CN109711718A - For layering transfer learning method, medium, device and the equipment of weary sample classification - Google Patents

For layering transfer learning method, medium, device and the equipment of weary sample classification Download PDF

Info

Publication number
CN109711718A
CN109711718A CN201811593430.3A CN201811593430A CN109711718A CN 109711718 A CN109711718 A CN 109711718A CN 201811593430 A CN201811593430 A CN 201811593430A CN 109711718 A CN109711718 A CN 109711718A
Authority
CN
China
Prior art keywords
sample
distinguishing characteristics
level
training data
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811593430.3A
Other languages
Chinese (zh)
Inventor
朱军
田天
杨建军
王思宇
宋世虹
郭楠
程雨航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201811593430.3A priority Critical patent/CN109711718A/en
Publication of CN109711718A publication Critical patent/CN109711718A/en
Pending legal-status Critical Current

Links

Abstract

Embodiments of the present invention provide a kind of layering transfer learning method for weary sample classification.This method comprises: the hierarchical model of object to be processed is established, so that it at least includes the signature of control parameter and at least one level based on distinguishing characteristics of each level;The control parameter of each level is obtained using different classes of sample.It can establish general industry sample data by the above method and handle model, in the method for crowdsourcing, obtain the numerous semantic tagger of multiple industries, model parameter is obtained from enterprise's sample class industry data abundant, and moved in the industry sample analysis of sample data quantity scarcity, it reduces sample size and lack the inaccuracy that bring marks.In addition, embodiments of the present invention provide a kind of layering transfer learning medium, device and equipment for weary sample classification.

Description

For layering transfer learning method, medium, device and the equipment of weary sample classification
Technical field
Embodiments of the present invention are related to data processing field, more specifically, embodiments of the present invention are related to a kind of needle To layering transfer learning method, medium, device and the equipment of weary sample classification.
Background technique
Background that this section is intended to provide an explanation of the embodiments of the present invention set forth in the claims or context.Herein Description recognizes it is the prior art not because not being included in this section.
The merit rating of enterprise pushes manufacturing development for the upgrading and relevant industrial department of instructing manufacturing business Have great importance.The merit rating of enterprise usually requires the ability rating model using enterprise as main basis, still During actual, it often is faced with the assessment unbalanced problem of sample size between different industries.I.e. in the enterprise of certain fields Industry data are relatively more, and then relatively deficient in the data of other field.Grading to weary sample industry is one challenging The problem of.
Traditional method is usually to utilize expertise, is retouched to the attributive character of different data sample construction various dimensions It states, but this generally requires a large amount of expert's workload, the requirement to expert is also relatively high, and cost is relatively high.If energy It is enough to obtain corresponding semantic attribute from public evaluation and the self-appraisal of enterprise, then the difficulty of evaluation can be greatly reduced, be The transfer learning of sample brings help.But during actual, although people make certainly of feature all the time It is fixed, but this process is often subconscious, it is difficult to it is described, moreover, the sense organ of different people is different, so concern Feature it is also different, so the masses is directly allowed to answer the feature of certain class evaluation or according to usually infeasible.
Summary of the invention
In the present context, embodiments of the present invention are intended to provide a kind of layering transfer learning for weary sample classification Method, medium, device and equipment.
In the first aspect of embodiment of the present invention, a kind of layering moving method for weary sample classification is provided, Include:
The hierarchical model of object to be processed is established, so that it at least includes the control parameter and at least one of each level Signature of a level based on distinguishing characteristics;
The control parameter of each level is obtained using different classes of sample.
In yet another embodiment of the present invention, the level includes at least class hierarchy, sample level and task level One of.
In yet another embodiment of the present invention, the control parameter of each level is obtained using different classes of sample Include:
The training data of predetermined quantity is obtained using different classes of sample;
The control parameter of each level is obtained using the training data.
In yet another embodiment of the present invention, the training data packet of predetermined quantity is obtained using different classes of sample It includes:
Task is repeated until obtaining the training data of predetermined quantity, wherein the task includes:
The sample of multiple and different classifications is chosen from the sample;
It regard one of sample selected as target sample;
The distinguishing characteristics between other samples in the sample selected described in extraction in addition to the target sample;
Based in the target sample, the sample selected in addition to the target sample other samples and area Other feature determines the training data of the target sample.
In yet another embodiment of the present invention, the training data characterizes the sample belonging to it and extracts distinguishing characteristics One of sample it is approximate based on the distinguishing characteristics.
In yet another embodiment of the present invention, the training data is encoded according to predetermined manner.
It in yet another embodiment of the present invention, include two other samples in addition to target sample in each task.
In yet another embodiment of the present invention, each training data is encoded asWherein b represents training data, X represents the serial number of the affiliated sample of the training data, and i and j respectively represent the serial number of the different samples of other in task, and t represents institute State the serial number of the affiliated task of training data.
In yet another embodiment of the present invention, the class hierarchy is based on the signature of distinguishing characteristics for characterizing the class Performance of the master sample on each distinguishing characteristics extracted under not.
In yet another embodiment of the present invention, the sample level is based on the signature of distinguishing characteristics for characterizing the sample This with the master sample under its generic based on each distinguishing characteristics extracted at a distance from.
In yet another embodiment of the present invention, before constructing the signature at all levels based on distinguishing characteristics, the method Further include:
Merge the distinguishing characteristics that similarity between any two reaches preset threshold, the phase between any two distinguishing characteristics Preset threshold is respectively less than like degree.
In yet another embodiment of the present invention, characteristic signature based on the distinguishing characteristics calculate each distinguishing characteristics it Between similarity.
In yet another embodiment of the present invention, the characteristic signature extracts the area for characterizing to remove in the sample Performance of the master sample of other each classifications except the sample generic of other feature on the distinguishing characteristics.
In yet another embodiment of the present invention, the performance on a distinguishing characteristics includes:
It is approximate with one of the sample that extracts the distinguishing characteristics on distinguishing characteristics;And/or
Do not have the distinguishing characteristics.
In yet another embodiment of the present invention, infer that method and the training data obtain the control and join using posteriority Number.
In yet another embodiment of the present invention, in the posteriority deduction method, target sample is described in task based access control All performances on distinguishing characteristics construct likelihood function respectively, construct Posterior probability distribution based on the likelihood function.
In yet another embodiment of the present invention, all performances of the target sample on the distinguishing characteristics in task based access control Building likelihood function includes: respectively
In task based access control in the probability distribution and task of the individual signature of target sample other any samples probability distribution JS divergence construct the likelihood function.
In yet another embodiment of the present invention, it in the posteriority deduction method, introduces variation distribution and is based on the posteriority Probability distribution carries out MAP estimation to obtain control parameter.
In yet another embodiment of the present invention, the method also includes:
The variation distribution is continued to optimize, so that the variation is distributed close to the Posterior probability distribution.
In yet another embodiment of the present invention, the evidence lower bound of the variation distribution is continued to optimize so that the variation It is distributed close to the Posterior probability distribution.
In yet another embodiment of the present invention, optimize the evidence lower bound of the variation distribution using coordinate rise method.
In the second aspect of embodiment of the present invention, a kind of layering transfer learning dress for weary sample classification is provided It sets, comprising:
Model building module is configured as establishing the hierarchical model of object to be processed, so that it at least includes each layer The signature of secondary control parameter and at least one level based on distinguishing characteristics;
Parameter calculating module is configured as obtaining the control parameter of each level using different classes of sample.
In the third aspect of embodiment of the present invention, a kind of computer readable storage medium is provided, program is stored with Code, said program code when being executed by a processor, realize method described in one of above embodiments.
In the fourth aspect of embodiment of the present invention, a kind of calculating equipment is provided, including processor and be stored with journey The storage medium of sequence code, said program code when being executed by a processor, realize method described in one of above embodiments.
The method of embodiment according to the present invention can obtain the semantic tagger of high quality, from enterprise in the method for crowdsourcing Category attribute is obtained in sample class industry data abundant, and is moved to the industry sample point of sample data quantity scarcity In analysis, the inaccuracy that sample size lacks bring mark is reduced.
Detailed description of the invention
The following detailed description is read with reference to the accompanying drawings, above-mentioned and other mesh of exemplary embodiment of the invention , feature and advantage will become prone to understand.In the accompanying drawings, if showing by way of example rather than limitation of the invention Dry embodiment, in which:
Fig. 1 schematically shows the schematic diagram of a scenario of embodiment according to the present invention;
Fig. 2 schematically shows layering transfer learning method one embodiment party according to the present invention for weary sample classification The flow diagram of formula;
Fig. 3 schematically shows the flow diagram for the step of training data is obtained according to the method for the present invention;
Fig. 4 schematically shows the structural block diagrams of model constructed by the method provided according to an embodiment of the present invention;
Fig. 5 schematically shows the feature-class relations chart provided according to an embodiment of the present invention;
Fig. 6 schematically shows a kind of signal for computer readable storage medium that embodiment provides according to the present invention Figure;
Fig. 7 schematically shows a kind of layering for weary sample classification that embodiment according to the present invention provides and migrates The schematic diagram of learning device;
Fig. 8 schematically shows a kind of schematic diagram for calculating equipment that embodiment provides according to the present invention;
In the accompanying drawings, identical or corresponding label indicates identical or corresponding part.
Specific embodiment
The principle and spirit of the invention are described below with reference to several illustrative embodiments.It should be appreciated that providing this A little embodiments are used for the purpose of making those skilled in the art can better understand that realizing the present invention in turn, and be not with any Mode limits the scope of the invention.On the contrary, these embodiments are provided so that this disclosure will be more thorough and complete, and energy It is enough that the scope of the present disclosure is completely communicated to those skilled in the art.
One skilled in the art will appreciate that embodiments of the present invention can be implemented as a kind of system, device, equipment, method Or computer program product.Therefore, the present disclosure may be embodied in the following forms, it may be assumed that complete hardware, complete software The form that (including firmware, resident software, microcode etc.) or hardware and software combine.
Embodiment according to the present invention, propose it is a kind of for the layering transfer learning method of weary sample classification, medium, Device and equipment.
In addition, any number of elements in attached drawing is used to example rather than limitation and any name are only used for distinguishing, Without any restrictions meaning.
Below with reference to several representative embodiments of the invention, the principle and spirit of the present invention are explained in detail.
Application scenarios overview
It is hierarchical model application scenarios schematic diagram constructed by the present invention, left part in figure referring initially to Fig. 1, Fig. 1 For the sampled data N number of sampled data of object to be handled (as shown in the figure) of multiple objects to be processed, right part is in figure The hierarchical model of the object to be processed of disclosed method building after the sampled-data processing to obtaining according to the present invention Multiple features (obtaining M feature of object to be processed as shown in the figure), wherein the hierarchical model can be deployed in local Equipment is calculated, server, server cluster or virtual server etc. can also be deployed in, pass through network (local area network/interconnection Net) to local service is provided, it the sampled data of object to be processed is provided obtains its feature will pass through.
Illustrative methods
Below with reference to the application scenarios of Fig. 1, be described with reference to Figure 2 illustrative embodiments according to the present invention for weary sample The method of the layering transfer learning of this classification.It should be noted that above-mentioned application scenarios be merely for convenience of understanding it is of the invention Spirit and principle and show, embodiments of the present invention are not limited in this respect.On the contrary, embodiments of the present invention can To be applied to applicable any scene.
In background technology part, it is understood that in the evaluation for needing to obtain entity object, in order to enable last knot Fruit is objective as far as possible, accurately, generally requires to obtain the evaluation for having enough result appraisal persons to be made the entity object, considers Even if obtained evaluation is also inevitably influenced by some subjective factors of estimator to there is relatively objective evaluation criterion, in order to The subjective factor for reducing estimator influences, and generally requires the evaluation that a large amount of estimator makes entity object, and then therefrom obtain Get relatively objective accurate evaluation.For some entity objects, since its is well-known, obtains a large amount of evaluations and more hold Easily, the entity object of some minorities is difficult to obtain sufficient amount of evaluation since the people of understanding is few, is also therefore difficult To relatively objective accurate evaluation, for this purpose, inventors herein proposing a kind of layering transfer learning method for weary sample classification, wrap It includes:
Step S101 establishes the hierarchical model of object to be processed, so that it at least includes the control parameter of each level And signature of at least one level based on distinguishing characteristics;
In this step, the stratification probabilistic model of a general object to be processed is established, the model can pass through Multiple sampled datas of one object are handled, multiple accurate, independent features of the object can be obtained.
In one embodiment of present embodiment, the level includes at least class hierarchy, sample level and task layer It is one of secondary, correspondingly, each level has a control parameter, specifically, being directed to for enterprise (object to be processed), benefit With the method disclosed in the present, the model including three class hierarchy, sample level and task level levels is established, So there are the control parameter α of class hierarchy, the control parameter σ of sample level and the control parameter β of task level accordingly, and And wherein class hierarchy and sample level have it to be signed accordingly to describe the model.
Step S102 obtains the control parameter of each level using different classes of sample.
In view of the data (training data) for carrying out model training should have unified format, and it is different classes of The sample data of sample is all collected into advance, may is all some descriptive sentences, format and disunity, be needed to institute The sample data for stating sample is handled, and the training data of uniform format is obtained, to carry out the training of model, in this embodiment party In one embodiment of formula, include: using the control parameter that different classes of sample obtains each level
The training data of predetermined quantity is obtained using different classes of sample;
It in this step, is handled by the sample data to different samples, obtains sufficient amount of training data, specifically , include: using the training data that different classes of sample obtains predetermined quantity
Task is repeated until obtaining the training data of predetermined quantity, wherein the task includes:
The sample of multiple and different classifications is chosen from the sample;
It regard one of sample selected as target sample;
The distinguishing characteristics between other samples in the sample selected described in extraction in addition to the target sample;
Based in the target sample, the sample selected in addition to the target sample other samples and area Other feature determines the training data of the target sample.
Wherein the training data characterizes the sample (target sample) belonging to it and extracts one of sample of distinguishing characteristics It is approximate based on the distinguishing characteristics, for example, referring to Fig. 3, three different classes of sample S1, S2, S3 are had chosen from sample, it will Sample S1 therein extracts distinguishing characteristics k according to the sample data of sample S2 and S3 as target sample, then being based on institute It states described in target sample, other samples in the sample selected in addition to the target sample and distinguishing characteristics determine It is similar to S2 or S3 that the training data namely target sample S1 of target sample are based on the distinguishing characteristics k, it is to be understood that The training data can be specific numerical value, specifically, training data bS1∈ { 1,0, -1 }, wherein bS1=1 can indicate mesh This S1 of standard specimen is approximate with sample S2 based on distinguishing characteristics k, bs1=-1 can indicate that target sample S1 is based on distinguishing characteristics k and sample S3 is approximate, bS1=0 can indicate that target sample S1 does not have distinguishing characteristics k.
As long as it should be noted that the training data can be characterized to sample (target sample) belonging to it and extracted One of sample of distinguishing characteristics, which is based on the approximate meaning of the distinguishing characteristics, indicates clear, and the training data can be configured as Any number, the present invention is to this and without limitation.
It is understood that the value range of the training data be it is N number of, N is the number of samples that acquirement is calculated in task, Described in N number of value characterize sample (target sample) of the training data belonging to it respectively and extract one of sample of distinguishing characteristics Do not have based on the distinguishing characteristics approximation or the distinguishing characteristics.
It should be noted that every case is comprehensively covered in order to get sufficient amount of training data, so that instruction The model practised is accurate enough, can also traverse to sample, so that target sample of each sample standard deviation as task execution This, compares with all samples and generates training data.
In view of training data should indicate to understand sample (target sample) belonging to it and the sample that extracts distinguishing characteristics One of it is approximate based on the distinguishing characteristics or do not have the distinguishing characteristics, the training data is encoded according to predetermined manner, tool Body, include two other samples in addition to target sample in each task, wherein each training data is encoded asIts Middle b represents training data, and x represents the serial number of the affiliated sample of the training data, and it is not same that i and j respectively represent other in task This serial number, t represent the serial number of the affiliated task of the training data.
After obtaining the training data, the control parameter θ of each level is obtained using the training data, is had For body, in one embodiment of present embodiment, it can use posteriority and infer that method and the training data obtain the control Parameter θ processed, such as in one embodiment of present embodiment, being directed to one includes class hierarchy, sample level and task For the stratification probabilistic model of three levels of level, the control parameter θ={ α, β, σ }, wherein α, β, σ are respectively corresponded as class The control parameter of other level, task level and sample level can refer to following production viewpoint for the hierarchical model To treat:
For each classification m, feature k, z is sampledm,k~Mult (α)
For each sample n, y is sampledn~N (μn2I).
For each task t, c is sampledt~Mult (β)
To the relevant issues (x, i, j) in task t, sampling
Specifically, referring to Fig. 4, wherein vector zm∈{1,0,-1}MIndicate the class signature (category of classification m Signature) namely signature of the class hierarchy based on distinguishing characteristics, extracted for characterizing the master sample under the classification The performance on each distinguishing characteristics out.The value of each element of vector can be 1, -1 or 0.zm,k=1, -1 indicates The master sample of classification m has positive value or negative value namely z on feature km,kThe master sample of=0 expression classification m does not have Feature k.Since it is discrete variable, so we assume that prior distribution is
p(zm,k)=Mult (zm,k|α).
Remember li∈ [M] (i ∈ [N]) is sample siAffiliated classification, wherein [M] indicates all integers from 1 to M.Ideal feelings Sample in the next classification of condition is answered identical in the same feature upper value, that is to say, that value all should be class in each feature Other liClass signatureBut it is constantly present difference between actually distinct individual, and therefore, sample siIndividual signature (item signature)yi∈RKNamely the signature of the sample level based on distinguishing characteristics, for characterizing the sample and its (such as there are different classes of enterprises: electricity for the distance of master sample under generic based on each distinguishing characteristics extracted Son is packed, and mining, extraction obtains the distinguishing characteristics that different enterprises have: design, production, logistics, sale.Then class number M= 3, number of features K=4.Assuming that value of the electronics industry on 4 distinguishing characteristics is respectively 1,1,0, -1, i.e. electronics industry has Positive " design " and " production " feature, " sale " feature of negative sense do not have " logistics " feature, then the classification of this enterprise sort Signature is (1,1,0, -1) z=.And value of certain enterprise A in 4 features is not the electronic catalog of standard in electronics industry Class signature (1,1,0, -1), but deviateed on this basis, such as (1.1,0.8, -0.2, -0.85), this deviates just It is the sample signature y) of enterprise A, wherein R indicates real number set, RKIndicate that K ties up real vector.Vector yiK-th of element representation Sample siValue on feature k deviates 1,0, -1 distance, it is assumed that sample siValue is vector v in each featurei, then
yi=vi-zli
Its prior distribution is
p(yii)=N (yii2I)
WhereinIt is n-th of sample snThe class signature of generic l, N indicate Gaussian Profile.
Each task t remembers ct∈ [K] is distinguishing characteristics, the distinguishing characteristics selected in expression task, and prior distribution is
p(ct)=Mult (ct|β)
From in this production viewpoint it can be seen that dependence between variable: variable z only determines by parameter alpha, ctOnly It is determined by parameter beta, and ynDependent on the class signature z of sample generic, finally, the training data that we obtainIt relies on In the sample signature y of each samplex,yi,yjAnd distinguishing characteristics c obtained in the subtaskt
After the hierarchical model is understood from the viewpoint of above-mentioned production, in the posteriority deduction method, based on appointing All performances of the target sample on the distinguishing characteristics construct likelihood function respectively in business, after likelihood function building Test probability distribution.
The wherein performance on a distinguishing characteristics includes:
It is approximate with one of the sample that extracts the distinguishing characteristics on distinguishing characteristics;And/or
Do not have the distinguishing characteristics.
Wherein, all performances of the target sample on the distinguishing characteristics construct likelihood function packet respectively in task based access control It includes:
In task based access control in the probability distribution and task of the individual signature of target sample other any samples probability distribution JS divergence construct the likelihood function.
Specifically, obtaining distinguishing characteristics ctTask in, for sample si,sj,sx, can define likelihood function is
WhereinIt is defined as being distributedWithJensen-Shannon divergence:
Assuming thatSo
WhereinKL (p | | q) indicates the KL divergence of two distribution ps and q, i.e.,AndWherein (0, σ U=N2)。
Therefore the definition of probability above is actually to be distributedAnd the JS divergence between U The probability distribution come is converted to through softmax function for input.So if distributionWithBetween it is closer, then Their Jensen-Shannon is apart from smaller, and the probability value obtained by softmax function is bigger, thenIt is bigger.
Similarly, i is exchanged, j can be defined
So posterior probability is
In above formulaIt is exactly the product of the prior distribution of all categories signature, represents the priori point of variable z Cloth,It is the prior distribution of sample signature,
Show in task t, extracts feature ctAnd it is marked Probability.
It is highly difficult in view of directly calculating the deduction of this posteriority, in one embodiment of present embodiment, introduce variation Q (c, z, y) is distributed to estimate maximum a posteriori.Under the hypothesis of mean-field, variation distribution can be decomposed into
Wherein
q(ctt)=Mult (ct;γt)
q(zm,km,k)=Mult (zm,k;φm,k)
Each single item is all a distribution for containing only parameter { γ, φ, ψ, τ } without considering mark b, further, can With optimize q (c, z, y) come close to p (c, z, y | b).It is extremely difficult in view of directly optimizing q (c, z, y), in this reality It applies in one embodiment of mode, introduces the evidence lower bound ELBO (Evidence Lower Bound) in variation deduction, thus So that simpler is become to the optimization process of q, it is specific:
In one embodiment of present embodiment, coordinate rise method (coordinate ascent) algorithm can be used Continuous iteration updates the parameter { γ, φ, ψ, τ } of q to optimize ELBO:
Firstly, carrying out abbreviation to ELBO:
First four in above formula be to joint distribution p (c, z, y, b) decomposition, can when understanding model from production viewpoint To obtain, this Joint Distribution can isolate p (c) and p (z) first, then can determine y distribution p (y | z) by condition of z, Then again with c, y is that condition can determine distribution p (b | y, c).Then four are then the q (c, z, y) under mean field assumes Simple decomposition.
The each single item of above formula is solved respectively, for simplicity, in one embodiment of present embodiment, Ke Yigu Determine τ.Then ELBO (q) is found out respectively to parameter γ, and the partial derivative of φ, ψ go to update optimization using the optimization method based on gradient ELBO:
argmaxγ,φ,ψELBO(q(c,z,y|γ,φ,ψ))# (4)
So far, obtain parameter θ={ α, β, the σ } of each level of the model, so can will about parameter θ=α, β, σ } probability Distribution Model move in weary sample industry such as mining industry, to correct the training data in mining industry, reduce Sample size lacks the inaccuracy of bring training data, obtains feature relatively accurate in mining industry (evaluation).
It is easily understood that during obtaining training data, between sample that a large amount of two classifications can be obtained Distinguishing characteristics, and among these a large amount of distinguishing characteristics, it there will necessarily be some similar or identical distinguishing characteristics.
In view of the process of model training study be it is non-convex, learning training result is to initial distinguishing characteristics and feature Number is very sensitive, and therefore, it is necessary to handle a large amount of distinguishing characteristics, reduction distinguishing characteristics is (including initial special as much as possible Seek peace Characteristic Number) influence to learning outcome constructs at all levels based on difference in one embodiment of present embodiment Before the signature of feature, the method also includes:
Merge the distinguishing characteristics that similarity between any two reaches preset threshold, the phase between any two distinguishing characteristics Preset threshold is respectively less than like degree.
In the present embodiment, it decides whether to be merged based on the similarity between two distinguishing characteristics, count When calculating the similarity between described two features, it can be regarded as calculating the text similarity between described two features, such as There are two distinguishing characteristics: designing and producing, then can be according to the text similarity between " design " and " production ", judgement It is no to be merged, in the present embodiment, text similarity measurement algorithm (such as algorithm based on editing distance) can be chosen Calculate the similarity between distinguishing characteristics.
In view of sometimes we measure two short texts in other words the similitude of more direct two words when, it is directly logical Crossing literal distance (editing distance) cannot achieve, such as: China-Beijing, Italy-Rome, the phase between the two phrases It should be similar like distance, because being all the relationship in capital and country;(man, boy) for another example, (woman, girl) should It is identical relationship, but we see that its literal distance is all 0 (distance is higher, and text similarity is higher).It would therefore be desirable to The meaning of text (distinguishing characteristics) is accounted for, the similarity of text (distinguishing characteristics) meaning is based on, decides whether to it It merges, to carry out understanding processing convenient for computer, it may be considered that convert vector for corresponding text, and then determine text Similarity.
In one embodiment of present embodiment, performance of the different classes of sample on a distinguishing characteristics is united Meter, can be with specifically, in the present embodiment to measure the similarity between distinguishing characteristics as the vector of the distinguishing characteristics Using the characteristic signature of the distinguishing characteristics (for characterizing in the sample except the affiliated class of sample for extracting the distinguishing characteristics Performance of the master sample on the distinguishing characteristics of other each classifications except not) vector as the distinguishing characteristics, For example, as shown in figure 5, there are three classification m1, m2, m3,4 features k1, k2, k3, k4, and the master sample of m1, m2, m3 exist Performance on feature k1 is respectively 1,0,0;Performance of the master sample of m1, m2, m3 on feature k1 is respectively 1,0,0;m1, Performance of the master sample of m2, m3 on feature k2 is respectively 0,0,1;Performance of the master sample of m1, m2, m3 on feature k3 Respectively -1,1, -1;Performance of the master sample of m1, m2, m3 on feature k3 is respectively -1,1,0, then the feature label of feature k1 Name h1 is (1,0,0), and the characteristic signature h2 of feature k2 is (0,0,1), and the characteristic signature h3 of feature k3 is (- 1,1, -1), feature The characteristic signature h4 of k4 is (- 1, -1,0).After by distinguishing characteristics vectorization, it can be determined according to the vector value of distinguishing characteristics Similarity between distinguishing characteristics merges the distinguishing characteristics that similarity is less than preset threshold, in the present embodiment, can be with It is determined based on following manner:
Wherein M is the number of classification,Indicate the collection of K (greater than 1 integer) a independent characteristic signature It closes, lk=t | ct=k } indicate all ctThe subscript of task equal to k.λ=(1- ρ) M is the threshold value of similarity, and ρ ∈ [0,1] is One relaxation factor.It can prove htWith all rkBetween similarity maximum value be less than ρ M when, by htAffiliated distinguishing characteristics Separately as a distinguishing characteristics.
It can establish general industry sample data processing model by the above method to obtain multiple in the method for crowdsourcing The numerous semantic tagger of industry obtains model parameter from enterprise's sample class industry data abundant, and is moved to sample In the industry sample analysis of notebook data quantity scarcity, the inaccuracy that sample size lacks bring mark is reduced.
Exemplary media
After describing the method, apparatus of exemplary embodiment of the invention, next, showing with reference to Fig. 6 the present invention The computer readable storage medium of example property embodiment is illustrated, referring to FIG. 6, the computer readable storage medium shown in it For CD 60, it is stored thereon with computer program (i.e. program product), the computer program, can be real when being run by processor Documented each step in existing above method embodiment, for example, the hierarchical model of object to be processed is established, so that it is minimum The signature of control parameter and at least one level based on distinguishing characteristics including each level;It is obtained using different classes of sample To the control parameter of each level.;This will not be repeated here for the specific implementation of each step.It should be noted that The example of the computer readable storage medium can also include, but are not limited to phase change memory (PRAM), static random-access is deposited Reservoir (SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other optics, magnetic-based storage media, herein No longer repeat one by one.
Exemplary means
After describing the medium of exemplary embodiment of the invention, next, with reference to Fig. 7 to the exemplary reality of the present invention Apply the layering transfer learning device for weary sample classification of mode, comprising:
Model building module 701 is configured as establishing the hierarchical model of object to be processed, so that it at least includes each The signature of the control parameter of level and at least one level based on distinguishing characteristics;
Parameter calculating module 702 is configured as obtaining the control parameter of each level using different classes of sample.
In yet another embodiment of the present invention, the level includes at least class hierarchy, sample level and task level One of.
In yet another embodiment of the present invention, the parameter calculating module 702 includes:
Training data acquiring unit is configured as obtaining the training data of predetermined quantity using different classes of sample;
Parameter calculation unit is configured as obtaining the control parameter of each level using the training data.
In yet another embodiment of the present invention, the training data acquiring unit includes:
Task execution subelement, is configured as repeating task until obtaining the training data of predetermined quantity, wherein institute The task of stating includes:
The sample of multiple and different classifications is chosen from the sample;
It regard one of sample selected as target sample;
The distinguishing characteristics between other samples in the sample selected described in extraction in addition to the target sample;
Based in the target sample, the sample selected in addition to the target sample other samples and area Other feature determines the training data of the target sample.
In yet another embodiment of the present invention, the training data characterizes the sample belonging to it and extracts distinguishing characteristics One of sample it is approximate based on the distinguishing characteristics.
In yet another embodiment of the present invention, the training data is encoded according to predetermined manner.
It in yet another embodiment of the present invention, include two other samples in addition to target sample in each task.
In yet another embodiment of the present invention, each training data is encoded asWherein b represents training data, X represents the serial number of the affiliated sample of the training data, and i and j respectively represent the serial number of the different samples of other in task, and t represents institute State the serial number of the affiliated task of training data.
In yet another embodiment of the present invention, the class hierarchy is based on the signature of distinguishing characteristics for characterizing the class Performance of the master sample on each distinguishing characteristics extracted under not.
In yet another embodiment of the present invention, the sample level is based on the signature of distinguishing characteristics for characterizing the sample This with the master sample under its generic based on each distinguishing characteristics extracted at a distance from.
In yet another embodiment of the present invention, described device further include:
Feature processing block is configured as merging the distinguishing characteristics that similarity between any two reaches preset threshold, Zhi Daoren The similarity anticipated between two distinguishing characteristics is respectively less than preset threshold.
In yet another embodiment of the present invention, the feature processing block is based on the characteristic signature of the distinguishing characteristics Calculate the similarity between each distinguishing characteristics.
In yet another embodiment of the present invention, the characteristic signature extracts the area for characterizing to remove in the sample Performance of the master sample of other each classifications except the sample generic of other feature on the distinguishing characteristics.
In yet another embodiment of the present invention, the performance on a distinguishing characteristics includes:
It is approximate with one of the sample that extracts the distinguishing characteristics on distinguishing characteristics;And/or
Do not have the distinguishing characteristics.
In yet another embodiment of the present invention, the parameter calculating module is additionally configured to infer method and institute using posteriority It states training data and obtains the control parameter.
In yet another embodiment of the present invention, the parameter calculating module includes:
Likelihood function construction unit is configured as in the posteriority deduction method, and target sample is described in task based access control All performances on distinguishing characteristics construct likelihood function respectively;
Posterior probability distribution construction unit is configured as constructing Posterior probability distribution based on the likelihood function.
In yet another embodiment of the present invention, the likelihood function construction unit is additionally configured to target in task based access control The JS divergence of the probability distribution of other any samples constructs the likelihood letter in the probability distribution and task of the individual signature of sample Number.
In yet another embodiment of the present invention, it in the posteriority deduction method, introduces variation distribution and is based on the posteriority Probability distribution carries out MAP estimation to obtain control parameter.
In yet another embodiment of the present invention, described device further include:
Optimization module is configured as continuing to optimize the variation distribution, so that the variation is distributed close to the posteriority Probability distribution.
In yet another embodiment of the present invention, the optimization module includes:
Evidence lower bound optimizes unit, is configured as continuing to optimize the evidence lower bound of the variation distribution so that the variation It is distributed close to the Posterior probability distribution.
In yet another embodiment of the present invention, the evidence lower bound optimization unit is configured as excellent using coordinate rise method Change the evidence lower bound of the variation distribution.
Exemplary computer device
After the method, apparatus and medium for describing exemplary embodiment of the invention, next, with reference to Fig. 8 to this The calculating equipment of invention illustrative embodiments is illustrated, and Fig. 8, which is shown, to be suitable for being used to realizing showing for embodiment of the present invention Example property calculates the block diagram of equipment 80, which can be computer system or server.The calculating equipment 80 that Fig. 8 is shown An only example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 8, calculating the component of equipment 80 can include but is not limited to: one or more processor or processing Unit 801, system storage 802 connect the bus of different system components (including system storage 802 and processing unit 801) 803。
It calculates equipment 80 and typically comprises a variety of computer system readable media.These media can be and any can be counted Calculate the usable medium that equipment 80 accesses, including volatile and non-volatile media, moveable and immovable medium.
System storage 802 may include the computer system readable media of form of volatile memory, such as deposit at random Access to memory (RAM) 8021 and/or cache memory 8022.Calculate equipment 80 may further include it is other it is removable/ Immovable, volatile/non-volatile computer system storage medium.Only as an example, ROM8023 can be used for reading and writing not Movably, non-volatile magnetic media (not shown in Fig. 8, commonly referred to as " hard disk drive ").Although not shown in FIG. 8, The disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided, and non-easy to moving The CD drive that the property lost CD (such as CD-ROM, DVD-ROM or other optical mediums) is read and write.In these cases, each Driver can be connected by one or more data media interfaces with bus 803.May include in system storage 802 to A few program product, the program product have one group of (for example, at least one) program module, these program modules are configured to Execute the function of various embodiments of the present invention.
Program/utility 8025 with one group of (at least one) program module 8024, can store in such as system In memory 802, and such program module 8024 includes but is not limited to: operating system, one or more application program, its It may include the realization of network environment in its program module and program data, each of these examples or certain combination. Program module 8024 usually executes function and/or method in embodiment described in the invention.
Calculating equipment 80 can also be logical with one or more external equipments 804 (such as keyboard, sensing equipment, display) Letter.This communication can be carried out by input/output (I/O) interface 805.Also, calculating equipment 80 can also be suitable by network Orchestration 806 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, such as because of spy Net) communication.As shown in figure 8, other module (such as processing units of the network adapter 806 by bus 803 and calculating equipment 80 801 etc.) it communicates.It should be understood that although being not shown in Fig. 8 other hardware and/or software mould can be used in conjunction with equipment 80 is calculated Block.
Processing unit 801 by the program that is stored in system storage 802 of operation, thereby executing various function application with And data processing, for example, executing and realizing each step in the layering transfer learning method for weary sample classification;For example, building The hierarchical model of object to be processed is found, so that its control parameter at least including each level and at least one level are based on The signature of distinguishing characteristics;The control parameter of each level is obtained using different classes of sample;The specific implementation of each step This will not be repeated here for mode.It should be noted that although being referred to the layering for weary sample classification in the above detailed description Several units/modules or subelement/submodule of transfer learning device, but this division is only exemplary and optional Property.In fact, embodiment according to the present invention, the feature and function of two or more above-described units/modules can To be embodied in a units/modules.Conversely, the feature and function of an above-described units/modules can be further It is divided by multiple units/modules and embodies.
In addition, although describing the operation of the method for the present invention in the accompanying drawings with particular order, this do not require that or Hint must execute these operations in this particular order, or have to carry out shown in whole operation be just able to achieve it is desired As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/or by one Step is decomposed into execution of multiple steps.
Although detailed description of the preferred embodimentsthe spirit and principles of the present invention are described by reference to several, it should be appreciated that, this It is not limited to the specific embodiments disclosed for invention, does not also mean that the feature in these aspects cannot to the division of various aspects Combination is benefited to carry out, this to divide the convenience merely to statement.The present invention is directed to cover appended claims spirit and Included various modifications and equivalent arrangements in range.
Through the above description, the embodiment provides technical solution below, but not limited to this:
1. a kind of layering transfer learning method for weary sample classification, comprising:
The hierarchical model of object to be processed is established, so that it at least includes the control parameter and at least one of each level Signature of a level based on distinguishing characteristics;
The control parameter of each level is obtained using different classes of sample.
2. method as described in technical solution 1, wherein the level includes at least class hierarchy, sample level and task One of level.
3. the method as described in technical solution 1 or 2, wherein obtain each level using different classes of sample Control parameter includes:
The training data of predetermined quantity is obtained using different classes of sample;
The control parameter of each level is obtained using the training data.
4. method as described in technical solution 3, wherein obtain the training data of predetermined quantity using different classes of sample Include:
Task is repeated until obtaining the training data of predetermined quantity, wherein the task includes:
The sample of multiple and different classifications is chosen from the sample;
It regard one of sample selected as target sample;
The distinguishing characteristics between other samples in the sample selected described in extraction in addition to the target sample;
Based in the target sample, the sample selected in addition to the target sample other samples and area Other feature determines the training data of the target sample.
5. the method as described in technical solution 1 or 2, wherein the training data characterizes the sample belonging to it and extracts area It is approximate that one of sample of other feature is based on the distinguishing characteristics.
6. the method as described in technical solution 1-5 is any, wherein the training data is encoded according to predetermined manner.
7. the method as described in technical solution 1-6 is any, wherein in addition to target sample include in each task two other Sample.
8. the method as described in technical solution 6 or 7, wherein each training data is encoded asWherein b represents instruction Practice data, x represents the serial number of the affiliated sample of the training data, and i and j respectively represent the serial number of the different samples of other in task, t Represent the serial number of the affiliated task of the training data.
9. the method as described in technical solution 2-8 is any, wherein the class hierarchy is used for based on the signature of distinguishing characteristics Characterize performance of the master sample on each distinguishing characteristics extracted under the classification.
10. the method as described in technical solution 2-9 is any, wherein the sample level is used based on the signature of distinguishing characteristics In characterize the sample and the master sample under its generic based on each distinguishing characteristics extracted at a distance from.
11. the method as described in technical solution 1-10 is any, wherein construct the signature at all levels based on distinguishing characteristics it Before, the method also includes:
Merge the distinguishing characteristics that similarity between any two reaches preset threshold, the phase between any two distinguishing characteristics Preset threshold is respectively less than like degree.
12. the method as described in technical solution 11, wherein the characteristic signature based on the distinguishing characteristics calculates each difference Similarity between feature.
13. the method as described in technical solution 12, wherein the characteristic signature is extracted for characterizing to remove in the sample Performance of the master sample of other each classifications except the sample generic of the distinguishing characteristics on the distinguishing characteristics.
14. the method as described in technical solution 1-13 is any, wherein the performance on a distinguishing characteristics includes:
It is approximate with one of the sample that extracts the distinguishing characteristics on distinguishing characteristics;And/or
Do not have the distinguishing characteristics.
15. the method as described in technical solution 1-14 is any, wherein inferring that method and the training data obtain using posteriority The control parameter.
16. such as the method for technical solution 15, wherein in the posteriority deduction method, target sample is in institute in task based access control All performances stated on distinguishing characteristics construct likelihood function respectively, construct Posterior probability distribution based on the likelihood function.
17, such as the method for technical solution 16, wherein all tables of the target sample on the distinguishing characteristics in task based access control Now building likelihood function includes: respectively
In task based access control in the probability distribution and task of the individual signature of target sample other any samples probability distribution JS divergence construct the likelihood function.
18. the method as described in technical solution 16, wherein in the posteriority deduction method, introduce variation distribution and be based on institute It states Posterior probability distribution and carries out MAP estimation to obtain control parameter.
19. the method as described in technical solution 18, wherein the method also includes:
The variation distribution is continued to optimize, so that the variation is distributed close to the Posterior probability distribution.
20. the method as described in technical solution 19, wherein continuing to optimize the evidence lower bound of the variation distribution so that institute Variation distribution is stated close to the Posterior probability distribution.
21. the method as described in technical solution 20, wherein under the evidence for optimizing the variation distribution using coordinate rise method Boundary.
22. a kind of layering transfer learning device for weary sample classification, comprising:
Model building module is configured as establishing the hierarchical model of object to be processed, so that it at least includes each layer The signature of secondary control parameter and at least one level based on distinguishing characteristics;
Parameter calculating module is configured as obtaining the control parameter of each level using different classes of sample.
23. the device as described in technical solution 22, wherein the level includes at least class hierarchy, sample level and appoints One of business level.
24. the device as described in technical solution 22 or 23, wherein the parameter calculating module includes:
Training data acquiring unit is configured as obtaining the training data of predetermined quantity using different classes of sample;
Parameter calculation unit is configured as obtaining the control parameter of each level using the training data.
25. the device as described in technical solution 24, wherein the training data acquiring unit includes:
Task execution subelement, is configured as repeating task until obtaining the training data of predetermined quantity, wherein institute The task of stating includes:
The sample of multiple and different classifications is chosen from the sample;
It regard one of sample selected as target sample;
The distinguishing characteristics between other samples in the sample selected described in extraction in addition to the target sample;
Based in the target sample, the sample selected in addition to the target sample other samples and area Other feature determines the training data of the target sample.
26. the device as described in technical solution 22 or 23, wherein the training data characterizes sample and extraction belonging to it It is approximate to be based on the distinguishing characteristics for one of sample of distinguishing characteristics out.
27. the device as described in technical solution 22-26 is any, wherein the training data is encoded according to predetermined manner.
28. the device as described in technical solution 22-27 is any, wherein include two in addition to target sample in each task Other samples.
29. the device as described in technical solution 27 or 28, wherein each training data is encoded asWherein b is represented Training data, x represent the serial number of the affiliated sample of the training data, and i and j respectively represent the sequence of the different samples of other in task Number, t represents the serial number of the affiliated task of the training data.
30. the device as described in technical solution 23-29 is any, wherein signature of the class hierarchy based on distinguishing characteristics For characterizing performance of the master sample under the classification on each distinguishing characteristics extracted.
31. the device as described in technical solution 23-30 is any, wherein the signature of the sample level based on distinguishing characteristics For characterize the sample and the master sample under its generic based on each distinguishing characteristics extracted at a distance from.
32. the device as described in technical solution 22-31 is any, wherein described device further include:
Feature processing block is configured as merging the distinguishing characteristics that similarity between any two reaches preset threshold, Zhi Daoren The similarity anticipated between two distinguishing characteristics is respectively less than preset threshold.
33. the device as described in technical solution 32, wherein feature of the feature processing block based on the distinguishing characteristics Similarity between each distinguishing characteristics of signature calculation.
34. the device as described in technical solution 33, wherein the characteristic signature is extracted for characterizing to remove in the sample Performance of the master sample of other each classifications except the sample generic of the distinguishing characteristics on the distinguishing characteristics.
35. the device as described in technical solution 22-34 is any, wherein the performance on a distinguishing characteristics includes:
It is approximate with one of the sample that extracts the distinguishing characteristics on distinguishing characteristics;And/or
Do not have the distinguishing characteristics.
36. the device as described in technical solution 22-35 is any, wherein after the parameter calculating module is additionally configured to utilization It tests deduction method and the training data obtains the control parameter.
37. such as the device of technical solution 36, wherein the parameter calculating module includes:
Likelihood function construction unit is configured as in the posteriority deduction method, and target sample is described in task based access control All performances on distinguishing characteristics construct likelihood function respectively;
Posterior probability distribution construction unit is configured as constructing Posterior probability distribution based on the likelihood function.
38. such as the device of technical solution 37, wherein the likelihood function construction unit is additionally configured to mesh in task based access control The JS divergence of the probability distribution of other any samples constructs the likelihood in the probability distribution and task of the individual signature of standard specimen sheet Function.
39. the device as described in technical solution 37, wherein in the posteriority deduction method, introduce variation distribution and be based on institute It states Posterior probability distribution and carries out MAP estimation to obtain control parameter.
40. the device as described in technical solution 39, wherein described device further include:
Optimization module is configured as continuing to optimize the variation distribution, so that the variation is distributed close to the posteriority Probability distribution.
41. the device as described in technical solution 40, wherein the optimization module includes:
Evidence lower bound optimizes unit, is configured as continuing to optimize the evidence lower bound of the variation distribution so that the variation It is distributed close to the Posterior probability distribution.
42. the device as described in technical solution 41, wherein the evidence lower bound optimization unit is configured as using on coordinate The method of liter optimizes the evidence lower bound of the variation distribution.
43. a kind of computer readable storage medium is stored with program code, said program code, which is worked as, to be executed by processor When, realize the method as described in one of technical solution 1-21.
44. a kind of calculating equipment, including processor and the storage medium for being stored with program code, said program code works as quilt When processor executes, the method as described in one of technical solution 1-21 is realized.
Finally, it is to be noted that, in the disclosure, such as relational terms of left and right, first and second or the like It is only used to distinguish one entity or operation from another entity or operation, without necessarily requiring or implying these There are any actual relationship or orders between entity or operation.Moreover, the terms "include", "comprise" or its is any Other variants are intended to non-exclusive inclusion, so that including the process, method, article or equipment of a series of elements Include not only those elements, but also including other elements that are not explicitly listed, or further includes for this process, side Method, article or the intrinsic element of equipment.In the absence of more restrictions, limited by sentence "including a ..." Element, it is not excluded that there is also other identical elements in the process, method, article or apparatus that includes the element.
Although being had been disclosed above by the description of the specific embodiment of the disclosure to the disclosure, however, it should Understand, those skilled in the art can design the various modifications to the disclosure in the spirit and scope of appended technical solution, improve Or equivalent.These modifications, improvement or equivalent should also be as being to be considered as included in disclosure range claimed.

Claims (10)

1. a kind of layering transfer learning method for weary sample classification, comprising:
The hierarchical model of object to be processed is established, so that it at least includes the control parameter and at least one layer of each level The secondary signature based on distinguishing characteristics;
The control parameter of each level is obtained using different classes of sample.
2. the method for claim 1, wherein the level includes at least class hierarchy, sample level and task level One of.
3. method according to claim 1 or 2, wherein obtain the control of each level using different classes of sample Parameter includes:
The training data of predetermined quantity is obtained using different classes of sample;
The control parameter of each level is obtained using the training data.
4. method as claimed in claim 3, wherein obtain the training data packet of predetermined quantity using different classes of sample It includes:
Task is repeated until obtaining the training data of predetermined quantity, wherein the task includes:
The sample of multiple and different classifications is chosen from the sample;
It regard one of sample selected as target sample;
The distinguishing characteristics between other samples in the sample selected described in extraction in addition to the target sample;
Based on other samples in the target sample, the sample selected in addition to the target sample and distinguish special Sign determines the training data of the target sample.
5. method according to claim 1 or 2, wherein the training data characterizes the sample belonging to it and to extract difference special It is approximate that one of sample of sign is based on the distinguishing characteristics.
6. method a method as claimed in any one of claims 1 to 5, wherein the training data is encoded according to predetermined manner.
7. the method as described in claim 1-6 is any, wherein include two other samples in addition to target sample in each task This.
8. a kind of layering transfer learning device for weary sample classification, comprising:
Model building module is configured as establishing the hierarchical model of object to be processed, so that it at least includes each level The signature of control parameter and at least one level based on distinguishing characteristics;
Parameter calculating module is configured as obtaining the control parameter of each level using different classes of sample.
9. a kind of computer readable storage medium, is stored with program code, said program code when being executed by a processor, is realized Method as described in one of claim 1-7.
10. a kind of calculating equipment, including processor and the storage medium for being stored with program code, said program code is when processed When device executes, the method as described in one of claim 1-7 is realized.
CN201811593430.3A 2018-12-25 2018-12-25 For layering transfer learning method, medium, device and the equipment of weary sample classification Pending CN109711718A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811593430.3A CN109711718A (en) 2018-12-25 2018-12-25 For layering transfer learning method, medium, device and the equipment of weary sample classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811593430.3A CN109711718A (en) 2018-12-25 2018-12-25 For layering transfer learning method, medium, device and the equipment of weary sample classification

Publications (1)

Publication Number Publication Date
CN109711718A true CN109711718A (en) 2019-05-03

Family

ID=66258318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811593430.3A Pending CN109711718A (en) 2018-12-25 2018-12-25 For layering transfer learning method, medium, device and the equipment of weary sample classification

Country Status (1)

Country Link
CN (1) CN109711718A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110782043A (en) * 2019-10-29 2020-02-11 腾讯科技(深圳)有限公司 Model optimization method and device, storage medium and server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4975975A (en) * 1988-05-26 1990-12-04 Gtx Corporation Hierarchical parametric apparatus and method for recognizing drawn characters
CN105912633A (en) * 2016-04-11 2016-08-31 上海大学 Sparse sample-oriented focus type Web information extraction system and method
CN107316061A (en) * 2017-06-22 2017-11-03 华南理工大学 A kind of uneven classification ensemble method of depth migration study
US20180212996A1 (en) * 2017-01-23 2018-07-26 Cisco Technology, Inc. Entity identification for enclave segmentation in a network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4975975A (en) * 1988-05-26 1990-12-04 Gtx Corporation Hierarchical parametric apparatus and method for recognizing drawn characters
CN105912633A (en) * 2016-04-11 2016-08-31 上海大学 Sparse sample-oriented focus type Web information extraction system and method
US20180212996A1 (en) * 2017-01-23 2018-07-26 Cisco Technology, Inc. Entity identification for enclave segmentation in a network
CN107316061A (en) * 2017-06-22 2017-11-03 华南理工大学 A kind of uneven classification ensemble method of depth migration study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TIAN TIAN ETC: ""Learning Attributes from the Crowdsourced Relative Labels"", 《THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110782043A (en) * 2019-10-29 2020-02-11 腾讯科技(深圳)有限公司 Model optimization method and device, storage medium and server
CN110782043B (en) * 2019-10-29 2023-09-22 腾讯科技(深圳)有限公司 Model optimization method, device, storage medium and server

Similar Documents

Publication Publication Date Title
Shi et al. Manufacturability analysis for additive manufacturing using a novel feature recognition technique
Stach et al. Expert-based and computational methods for developing fuzzy cognitive maps
CN109885768A (en) Worksheet method, apparatus and system
CN110196908A (en) Data classification method, device, computer installation and storage medium
CN107885499A (en) A kind of interface document generation method and terminal device
CN106778878B (en) Character relation classification method and device
CN105786898B (en) A kind of construction method and device of domain body
CN108304382A (en) Mass analysis method based on manufacturing process text data digging and system
CN109933783A (en) A kind of essence of a contract method of non-performing asset operation field
Ledur et al. Towards a domain-specific language for geospatial data visualization maps with big data sets
JP7347179B2 (en) Methods, devices and computer programs for extracting web page content
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN109711718A (en) For layering transfer learning method, medium, device and the equipment of weary sample classification
Chander et al. Data clustering using unsupervised machine learning
CN106844765B (en) Significant information detection method and device based on convolutional neural network
CN111699472B (en) Method for determining a system for developing complex embedded or information physical systems
Bliek et al. EXPObench: benchmarking surrogate-based optimisation algorithms on expensive black-box functions
Demir et al. Clustering-based deep brain multigraph integrator network for learning connectional brain templates
CN105373561B (en) The method and apparatus for identifying the logging mode in non-relational database
CN106775694A (en) A kind of hierarchy classification method of software merit rating code product
CN114281950B (en) Data retrieval method and system based on multi-graph weighted fusion
CN112836507B (en) Method for extracting domain text theme
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
Yousif et al. Shape clustering using k-medoids in architectural form finding
WO2022127124A1 (en) Meta learning-based entity category recognition method and apparatus, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190503

WD01 Invention patent application deemed withdrawn after publication