CN103324610A - Sample training method and device for mobile device - Google Patents

Sample training method and device for mobile device Download PDF

Info

Publication number
CN103324610A
CN103324610A CN2013102308120A CN201310230812A CN103324610A CN 103324610 A CN103324610 A CN 103324610A CN 2013102308120 A CN2013102308120 A CN 2013102308120A CN 201310230812 A CN201310230812 A CN 201310230812A CN 103324610 A CN103324610 A CN 103324610A
Authority
CN
China
Prior art keywords
character
results
subspace
commendation
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013102308120A
Other languages
Chinese (zh)
Inventor
李寿山
高伟
周国栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN2013102308120A priority Critical patent/CN103324610A/en
Publication of CN103324610A publication Critical patent/CN103324610A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a sample training method and device for a mobile device. The method is used in the device, the device is used in the mobile device, all characteristic values in preset samples are extracted, all characteristic values are decomposed according to preset rules to obtain at least one characteristic value subspace, training of machine learning classification methods is performed on the at least one characteristic value subspace, and base classifiers correspond to the at least one characteristic value subspace are obtained. According to the method, all extracted characteristic values in the preset samples are decomposed to obtain the at least one characteristic value subspace, the training of the machine learning classification methods is performed on the at least one characteristic value subspace to obtain the base classifiers corresponding to the at least one characteristic value subspace, and all base classifiers are obtained through the training of the at least one characteristic value subspace, so that the number of the characteristic values is obviously smaller than that of all characteristic values, and the memory space required during sample training is also small.

Description

A kind of sample training method and device that is applied to mobile device
Technical field
The present invention relates to field of information processing, particularly a kind of sample training method and device that is applied to mobile device.
Background technology
Along with the fast development of internet, people more and more get used to the viewpoint in network expression oneself, thereby make the text that emerges a large amount of band emotions on the network.These tendentiousness texts often exist with the form of comment on commodity, forum's comment and blog.These texts are crucial text often, or the user's interest text.
So-called text based on sentiment classification is analyzed speaker's attitude (or claiming viewpoint, emotion) exactly, just the subjectivity information in the text is analyzed.Emotion classification (Sentiment Classification) is a basic task during emotion is analyzed.This task is intended to text is passed judgement on classification according to the emotion tendency.Compare based on the text classification of theme with tradition, the emotion classification is considered to have more challenge.This task specifically refers to text is divided into the task of front text or negative text.For example: " I am delithted with this film ", by the emotion classification, the words will be divided into the front text, and " the very poor strength of this film " is classified as negative text.
At present, utilize the training process in the supervised classification method of machine learning, often need manually to mark the positive negative sample of certain scale.The classification accuracy of this method is than higher, but along with the number of training purpose increases, number of features also improves thereupon significantly, needs to take a large amount of memory headrooms in the assorting process, often be subjected to the restriction of memory size, the task of being difficult to carry out text classification for mobile terminal device.
Summary of the invention
The invention provides a kind of sample training method and device that is applied to mobile device, little owing to the mobile device internal memory to solve in the prior art, and the problem that can't carry out text classification.
Concrete technical scheme is as follows:
A kind of sample training method that is applied to mobile device, described method comprises:
Extract the whole eigenwerts in the default sample;
According to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace;
The training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence.
Preferably, also comprise:
Utilize described basic sorter to treat classification samples and classify, obtain the classification results of each described basic sorter correspondence, wherein, described classification results can be first classification results and derogatory sense character second classification results of commendation character;
Utilize fusion rule to merge respectively first classification results and described derogatory sense character second classification results of described commendation character, obtain commendation character first fusion results and derogatory sense character second fusion results;
Whether judge described commendation character first fusion results greater than described derogatory sense character second fusion results, if, the result that then to obtain described sample to be sorted be commendation, if not, the result that then to obtain described sample to be sorted be derogatory sense.
Preferably, the process of the whole eigenwerts in the default sample of described extraction comprises:
Use characteristic value extracting method extracts the whole eigenwerts in the default sample.
Preferably, describedly according to preset rules described whole eigenwerts are decomposed, the process that obtains at least one eigenwert subspace comprises:
Adopt the mode of average division or random extraction that described whole eigenwerts are decomposed, obtain at least one eigenwert subspace.
Preferably, described the training of machine learning classification method is carried out in each described eigenwert subspace, the process that obtains the basic sorter of each described eigenwert subspace correspondence comprises:
Use identical machine learning classification method or different machine learning classification methods to train to each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence.
Preferably, describedly utilize fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, the process that obtains commendation character first fusion results and derogatory sense character second fusion results comprises:
Utilize Bayes's fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtain commendation character first fusion results and derogatory sense character second fusion results.
A kind of sample training device that is applied to mobile device, described device comprises: extraction module, decomposing module and training module;
Wherein, described extraction module is used for, and extracts the whole eigenwerts in the default sample;
Described decomposing module is used for, and according to preset rules described whole eigenwerts is decomposed, and obtains at least one eigenwert subspace;
Described training module is used for, and the training of machine learning classification method is carried out in each described eigenwert subspace, obtains the basic sorter of each described eigenwert subspace correspondence.
Preferably, also comprise: sort module, Fusion Module and judge module;
Wherein, described sort module is used for, and utilizes described basic sorter to treat classification samples and classifies, and obtains the classification results of each described basic sorter correspondence, wherein, described classification results can be commendation character first classification results and derogatory sense character second classification results;
Described Fusion Module is used for, and utilizes fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtains commendation character first fusion results and derogatory sense character second fusion results;
Described judge module is used for, and whether judges described commendation character first fusion results greater than described derogatory sense character second fusion results, if, the result that then to obtain described sample to be sorted be commendation, if not, the result that then to obtain described sample to be sorted be derogatory sense.
As can be seen from the above technical solutions, the invention provides a kind of sample training method and device that is applied to mobile device, described method is applied in the described device, described device is applied in the mobile device, extract the whole eigenwerts in the default sample, according to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace, the training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence, in the described method, whole eigenwerts in the default sample that extracts are decomposed, obtain at least one eigenwert subspace, carry out the training of machine learning classification method for described each eigenwert subspace, therefore obtain the basic sorter of each described eigenwert subspace correspondence, because each described basic sorter is obtained by the training of described eigenwert subspace, the eigenwert number is obviously little much than whole eigenwerts so, therefore, the memory headroom that needs in sample training is also little a lot.
Description of drawings
In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, to do to introduce simply to the accompanying drawing of required use in invention or the description of the Prior Art below, apparently, the accompanying drawing that describes below only is some embodiment that put down in writing among the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the embodiment of the invention one disclosed a kind of sample training method flow synoptic diagram that is applied to mobile device;
Fig. 2 is the embodiment of the invention two disclosed a kind of sample training method flow synoptic diagram that are applied to mobile device;
Fig. 3 is the embodiment of the invention three disclosed a kind of sample training apparatus structure synoptic diagram that are applied to mobile device;
Fig. 4 is the embodiment of the invention four disclosed a kind of sample training apparatus structure synoptic diagram that are applied to mobile device.
Embodiment
Fast development along with the internet, people more and more hate in the happiness of network expression oneself, thereby emerge the text of a large amount of band emotions on the network, in order to distinguish the emotion classification in these texts, method commonly used at present is: the supervised classification method that utilizes machine learning, training process in the method often needs manually to mark the positive negative sample of certain scale.The classification accuracy of this method is than higher, but along with the number of training purpose increases, number of features also improves thereupon significantly, needs to take a large amount of memory headrooms in the assorting process, often be subjected to the restriction of memory size, the task of being difficult to carry out text classification for mobile terminal device.
Therefore, the present invention proposes a kind of sample training method and device that is applied to mobile device, described method is applied in the described device, whole eigenwerts in the default sample that extracts are decomposed, obtain at least one eigenwert subspace, carry out the training of machine learning classification method for described each eigenwert subspace, therefore obtain the basic sorter of each described eigenwert subspace correspondence, because each described basic sorter is obtained by the training of described eigenwert subspace, the eigenwert number is obviously little much than whole eigenwerts so, therefore, the memory headroom that needs in sample training is also little a lot.
Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to protection scope of the present invention not making the every other embodiment that obtains under the creative work prerequisite.
The embodiment of the invention one discloses a kind of sample training method that is applied to mobile device, and referring to shown in Figure 1, described method comprises:
Step S101: extract the whole eigenwerts in the default sample;
Wherein, use characteristic value extracting method extracts the whole eigenwerts in the default sample.
Step S102: according to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace;
Wherein, adopt the mode of average division or random extraction that described whole eigenwerts are decomposed, obtain at least one eigenwert subspace.
Suppose that default sample is X=(X 1, X 2..., X n), X wherein iBe m dimensional vector X i=(x I1, x I2... x Im).Specifically, (r<m) make up r dimension stochastic subspace can make up by r dimension sample based on this mode this method by select r feature at random in original m dimensional feature space
Figure BDA00003333064300051
(i=1,2 ... the new training set of n) forming
Figure BDA00003333064300052
Step S103: the training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence;
Wherein, use identical machine learning classification method or different machine learning classification methods to train to each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence, in addition, it is also conceivable that and utilize different machine learning methods, make the otherness of basic sorter increase, be conducive to improve the classifying quality of system;
The machine learning classification method of using in the present embodiment is the maximum entropy sorting technique:
Wherein, the maximum entropy sorting technique is based on the maximum entropy information theory, and its basic thought is to set up model for all known factors, and all unknown factors are foreclosed.That is to say, find a kind of probability distribution, satisfy all known facts, but allow the randomization of unknown factor.With respect to the naive Bayesian method, it is independent that the characteristics of this method maximum are exactly the condition that does not need to satisfy between feature and the feature.Therefore, this method be fit to merge various different features, and need not to consider the influence between them;
Under maximum entropy model, the formula of predicted condition probability P (c|D) is as follows:
P ( c i | D ) = 1 Z ( D ) exp ( Σ k λ k , c F k , c ( D , c i ) )
Wherein Z (D) is normalized factor.F K, cBe fundamental function, be defined as:
F k , c ( D , c ′ ) = { 1 , n k ( d ) > 0 and c ′ = c 0 , otherwise ;
Except the maximum entropy sorting technique, common machine learning classification method has naive Bayesian and support vector machine etc.
Present embodiment discloses a kind of sample training method that is applied to mobile device, in the described method, extract the whole eigenwerts in the default sample, according to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace, the training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence, in the described method, whole eigenwerts in the default sample that extracts are decomposed, obtain at least one eigenwert subspace, carry out the training of machine learning classification method for described each eigenwert subspace, therefore obtain the basic sorter of each described eigenwert subspace correspondence, because each described basic sorter is obtained by the training of described eigenwert subspace, the eigenwert number is obviously little much than whole eigenwerts so, and therefore, the memory headroom that needs in sample training is also little a lot.
The embodiment of the invention two discloses a kind of sample training method that is applied to mobile device, and referring to shown in Figure 2, described method comprises:
Step S201: extract the whole eigenwerts in the default sample;
Step S202: according to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace;
Step S203: the training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence;
Wherein, the specific implementation of step S201-step S203 is the same with the specific implementation of embodiment one disclosed step S101-step S103, just repeats no more herein.
Step S204: utilize described basic sorter to treat classification samples and classify, obtain the classification results of each described basic sorter correspondence, wherein, described classification results can be commendation character first classification results and derogatory sense character second classification results;
Need to prove that the corresponding classification results of each described basic sorter has comprised commendation result's probability and derogatory sense result's probability in the described classification results;
Step S205: utilize fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtain commendation character first fusion results and derogatory sense character second fusion results;
Wherein, utilize Bayes's fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtain commendation character first fusion results and derogatory sense character second fusion results.
Because corresponding classification results of each described basic sorter, commendation result's probability and derogatory sense result's probability have been comprised in the described classification results, be commendation character first classification results and derogatory sense character second classification results, utilize Bayes's fusion rule to merge the classification results of described commendation, obtain commendation character first fusion results, also utilize Bayes's fusion rule to merge the classification results of described derogatory sense, obtain derogatory sense character second fusion results;
Suppose P l(c +| D) and P l(c -| D) represent the result that l basic sorter provides respectively,
Bayes's fusion rule refers to specifically suppose that the result that each sorter provides is separate, and like this, sample belongs to the posterior probability P of commendation l(c +| D) and sample belong to the posterior probability P of derogatory sense l(c -| D) can be expressed as by Bayesian formula:
P ( c + | D ) = P ( c + ) Π l = 1 N P l ( c + | D )
P ( c - | D ) = P ( c - ) Π l = 1 N P l ( c - | D )
Wherein, P (c +) and P (c -) represent that respectively sample belongs to the prior probability of commendation and derogatory sense.Ignore the influence of prior probability among the present invention, all be set to 0.5.
Step S206: whether judge described commendation character first fusion results greater than described derogatory sense character second fusion results, if, execution in step S207 then, if not, execution in step S208 then;
Wherein, the text tendentiousness kind judging of sample to be sorted is by posterior probability P l(c +| D) and P l(c -| D) decide, concrete decision rule is as follows:
If P is (c +| D)〉P (c -| D), then described sample to be sorted belongs to commendation, otherwise described sample to be sorted belongs to derogatory sense.
Step S207: the result that then to obtain described sample to be sorted be commendation;
Step S208: the result that then to obtain described sample to be sorted be derogatory sense.
Present embodiment discloses a kind of sample training method that is applied to mobile device, described method is on the basis of embodiment one, increased the method for the emotion classification that utilizes described basic sorter to treat classification samples, namely, utilizing described basic sorter to treat classification samples classifies, obtain the classification results of each described basic sorter correspondence, wherein, described classification results can be commendation character first classification results and derogatory sense character second classification results, utilize fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtain commendation character first fusion results and derogatory sense character second fusion results, judge that whether described commendation character first fusion results is greater than described derogatory sense character second fusion results, if, the result that then to obtain described sample to be sorted be commendation, if not, the result that then to obtain described sample to be sorted be derogatory sense, because present embodiment has used described basic sorter to treat the emotion classification of classification samples, and each described basic sorter is obtained by the training of described eigenwert subspace, the eigenwert number is obviously little much than whole eigenwerts so, therefore, the memory headroom that needs in sample training is also little a lot, thereby is applicable in the mobile device.
The embodiment of the invention discloses a kind of sample training device that is applied to mobile device for three kinds, and referring to shown in Figure 3, described device comprises: extraction module 101, decomposing module 102 and training module 103;
Wherein, described extraction module 101 is used for, and extracts the whole eigenwerts in the default sample;
Described extraction module 101 can extract whole eigenwerts of presetting in the sample by use characteristic value extracting method.
Described decomposing module 102 is used for, and according to preset rules described whole eigenwerts is decomposed, and obtains at least one eigenwert subspace;
Described decomposing module can adopt the mode of average division or random extraction that described whole eigenwerts are decomposed, and obtains at least one eigenwert subspace.
Suppose that default sample is X=(X 1, X 2..., X n), X wherein iBe m dimensional vector X i=(x I1, x I2... x Im).Specifically, (r<m) make up r dimension stochastic subspace can make up by r dimension sample based on this mode this method by select r feature at random in original m dimensional feature space (i=1,2 ... the new training set of n) forming
Figure BDA00003333064300092
Described training module 103 is used for, and the training of machine learning classification method is carried out in each described eigenwert subspace, obtains the basic sorter of each described eigenwert subspace correspondence.
Described training module can use identical machine learning classification method or different machine learning classification methods to train to each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence, in addition, it is also conceivable that and utilize different machine learning methods, make the otherness of basic sorter increase, be conducive to improve the classifying quality of system;
The machine learning classification method of using in the present embodiment is the maximum entropy sorting technique:
Wherein, the maximum entropy sorting technique is based on the maximum entropy information theory, and its basic thought is to set up model for all known factors, and all unknown factors are foreclosed.That is to say, find a kind of probability distribution, satisfy all known facts, but allow the randomization of unknown factor.With respect to the naive Bayesian method, it is independent that the characteristics of this method maximum are exactly the condition that does not need to satisfy between feature and the feature.Therefore, this method be fit to merge various different features, and need not to consider the influence between them;
Under maximum entropy model, the formula of predicted condition probability P (c|D) is as follows:
P ( c i | D ) = 1 Z ( D ) exp ( Σ k λ k , c F k , c ( D , c i ) )
Wherein Z (D) is normalized factor.F K, cBe fundamental function, be defined as:
F k , c ( D , c ′ ) = { 1 , n k ( d ) > 0 and c ′ = c 0 , otherwise ;
Except the maximum entropy sorting technique, common machine learning classification method has naive Bayesian and support vector machine etc.
Present embodiment discloses a kind of sample training device that is applied to mobile device, described device extracts the whole eigenwerts in the default sample, according to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace, the training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence, in the described method, whole eigenwerts in the default sample that extracts are decomposed, obtain at least one eigenwert subspace, carry out the training of machine learning classification method for described each eigenwert subspace, therefore obtain the basic sorter of each described eigenwert subspace correspondence, because each described basic sorter is obtained by the training of described eigenwert subspace, the eigenwert number is obviously little much than whole eigenwerts so, therefore, the memory headroom that needs in sample training is also little a lot.
The embodiment of the invention four discloses a kind of sample training device that is applied to mobile device, and referring to shown in Figure 4, described device comprises: extraction module 101, decomposing module 102, training module 103, sort module 104, Fusion Module 105 and judge module 106;
Described extraction module 101, described decomposing module 102, described training module 103 and to implement three disclosed described extraction modules 101, described decomposing module 102, described training module 103 consistent;
Wherein, described sort module 104 is used for, and utilizes described basic sorter to treat classification samples and classifies, and obtains the classification results of each described basic sorter correspondence, wherein, described classification results can be commendation character first classification results and derogatory sense character second classification results;
Need to prove that the corresponding classification results of each described basic sorter has comprised commendation result's probability and derogatory sense result's probability in the described classification results;
Described Fusion Module 105 is used for, and utilizes fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtains commendation character first fusion results and derogatory sense character second fusion results;
Wherein, utilize Bayes's fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtain commendation character first fusion results and derogatory sense character second fusion results.
Because corresponding classification results of each described basic sorter, commendation result's probability and derogatory sense result's probability have been comprised in the described classification results, be commendation character first classification results and derogatory sense character second classification results, utilize Bayes's fusion rule to merge the classification results of described commendation, obtain commendation character first fusion results, also utilize Bayes's fusion rule to merge the classification results of described derogatory sense, obtain derogatory sense character second fusion results;
Suppose P l(c +| D) and P l(c -| D) represent the result that l basic sorter provides respectively,
Bayes's fusion rule refers to specifically suppose that the result that each sorter provides is separate, and like this, sample belongs to the posterior probability P of commendation l(c +| D) and sample belong to the posterior probability P of derogatory sense l(c -| D) can be expressed as by Bayesian formula:
P ( c + | D ) = P ( c + ) Π l = 1 N P l ( c + | D )
P ( c - | D ) = P ( c - ) Π l = 1 N P l ( c - | D )
Wherein, P (c +) and P (c -) represent that respectively sample belongs to the prior probability of commendation and derogatory sense.Ignore the influence of prior probability among the present invention, all be set to 0.5.
Described judge module 106 is used for, whether judge described commendation character first fusion results greater than described derogatory sense character second fusion results, if, the result that then to obtain described sample to be sorted be commendation, if not, the result that then to obtain described sample to be sorted be derogatory sense;
Wherein, the text tendentiousness kind judging of sample to be sorted is by posterior probability P l(c +| D) and P l(c -| D) decide, concrete decision rule is as follows:
If P is (c +| D)〉P (c -| D), then described sample to be sorted belongs to commendation, otherwise described sample to be sorted belongs to derogatory sense.
Present embodiment discloses a kind of sample training device that is applied to mobile device, used described basic sorter to treat the emotion classification of classification samples, and each described basic sorter is obtained by the training of described eigenwert subspace, the eigenwert number is obviously little much than whole eigenwerts so, therefore, the memory headroom that needs in sample training is also little a lot, thereby is applicable in the mobile device.
Each embodiment adopts the mode of going forward one by one to describe in this instructions, and what each embodiment stressed is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.For the disclosed device of embodiment, because it is corresponding with the embodiment disclosed method, so description is fairly simple, relevant part partly illustrates referring to method and gets final product.
To the above-mentioned explanation of the disclosed embodiments, make this area professional and technical personnel can realize or use the present invention.Multiple modification to these embodiment is apparent to those skilled in the art, and defined General Principle can realize under the situation that does not break away from the spirit or scope of the present invention in other embodiments herein.Therefore, the present invention will can not be restricted to these embodiment shown in this article, but will meet the wide region consistent with principle disclosed herein and features of novelty.

Claims (8)

1. sample training method that is applied to mobile device is characterized in that described method comprises:
Extract the whole eigenwerts in the default sample;
According to preset rules described whole eigenwerts are decomposed, obtain at least one eigenwert subspace;
The training of machine learning classification method is carried out in each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence.
2. method according to claim 1 is characterized in that, also comprises:
Utilize described basic sorter to treat classification samples and classify, obtain the classification results of each described basic sorter correspondence, wherein, described classification results can be first classification results and derogatory sense character second classification results of commendation character;
Utilize fusion rule to merge respectively first classification results and described derogatory sense character second classification results of described commendation character, obtain commendation character first fusion results and derogatory sense character second fusion results;
Whether judge described commendation character first fusion results greater than described derogatory sense character second fusion results, if, the result that then to obtain described sample to be sorted be commendation, if not, the result that then to obtain described sample to be sorted be derogatory sense.
3. method according to claim 1 is characterized in that, the process of the whole eigenwerts in the default sample of described extraction comprises:
Use characteristic value extracting method extracts the whole eigenwerts in the default sample.
4. method according to claim 1 is characterized in that, describedly according to preset rules described whole eigenwerts is decomposed, and the process that obtains at least one eigenwert subspace comprises:
Adopt the mode of average division or random extraction that described whole eigenwerts are decomposed, obtain at least one eigenwert subspace.
5. method according to claim 1 is characterized in that, described the training of machine learning classification method is carried out in each described eigenwert subspace, and the process that obtains the basic sorter of each described eigenwert subspace correspondence comprises:
Use identical machine learning classification method or different machine learning classification methods to train to each described eigenwert subspace, obtain the basic sorter of each described eigenwert subspace correspondence.
6. method according to claim 2, it is characterized in that, describedly utilize fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, the process that obtains commendation character first fusion results and derogatory sense character second fusion results comprises:
Utilize Bayes's fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtain commendation character first fusion results and derogatory sense character second fusion results.
7. a sample training device that is applied to mobile device is characterized in that described device comprises: extraction module, decomposing module and training module;
Wherein, described extraction module is used for, and extracts the whole eigenwerts in the default sample;
Described decomposing module is used for, and according to preset rules described whole eigenwerts is decomposed, and obtains at least one eigenwert subspace;
Described training module is used for, and the training of machine learning classification method is carried out in each described eigenwert subspace, obtains the basic sorter of each described eigenwert subspace correspondence.
8. device according to claim 7 is characterized in that, also comprises: sort module, Fusion Module and judge module;
Wherein, described sort module is used for, and utilizes described basic sorter to treat classification samples and classifies, and obtains the classification results of each described basic sorter correspondence, wherein, described classification results can be commendation character first classification results and derogatory sense character second classification results;
Described Fusion Module is used for, and utilizes fusion rule to merge respectively described commendation character first classification results and described derogatory sense character second classification results, obtains commendation character first fusion results and derogatory sense character second fusion results;
Described judge module is used for, and whether judges described commendation character first fusion results greater than described derogatory sense character second fusion results, if, the result that then to obtain described sample to be sorted be commendation, if not, the result that then to obtain described sample to be sorted be derogatory sense.
CN2013102308120A 2013-06-09 2013-06-09 Sample training method and device for mobile device Pending CN103324610A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013102308120A CN103324610A (en) 2013-06-09 2013-06-09 Sample training method and device for mobile device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013102308120A CN103324610A (en) 2013-06-09 2013-06-09 Sample training method and device for mobile device

Publications (1)

Publication Number Publication Date
CN103324610A true CN103324610A (en) 2013-09-25

Family

ID=49193360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013102308120A Pending CN103324610A (en) 2013-06-09 2013-06-09 Sample training method and device for mobile device

Country Status (1)

Country Link
CN (1) CN103324610A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017124930A1 (en) * 2016-01-18 2017-07-27 阿里巴巴集团控股有限公司 Method and device for feature data processing
CN107992887A (en) * 2017-11-28 2018-05-04 东软集团股份有限公司 Classifier generation method, sorting technique, device, electronic equipment and storage medium
CN108090503A (en) * 2017-11-28 2018-05-29 东软集团股份有限公司 On-line tuning method, apparatus, storage medium and the electronic equipment of multi-categorizer
CN109993312A (en) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 A kind of equipment and its information processing method, computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6502081B1 (en) * 1999-08-06 2002-12-31 Lexis Nexis System and method for classifying legal concepts using legal topic scheme
CN102682124A (en) * 2012-05-16 2012-09-19 苏州大学 Emotion classifying method and device for text
CN102789498A (en) * 2012-07-16 2012-11-21 钱钢 Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6502081B1 (en) * 1999-08-06 2002-12-31 Lexis Nexis System and method for classifying legal concepts using legal topic scheme
CN102682124A (en) * 2012-05-16 2012-09-19 苏州大学 Emotion classifying method and device for text
CN102789498A (en) * 2012-07-16 2012-11-21 钱钢 Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
SHOUSHAN LI等: "Sentiment Classification through Combining Classifiers with Multiple Feature Sets", 《INTERNATIONAL CONFERENCE ON NATURE LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING》 *
SHOUSHAN LI等: "Sentiment Classification through Combining Classifiers with Multiple Feature Sets", 《INTERNATIONAL CONFERENCE ON NATURE LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING》, 1 September 2007 (2007-09-01), pages 135 - 140, XP031153219 *
叶云龙 等: "基于随机子空间的多分类器集成", 《南京师范大学学报(工程技术版)》 *
苏艳等: "基于随机特征子空间的半监督情感分类方法研究", 《中文信息学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017124930A1 (en) * 2016-01-18 2017-07-27 阿里巴巴集团控股有限公司 Method and device for feature data processing
US11188731B2 (en) 2016-01-18 2021-11-30 Alibaba Group Holding Limited Feature data processing method and device
CN107992887A (en) * 2017-11-28 2018-05-04 东软集团股份有限公司 Classifier generation method, sorting technique, device, electronic equipment and storage medium
CN108090503A (en) * 2017-11-28 2018-05-29 东软集团股份有限公司 On-line tuning method, apparatus, storage medium and the electronic equipment of multi-categorizer
CN108090503B (en) * 2017-11-28 2021-05-07 东软集团股份有限公司 Online adjustment method and device for multiple classifiers, storage medium and electronic equipment
CN109993312A (en) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 A kind of equipment and its information processing method, computer storage medium

Similar Documents

Publication Publication Date Title
Pane et al. A multi-lable classification on topics of quranic verses in english translation using multinomial naive bayes
CN102789498B (en) Method and system for carrying out sentiment classification on Chinese comment text on basis of ensemble learning
Bouazizi et al. Opinion mining in twitter how to make use of sarcasm to enhance sentiment analysis
CN104899298A (en) Microblog sentiment analysis method based on large-scale corpus characteristic learning
Noviantho et al. Cyberbullying classification using text mining
CN107908715A (en) Microblog emotional polarity discriminating method based on Adaboost and grader Weighted Fusion
CN104298665A (en) Identification method and device of evaluation objects of Chinese texts
CN108052505A (en) Text emotion analysis method and device, storage medium, terminal
CN104794500A (en) Tri-training semi-supervised learning method and device
CN102663370A (en) Face identification method and system
CN102708164A (en) Method and system for calculating movie expectation
Abdelaal et al. Improve the automatic classification accuracy for Arabic tweets using ensemble methods
Isa et al. Cyberbullying classification using text mining
CN103324610A (en) Sample training method and device for mobile device
Selvaperumal et al. A short message classification algorithm for tweet classification
CN103473380A (en) Computer text sentiment classification method
CN103761221A (en) System and method for identifying sensitive text messages
CN105183808A (en) Problem classification method and apparatus
CN106570170A (en) Text classification and naming entity recognition integrated method and system based on depth cyclic neural network
CN105224955A (en) Based on the method for microblogging large data acquisition network service state
CN102411592A (en) Text classification method and device
KR20130103249A (en) Method of classifying emotion from multi sentence using context information
CN105243095A (en) Microblog text based emotion classification method and system
CN105279517A (en) Weak tag social image recognition method based on semi-supervision relation theme model
Nalini et al. Classification using Latent Dirichlet allocation with Naïve Bayes classifier to detect cyber bullying in twitter

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130925