CN106446230A - Method for optimizing word classification in machine learning text - Google Patents

Method for optimizing word classification in machine learning text Download PDF

Info

Publication number
CN106446230A
CN106446230A CN201610881132.9A CN201610881132A CN106446230A CN 106446230 A CN106446230 A CN 106446230A CN 201610881132 A CN201610881132 A CN 201610881132A CN 106446230 A CN106446230 A CN 106446230A
Authority
CN
China
Prior art keywords
classification
word
text
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610881132.9A
Other languages
Chinese (zh)
Inventor
郭宇
李永波
季统凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
G Cloud Technology Co Ltd
Original Assignee
G Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by G Cloud Technology Co Ltd filed Critical G Cloud Technology Co Ltd
Priority to CN201610881132.9A priority Critical patent/CN106446230A/en
Publication of CN106446230A publication Critical patent/CN106446230A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of data processing and machine learning classification, in particular to a method for optimizing word classification in machine learning text. The method comprises the steps that on the basis of text classification, self-defined and semantically related features are filtered out through a feature selection regulator based on a regular expression, a user customizes classification types in training data after feature selection, and classification based training is conducted by means of the features and the types according to a naive Bayesian model; after training is completed, in the application stage, if statements conforming to the feature selection regulator exist in text needing word classification, classification is completed by combining a trained model. According to the method, the capacity of the model for processing work classification is not limited in word data in a training sample; the method can be applied to optimization and application of machine learning text work classification and derivation functions thereof.

Description

A kind of method optimizing word's kinds in machine learning text
Technical field
The present invention relates to data processing and machine learning classification field, word in especially a kind of optimization machine learning text The method of classification.
Background technology
With the fast development of information technology, modern society's information content is in explosive growth, in the today in big data epoch How to make good use of mass data, really valuable information of excavating becomes a social concerns focus.Machine learning is in data Effect played in excavation is more and more obvious, and in the problem with text classification for the process to natural language, machine learning By traditional rules customization method is replaced come solve problem using statistical method, by facts have proved this way effect not Wrong and in hgher efficiency.On the basis of to text classification it is desirable in further to text, each word, keyword are classified, Extract required key wordses information, this just puts forward higher requirement to machine learning classification.
Content of the invention
Present invention solves the technical problem that being to provide a kind of method optimizing word's kinds in machine learning text, solve The classification problem of self-defined key wordses in current text classification.
The present invention solve above-mentioned technical problem technical scheme be:
Described method is on the basis of text classification, and the feature selecting rule device based on regular expression filters out Self-defining and semantic related feature, the class categories in User Defined training data after feature selecting, and then utilize These features and classification carry out classification based training according to model-naive Bayesian;After completing training, when the application stage, need word In the text of language classification if there is meet feature selecting rule device sentence when, complete point in conjunction with the trained model completing Class.
Described method concretely comprises the following steps:
S1, training set create:Create satisfactory training text data according to actual needs, complete in conjunction with true environment Training set creates;
S2, data prediction:When being related to Chinese in the text of need classification, need the text in training set is carried out point The pretreatment such as word and stop words removal;
S3, feature selecting rule device in, input self-defining regular expression as filter condition, feature selecting advise Then device goes out legal text in training set according to regular expression Rules Filtering, will be in regular expression wildcard in text Participle queue put in word at symbol;
Whether full S4, the characteristic vector model being generated according to feature selecting rule device, compare word in each participle queue The regular expression that each feature of foot is located, calculates the weights of the vector of each word;
The defined classification results completing of user in S5, the characteristic vector according to each word and training set, in conjunction with simplicity Bayes classifier, calculates each class conditional probability and prior probability, then completes the training to disaggregated model;When completing mould After type training, tested using pre-prepd test the set pair analysis model, formed after the contrast of test result and legitimate reading to point The Performance Evaluation of class model, and possible modification is proposed, model is optimized;
S6, using complete train grader the text data being actually needed word's kinds is classified.
Described a characteristic value is represented with feature selecting rule one of the self-defining expression formula of device asterisk wildcard, feature is selected Select regular device when the text to input checks, if there are the sentence meeting expression formula, then this sentence is extracted simultaneously Using the word of asterisk wildcard position or word collection as in the object typing classification queue needing classification;User can customize each and leads to Join the meaning of the representative characteristic value of symbol.
User is processed to training data before train classification models, self-defined first required sorting item, entirely literary composition The word collection meeting feature selecting rule device in this is broadly divided into A, B, C tri- class, and to should in each training text individuality Individual last classification results are labeled.
During described model training, such as certain word meets first regularity, the then characteristic value representated by this regularity It is designated as 1, be otherwise designated as 0;
After the completion of model training, test result is analyzed;Now, feature weight synthesis word position, the frequency of occurrences Calculated as considering index etc. factor.
The invention provides word's kinds in a kind of utilization regular expressions accurately mate semantic optimization machine learning text Method;Can be applicable in the optimization of the classification of text word and its derivation function and related application in machine learning category.
Brief description
The present invention is further described for accompanying drawings below:
Fig. 1 is classification process schematic diagram of the present invention;
Specific embodiment
As shown in figure 1, the present invention is on the basis of traditional machine learning file classification method, using with regular expression Based on feature selecting rule device, filter out self-defining and semantic related feature, User Defined after feature selecting Class categories in training data, and then using these features and classification, classification based training is carried out according to model-naive Bayesian; After completing training, when the application stage, need in the text of word's kinds if there is the sentence meeting feature selecting rule device When, complete classification task in conjunction with the trained model completing.
Feature selector is based on regular expression, represents one in one of self-defining regular expression asterisk wildcard Characteristic value.As:". " in " .* [xyz]+" can represent a specific feature, similar to:" meet regularity at this Word is all national title " or " word meeting regularity at this is all related to religion " etc..One feature selecting rule Then device can comprise one or more rules, and these rules constitute the basis forming characteristic vector model.According to feature selecting The sentence meeting regular expression in text selected by regular device, the set of words that in these sentences, corresponding asterisk wildcard is located it is simply that By the word being classified.When being related to Chinese word's kinds, need to process using Chinese word segmentation instrument and some stop words Flow process is come the word in classification queue that to standardize.
Whether meet, according to these words, a plurality of regularity specified in feature selecting rule device, set up its characteristic vector Model.Number of dimensions in characteristic vector model, is the characteristic in regularity, be expressed as feature 1, feature 2 ... feature N }, such as certain word meets first regularity, then the characteristic value representated by this regularity is designated as 1, is otherwise designated as 0.Thus Understand arbitrarily to need the word of classification, its characteristic vector is all represented by form such similar to { 1,0,0,1... }.Obtaining After getting the characteristic vector of each word in training set, classification based training is carried out according to model-naive Bayesian.
Here assume initially with naive Bayesian theorem separate between each feature, instructed according to preprepared Practice collection, set up the characteristic vector of word the artificially defined class categories belonging to it that each needs classification.I.e. user needs Before train classification models, training data is processed, self-defined required sorting item, such as meet feature selecting in whole text The word collection of regular device is broadly divided into A, B, C tri- class, then will need the classification knot of the word of classification in each training text The artificial mark of fruit draws.May then pass through model-naive Bayesian training and show that the prior probability of each feature and posteriority are general Rate, is referred to as class conditional probability, and the so far training of model completes.
After model training completes, need the data in test set is tested.After completing test, test is tied Fruit is analyzed, thus the performance of assessment models, and carry out the optimization of model as far as possible to a certain extent.As feature weight meter Calculation mode no longer indicates whether to meet regular expression with 0,1, but the factor such as comprehensive word position, frequency of occurrences is as considering Index is calculated.
In actual classification task, feature selecting rule device checks to the text of input, if there are meeting expression formula Sentence, then this sentence is extracted and the word of asterisk wildcard position or word collection is divided as the object typing needing classification In Class Queue.When word of classifying is related to Chinese, using general Chinese word segmentation instrument as solution.Can after completing participle According to demand, stop-word is processed, then each word is carried out in the manner described above its characteristic vector modeling, Ran Houtong Cross and train the sorter model finishing to complete work of classifying.
The concrete steps of scheme can be as follows as described above:
S1, training set create:Create satisfactory training text data according to actual needs, can be complete in conjunction with true environment Training set is become to create.In text in training set, the word of classification is needed to have artificially sorted result.It should be noted that When creating training set, to be created for a certain or several regular expression rules, these rules will be It is cited, for generating corresponding characteristic item in feature selecting rule device.
S2, data prediction:When being related to Chinese in the text of need classification, need the text in training set is carried out point The pretreatment such as word and stop words removal.Chinese word segmentation can be using Chinese word segmentation instrument SCWS or Jcseg currently commonly using etc.. The process of stop words is fairly simple, it is possible to use conventional deactivation vocabulary, as foundation, word corresponding in text is removed.This In it should be noted that when feature selecting rule device in regularity used some stop words when, then will not remove this deactivation Word.
S3, feature selecting rule device in, input self-defining regular expression as filter condition, feature selecting advise Then device can go out legal text in training set according to regular expression Rules Filtering, leads to being in regular expression in text Join the word at symbol and put into participle queue.
Whether full S4, the characteristic vector model being generated according to feature selecting rule device, compare word in each participle queue The regular expression that each feature of foot is located, calculates the weights of the vector of each word.Concrete grammar is:When word place text When meeting regular expression 1 in feature selecting rule device, the weights of the corresponding characteristic vector of this expression formula are set to 1, are otherwise set to 0.Thus the characteristic vector obtaining each word is expressed as form such similar to { 1,0,0,1... }.
The defined classification results completing of user in S5, the characteristic vector according to each word and training set, in conjunction with simplicity Bayes classifier, calculates each class conditional probability and prior probability, then completes the training to disaggregated model.When completing mould After type training, to be tested using pre-prepd test the set pair analysis model, it is right that test result is formed after being contrasted with legitimate reading The Performance Evaluation of disaggregated model, and possible modification is proposed, model is optimized.
S6, using complete train grader the text data being actually needed word's kinds is classified.Here need It should be noted that the text classified has to meet rule in feature selecting rule device, and class to be classified Also not consistent with self-defined classification in feature selecting rule device, otherwise need to redefine regularity and sorting item, again instruct Practice model, just can complete new classification task.
Case study on implementation described above is an example of the present invention and not all, based on the example in the present invention, this Other examples that field those of ordinary skill is obtained on the premise of not making creative work, broadly fall into the guarantor of the present invention Shield scope.

Claims (7)

1. a kind of optimize machine learning text in word's kinds method it is characterised in that:Described method is in text classification On the basis of, the feature selecting rule device based on regular expression filters out self-defining and semantic related feature, in spy Levy the class categories in User Defined training data after selection, so using these features with classification according to naive Bayesian mould Type is carrying out classification based training;After completing training, when the application stage, need in the text of word's kinds if there is meeting feature During the sentence of the regular device of selection, complete to classify in conjunction with the trained model completing.
2. method according to claim 1 it is characterised in that:Described method concretely comprises the following steps:
S1, training set create:Create satisfactory training text data according to actual needs, complete to train in conjunction with true environment Collection creates;
S2, data prediction:When being related to Chinese in the text of need classification, need the text in training set is carried out participle and The pretreatments such as stop words removal;
S3, feature selecting rule device in, input self-defining regular expression as filter condition, feature selecting rule device Legal text in training set is gone out according to regular expression Rules Filtering, will be in text at regular expression asterisk wildcard Word put into participle queue;
S4, the characteristic vector model being generated according to feature selecting rule device, compare whether word in each participle queue meets respectively The regular expression that individual feature is located, calculates the weights of the vector of each word;
The defined classification results completing of user in S5, the characteristic vector according to each word and training set, in conjunction with simple pattra leaves This grader, calculates each class conditional probability and prior probability, then completes the training to disaggregated model;Instruct when completing model After white silk, tested using pre-prepd test the set pair analysis model, formed to classification mould after test result and legitimate reading contrast The Performance Evaluation of type, and possible modification is proposed, model is optimized;
S6, using complete train grader the text data being actually needed word's kinds is classified.
3. method according to claim 1 it is characterised in that:In the described rule self-defining expression formula of device with feature selecting An asterisk wildcard represent a characteristic value, feature selecting rule device when the text to input checks, if there are satisfaction The sentence of expression formula, then extract this sentence and using the word of asterisk wildcard position or word collection as the object needing classification In typing classification queue;User can customize the meaning of the characteristic value representated by each asterisk wildcard.
4. method according to claim 2 it is characterised in that:In the described rule self-defining expression formula of device with feature selecting An asterisk wildcard represent a characteristic value, feature selecting rule device when the text to input checks, if there are satisfaction The sentence of expression formula, then extract this sentence and using the word of asterisk wildcard position or word collection as the object needing classification In typing classification queue;User can customize the meaning of the characteristic value representated by each asterisk wildcard.
5. the method according to any one of Claims 1-4 it is characterised in that:User is before train classification models to training Data is processed, self-defined first required sorting item, meets the word collection of feature selecting rule device substantially in whole text A, B, C tri- class can be divided into, and in each training text individuality, classification results last for this individuality are labeled.
6. the method according to any one of claim 1-4 it is characterised in that:
During described model training, such as certain word meets first regularity, then the characteristic value representated by this regularity is designated as 1, otherwise it is designated as 0;
After the completion of model training, test result is analyzed;Now, feature weight synthesis word position, frequency of occurrences etc. because Element conduct is considered index and is calculated.
7. method according to claim 5 it is characterised in that:
During described model training, such as certain word meets first regularity, then the characteristic value representated by this regularity is designated as 1, otherwise it is designated as 0;
After the completion of model training, test result is analyzed;Now, feature weight synthesis word position, frequency of occurrences etc. because Element conduct is considered index and is calculated.
CN201610881132.9A 2016-10-08 2016-10-08 Method for optimizing word classification in machine learning text Pending CN106446230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610881132.9A CN106446230A (en) 2016-10-08 2016-10-08 Method for optimizing word classification in machine learning text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610881132.9A CN106446230A (en) 2016-10-08 2016-10-08 Method for optimizing word classification in machine learning text

Publications (1)

Publication Number Publication Date
CN106446230A true CN106446230A (en) 2017-02-22

Family

ID=58172086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610881132.9A Pending CN106446230A (en) 2016-10-08 2016-10-08 Method for optimizing word classification in machine learning text

Country Status (1)

Country Link
CN (1) CN106446230A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368464A (en) * 2017-07-28 2017-11-21 深圳数众科技有限公司 A kind of method and device for obtaining bid product information
CN107679734A (en) * 2017-09-27 2018-02-09 成都四方伟业软件股份有限公司 It is a kind of to be used for the method and system without label data classification prediction
CN108470116A (en) * 2018-03-03 2018-08-31 淄博职业学院 The personal identification method and device of a kind of computer system and its user
CN108491390A (en) * 2018-03-28 2018-09-04 江苏满运软件科技有限公司 A kind of main line logistics goods title automatic recognition classification method
CN108519978A (en) * 2018-04-10 2018-09-11 成都信息工程大学 A kind of Chinese document segmenting method based on Active Learning
CN109144999A (en) * 2018-08-02 2019-01-04 东软集团股份有限公司 A kind of data positioning method, device and storage medium, program product
CN109409533A (en) * 2018-09-28 2019-03-01 深圳乐信软件技术有限公司 A kind of generation method of machine learning model, device, equipment and storage medium
CN109508370A (en) * 2018-09-28 2019-03-22 北京百度网讯科技有限公司 Opinions Extraction method, equipment and storage medium
CN110457566A (en) * 2019-08-15 2019-11-15 腾讯科技(武汉)有限公司 Method, device, electronic equipment and storage medium
CN110968687A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Method and device for classifying texts
CN111428518A (en) * 2019-01-09 2020-07-17 科大讯飞股份有限公司 Low-frequency word translation method and device
CN112579733A (en) * 2019-09-30 2021-03-30 华为技术有限公司 Rule matching method, rule matching device, storage medium and electronic equipment
CN113536785A (en) * 2021-06-15 2021-10-22 合肥讯飞数码科技有限公司 Text recommendation method, intelligent terminal and computer readable storage medium
CN113742479A (en) * 2020-05-29 2021-12-03 北京沃东天骏信息技术有限公司 Method and device for screening target text
CN117556049A (en) * 2024-01-10 2024-02-13 杭州光云科技股份有限公司 Text classification method of regular expression generated based on large language model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559174A (en) * 2013-09-30 2014-02-05 东软集团股份有限公司 Semantic emotion classification characteristic value extraction method and system
US20140279761A1 (en) * 2013-03-15 2014-09-18 Konstantinos (Constantin) F. Aliferis Document Coding Computer System and Method With Integrated Quality Assurance
CN105808524A (en) * 2016-03-11 2016-07-27 江苏畅远信息科技有限公司 Patent document abstract-based automatic patent classification method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140279761A1 (en) * 2013-03-15 2014-09-18 Konstantinos (Constantin) F. Aliferis Document Coding Computer System and Method With Integrated Quality Assurance
CN103559174A (en) * 2013-09-30 2014-02-05 东软集团股份有限公司 Semantic emotion classification characteristic value extraction method and system
CN105808524A (en) * 2016-03-11 2016-07-27 江苏畅远信息科技有限公司 Patent document abstract-based automatic patent classification method

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368464B (en) * 2017-07-28 2020-07-10 深圳数众科技有限公司 Method and device for acquiring bidding product information
CN107368464A (en) * 2017-07-28 2017-11-21 深圳数众科技有限公司 A kind of method and device for obtaining bid product information
CN107679734A (en) * 2017-09-27 2018-02-09 成都四方伟业软件股份有限公司 It is a kind of to be used for the method and system without label data classification prediction
CN108470116A (en) * 2018-03-03 2018-08-31 淄博职业学院 The personal identification method and device of a kind of computer system and its user
CN108491390A (en) * 2018-03-28 2018-09-04 江苏满运软件科技有限公司 A kind of main line logistics goods title automatic recognition classification method
CN108519978A (en) * 2018-04-10 2018-09-11 成都信息工程大学 A kind of Chinese document segmenting method based on Active Learning
CN109144999A (en) * 2018-08-02 2019-01-04 东软集团股份有限公司 A kind of data positioning method, device and storage medium, program product
CN109144999B (en) * 2018-08-02 2021-06-08 东软集团股份有限公司 Data positioning method, device, storage medium and program product
CN109409533A (en) * 2018-09-28 2019-03-01 深圳乐信软件技术有限公司 A kind of generation method of machine learning model, device, equipment and storage medium
CN109508370A (en) * 2018-09-28 2019-03-22 北京百度网讯科技有限公司 Opinions Extraction method, equipment and storage medium
CN109409533B (en) * 2018-09-28 2021-07-27 深圳乐信软件技术有限公司 Method, device, equipment and storage medium for generating machine learning model
CN110968687B (en) * 2018-09-30 2023-06-16 北京国双科技有限公司 Method and device for classifying text
CN110968687A (en) * 2018-09-30 2020-04-07 北京国双科技有限公司 Method and device for classifying texts
CN111428518A (en) * 2019-01-09 2020-07-17 科大讯飞股份有限公司 Low-frequency word translation method and device
CN111428518B (en) * 2019-01-09 2023-11-21 科大讯飞股份有限公司 Low-frequency word translation method and device
CN110457566A (en) * 2019-08-15 2019-11-15 腾讯科技(武汉)有限公司 Method, device, electronic equipment and storage medium
WO2021063089A1 (en) * 2019-09-30 2021-04-08 华为技术有限公司 Rule matching method, rule matching apparatus, storage medium and electronic device
CN112579733A (en) * 2019-09-30 2021-03-30 华为技术有限公司 Rule matching method, rule matching device, storage medium and electronic equipment
CN112579733B (en) * 2019-09-30 2023-10-20 华为技术有限公司 Rule matching method, rule matching device, storage medium and electronic equipment
CN113742479A (en) * 2020-05-29 2021-12-03 北京沃东天骏信息技术有限公司 Method and device for screening target text
CN113536785A (en) * 2021-06-15 2021-10-22 合肥讯飞数码科技有限公司 Text recommendation method, intelligent terminal and computer readable storage medium
CN117556049A (en) * 2024-01-10 2024-02-13 杭州光云科技股份有限公司 Text classification method of regular expression generated based on large language model
CN117556049B (en) * 2024-01-10 2024-05-17 杭州光云科技股份有限公司 Text classification method of regular expression generated based on large language model

Similar Documents

Publication Publication Date Title
CN106446230A (en) Method for optimizing word classification in machine learning text
CN108573047A (en) A kind of training method and device of Module of Automatic Chinese Documents Classification
CN108199951A (en) A kind of rubbish mail filtering method based on more algorithm fusion models
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN102622373B (en) Statistic text classification system and statistic text classification method based on term frequency-inverse document frequency (TF*IDF) algorithm
CN107944480A (en) A kind of enterprises ' industry sorting technique
CN107193801A (en) A kind of short text characteristic optimization and sentiment analysis method based on depth belief network
CN106021410A (en) Source code annotation quality evaluation method based on machine learning
CN106844424A (en) A kind of file classification method based on LDA
CN107301171A (en) A kind of text emotion analysis method and system learnt based on sentiment dictionary
CN107291723A (en) The method and apparatus of web page text classification, the method and apparatus of web page text identification
CN104834940A (en) Medical image inspection disease classification method based on support vector machine (SVM)
CN106202561A (en) Digitized contingency management case library construction methods based on the big data of text and device
CN105373606A (en) Unbalanced data sampling method in improved C4.5 decision tree algorithm
CN106294783A (en) A kind of video recommendation method and device
CN103995876A (en) Text classification method based on chi square statistics and SMO algorithm
CN101876987A (en) Overlapped-between-clusters-oriented method for classifying two types of texts
KR20120109943A (en) Emotion classification method for analysis of emotion immanent in sentence
CN109670039A (en) Sentiment analysis method is commented on based on the semi-supervised electric business of tripartite graph and clustering
CN106897290B (en) Method and device for establishing keyword model
CN102541838A (en) Method and equipment for optimizing emotional classifier
CN110008309A (en) A kind of short phrase picking method and device
CN103593431A (en) Internet public opinion analyzing method and device
CN108280164A (en) A kind of short text filtering and sorting technique based on classification related words
CN109145108A (en) Classifier training method, classification method, device and computer equipment is laminated in text

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170222