CN107169035A - A kind of file classification method for mixing shot and long term memory network and convolutional neural networks - Google Patents
A kind of file classification method for mixing shot and long term memory network and convolutional neural networks Download PDFInfo
- Publication number
- CN107169035A CN107169035A CN201710257132.6A CN201710257132A CN107169035A CN 107169035 A CN107169035 A CN 107169035A CN 201710257132 A CN201710257132 A CN 201710257132A CN 107169035 A CN107169035 A CN 107169035A
- Authority
- CN
- China
- Prior art keywords
- sentence
- layer
- convolutional neural
- long term
- memory network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a kind of file classification method for mixing shot and long term memory network and convolutional neural networks, by fully combining two-way shot and long term memory network in the advantage of advantage and convolutional neural networks in terms of learning text local feature in terms of the contextual information of learning text, after the contextual information for learning word using two-way shot and long term memory network, further learn to extract the local feature of the term vector of contextual information by convolutional neural networks again, then two-way shot and long term memory network is recycled to learn the context of these local features, form the output of fixed dimension, classification output is carried out finally by a multilayer perceptron.The accuracy rate of category of model can be further improved, and with preferable versatility, good effect is all achieved on multiple corpus of test.
Description
Technical field
The present invention relates to natural language processing field, and in particular to one kind mixing shot and long term memory network and convolutional Neural net
The file classification method of network.
Background technology
Text automatic classification based on machine learning is a research most popular in natural language processing field in recent years
Direction, it is numerous in information retrieval, search engine, automatic question answering, ecommerce, digital library, automatic abstract, news portal etc.
Field has obtained extensive application.So-called text automatic classification refers on the premise of given taxonomic hierarchies, sharp
The process of text categories is automatically determined after being analyzed with the mode of machine learning come the content to text.1990s with
Before, text automatic classification mainly by the way of KBE, i.e., is classified, it has the disadvantage into by hand by professional
This height, waste time and energy.Since the nineties, many researchers start various statistical methods and machine learning method being applied to certainly
Dynamic text classification, such as support vector machines, AdaBoost algorithms, NB Algorithm, KNN algorithms and Logistic are returned
Return.In recent years, with the fast development of deep learning and various neural network models, the text classification side based on deep learning
Method causes the close attention and research of academia and industrial quarters, some typical neural network models, such as recurrent neural network
(using shot and long term memory network LSTM and GRU as main representative) and convolutional neural networks CNN are widely used in text
In classification, and achieve good effect.Existing research and application are proved recurrent neural network and are suitable for learning sentence
Long-term dependence between middle linguistic unit, convolutional neural networks are suitable for learning the local feature of sentence, but current grind
Study carefully and be not adequately bonded recurrent neural network and the respective advantage of convolutional neural networks, be also not bound with language in consideration sentence
The contextual information of unit.
The content of the invention
The purpose of the present invention is that there is provided one kind mixing shot and long term memory network and volume for above-mentioned the deficiencies in the prior art
The file classification method of product neutral net, using the information above and context information of word in two-way LSTM learning texts sentence, connects
Learning outcome and local feature is further extracted by CNN, then recycle one two-way LSTM layers to learn local feature
Between relation, finally learning outcome is classified and exported by a multilayer perceptron.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of file classification method for mixing shot and long term memory network and convolutional neural networks, methods described includes following step
Suddenly:
Step 1, the sentence in text is pre-processed, the distribution of lengths of sentence and square in combined training corpus
Unified sentence length is formed after difference, the length threshold for determining sentence, input text is obtained using the good term vector table of pre-training
In the vectorization of each word represent, form continuous and dense real number vector matrix;
Step 2, the sentence term vector for input, respectively by positive each word of LSTM e-learnings above
The context information of information and reverse each word of LSTM e-learnings, and the result of study is subjected to series connection merging, so that
Sentence term vector comprising semantic information is represented to be converted into while including the expression of semantic and contextual information;
Step 3, the word exported respectively to two-way LSTM networks using multiple different in width, the nuclear matrix comprising different weights
Vector matrix carries out two-dimensional convolution computing, extracts local convolution feature, and generate the local convolution eigenmatrix of multilayer;
Step 4, using one-dimensional maximum pond algorithm down-sampling is carried out to the local convolution eigenmatrix of multilayer, obtain sentence
Multilayer global characteristics matrix, and by result carry out series connection merging;
Step 5, using the LSTM networks of two opposite directions learn the long-term dependence between sentence local feature respectively,
And exported last learning outcome;
Step 6, the output result of step 5 is first passed through to a full connection hidden layer, it is then right by one softmax layers again
The classification of sentence is predicted.
Further, a kind of shot and long term memory network and the file classification method of convolutional neural networks of mixing is one
Completed in individual multilayer neural network, the step 1 is completed in first layer input layer, and step 2 is two-way LSTM layers in the second layer
Middle to complete, step 3 completion in CNN layer of third layer, step 4 is completed in the 4th layer of pond layer, and step 5 is two-way in layer 5
Completed in LSTM layers, step 6 is completed in layer 6 output layer.
Further, two-way LSTM layers of the second layer is used for the context letter for learning to be originally inputted the word of each in sentence
Breath, and export after the learning outcome of each word is connected, two-way LSTM layer of the layer 5 learns sentence spy after convolution
Contextual information between levying, and only export the learning outcome of final step.
Further, it is described to include punctuation mark filtering, abbreviation polishing to sentence progress pretreatment, delete sky in step 1
Lattice, participle and forbidden character are carried out to sentence filter.
Further, the step 3 is local feature learning process, passes through the two-dimensional convolution window of multiple different word step-lengths
The term vector comprising contextual information is learnt with convolution kernel, so as to obtain varigrained phrase information.
Further, the step 4 is sampling and reduction process, and multilayer is locally rolled up by one-dimensional maximum pond algorithm
Product eigenmatrix carries out down-sampling, obtains most important characteristic value in the window of the pond of each in sentence, and be used as local window
In character representation.
Further, the step 5 learns for the context of local feature, by between two-way LSTM study local features
Contextual information, and the learning outcome of last term vector is exported, while forming the one-dimensional output of fixed dimension.
Further, the step 6 is exported for classification, and classification judgement is carried out by a multilayer perceptron connected entirely,
And final output is obtained according to the probability distribution situation on specified taxonomic hierarchies.
Further, the step 6 is completed in the multilayer perceptron of two layers, including a full connection hidden layer and one
Individual softmax layers, the output result of step 6 is the prediction classification of correspondence text.
The present invention compared with prior art, has the following advantages that and beneficial effect:
The present invention is being learned by fully combining advantages and CNN of the two-way LSTM in terms of the contextual information of learning text
The advantage in terms of text local feature is practised, a kind of mixing LSTM and CNN file classification method is proposed, by using two-way LSTM
Learn word contextual information after, then by CNN further learn extract contextual information term vector local feature, then
Recycle two-way LSTM to learn the context of these local features, the output of fixed dimension is formed, finally by a multilayer sense
Know that device carries out classification output.The accuracy rate of category of model can be further improved, and with preferable versatility, in many of test
Good effect is all achieved on individual corpus.
Brief description of the drawings
Fig. 1 is the general frame figure of multilayer neural network model of the embodiment of the present invention.
Embodiment
With reference to embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited
In this.
Embodiment:
A kind of file classification method for mixing shot and long term memory network and convolutional neural networks is present embodiments provided, it is described
Method comprises the following steps:
Step 1, sentence in text is pre-processed, including punctuation mark filtering, abbreviation polishing, delete space, distich
Son carries out the distribution of lengths and mean square deviation of sentence in participle and forbidden character filtering, combined training corpus, determines the length of sentence
Unified sentence length is formed after degree threshold value, the vector of each word in input text is obtained using the good term vector table of pre-training
Change and represent, form continuous and dense real number vector matrix;
Step 2, the sentence term vector for input, respectively by positive each word of LSTM e-learnings above
The context information of information and reverse each word of LSTM e-learnings, and the result of study is subjected to series connection merging, so that
Sentence term vector comprising semantic information is represented to be converted into while including the expression of semantic and contextual information;
Step 3, the word exported respectively to two-way LSTM networks using multiple different in width, the nuclear matrix comprising different weights
Vector matrix carries out two-dimensional convolution computing, extracts local convolution feature, and generate the local convolution eigenmatrix of multilayer;
Step 4, using one-dimensional maximum pond algorithm down-sampling is carried out to the local convolution eigenmatrix of multilayer, obtain sentence
Multilayer global characteristics matrix, and by result carry out series connection merging;
Step 5, using the LSTM networks of two opposite directions learn the long-term dependence between sentence local feature respectively,
And exported last learning outcome;
Step 6, the output result of step 5 is first passed through to a full connection hidden layer, it is then right by one softmax layers again
The classification of sentence is predicted.
A kind of shot and long term memory network and the file classification method of convolutional neural networks of mixing described above is more than one
Completed in layer neutral net, the Organization Chart of multilayer neural network is as shown in figure 1, the step 1 is complete in first layer input layer
Into;Step 2 is completed in the second layer is two-way LSTM layers, wherein, two-way LSTM output dimension is 256 dimensions;Step 3 is in third layer
Completed in CNN layers, wherein, the convolution word step-length in CNN layers is respectively 2,3,4, and output dimension is 128 dimensions;Step 4 is at the 4th layer
Completed in the layer of pond, the word step-length of pond window is respectively 2,3,4, and use one-dimensional maximum pond;Step 5 is in layer 5
Completed in two-way LSTM layers, wherein two-way LSTM layers output dimension is 128 dimensions, and only export the study knot of last word
Really;Step 6 is completed in layer 6 output layer, and the output layer is the multilayer perceptron of two layers, including a full connection
Hidden layer and one softmax layers, the full connection hidden layer is 128 dimensions, and dropout values are 0.5, and the output result of step 6 is pair
Answer the prediction classification of text.Loss function is defined using polynary cross entropy during model training, and combines RMSProp optimizations
Device.
Wherein, two-way LSTM layers of the second layer is used for the contextual information for learning to be originally inputted the word of each in sentence, and
And exported after the learning outcome of each word is connected, after the two-way LSTM layers of study convolution of layer 5 between sentence characteristics
Contextual information, and only export the learning outcome of final step.
Wherein, the step 3 is local feature learning process, passes through the two-dimensional convolution window and volume of multiple different word step-lengths
Term vector of the product verification comprising contextual information is learnt, so as to obtain varigrained phrase information, the step 4 is to adopt
Sample and reduction process, carry out down-sampling to the local convolution eigenmatrix of multilayer by one-dimensional maximum pond algorithm, obtain sentence
In most important characteristic value in each pond window, and as the character representation in local window, the step 5 is local special
The context study levied, learns the contextual information between local feature, and export last term vector by two-way LSTM
Learning outcome, while formed fixed dimension one-dimensional output, the step 6 for classification export, by one connect entirely it is many
Layer perceptron carries out classification judgement, and obtains final output according to the probability distribution situation on specified taxonomic hierarchies.
It is described above, it is only patent preferred embodiment of the present invention, but the protection domain of patent of the present invention is not limited to
This, any one skilled in the art is in the scope disclosed in patent of the present invention, according to the skill of patent of the present invention
Art scheme and its patent of invention design are subject to equivalent substitution or change, belong to the protection domain of patent of the present invention.
Claims (9)
1. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks, it is characterised in that methods described
Comprise the following steps:
Step 1, the sentence in text is pre-processed, the distribution of lengths and mean square deviation of sentence in combined training corpus, really
Unified sentence length is formed after the length threshold for determining sentence, obtains each in input text using the good term vector table of pre-training
The vectorization of individual word is represented, forms continuous and dense real number vector matrix;
Step 2, the sentence term vector for input, pass through the information above of positive each word of LSTM e-learnings respectively
With the context information of reverse each word of LSTM e-learnings, and the result of study is subjected to series connection merging, so that will bag
Sentence term vector containing semantic information represents to be converted into while including the expression of semantic and contextual information;
Step 3, the term vector exported respectively to two-way LSTM networks using multiple different in width, the nuclear matrix comprising different weights
Matrix carries out two-dimensional convolution computing, extracts local convolution feature, and generate the local convolution eigenmatrix of multilayer;
Step 4, using one-dimensional maximum pond algorithm down-sampling is carried out to the local convolution eigenmatrix of multilayer, obtain many of sentence
Layer global characteristics matrix, and result is subjected to series connection merging;
Step 5, the long-term dependence between sentence local feature that learnt respectively using the LSTM networks of two opposite directions, and will
Last learning outcome is exported;
Step 6, the output result of step 5 is first passed through to a full connection hidden layer, then again by one softmax layers to sentence
Classification be predicted.
2. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:A kind of shot and long term memory network and the file classification method of convolutional neural networks of mixing is in a multilayer
Completed in neutral net, the step 1 is completed in first layer input layer, step 2 is completed in the second layer is two-way LSTM layers,
Step 3 is completed in CNN layers of third layer, and step 4 is completed in the 4th layer of pond layer, and step 5 is in layer 5 is two-way LSTM layers
Complete, step 6 is completed in layer 6 output layer.
3. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 2,
It is characterized in that:Two-way LSTM layers of the second layer is used for the contextual information for learning to be originally inputted the word of each in sentence, and
Exported after the learning outcome of each word is connected, it is upper between sentence characteristics after the two-way LSTM layers of study convolution of layer 5
Context information, and only export the learning outcome of final step.
4. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:It is described pretreatment is carried out to sentence to include punctuation mark filtering, abbreviation polishing, deletion space, right in step 1
Sentence carries out participle and forbidden character filtering.
5. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:The step 3 is local feature learning process, passes through the two-dimensional convolution window and convolution of multiple different word step-lengths
Term vector of the verification comprising contextual information is learnt, so as to obtain varigrained phrase information.
6. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:The step 4 is sampling and reduction process, by one-dimensional maximum pond algorithm to the local convolution feature of multilayer
Matrix carries out down-sampling, obtains most important characteristic value in the window of the pond of each in sentence, and be used as the spy in local window
Levy expression.
7. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:The step 5 learns for the context of local feature, above and below between two-way LSTM study local features
Literary information, and the learning outcome of last term vector is exported, while forming the one-dimensional output of fixed dimension.
8. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:The step 6 exports for classification, by the multilayer perceptron connected entirely a progress classification judgement, and according to
The probability distribution situation on taxonomic hierarchies is specified to obtain final output.
9. a kind of file classification method for mixing shot and long term memory network and convolutional neural networks according to claim 1,
It is characterized in that:The step 6 is completed in the multilayer perceptron of two layers, including a full connection hidden layer and one
Softmax layers, the output result of step 6 is the prediction classification of correspondence text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710257132.6A CN107169035B (en) | 2017-04-19 | 2017-04-19 | A kind of file classification method mixing shot and long term memory network and convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710257132.6A CN107169035B (en) | 2017-04-19 | 2017-04-19 | A kind of file classification method mixing shot and long term memory network and convolutional neural networks |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107169035A true CN107169035A (en) | 2017-09-15 |
CN107169035B CN107169035B (en) | 2019-10-18 |
Family
ID=59812256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710257132.6A Expired - Fee Related CN107169035B (en) | 2017-04-19 | 2017-04-19 | A kind of file classification method mixing shot and long term memory network and convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107169035B (en) |
Cited By (69)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107679199A (en) * | 2017-10-11 | 2018-02-09 | 北京邮电大学 | A kind of external the Chinese text readability analysis method based on depth local feature |
CN107832400A (en) * | 2017-11-01 | 2018-03-23 | 山东大学 | A kind of method that location-based LSTM and CNN conjunctive models carry out relation classification |
CN107908620A (en) * | 2017-11-15 | 2018-04-13 | 珠海金山网络游戏科技有限公司 | A kind of method and apparatus based on job documentation anticipation user's occupation |
CN108334499A (en) * | 2018-02-08 | 2018-07-27 | 海南云江科技有限公司 | A kind of text label tagging equipment, method and computing device |
CN108376558A (en) * | 2018-01-24 | 2018-08-07 | 复旦大学 | A kind of multi-modal nuclear magnetic resonance image Case report no automatic generation method |
CN108415972A (en) * | 2018-02-08 | 2018-08-17 | 合肥工业大学 | text emotion processing method |
CN108415923A (en) * | 2017-10-18 | 2018-08-17 | 北京邮电大学 | The intelligent interactive system of closed domain |
CN108520320A (en) * | 2018-03-30 | 2018-09-11 | 华中科技大学 | A kind of equipment life prediction technique based on multiple shot and long term memory network and Empirical Bayes |
CN108536825A (en) * | 2018-04-10 | 2018-09-14 | 苏州市中地行信息技术有限公司 | A method of whether identification source of houses data repeat |
CN108595409A (en) * | 2018-03-16 | 2018-09-28 | 上海大学 | A kind of requirement documents based on neural network and service document matches method |
CN108595429A (en) * | 2018-04-25 | 2018-09-28 | 杭州闪捷信息科技股份有限公司 | The method for carrying out Text character extraction based on depth convolutional neural networks |
CN108595440A (en) * | 2018-05-11 | 2018-09-28 | 厦门市美亚柏科信息股份有限公司 | Short text content categorizing method and system |
CN108614815A (en) * | 2018-05-07 | 2018-10-02 | 华东师范大学 | Sentence exchange method and device |
CN108710651A (en) * | 2018-05-08 | 2018-10-26 | 华南理工大学 | A kind of large scale customer complaint data automatic classification method |
CN108717439A (en) * | 2018-05-16 | 2018-10-30 | 哈尔滨理工大学 | A kind of Chinese Text Categorization merged based on attention mechanism and characteristic strengthening |
CN108804591A (en) * | 2018-05-28 | 2018-11-13 | 杭州依图医疗技术有限公司 | A kind of file classification method and device of case history text |
CN108829737A (en) * | 2018-05-21 | 2018-11-16 | 浙江大学 | Text combined crosswise classification method based on two-way shot and long term memory network |
CN108874776A (en) * | 2018-06-11 | 2018-11-23 | 北京奇艺世纪科技有限公司 | A kind of recognition methods of rubbish text and device |
CN108961816A (en) * | 2018-07-19 | 2018-12-07 | 泰华智慧产业集团股份有限公司 | Road parking berth prediction technique based on optimization LSTM model |
CN108984745A (en) * | 2018-07-16 | 2018-12-11 | 福州大学 | A kind of neural network file classification method merging more knowledge mappings |
CN109033413A (en) * | 2018-03-12 | 2018-12-18 | 上海大学 | A kind of requirement documents neural network based and service document matches method |
CN109062996A (en) * | 2018-07-05 | 2018-12-21 | 贵州威爱教育科技有限公司 | A kind of management method and system of cloud file |
CN109086892A (en) * | 2018-06-15 | 2018-12-25 | 中山大学 | It is a kind of based on the visual problem inference pattern and system that typically rely on tree |
CN109101552A (en) * | 2018-07-10 | 2018-12-28 | 东南大学 | A kind of fishing website URL detection method based on deep learning |
CN109213896A (en) * | 2018-08-06 | 2019-01-15 | 杭州电子科技大学 | Underwater video abstraction generating method based on shot and long term memory network intensified learning |
CN109241284A (en) * | 2018-08-27 | 2019-01-18 | 中国人民解放军国防科技大学 | Document classification method and device |
CN109271521A (en) * | 2018-11-16 | 2019-01-25 | 北京九狐时代智能科技有限公司 | A kind of file classification method and device |
CN109271537A (en) * | 2018-08-10 | 2019-01-25 | 北京大学 | A kind of text based on distillation study is to image generating method and system |
CN109308355A (en) * | 2018-09-17 | 2019-02-05 | 清华大学 | Legal decision prediction of result method and device |
CN109359198A (en) * | 2018-12-04 | 2019-02-19 | 北京容联易通信息技术有限公司 | A kind of file classification method and device |
CN109508377A (en) * | 2018-11-26 | 2019-03-22 | 南京云思创智信息科技有限公司 | Text feature, device, chat robots and storage medium based on Fusion Model |
CN109508811A (en) * | 2018-09-30 | 2019-03-22 | 中冶华天工程技术有限公司 | Parameter prediction method is discharged based on principal component analysis and the sewage treatment of shot and long term memory network |
CN109542585A (en) * | 2018-11-14 | 2019-03-29 | 山东大学 | A kind of Virtual Machine Worker load predicting method for supporting irregular time interval |
CN109582794A (en) * | 2018-11-29 | 2019-04-05 | 南京信息工程大学 | Long article classification method based on deep learning |
WO2019080864A1 (en) * | 2017-10-27 | 2019-05-02 | 阿里巴巴集团控股有限公司 | Semantic encoding method and device for text |
CN109726268A (en) * | 2018-08-29 | 2019-05-07 | 中国人民解放军国防科技大学 | Text representation method and device based on hierarchical neural network |
CN109743732A (en) * | 2018-12-20 | 2019-05-10 | 重庆邮电大学 | Refuse messages method of discrimination based on improved CNN-LSTM |
CN109815456A (en) * | 2019-02-13 | 2019-05-28 | 北京航空航天大学 | A method of it is compressed based on term vector memory space of the character to coding |
CN109840279A (en) * | 2019-01-10 | 2019-06-04 | 山东亿云信息技术有限公司 | File classification method based on convolution loop neural network |
CN109902293A (en) * | 2019-01-30 | 2019-06-18 | 华南理工大学 | A kind of file classification method based on part with global mutually attention mechanism |
CN109902301A (en) * | 2019-02-26 | 2019-06-18 | 广东工业大学 | Relation inference method, device and equipment based on deep neural network |
CN109918503A (en) * | 2019-01-29 | 2019-06-21 | 华南理工大学 | The slot fill method of semantic feature is extracted from attention mechanism based on dynamic window |
CN109947932A (en) * | 2017-10-27 | 2019-06-28 | 中移(苏州)软件技术有限公司 | A kind of pushed information classification method and system |
CN109992781A (en) * | 2019-04-02 | 2019-07-09 | 腾讯科技(深圳)有限公司 | Processing, device, storage medium and the processor of text feature |
CN110020431A (en) * | 2019-03-06 | 2019-07-16 | 平安科技(深圳)有限公司 | Feature extracting method, device, computer equipment and the storage medium of text information |
CN110019784A (en) * | 2017-09-29 | 2019-07-16 | 北京国双科技有限公司 | A kind of file classification method and device |
CN110046253A (en) * | 2019-04-10 | 2019-07-23 | 广州大学 | A kind of prediction technique of language conflict |
CN110059192A (en) * | 2019-05-15 | 2019-07-26 | 北京信息科技大学 | Character level file classification method based on five codes |
CN110083832A (en) * | 2019-04-17 | 2019-08-02 | 北大方正集团有限公司 | Recognition methods, device, equipment and the readable storage medium storing program for executing of article reprinting relationship |
CN110196913A (en) * | 2019-05-23 | 2019-09-03 | 北京邮电大学 | Multiple entity relationship joint abstracting method and device based on text generation formula |
CN110222953A (en) * | 2018-12-29 | 2019-09-10 | 北京理工大学 | A kind of power quality hybrid perturbation analysis method based on deep learning |
CN110263152A (en) * | 2019-05-07 | 2019-09-20 | 平安科技(深圳)有限公司 | File classification method, system and computer equipment neural network based |
CN110704890A (en) * | 2019-08-12 | 2020-01-17 | 上海大学 | Automatic text causal relationship extraction method fusing convolutional neural network and cyclic neural network |
CN110781939A (en) * | 2019-10-17 | 2020-02-11 | 中国铁塔股份有限公司 | Method and device for detecting similar pictures and project management system |
CN110837227A (en) * | 2018-08-15 | 2020-02-25 | 格力电器(武汉)有限公司 | Electric appliance control method and device |
CN111126556A (en) * | 2018-10-31 | 2020-05-08 | 百度在线网络技术(北京)有限公司 | Training method and device of artificial neural network model |
CN111371806A (en) * | 2020-03-18 | 2020-07-03 | 北京邮电大学 | Web attack detection method and device |
CN111552808A (en) * | 2020-04-20 | 2020-08-18 | 北京北大软件工程股份有限公司 | Administrative illegal case law prediction method and tool based on convolutional neural network |
CN111914085A (en) * | 2020-06-18 | 2020-11-10 | 华南理工大学 | Text fine-grained emotion classification method, system, device and storage medium |
CN112052675A (en) * | 2020-08-21 | 2020-12-08 | 北京邮电大学 | Method and device for detecting sensitive information of unstructured text |
WO2021004118A1 (en) * | 2019-07-05 | 2021-01-14 | 深圳壹账通智能科技有限公司 | Correlation value determination method and apparatus |
CN112434156A (en) * | 2020-11-02 | 2021-03-02 | 浙江大有实业有限公司杭州科技发展分公司 | Power grid operation warning method and device based on mixed text classification model |
US11010560B2 (en) | 2018-11-08 | 2021-05-18 | International Business Machines Corporation | Multi-resolution convolutional neural networks for sequence modeling |
CN112883708A (en) * | 2021-02-25 | 2021-06-01 | 哈尔滨工业大学 | Text inclusion recognition method based on 2D-LSTM |
CN113780610A (en) * | 2020-12-02 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Customer service portrait construction method and device |
WO2022227211A1 (en) * | 2021-04-30 | 2022-11-03 | 平安科技(深圳)有限公司 | Bert-based multi-intention recognition method for discourse, and device and readable storage medium |
CN115563286A (en) * | 2022-11-10 | 2023-01-03 | 东北农业大学 | Knowledge-driven milk cow disease text classification method |
CN116308464A (en) * | 2023-05-11 | 2023-06-23 | 广州钛动科技股份有限公司 | Target client acquisition system and method |
CN116721361A (en) * | 2023-06-09 | 2023-09-08 | 中国测绘科学研究院 | Wetland remote sensing extraction method compatible with space-time discontinuous images |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104572892A (en) * | 2014-12-24 | 2015-04-29 | 中国科学院自动化研究所 | Text classification method based on cyclic convolution network |
CN104834747A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Short text classification method based on convolution neutral network |
US20170032221A1 (en) * | 2015-07-29 | 2017-02-02 | Htc Corporation | Method, electronic apparatus, and computer readable medium of constructing classifier for disease detection |
CN106547735A (en) * | 2016-10-25 | 2017-03-29 | 复旦大学 | The structure and using method of the dynamic word or word vector based on the context-aware of deep learning |
-
2017
- 2017-04-19 CN CN201710257132.6A patent/CN107169035B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104572892A (en) * | 2014-12-24 | 2015-04-29 | 中国科学院自动化研究所 | Text classification method based on cyclic convolution network |
CN104834747A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Short text classification method based on convolution neutral network |
US20170032221A1 (en) * | 2015-07-29 | 2017-02-02 | Htc Corporation | Method, electronic apparatus, and computer readable medium of constructing classifier for disease detection |
CN106547735A (en) * | 2016-10-25 | 2017-03-29 | 复旦大学 | The structure and using method of the dynamic word or word vector based on the context-aware of deep learning |
Non-Patent Citations (2)
Title |
---|
MINLIE HUANG: ""Modeling Rich Contexts for Sentiment Classification with LSTM"", 《ARXIV PREPRINT ARXIV:1605.01478》 * |
黄磊等: ""基于递归神经网络的文本分类研究"", 《北京化工大学学报》 * |
Cited By (98)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110019784A (en) * | 2017-09-29 | 2019-07-16 | 北京国双科技有限公司 | A kind of file classification method and device |
CN107679199A (en) * | 2017-10-11 | 2018-02-09 | 北京邮电大学 | A kind of external the Chinese text readability analysis method based on depth local feature |
CN108415923A (en) * | 2017-10-18 | 2018-08-17 | 北京邮电大学 | The intelligent interactive system of closed domain |
CN108415923B (en) * | 2017-10-18 | 2020-12-11 | 北京邮电大学 | Intelligent man-machine conversation system of closed domain |
CN109947932A (en) * | 2017-10-27 | 2019-06-28 | 中移(苏州)软件技术有限公司 | A kind of pushed information classification method and system |
CN110019793A (en) * | 2017-10-27 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of text semantic coding method and device |
WO2019080864A1 (en) * | 2017-10-27 | 2019-05-02 | 阿里巴巴集团控股有限公司 | Semantic encoding method and device for text |
JP2021501390A (en) * | 2017-10-27 | 2021-01-14 | アリババ・グループ・ホールディング・リミテッドAlibaba Group Holding Limited | Text Semantic Coding Methods and Devices |
CN107832400B (en) * | 2017-11-01 | 2019-04-16 | 山东大学 | A kind of method that location-based LSTM and CNN conjunctive model carries out relationship classification |
CN107832400A (en) * | 2017-11-01 | 2018-03-23 | 山东大学 | A kind of method that location-based LSTM and CNN conjunctive models carry out relation classification |
CN107908620A (en) * | 2017-11-15 | 2018-04-13 | 珠海金山网络游戏科技有限公司 | A kind of method and apparatus based on job documentation anticipation user's occupation |
CN108376558A (en) * | 2018-01-24 | 2018-08-07 | 复旦大学 | A kind of multi-modal nuclear magnetic resonance image Case report no automatic generation method |
CN108376558B (en) * | 2018-01-24 | 2021-08-20 | 复旦大学 | Automatic generation method for multi-modal nuclear magnetic resonance image medical record report |
CN108334499A (en) * | 2018-02-08 | 2018-07-27 | 海南云江科技有限公司 | A kind of text label tagging equipment, method and computing device |
CN108415972A (en) * | 2018-02-08 | 2018-08-17 | 合肥工业大学 | text emotion processing method |
CN108334499B (en) * | 2018-02-08 | 2022-03-18 | 海南云江科技有限公司 | Text label labeling device and method and computing device |
CN109033413B (en) * | 2018-03-12 | 2022-12-23 | 上海大学 | Neural network-based demand document and service document matching method |
CN109033413A (en) * | 2018-03-12 | 2018-12-18 | 上海大学 | A kind of requirement documents neural network based and service document matches method |
CN108595409A (en) * | 2018-03-16 | 2018-09-28 | 上海大学 | A kind of requirement documents based on neural network and service document matches method |
CN108520320A (en) * | 2018-03-30 | 2018-09-11 | 华中科技大学 | A kind of equipment life prediction technique based on multiple shot and long term memory network and Empirical Bayes |
CN108536825A (en) * | 2018-04-10 | 2018-09-14 | 苏州市中地行信息技术有限公司 | A method of whether identification source of houses data repeat |
CN108595429A (en) * | 2018-04-25 | 2018-09-28 | 杭州闪捷信息科技股份有限公司 | The method for carrying out Text character extraction based on depth convolutional neural networks |
CN108614815A (en) * | 2018-05-07 | 2018-10-02 | 华东师范大学 | Sentence exchange method and device |
CN108710651A (en) * | 2018-05-08 | 2018-10-26 | 华南理工大学 | A kind of large scale customer complaint data automatic classification method |
CN108710651B (en) * | 2018-05-08 | 2022-03-25 | 华南理工大学 | Automatic classification method for large-scale customer complaint data |
CN108595440A (en) * | 2018-05-11 | 2018-09-28 | 厦门市美亚柏科信息股份有限公司 | Short text content categorizing method and system |
CN108595440B (en) * | 2018-05-11 | 2022-03-18 | 厦门市美亚柏科信息股份有限公司 | Short text content classification method and system |
CN108717439A (en) * | 2018-05-16 | 2018-10-30 | 哈尔滨理工大学 | A kind of Chinese Text Categorization merged based on attention mechanism and characteristic strengthening |
CN108829737B (en) * | 2018-05-21 | 2021-11-05 | 浙江大学 | Text cross combination classification method based on bidirectional long-short term memory network |
CN108829737A (en) * | 2018-05-21 | 2018-11-16 | 浙江大学 | Text combined crosswise classification method based on two-way shot and long term memory network |
CN108804591A (en) * | 2018-05-28 | 2018-11-13 | 杭州依图医疗技术有限公司 | A kind of file classification method and device of case history text |
CN108874776A (en) * | 2018-06-11 | 2018-11-23 | 北京奇艺世纪科技有限公司 | A kind of recognition methods of rubbish text and device |
CN108874776B (en) * | 2018-06-11 | 2022-06-03 | 北京奇艺世纪科技有限公司 | Junk text recognition method and device |
CN109086892A (en) * | 2018-06-15 | 2018-12-25 | 中山大学 | It is a kind of based on the visual problem inference pattern and system that typically rely on tree |
CN109086892B (en) * | 2018-06-15 | 2022-02-18 | 中山大学 | General dependency tree-based visual problem reasoning model and system |
CN109062996A (en) * | 2018-07-05 | 2018-12-21 | 贵州威爱教育科技有限公司 | A kind of management method and system of cloud file |
CN109101552B (en) * | 2018-07-10 | 2022-01-28 | 东南大学 | Phishing website URL detection method based on deep learning |
CN109101552A (en) * | 2018-07-10 | 2018-12-28 | 东南大学 | A kind of fishing website URL detection method based on deep learning |
CN108984745B (en) * | 2018-07-16 | 2021-11-02 | 福州大学 | Neural network text classification method fusing multiple knowledge maps |
CN108984745A (en) * | 2018-07-16 | 2018-12-11 | 福州大学 | A kind of neural network file classification method merging more knowledge mappings |
CN108961816A (en) * | 2018-07-19 | 2018-12-07 | 泰华智慧产业集团股份有限公司 | Road parking berth prediction technique based on optimization LSTM model |
CN109213896B (en) * | 2018-08-06 | 2021-06-01 | 杭州电子科技大学 | Underwater video abstract generation method based on long-short term memory network reinforcement learning |
CN109213896A (en) * | 2018-08-06 | 2019-01-15 | 杭州电子科技大学 | Underwater video abstraction generating method based on shot and long term memory network intensified learning |
CN109271537A (en) * | 2018-08-10 | 2019-01-25 | 北京大学 | A kind of text based on distillation study is to image generating method and system |
CN109271537B (en) * | 2018-08-10 | 2021-11-23 | 北京大学 | Text-to-image generation method and system based on distillation learning |
CN110837227A (en) * | 2018-08-15 | 2020-02-25 | 格力电器(武汉)有限公司 | Electric appliance control method and device |
CN109241284A (en) * | 2018-08-27 | 2019-01-18 | 中国人民解放军国防科技大学 | Document classification method and device |
CN109726268A (en) * | 2018-08-29 | 2019-05-07 | 中国人民解放军国防科技大学 | Text representation method and device based on hierarchical neural network |
CN109308355A (en) * | 2018-09-17 | 2019-02-05 | 清华大学 | Legal decision prediction of result method and device |
CN109308355B (en) * | 2018-09-17 | 2020-03-13 | 清华大学 | Legal judgment result prediction method and device |
CN109508811A (en) * | 2018-09-30 | 2019-03-22 | 中冶华天工程技术有限公司 | Parameter prediction method is discharged based on principal component analysis and the sewage treatment of shot and long term memory network |
CN111126556B (en) * | 2018-10-31 | 2023-07-25 | 百度在线网络技术(北京)有限公司 | Training method and device for artificial neural network model |
CN111126556A (en) * | 2018-10-31 | 2020-05-08 | 百度在线网络技术(北京)有限公司 | Training method and device of artificial neural network model |
US11010560B2 (en) | 2018-11-08 | 2021-05-18 | International Business Machines Corporation | Multi-resolution convolutional neural networks for sequence modeling |
CN109542585A (en) * | 2018-11-14 | 2019-03-29 | 山东大学 | A kind of Virtual Machine Worker load predicting method for supporting irregular time interval |
CN109542585B (en) * | 2018-11-14 | 2020-06-16 | 山东大学 | Virtual machine workload prediction method supporting irregular time intervals |
CN109271521A (en) * | 2018-11-16 | 2019-01-25 | 北京九狐时代智能科技有限公司 | A kind of file classification method and device |
CN109508377A (en) * | 2018-11-26 | 2019-03-22 | 南京云思创智信息科技有限公司 | Text feature, device, chat robots and storage medium based on Fusion Model |
CN109582794A (en) * | 2018-11-29 | 2019-04-05 | 南京信息工程大学 | Long article classification method based on deep learning |
CN109359198A (en) * | 2018-12-04 | 2019-02-19 | 北京容联易通信息技术有限公司 | A kind of file classification method and device |
CN109743732B (en) * | 2018-12-20 | 2022-05-10 | 重庆邮电大学 | Junk short message distinguishing method based on improved CNN-LSTM |
CN109743732A (en) * | 2018-12-20 | 2019-05-10 | 重庆邮电大学 | Refuse messages method of discrimination based on improved CNN-LSTM |
CN110222953A (en) * | 2018-12-29 | 2019-09-10 | 北京理工大学 | A kind of power quality hybrid perturbation analysis method based on deep learning |
CN109840279A (en) * | 2019-01-10 | 2019-06-04 | 山东亿云信息技术有限公司 | File classification method based on convolution loop neural network |
CN109918503A (en) * | 2019-01-29 | 2019-06-21 | 华南理工大学 | The slot fill method of semantic feature is extracted from attention mechanism based on dynamic window |
CN109918503B (en) * | 2019-01-29 | 2020-12-22 | 华南理工大学 | Groove filling method for extracting semantic features based on dynamic window self-attention mechanism |
CN109902293A (en) * | 2019-01-30 | 2019-06-18 | 华南理工大学 | A kind of file classification method based on part with global mutually attention mechanism |
CN109815456A (en) * | 2019-02-13 | 2019-05-28 | 北京航空航天大学 | A method of it is compressed based on term vector memory space of the character to coding |
CN109902301B (en) * | 2019-02-26 | 2023-02-10 | 广东工业大学 | Deep neural network-based relationship reasoning method, device and equipment |
CN109902301A (en) * | 2019-02-26 | 2019-06-18 | 广东工业大学 | Relation inference method, device and equipment based on deep neural network |
CN110020431A (en) * | 2019-03-06 | 2019-07-16 | 平安科技(深圳)有限公司 | Feature extracting method, device, computer equipment and the storage medium of text information |
CN109992781A (en) * | 2019-04-02 | 2019-07-09 | 腾讯科技(深圳)有限公司 | Processing, device, storage medium and the processor of text feature |
CN110046253A (en) * | 2019-04-10 | 2019-07-23 | 广州大学 | A kind of prediction technique of language conflict |
CN110046253B (en) * | 2019-04-10 | 2022-01-04 | 广州大学 | Language conflict prediction method |
CN110083832A (en) * | 2019-04-17 | 2019-08-02 | 北大方正集团有限公司 | Recognition methods, device, equipment and the readable storage medium storing program for executing of article reprinting relationship |
CN110083832B (en) * | 2019-04-17 | 2020-12-29 | 北大方正集团有限公司 | Article reprint relation identification method, device, equipment and readable storage medium |
CN110263152B (en) * | 2019-05-07 | 2024-04-09 | 平安科技(深圳)有限公司 | Text classification method, system and computer equipment based on neural network |
CN110263152A (en) * | 2019-05-07 | 2019-09-20 | 平安科技(深圳)有限公司 | File classification method, system and computer equipment neural network based |
CN110059192A (en) * | 2019-05-15 | 2019-07-26 | 北京信息科技大学 | Character level file classification method based on five codes |
CN110196913A (en) * | 2019-05-23 | 2019-09-03 | 北京邮电大学 | Multiple entity relationship joint abstracting method and device based on text generation formula |
WO2021004118A1 (en) * | 2019-07-05 | 2021-01-14 | 深圳壹账通智能科技有限公司 | Correlation value determination method and apparatus |
CN110704890A (en) * | 2019-08-12 | 2020-01-17 | 上海大学 | Automatic text causal relationship extraction method fusing convolutional neural network and cyclic neural network |
CN110781939A (en) * | 2019-10-17 | 2020-02-11 | 中国铁塔股份有限公司 | Method and device for detecting similar pictures and project management system |
CN111371806A (en) * | 2020-03-18 | 2020-07-03 | 北京邮电大学 | Web attack detection method and device |
CN111552808A (en) * | 2020-04-20 | 2020-08-18 | 北京北大软件工程股份有限公司 | Administrative illegal case law prediction method and tool based on convolutional neural network |
CN111914085A (en) * | 2020-06-18 | 2020-11-10 | 华南理工大学 | Text fine-grained emotion classification method, system, device and storage medium |
CN111914085B (en) * | 2020-06-18 | 2024-04-23 | 华南理工大学 | Text fine granularity emotion classification method, system, device and storage medium |
CN112052675A (en) * | 2020-08-21 | 2020-12-08 | 北京邮电大学 | Method and device for detecting sensitive information of unstructured text |
CN112434156A (en) * | 2020-11-02 | 2021-03-02 | 浙江大有实业有限公司杭州科技发展分公司 | Power grid operation warning method and device based on mixed text classification model |
CN113780610A (en) * | 2020-12-02 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Customer service portrait construction method and device |
CN112883708A (en) * | 2021-02-25 | 2021-06-01 | 哈尔滨工业大学 | Text inclusion recognition method based on 2D-LSTM |
WO2022227211A1 (en) * | 2021-04-30 | 2022-11-03 | 平安科技(深圳)有限公司 | Bert-based multi-intention recognition method for discourse, and device and readable storage medium |
CN115563286A (en) * | 2022-11-10 | 2023-01-03 | 东北农业大学 | Knowledge-driven milk cow disease text classification method |
CN115563286B (en) * | 2022-11-10 | 2023-12-01 | 东北农业大学 | Knowledge-driven dairy cow disease text classification method |
CN116308464B (en) * | 2023-05-11 | 2023-09-08 | 广州市沃钛移动科技有限公司 | Target client acquisition system and method |
CN116308464A (en) * | 2023-05-11 | 2023-06-23 | 广州钛动科技股份有限公司 | Target client acquisition system and method |
CN116721361A (en) * | 2023-06-09 | 2023-09-08 | 中国测绘科学研究院 | Wetland remote sensing extraction method compatible with space-time discontinuous images |
CN116721361B (en) * | 2023-06-09 | 2024-01-02 | 中国测绘科学研究院 | Wetland remote sensing extraction method compatible with space-time discontinuous images |
Also Published As
Publication number | Publication date |
---|---|
CN107169035B (en) | 2019-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107169035B (en) | A kind of file classification method mixing shot and long term memory network and convolutional neural networks | |
CN110866117B (en) | Short text classification method based on semantic enhancement and multi-level label embedding | |
CN107133213B (en) | Method and system for automatically extracting text abstract based on algorithm | |
CN107066553B (en) | Short text classification method based on convolutional neural network and random forest | |
Wang et al. | Research on Web text classification algorithm based on improved CNN and SVM | |
CN107832400A (en) | A kind of method that location-based LSTM and CNN conjunctive models carry out relation classification | |
CN109189925A (en) | Term vector model based on mutual information and based on the file classification method of CNN | |
CN107818164A (en) | A kind of intelligent answer method and its system | |
CN107918782A (en) | A kind of method and system for the natural language for generating description picture material | |
CN110321563B (en) | Text emotion analysis method based on hybrid supervision model | |
CN106599933A (en) | Text emotion classification method based on the joint deep learning model | |
CN107193801A (en) | A kind of short text characteristic optimization and sentiment analysis method based on depth belief network | |
CN110765260A (en) | Information recommendation method based on convolutional neural network and joint attention mechanism | |
CN107451278A (en) | Chinese Text Categorization based on more hidden layer extreme learning machines | |
CN111143563A (en) | Text classification method based on integration of BERT, LSTM and CNN | |
CN110415071B (en) | Automobile competitive product comparison method based on viewpoint mining analysis | |
CN111858878B (en) | Method, system and storage medium for automatically extracting answer from natural language text | |
Zhang | Research on text classification method based on LSTM neural network model | |
CN112287106A (en) | Online comment emotion classification method based on dual-channel hybrid neural network | |
CN106570170A (en) | Text classification and naming entity recognition integrated method and system based on depth cyclic neural network | |
CN116484262B (en) | Textile equipment fault auxiliary processing method based on text classification | |
CN113220890A (en) | Deep learning method combining news headlines and news long text contents based on pre-training | |
CN111651602A (en) | Text classification method and system | |
CN114239585A (en) | Biomedical nested named entity recognition method | |
CN114462420A (en) | False news detection method based on feature fusion model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191018 |
|
CF01 | Termination of patent right due to non-payment of annual fee |