CN105740226A - Method for implementing Chinese segmentation by using tree neural network and bilateral neural network - Google Patents
Method for implementing Chinese segmentation by using tree neural network and bilateral neural network Download PDFInfo
- Publication number
- CN105740226A CN105740226A CN201610037336.4A CN201610037336A CN105740226A CN 105740226 A CN105740226 A CN 105740226A CN 201610037336 A CN201610037336 A CN 201610037336A CN 105740226 A CN105740226 A CN 105740226A
- Authority
- CN
- China
- Prior art keywords
- sentence
- term memory
- long term
- neural networks
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 25
- 230000011218 segmentation Effects 0.000 title claims abstract description 21
- 230000002146 bilateral effect Effects 0.000 title abstract 3
- 238000002372 labelling Methods 0.000 claims abstract 2
- 230000007787 long-term memory Effects 0.000 claims description 31
- 230000007935 neutral effect Effects 0.000 claims description 21
- 238000003062 neural network model Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 2
- 230000015654 memory Effects 0.000 abstract description 3
- 238000004590 computer program Methods 0.000 abstract 1
- 239000011159 matrix material Substances 0.000 description 4
- 230000001537 neural effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention relates to a method, a system, a device and a computer program for implementing Chinese segmentation by using a tree neural network and a bilateral neural network. The method comprises the steps of converting each character in an input sentence into a character vector as a first input sequence serving as the input of a three-layer long-short term memory neural network, i.e., the tree neural network, meanwhile, generating a second input sequence using a sentence vector as an initial value of each hidden layer, transmitting the second input sequence to the bilateral long-short term memory neural network, generating a third input sequence using the sentence vector as an initial value of each hidden layer, transmitting the third input sequence to a logSoftMax layer, i.e., a multi-classification layer, and finally generating a segmentation labeling sequence.
Description
Technical field
The invention belongs to natural language processing field, be directed to use with tree-like neutral net and method that two way blocks realizes Chinese word segmentation.
Background technology
Conventional traditional Chinese participle technique includes by word traversal, based on the segmenting method etc. of the segmenting method of dictionary dictionary coupling, full cutting and the frequency statistics based on word, and these methods are all the modes based on algorithm.Also having the segmenting method based on model that two comparisons are famous in traditional method, hidden Markov model, conditional random field models, the two model is all by sequence, obtains target sequence, and wherein conditional random field models effect is better than hidden Markov model.Along with the maturation of the lifting of computer computation ability and neural network model, a kind of method using tree-like neutral net two way blocks to realize Chinese word segmentation is proposed here.
Summary of the invention
It is an object of the invention to propose a kind of method based on neural fusion Chinese word segmentation at least to a certain extent.Illustrate how the mark of word segmentation sequence that the sentence generation by inputting is corresponding.
In order to realize object above, the technical solution used in the present invention is: obtain input sentence, each word in sentence converts to word vector input as first, first input is passed to three layers shot and long term Memory Neural Networks and tree-like neutral net produces the second input, thus realizing phrase, the extraction of semantic information, second input is passed to two-way shot and long term Memory Neural Networks, and the initial input of hidden layer is initialized by special mode, produce the 3rd input, thus realizing the extraction of word contextual information, 3rd input is passed to logSoftMax layer is layer of classifying more, obtain final mark of word segmentation sequence.In order to obtain tree-like information, it is necessary to each network is individually trained, more whole neutral net is trained.
The details of some embodiments of theme described in the following drawings and this specification described in illustrating.According to illustrating, drawings and claims book, it can be apparent for using tree-like and two way blocks to realize other features of method of Chinese word segmentation, aspect and advantage.
Accompanying drawing explanation
Fig. 1 illustrates whole neural network structure
Fig. 2 illustrates part three layers shot and long term Memory Neural Networks
Fig. 3 illustrates a two-way-shot and long term Memory Neural Networks
Detailed description of the invention
Below in conjunction with the accompanying drawing in the present invention, whole technical scheme and whole neutral net are carried out clearly, complete explanation.
Present disclosure is in that to provide a kind of technical solution carrying out Chinese word segmentation based on neutral net, including four parts, sentence is converted to vector portion, training three layers shot and long term Memory Neural Networks and tree-like part of neural network, train two-way shot and long term Memory Neural Networks part, train whole neutral net.
Fig. 1 illustrates from input sentence to the whole flow process of final sentence mark of word segmentation sequence output.Wherein input sentence is the example of the system that sentence converts to term vector to list entries.Following system, assembly and technology can be implemented wherein.
Word converts to term vector, and term vector has two ways to obtain, 1) using term vector as parameter, it is included in the middle of neutral net, while training whole neutral net, is obtained with term vector.But the term vector obtained in this way, this relation inconspicuous of similar Chinese character, even without inevitable contact.2) the neutral net training in advance utilizing comparative maturity goes out term vector storehouse, such as word2vec, GloVe, the term vector that the two neural network algorithm trains out, there is certain linear relationship or obvious non-linear relation between similar word or similar word, its similar word can be found by the term vector of a word.So that term vector has more semanteme, the present invention adopts Glove to train the term vector storehouse of 300 dimensions.
The number N of word in statistics language material, use oneHot (oneHot represents that a dimension is N, only one of which position be 1 other be the vector of 0) represent each word, find, by oneHot, the vector that word is corresponding, sentence converts vector representation to the most at last.
Fig. 2 shows part three layers shot and long term Memory Neural Networks, each layer of shot and long term Memory Neural Networks, is made up of LSTM (shot and long term memory) node of 100 standards.The LSTM of standard mainly processes variable length sequence, solves long-distance dependence problem, and it includes three doors: input gate, forget door, out gate.Multilamellar shot and long term Memory Neural Networks is used to be equivalent to define a tree-like neutral net.
nullIn order to use three layers shot and long term Memory Neural Networks to have tree-like function,The input training this layer network is sentence vector,The sequence that target is the syntax parsing tree that this input sentence is corresponding represents,Such as: input={ " uses tree-like neutral net and two way blocks to realize Chinese word segmentation " },Target={ " (ROOT (IP (VP (VP (VV use) (NP (NP (NN is tree-like) (NN neural) (NN network)) (CC and) (NP (ADJP (JJ is two-way)) (NP (NN is neural) (NN network))))) (VP (VV realizations) (NP (NN is Chinese) (NN participle)))))) " },Individually when training,Need to add a linear transformation layer and a logSoftMax layer at this layer network,The output making the LSTM of the standard of 100 nodes can represent corresponding with tree-like sequence,Be equivalent to coding and decoding.The init state of the hidden layer of traditional shot and long term Memory Neural Networks is full 0 or generates only small random number, original state for this three layers shot and long term Memory Neural Networks, it is vectorial that the present invention adopts sentence2vec (neural network algorithm that sentence is changed the vector that forms a complete sentence) to generate the sentence representing input sentence, sentence vector converts to and the vector of hidden layer identical dimensional by being multiplied by matrix parameter, and matrix parameter obtains by training whole neutral net.
Fig. 3 shows two-way shot and long term memory (BIDIRECTIONAL-LSTM) neutral net.One two-way-shot and long term Memory Neural Networks includes one and is made up of backward front circular recursion-length Memory Neural Networks the circular recursion-shot and long term Memory Neural Networks transmitted after forward direction and one, each circular recursion-shot and long term Memory Neural Networks is made up of the LSTM mnemon of designated length and block number, and the sequence length that adopts here is the longest is 100.Each unit includes input gate, forgets door and out gate, i.e. the LSTM mnemon of standard.Two-way-shot and long term Memory Neural Networks, can capture the information of each word the right and left, so obtaining semanteme better.The init state of the BIDIRECTIONAL-LSTM two ends hidden layer of usual standard is full 0 or generates only small random number, with three layers shot and long term Memory Neural Networks is the same above, it is vectorial that the present invention adopts sentence2vec (neural network algorithm that sentence is changed the vector that forms a complete sentence) to generate the sentence representing input sentence, sentence vector converts to and the vector of hidden layer identical dimensional by being multiplied by matrix parameter, and matrix parameter obtains by training whole neutral net.
Output sequence in Fig. 1, arranging logsoftMax layer in output sequence namely to classify layer, each output produces a column vector, and the dimension of column vector is 4, this 4 expression BEMS mark, wherein B refers to Begin prefix, and E refers to End suffix, and M refers in Middle word, S refers to monosyllabic word, taking out maximum of which probit, find the mark of correspondence position, this mark inputs the mark of the word of sentence correspondence position exactly.Same operation is done in all of output, final acquisition mark of word segmentation sequence.
In order to individually train two-way shot and long term memory (BIDIRECTIONAL-LSTM) neutral net and logsoftMax layer namely to classify layer, the input inputting the three layers shot and long term Memory Neural Networks that sentence passes through above to train is remembered the input of (BIDIRECTIONAL-LSTM) neutral net as two-way shot and long term, and target is the mark of word segmentation sequence that this sentence is corresponding.
The above is the complete explanation to whole neural network structure and processing procedure.Finally need to train whole neutral net, just can use, input is a sentence, target is a mark of word segmentation sequence, as: input={ " uses tree-like neutral net and two way blocks to realize Chinese word segmentation " }, target={ " BEBEBEBESBEBEBEBEBEBE " }. during use, it is only necessary to input a sentence, it is possible to output mark of word segmentation sequence.
Although this specification comprises some specific implementation mode details, but these are not construed as the restriction to any invention or required scope, are only intended to the explanation of the feature of specific embodiment.All same equivalent effect changes made according to the thinking of the present invention, all should be covered by protection scope of the present invention.
Claims (10)
1. the method using tree-like two way blocks to realize Chinese word segmentation, comprises the following steps: obtaining input sentence, described input sentence includes multiple inputs of grammaticalness order;Use language model that word each in described sentence converts to word vectorial as the first list entries, the first described list entries is passed to three layers shot and long term Memory Neural Networks and tree-like neutral net, simultaneously according to described input sentence generation sentence vector as the initialization input of every layer of hidden layer in three layers shot and long term Memory Neural Networks, training three layers shot and long term Memory Neural Networks, produce the second list entries, the second list entries is passed to two-way shot and long term Memory Neural Networks again, input as the initialization of two-way shot and long term Memory Neural Networks hidden layer according to described input sentence generation sentence vector simultaneously, produce the 3rd list entries, it is layer of classifying that the 3rd described list entries passes to logSoftMax layer more, to produce the mark of word segmentation sequence of described input sentence.
2. method according to claim 1, wherein said input sentence is no more than the variable-length input sentence of designated length.
3. method according to claim 1, wherein said language model refers to the neural network model that word or word convert to term vector.
4. the method according to Claim 1-3 any one, wherein processes described input sentence and includes: the unidentified item in described input sentence replaces to appointment labelling to produce modified input sentence.
5. the vector representation being converted to by described input sentence by ripe neural network model is referred to according to the sentence vector that claim 1 is wherein said.
6. method according to claim 1, the input that initializes of wherein said hidden layer includes two-way shot and long term Memory Neural Networks hidden layer by the init state after forward direction and by backward front init state, and the init state of three layers shot and long term Memory Neural Networks every layer, all adopt the sentence vector of described sentence.
7. the method according to claim 1 to 6 any one, farther includes: use stochastic gradient descent to train described three layers shot and long term Memory Neural Networks and two-way shot and long term Memory Neural Networks.
8. the method according to claim 1 to 7 any one, wherein said input sentence is consistent with the sentence of grammer, and mark of word segmentation sequence is by the character string of 4 kinds of tag combination.
9. 4 tag combination according to claim 8 refer to BMES, and wherein B refers to that Begin represents that prefix, E refer to that End represents that suffix, M refer to that Middle represents in word, and S refers to that Single represents single word.
10. method according to claim 1, wherein training three layers shot and long term Memory Neural Networks refers to the extra linear transformation of interpolation and logSoftMax layer, using the vector representation of described sentence as input, it is shown as target, training network parameter with the sequence table of the syntax parsing tree of described sentence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610037336.4A CN105740226A (en) | 2016-01-15 | 2016-01-15 | Method for implementing Chinese segmentation by using tree neural network and bilateral neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610037336.4A CN105740226A (en) | 2016-01-15 | 2016-01-15 | Method for implementing Chinese segmentation by using tree neural network and bilateral neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105740226A true CN105740226A (en) | 2016-07-06 |
Family
ID=56246271
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610037336.4A Pending CN105740226A (en) | 2016-01-15 | 2016-01-15 | Method for implementing Chinese segmentation by using tree neural network and bilateral neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105740226A (en) |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106372107A (en) * | 2016-08-19 | 2017-02-01 | 中兴通讯股份有限公司 | Generation method and device of natural language sentence library |
CN106919646A (en) * | 2017-01-18 | 2017-07-04 | 南京云思创智信息科技有限公司 | Chinese text summarization generation system and method |
CN107145483A (en) * | 2017-04-24 | 2017-09-08 | 北京邮电大学 | A kind of adaptive Chinese word cutting method based on embedded expression |
CN107193865A (en) * | 2017-04-06 | 2017-09-22 | 上海奔影网络科技有限公司 | Natural language is intended to understanding method and device in man-machine interaction |
CN107463928A (en) * | 2017-07-28 | 2017-12-12 | 顺丰科技有限公司 | Word sequence error correction algorithm, system and its equipment based on OCR and two-way LSTM |
CN107480680A (en) * | 2017-07-28 | 2017-12-15 | 顺丰科技有限公司 | Method, system and the equipment of text information in identification image based on OCR and Bi LSTM |
CN107608970A (en) * | 2017-09-29 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | part-of-speech tagging model generating method and device |
CN107797986A (en) * | 2017-10-12 | 2018-03-13 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on LSTM CNN |
CN107844475A (en) * | 2017-10-12 | 2018-03-27 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on LSTM |
CN107894975A (en) * | 2017-10-12 | 2018-04-10 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on Bi LSTM |
CN107894976A (en) * | 2017-10-12 | 2018-04-10 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on Bi LSTM |
CN107943783A (en) * | 2017-10-12 | 2018-04-20 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on LSTM CNN |
CN107967252A (en) * | 2017-10-12 | 2018-04-27 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on Bi-LSTM-CNN |
CN107977354A (en) * | 2017-10-12 | 2018-05-01 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on Bi-LSTM-CNN |
CN107992467A (en) * | 2017-10-12 | 2018-05-04 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on LSTM |
CN108090070A (en) * | 2016-11-22 | 2018-05-29 | 北京高地信息技术有限公司 | A kind of Chinese entity attribute abstracting method |
CN108595428A (en) * | 2018-04-25 | 2018-09-28 | 杭州闪捷信息科技股份有限公司 | The method segmented based on bidirectional circulating neural network |
WO2018232699A1 (en) * | 2017-06-22 | 2018-12-27 | 腾讯科技(深圳)有限公司 | Information processing method and related device |
CN109388806A (en) * | 2018-10-26 | 2019-02-26 | 北京布本智能科技有限公司 | A kind of Chinese word cutting method based on deep learning and forgetting algorithm |
CN109685137A (en) * | 2018-12-24 | 2019-04-26 | 上海仁静信息技术有限公司 | A kind of topic classification method, device, electronic equipment and storage medium |
CN109781094A (en) * | 2018-12-24 | 2019-05-21 | 上海交通大学 | Earth magnetism positioning system based on Recognition with Recurrent Neural Network |
CN109791731A (en) * | 2017-06-22 | 2019-05-21 | 北京嘀嘀无限科技发展有限公司 | A kind of method and system for estimating arrival time |
CN109800438A (en) * | 2019-02-01 | 2019-05-24 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN109871843A (en) * | 2017-12-01 | 2019-06-11 | 北京搜狗科技发展有限公司 | Character identifying method and device, the device for character recognition |
CN109948149A (en) * | 2019-02-28 | 2019-06-28 | 腾讯科技(深圳)有限公司 | A kind of file classification method and device |
CN110019784A (en) * | 2017-09-29 | 2019-07-16 | 北京国双科技有限公司 | A kind of file classification method and device |
CN110473534A (en) * | 2019-07-12 | 2019-11-19 | 南京邮电大学 | A kind of nursing old people conversational system based on deep neural network |
CN110750986A (en) * | 2018-07-04 | 2020-02-04 | 普天信息技术有限公司 | Neural network word segmentation system and training method based on minimum information entropy |
CN111160009A (en) * | 2019-12-30 | 2020-05-15 | 北京理工大学 | Sequence feature extraction method based on tree-shaped grid memory neural network |
US10769522B2 (en) | 2017-02-17 | 2020-09-08 | Wipro Limited | Method and system for determining classification of text |
US11200269B2 (en) | 2017-06-15 | 2021-12-14 | Microsoft Technology Licensing, Llc | Method and system for highlighting answer phrases |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101566998A (en) * | 2009-05-26 | 2009-10-28 | 华中师范大学 | Chinese question-answering system based on neural network |
CN105068998A (en) * | 2015-07-29 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Translation method and translation device based on neural network model |
CN105185374A (en) * | 2015-09-11 | 2015-12-23 | 百度在线网络技术(北京)有限公司 | Prosodic hierarchy annotation method and device |
-
2016
- 2016-01-15 CN CN201610037336.4A patent/CN105740226A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101566998A (en) * | 2009-05-26 | 2009-10-28 | 华中师范大学 | Chinese question-answering system based on neural network |
CN105068998A (en) * | 2015-07-29 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Translation method and translation device based on neural network model |
CN105185374A (en) * | 2015-09-11 | 2015-12-23 | 百度在线网络技术(北京)有限公司 | Prosodic hierarchy annotation method and device |
Non-Patent Citations (1)
Title |
---|
XINCHI CHEN等: "Long Short-Term Memory Neural Networks for Chinese Word Segmentation", 《PROCEEDINGS OF THE 2015 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 * |
Cited By (45)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106372107B (en) * | 2016-08-19 | 2020-01-17 | 中兴通讯股份有限公司 | Method and device for generating natural language sentence library |
CN106372107A (en) * | 2016-08-19 | 2017-02-01 | 中兴通讯股份有限公司 | Generation method and device of natural language sentence library |
CN108090070A (en) * | 2016-11-22 | 2018-05-29 | 北京高地信息技术有限公司 | A kind of Chinese entity attribute abstracting method |
CN108090070B (en) * | 2016-11-22 | 2021-08-24 | 湖南四方天箭信息科技有限公司 | Chinese entity attribute extraction method |
CN106919646A (en) * | 2017-01-18 | 2017-07-04 | 南京云思创智信息科技有限公司 | Chinese text summarization generation system and method |
CN106919646B (en) * | 2017-01-18 | 2020-06-09 | 南京云思创智信息科技有限公司 | Chinese text abstract generating system and method |
US10769522B2 (en) | 2017-02-17 | 2020-09-08 | Wipro Limited | Method and system for determining classification of text |
CN107193865A (en) * | 2017-04-06 | 2017-09-22 | 上海奔影网络科技有限公司 | Natural language is intended to understanding method and device in man-machine interaction |
CN107193865B (en) * | 2017-04-06 | 2020-03-10 | 上海奔影网络科技有限公司 | Natural language intention understanding method and device in man-machine interaction |
CN107145483A (en) * | 2017-04-24 | 2017-09-08 | 北京邮电大学 | A kind of adaptive Chinese word cutting method based on embedded expression |
CN107145483B (en) * | 2017-04-24 | 2018-09-04 | 北京邮电大学 | A kind of adaptive Chinese word cutting method based on embedded expression |
US11200269B2 (en) | 2017-06-15 | 2021-12-14 | Microsoft Technology Licensing, Llc | Method and system for highlighting answer phrases |
CN109791731A (en) * | 2017-06-22 | 2019-05-21 | 北京嘀嘀无限科技发展有限公司 | A kind of method and system for estimating arrival time |
WO2018232699A1 (en) * | 2017-06-22 | 2018-12-27 | 腾讯科技(深圳)有限公司 | Information processing method and related device |
US10789415B2 (en) | 2017-06-22 | 2020-09-29 | Tencent Technology (Shenzhen) Company Limited | Information processing method and related device |
US11079244B2 (en) | 2017-06-22 | 2021-08-03 | Beijing Didi Infinity Technology And Development Co., Ltd. | Methods and systems for estimating time of arrival |
CN107480680A (en) * | 2017-07-28 | 2017-12-15 | 顺丰科技有限公司 | Method, system and the equipment of text information in identification image based on OCR and Bi LSTM |
CN107463928A (en) * | 2017-07-28 | 2017-12-12 | 顺丰科技有限公司 | Word sequence error correction algorithm, system and its equipment based on OCR and two-way LSTM |
CN110019784B (en) * | 2017-09-29 | 2021-10-15 | 北京国双科技有限公司 | Text classification method and device |
CN107608970A (en) * | 2017-09-29 | 2018-01-19 | 百度在线网络技术(北京)有限公司 | part-of-speech tagging model generating method and device |
CN110019784A (en) * | 2017-09-29 | 2019-07-16 | 北京国双科技有限公司 | A kind of file classification method and device |
CN107608970B (en) * | 2017-09-29 | 2024-04-26 | 百度在线网络技术(北京)有限公司 | Part-of-speech tagging model generation method and device |
CN107943783A (en) * | 2017-10-12 | 2018-04-20 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on LSTM CNN |
CN107894976A (en) * | 2017-10-12 | 2018-04-10 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on Bi LSTM |
CN107967252A (en) * | 2017-10-12 | 2018-04-27 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on Bi-LSTM-CNN |
CN107894975A (en) * | 2017-10-12 | 2018-04-10 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on Bi LSTM |
CN107977354A (en) * | 2017-10-12 | 2018-05-01 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on Bi-LSTM-CNN |
CN107844475A (en) * | 2017-10-12 | 2018-03-27 | 北京知道未来信息技术有限公司 | A kind of segmenting method based on LSTM |
CN107797986B (en) * | 2017-10-12 | 2020-12-11 | 北京知道未来信息技术有限公司 | LSTM-CNN-based mixed corpus word segmentation method |
CN107992467A (en) * | 2017-10-12 | 2018-05-04 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on LSTM |
CN107797986A (en) * | 2017-10-12 | 2018-03-13 | 北京知道未来信息技术有限公司 | A kind of mixing language material segmenting method based on LSTM CNN |
CN109871843A (en) * | 2017-12-01 | 2019-06-11 | 北京搜狗科技发展有限公司 | Character identifying method and device, the device for character recognition |
CN109871843B (en) * | 2017-12-01 | 2022-04-08 | 北京搜狗科技发展有限公司 | Character recognition method and device for character recognition |
CN108595428A (en) * | 2018-04-25 | 2018-09-28 | 杭州闪捷信息科技股份有限公司 | The method segmented based on bidirectional circulating neural network |
CN110750986A (en) * | 2018-07-04 | 2020-02-04 | 普天信息技术有限公司 | Neural network word segmentation system and training method based on minimum information entropy |
CN110750986B (en) * | 2018-07-04 | 2023-10-10 | 普天信息技术有限公司 | Neural network word segmentation system and training method based on minimum information entropy |
CN109388806B (en) * | 2018-10-26 | 2023-06-27 | 北京布本智能科技有限公司 | Chinese word segmentation method based on deep learning and forgetting algorithm |
CN109388806A (en) * | 2018-10-26 | 2019-02-26 | 北京布本智能科技有限公司 | A kind of Chinese word cutting method based on deep learning and forgetting algorithm |
CN109781094A (en) * | 2018-12-24 | 2019-05-21 | 上海交通大学 | Earth magnetism positioning system based on Recognition with Recurrent Neural Network |
CN109685137A (en) * | 2018-12-24 | 2019-04-26 | 上海仁静信息技术有限公司 | A kind of topic classification method, device, electronic equipment and storage medium |
CN109800438B (en) * | 2019-02-01 | 2020-03-31 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN109800438A (en) * | 2019-02-01 | 2019-05-24 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN109948149A (en) * | 2019-02-28 | 2019-06-28 | 腾讯科技(深圳)有限公司 | A kind of file classification method and device |
CN110473534A (en) * | 2019-07-12 | 2019-11-19 | 南京邮电大学 | A kind of nursing old people conversational system based on deep neural network |
CN111160009A (en) * | 2019-12-30 | 2020-05-15 | 北京理工大学 | Sequence feature extraction method based on tree-shaped grid memory neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105740226A (en) | Method for implementing Chinese segmentation by using tree neural network and bilateral neural network | |
CN107168957A (en) | A kind of Chinese word cutting method | |
CN105930314B (en) | System and method is generated based on coding-decoding deep neural network text snippet | |
CN107516041B (en) | WebShell detection method and system based on deep neural network | |
CN108009154B (en) | Image Chinese description method based on deep learning model | |
CN106502985B (en) | neural network modeling method and device for generating titles | |
CN109101235A (en) | A kind of intelligently parsing method of software program | |
CN107291836B (en) | Chinese text abstract obtaining method based on semantic relevancy model | |
CN104699797B (en) | A kind of web page data structured analysis method and device | |
CN111177394A (en) | Knowledge map relation data classification method based on syntactic attention neural network | |
CN105938485A (en) | Image description method based on convolution cyclic hybrid model | |
CN109213975B (en) | Twitter text representation method based on character level convolution variation self-coding | |
CN104778224B (en) | A kind of destination object social networks recognition methods based on video semanteme | |
CN106547735A (en) | The structure and using method of the dynamic word or word vector based on the context-aware of deep learning | |
CN111538848A (en) | Knowledge representation learning method fusing multi-source information | |
CN111858932A (en) | Multiple-feature Chinese and English emotion classification method and system based on Transformer | |
CN106844327B (en) | Text coding method and system | |
CN109359297A (en) | A kind of Relation extraction method and system | |
CN106919557A (en) | A kind of document vector generation method of combination topic model | |
CN107092594B (en) | Bilingual recurrence self-encoding encoder based on figure | |
CN105956158B (en) | The method that network neologisms based on massive micro-blog text and user information automatically extract | |
JP2018067199A (en) | Abstract generating device, text converting device, and methods and programs therefor | |
CN111625276B (en) | Code abstract generation method and system based on semantic and grammar information fusion | |
CN113487024A (en) | Alternate sequence generation model training method and method for extracting graph from text | |
CN112560456A (en) | Generation type abstract generation method and system based on improved neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
DD01 | Delivery of document by public notice | ||
DD01 | Delivery of document by public notice |
Addressee: Nanjing University Document name: the First Notification of an Office Action |
|
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160706 |