CN105740226A - Method for implementing Chinese segmentation by using tree neural network and bilateral neural network - Google Patents

Method for implementing Chinese segmentation by using tree neural network and bilateral neural network Download PDF

Info

Publication number
CN105740226A
CN105740226A CN201610037336.4A CN201610037336A CN105740226A CN 105740226 A CN105740226 A CN 105740226A CN 201610037336 A CN201610037336 A CN 201610037336A CN 105740226 A CN105740226 A CN 105740226A
Authority
CN
China
Prior art keywords
sentence
term memory
long term
neural networks
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610037336.4A
Other languages
Chinese (zh)
Inventor
黄积杨
赵志宏
张冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201610037336.4A priority Critical patent/CN105740226A/en
Publication of CN105740226A publication Critical patent/CN105740226A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention relates to a method, a system, a device and a computer program for implementing Chinese segmentation by using a tree neural network and a bilateral neural network. The method comprises the steps of converting each character in an input sentence into a character vector as a first input sequence serving as the input of a three-layer long-short term memory neural network, i.e., the tree neural network, meanwhile, generating a second input sequence using a sentence vector as an initial value of each hidden layer, transmitting the second input sequence to the bilateral long-short term memory neural network, generating a third input sequence using the sentence vector as an initial value of each hidden layer, transmitting the third input sequence to a logSoftMax layer, i.e., a multi-classification layer, and finally generating a segmentation labeling sequence.

Description

Tree-like neutral net and two way blocks is used to realize Chinese word segmentation
Technical field
The invention belongs to natural language processing field, be directed to use with tree-like neutral net and method that two way blocks realizes Chinese word segmentation.
Background technology
Conventional traditional Chinese participle technique includes by word traversal, based on the segmenting method etc. of the segmenting method of dictionary dictionary coupling, full cutting and the frequency statistics based on word, and these methods are all the modes based on algorithm.Also having the segmenting method based on model that two comparisons are famous in traditional method, hidden Markov model, conditional random field models, the two model is all by sequence, obtains target sequence, and wherein conditional random field models effect is better than hidden Markov model.Along with the maturation of the lifting of computer computation ability and neural network model, a kind of method using tree-like neutral net two way blocks to realize Chinese word segmentation is proposed here.
Summary of the invention
It is an object of the invention to propose a kind of method based on neural fusion Chinese word segmentation at least to a certain extent.Illustrate how the mark of word segmentation sequence that the sentence generation by inputting is corresponding.
In order to realize object above, the technical solution used in the present invention is: obtain input sentence, each word in sentence converts to word vector input as first, first input is passed to three layers shot and long term Memory Neural Networks and tree-like neutral net produces the second input, thus realizing phrase, the extraction of semantic information, second input is passed to two-way shot and long term Memory Neural Networks, and the initial input of hidden layer is initialized by special mode, produce the 3rd input, thus realizing the extraction of word contextual information, 3rd input is passed to logSoftMax layer is layer of classifying more, obtain final mark of word segmentation sequence.In order to obtain tree-like information, it is necessary to each network is individually trained, more whole neutral net is trained.
The details of some embodiments of theme described in the following drawings and this specification described in illustrating.According to illustrating, drawings and claims book, it can be apparent for using tree-like and two way blocks to realize other features of method of Chinese word segmentation, aspect and advantage.
Accompanying drawing explanation
Fig. 1 illustrates whole neural network structure
Fig. 2 illustrates part three layers shot and long term Memory Neural Networks
Fig. 3 illustrates a two-way-shot and long term Memory Neural Networks
Detailed description of the invention
Below in conjunction with the accompanying drawing in the present invention, whole technical scheme and whole neutral net are carried out clearly, complete explanation.
Present disclosure is in that to provide a kind of technical solution carrying out Chinese word segmentation based on neutral net, including four parts, sentence is converted to vector portion, training three layers shot and long term Memory Neural Networks and tree-like part of neural network, train two-way shot and long term Memory Neural Networks part, train whole neutral net.
Fig. 1 illustrates from input sentence to the whole flow process of final sentence mark of word segmentation sequence output.Wherein input sentence is the example of the system that sentence converts to term vector to list entries.Following system, assembly and technology can be implemented wherein.
Word converts to term vector, and term vector has two ways to obtain, 1) using term vector as parameter, it is included in the middle of neutral net, while training whole neutral net, is obtained with term vector.But the term vector obtained in this way, this relation inconspicuous of similar Chinese character, even without inevitable contact.2) the neutral net training in advance utilizing comparative maturity goes out term vector storehouse, such as word2vec, GloVe, the term vector that the two neural network algorithm trains out, there is certain linear relationship or obvious non-linear relation between similar word or similar word, its similar word can be found by the term vector of a word.So that term vector has more semanteme, the present invention adopts Glove to train the term vector storehouse of 300 dimensions.
The number N of word in statistics language material, use oneHot (oneHot represents that a dimension is N, only one of which position be 1 other be the vector of 0) represent each word, find, by oneHot, the vector that word is corresponding, sentence converts vector representation to the most at last.
Fig. 2 shows part three layers shot and long term Memory Neural Networks, each layer of shot and long term Memory Neural Networks, is made up of LSTM (shot and long term memory) node of 100 standards.The LSTM of standard mainly processes variable length sequence, solves long-distance dependence problem, and it includes three doors: input gate, forget door, out gate.Multilamellar shot and long term Memory Neural Networks is used to be equivalent to define a tree-like neutral net.
nullIn order to use three layers shot and long term Memory Neural Networks to have tree-like function,The input training this layer network is sentence vector,The sequence that target is the syntax parsing tree that this input sentence is corresponding represents,Such as: input={ " uses tree-like neutral net and two way blocks to realize Chinese word segmentation " },Target={ " (ROOT (IP (VP (VP (VV use) (NP (NP (NN is tree-like) (NN neural) (NN network)) (CC and) (NP (ADJP (JJ is two-way)) (NP (NN is neural) (NN network))))) (VP (VV realizations) (NP (NN is Chinese) (NN participle)))))) " },Individually when training,Need to add a linear transformation layer and a logSoftMax layer at this layer network,The output making the LSTM of the standard of 100 nodes can represent corresponding with tree-like sequence,Be equivalent to coding and decoding.The init state of the hidden layer of traditional shot and long term Memory Neural Networks is full 0 or generates only small random number, original state for this three layers shot and long term Memory Neural Networks, it is vectorial that the present invention adopts sentence2vec (neural network algorithm that sentence is changed the vector that forms a complete sentence) to generate the sentence representing input sentence, sentence vector converts to and the vector of hidden layer identical dimensional by being multiplied by matrix parameter, and matrix parameter obtains by training whole neutral net.
Fig. 3 shows two-way shot and long term memory (BIDIRECTIONAL-LSTM) neutral net.One two-way-shot and long term Memory Neural Networks includes one and is made up of backward front circular recursion-length Memory Neural Networks the circular recursion-shot and long term Memory Neural Networks transmitted after forward direction and one, each circular recursion-shot and long term Memory Neural Networks is made up of the LSTM mnemon of designated length and block number, and the sequence length that adopts here is the longest is 100.Each unit includes input gate, forgets door and out gate, i.e. the LSTM mnemon of standard.Two-way-shot and long term Memory Neural Networks, can capture the information of each word the right and left, so obtaining semanteme better.The init state of the BIDIRECTIONAL-LSTM two ends hidden layer of usual standard is full 0 or generates only small random number, with three layers shot and long term Memory Neural Networks is the same above, it is vectorial that the present invention adopts sentence2vec (neural network algorithm that sentence is changed the vector that forms a complete sentence) to generate the sentence representing input sentence, sentence vector converts to and the vector of hidden layer identical dimensional by being multiplied by matrix parameter, and matrix parameter obtains by training whole neutral net.
Output sequence in Fig. 1, arranging logsoftMax layer in output sequence namely to classify layer, each output produces a column vector, and the dimension of column vector is 4, this 4 expression BEMS mark, wherein B refers to Begin prefix, and E refers to End suffix, and M refers in Middle word, S refers to monosyllabic word, taking out maximum of which probit, find the mark of correspondence position, this mark inputs the mark of the word of sentence correspondence position exactly.Same operation is done in all of output, final acquisition mark of word segmentation sequence.
In order to individually train two-way shot and long term memory (BIDIRECTIONAL-LSTM) neutral net and logsoftMax layer namely to classify layer, the input inputting the three layers shot and long term Memory Neural Networks that sentence passes through above to train is remembered the input of (BIDIRECTIONAL-LSTM) neutral net as two-way shot and long term, and target is the mark of word segmentation sequence that this sentence is corresponding.
The above is the complete explanation to whole neural network structure and processing procedure.Finally need to train whole neutral net, just can use, input is a sentence, target is a mark of word segmentation sequence, as: input={ " uses tree-like neutral net and two way blocks to realize Chinese word segmentation " }, target={ " BEBEBEBESBEBEBEBEBEBE " }. during use, it is only necessary to input a sentence, it is possible to output mark of word segmentation sequence.
Although this specification comprises some specific implementation mode details, but these are not construed as the restriction to any invention or required scope, are only intended to the explanation of the feature of specific embodiment.All same equivalent effect changes made according to the thinking of the present invention, all should be covered by protection scope of the present invention.

Claims (10)

1. the method using tree-like two way blocks to realize Chinese word segmentation, comprises the following steps: obtaining input sentence, described input sentence includes multiple inputs of grammaticalness order;Use language model that word each in described sentence converts to word vectorial as the first list entries, the first described list entries is passed to three layers shot and long term Memory Neural Networks and tree-like neutral net, simultaneously according to described input sentence generation sentence vector as the initialization input of every layer of hidden layer in three layers shot and long term Memory Neural Networks, training three layers shot and long term Memory Neural Networks, produce the second list entries, the second list entries is passed to two-way shot and long term Memory Neural Networks again, input as the initialization of two-way shot and long term Memory Neural Networks hidden layer according to described input sentence generation sentence vector simultaneously, produce the 3rd list entries, it is layer of classifying that the 3rd described list entries passes to logSoftMax layer more, to produce the mark of word segmentation sequence of described input sentence.
2. method according to claim 1, wherein said input sentence is no more than the variable-length input sentence of designated length.
3. method according to claim 1, wherein said language model refers to the neural network model that word or word convert to term vector.
4. the method according to Claim 1-3 any one, wherein processes described input sentence and includes: the unidentified item in described input sentence replaces to appointment labelling to produce modified input sentence.
5. the vector representation being converted to by described input sentence by ripe neural network model is referred to according to the sentence vector that claim 1 is wherein said.
6. method according to claim 1, the input that initializes of wherein said hidden layer includes two-way shot and long term Memory Neural Networks hidden layer by the init state after forward direction and by backward front init state, and the init state of three layers shot and long term Memory Neural Networks every layer, all adopt the sentence vector of described sentence.
7. the method according to claim 1 to 6 any one, farther includes: use stochastic gradient descent to train described three layers shot and long term Memory Neural Networks and two-way shot and long term Memory Neural Networks.
8. the method according to claim 1 to 7 any one, wherein said input sentence is consistent with the sentence of grammer, and mark of word segmentation sequence is by the character string of 4 kinds of tag combination.
9. 4 tag combination according to claim 8 refer to BMES, and wherein B refers to that Begin represents that prefix, E refer to that End represents that suffix, M refer to that Middle represents in word, and S refers to that Single represents single word.
10. method according to claim 1, wherein training three layers shot and long term Memory Neural Networks refers to the extra linear transformation of interpolation and logSoftMax layer, using the vector representation of described sentence as input, it is shown as target, training network parameter with the sequence table of the syntax parsing tree of described sentence.
CN201610037336.4A 2016-01-15 2016-01-15 Method for implementing Chinese segmentation by using tree neural network and bilateral neural network Pending CN105740226A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610037336.4A CN105740226A (en) 2016-01-15 2016-01-15 Method for implementing Chinese segmentation by using tree neural network and bilateral neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610037336.4A CN105740226A (en) 2016-01-15 2016-01-15 Method for implementing Chinese segmentation by using tree neural network and bilateral neural network

Publications (1)

Publication Number Publication Date
CN105740226A true CN105740226A (en) 2016-07-06

Family

ID=56246271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610037336.4A Pending CN105740226A (en) 2016-01-15 2016-01-15 Method for implementing Chinese segmentation by using tree neural network and bilateral neural network

Country Status (1)

Country Link
CN (1) CN105740226A (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372107A (en) * 2016-08-19 2017-02-01 中兴通讯股份有限公司 Generation method and device of natural language sentence library
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method
CN107145483A (en) * 2017-04-24 2017-09-08 北京邮电大学 A kind of adaptive Chinese word cutting method based on embedded expression
CN107193865A (en) * 2017-04-06 2017-09-22 上海奔影网络科技有限公司 Natural language is intended to understanding method and device in man-machine interaction
CN107463928A (en) * 2017-07-28 2017-12-12 顺丰科技有限公司 Word sequence error correction algorithm, system and its equipment based on OCR and two-way LSTM
CN107480680A (en) * 2017-07-28 2017-12-15 顺丰科技有限公司 Method, system and the equipment of text information in identification image based on OCR and Bi LSTM
CN107608970A (en) * 2017-09-29 2018-01-19 百度在线网络技术(北京)有限公司 part-of-speech tagging model generating method and device
CN107797986A (en) * 2017-10-12 2018-03-13 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on LSTM CNN
CN107844475A (en) * 2017-10-12 2018-03-27 北京知道未来信息技术有限公司 A kind of segmenting method based on LSTM
CN107894975A (en) * 2017-10-12 2018-04-10 北京知道未来信息技术有限公司 A kind of segmenting method based on Bi LSTM
CN107894976A (en) * 2017-10-12 2018-04-10 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on Bi LSTM
CN107943783A (en) * 2017-10-12 2018-04-20 北京知道未来信息技术有限公司 A kind of segmenting method based on LSTM CNN
CN107967252A (en) * 2017-10-12 2018-04-27 北京知道未来信息技术有限公司 A kind of segmenting method based on Bi-LSTM-CNN
CN107977354A (en) * 2017-10-12 2018-05-01 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on Bi-LSTM-CNN
CN107992467A (en) * 2017-10-12 2018-05-04 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on LSTM
CN108090070A (en) * 2016-11-22 2018-05-29 北京高地信息技术有限公司 A kind of Chinese entity attribute abstracting method
CN108595428A (en) * 2018-04-25 2018-09-28 杭州闪捷信息科技股份有限公司 The method segmented based on bidirectional circulating neural network
WO2018232699A1 (en) * 2017-06-22 2018-12-27 腾讯科技(深圳)有限公司 Information processing method and related device
CN109388806A (en) * 2018-10-26 2019-02-26 北京布本智能科技有限公司 A kind of Chinese word cutting method based on deep learning and forgetting algorithm
CN109685137A (en) * 2018-12-24 2019-04-26 上海仁静信息技术有限公司 A kind of topic classification method, device, electronic equipment and storage medium
CN109781094A (en) * 2018-12-24 2019-05-21 上海交通大学 Earth magnetism positioning system based on Recognition with Recurrent Neural Network
CN109791731A (en) * 2017-06-22 2019-05-21 北京嘀嘀无限科技发展有限公司 A kind of method and system for estimating arrival time
CN109800438A (en) * 2019-02-01 2019-05-24 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109871843A (en) * 2017-12-01 2019-06-11 北京搜狗科技发展有限公司 Character identifying method and device, the device for character recognition
CN109948149A (en) * 2019-02-28 2019-06-28 腾讯科技(深圳)有限公司 A kind of file classification method and device
CN110019784A (en) * 2017-09-29 2019-07-16 北京国双科技有限公司 A kind of file classification method and device
CN110473534A (en) * 2019-07-12 2019-11-19 南京邮电大学 A kind of nursing old people conversational system based on deep neural network
CN110750986A (en) * 2018-07-04 2020-02-04 普天信息技术有限公司 Neural network word segmentation system and training method based on minimum information entropy
CN111160009A (en) * 2019-12-30 2020-05-15 北京理工大学 Sequence feature extraction method based on tree-shaped grid memory neural network
US10769522B2 (en) 2017-02-17 2020-09-08 Wipro Limited Method and system for determining classification of text
US11200269B2 (en) 2017-06-15 2021-12-14 Microsoft Technology Licensing, Llc Method and system for highlighting answer phrases

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101566998A (en) * 2009-05-26 2009-10-28 华中师范大学 Chinese question-answering system based on neural network
CN105068998A (en) * 2015-07-29 2015-11-18 百度在线网络技术(北京)有限公司 Translation method and translation device based on neural network model
CN105185374A (en) * 2015-09-11 2015-12-23 百度在线网络技术(北京)有限公司 Prosodic hierarchy annotation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101566998A (en) * 2009-05-26 2009-10-28 华中师范大学 Chinese question-answering system based on neural network
CN105068998A (en) * 2015-07-29 2015-11-18 百度在线网络技术(北京)有限公司 Translation method and translation device based on neural network model
CN105185374A (en) * 2015-09-11 2015-12-23 百度在线网络技术(北京)有限公司 Prosodic hierarchy annotation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XINCHI CHEN等: "Long Short-Term Memory Neural Networks for Chinese Word Segmentation", 《PROCEEDINGS OF THE 2015 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 *

Cited By (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372107B (en) * 2016-08-19 2020-01-17 中兴通讯股份有限公司 Method and device for generating natural language sentence library
CN106372107A (en) * 2016-08-19 2017-02-01 中兴通讯股份有限公司 Generation method and device of natural language sentence library
CN108090070A (en) * 2016-11-22 2018-05-29 北京高地信息技术有限公司 A kind of Chinese entity attribute abstracting method
CN108090070B (en) * 2016-11-22 2021-08-24 湖南四方天箭信息科技有限公司 Chinese entity attribute extraction method
CN106919646A (en) * 2017-01-18 2017-07-04 南京云思创智信息科技有限公司 Chinese text summarization generation system and method
CN106919646B (en) * 2017-01-18 2020-06-09 南京云思创智信息科技有限公司 Chinese text abstract generating system and method
US10769522B2 (en) 2017-02-17 2020-09-08 Wipro Limited Method and system for determining classification of text
CN107193865A (en) * 2017-04-06 2017-09-22 上海奔影网络科技有限公司 Natural language is intended to understanding method and device in man-machine interaction
CN107193865B (en) * 2017-04-06 2020-03-10 上海奔影网络科技有限公司 Natural language intention understanding method and device in man-machine interaction
CN107145483A (en) * 2017-04-24 2017-09-08 北京邮电大学 A kind of adaptive Chinese word cutting method based on embedded expression
CN107145483B (en) * 2017-04-24 2018-09-04 北京邮电大学 A kind of adaptive Chinese word cutting method based on embedded expression
US11200269B2 (en) 2017-06-15 2021-12-14 Microsoft Technology Licensing, Llc Method and system for highlighting answer phrases
CN109791731A (en) * 2017-06-22 2019-05-21 北京嘀嘀无限科技发展有限公司 A kind of method and system for estimating arrival time
WO2018232699A1 (en) * 2017-06-22 2018-12-27 腾讯科技(深圳)有限公司 Information processing method and related device
US10789415B2 (en) 2017-06-22 2020-09-29 Tencent Technology (Shenzhen) Company Limited Information processing method and related device
US11079244B2 (en) 2017-06-22 2021-08-03 Beijing Didi Infinity Technology And Development Co., Ltd. Methods and systems for estimating time of arrival
CN107480680A (en) * 2017-07-28 2017-12-15 顺丰科技有限公司 Method, system and the equipment of text information in identification image based on OCR and Bi LSTM
CN107463928A (en) * 2017-07-28 2017-12-12 顺丰科技有限公司 Word sequence error correction algorithm, system and its equipment based on OCR and two-way LSTM
CN110019784B (en) * 2017-09-29 2021-10-15 北京国双科技有限公司 Text classification method and device
CN107608970A (en) * 2017-09-29 2018-01-19 百度在线网络技术(北京)有限公司 part-of-speech tagging model generating method and device
CN110019784A (en) * 2017-09-29 2019-07-16 北京国双科技有限公司 A kind of file classification method and device
CN107608970B (en) * 2017-09-29 2024-04-26 百度在线网络技术(北京)有限公司 Part-of-speech tagging model generation method and device
CN107943783A (en) * 2017-10-12 2018-04-20 北京知道未来信息技术有限公司 A kind of segmenting method based on LSTM CNN
CN107894976A (en) * 2017-10-12 2018-04-10 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on Bi LSTM
CN107967252A (en) * 2017-10-12 2018-04-27 北京知道未来信息技术有限公司 A kind of segmenting method based on Bi-LSTM-CNN
CN107894975A (en) * 2017-10-12 2018-04-10 北京知道未来信息技术有限公司 A kind of segmenting method based on Bi LSTM
CN107977354A (en) * 2017-10-12 2018-05-01 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on Bi-LSTM-CNN
CN107844475A (en) * 2017-10-12 2018-03-27 北京知道未来信息技术有限公司 A kind of segmenting method based on LSTM
CN107797986B (en) * 2017-10-12 2020-12-11 北京知道未来信息技术有限公司 LSTM-CNN-based mixed corpus word segmentation method
CN107992467A (en) * 2017-10-12 2018-05-04 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on LSTM
CN107797986A (en) * 2017-10-12 2018-03-13 北京知道未来信息技术有限公司 A kind of mixing language material segmenting method based on LSTM CNN
CN109871843A (en) * 2017-12-01 2019-06-11 北京搜狗科技发展有限公司 Character identifying method and device, the device for character recognition
CN109871843B (en) * 2017-12-01 2022-04-08 北京搜狗科技发展有限公司 Character recognition method and device for character recognition
CN108595428A (en) * 2018-04-25 2018-09-28 杭州闪捷信息科技股份有限公司 The method segmented based on bidirectional circulating neural network
CN110750986A (en) * 2018-07-04 2020-02-04 普天信息技术有限公司 Neural network word segmentation system and training method based on minimum information entropy
CN110750986B (en) * 2018-07-04 2023-10-10 普天信息技术有限公司 Neural network word segmentation system and training method based on minimum information entropy
CN109388806B (en) * 2018-10-26 2023-06-27 北京布本智能科技有限公司 Chinese word segmentation method based on deep learning and forgetting algorithm
CN109388806A (en) * 2018-10-26 2019-02-26 北京布本智能科技有限公司 A kind of Chinese word cutting method based on deep learning and forgetting algorithm
CN109781094A (en) * 2018-12-24 2019-05-21 上海交通大学 Earth magnetism positioning system based on Recognition with Recurrent Neural Network
CN109685137A (en) * 2018-12-24 2019-04-26 上海仁静信息技术有限公司 A kind of topic classification method, device, electronic equipment and storage medium
CN109800438B (en) * 2019-02-01 2020-03-31 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109800438A (en) * 2019-02-01 2019-05-24 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109948149A (en) * 2019-02-28 2019-06-28 腾讯科技(深圳)有限公司 A kind of file classification method and device
CN110473534A (en) * 2019-07-12 2019-11-19 南京邮电大学 A kind of nursing old people conversational system based on deep neural network
CN111160009A (en) * 2019-12-30 2020-05-15 北京理工大学 Sequence feature extraction method based on tree-shaped grid memory neural network

Similar Documents

Publication Publication Date Title
CN105740226A (en) Method for implementing Chinese segmentation by using tree neural network and bilateral neural network
CN107168957A (en) A kind of Chinese word cutting method
CN105930314B (en) System and method is generated based on coding-decoding deep neural network text snippet
CN107516041B (en) WebShell detection method and system based on deep neural network
CN108009154B (en) Image Chinese description method based on deep learning model
CN106502985B (en) neural network modeling method and device for generating titles
CN109101235A (en) A kind of intelligently parsing method of software program
CN107291836B (en) Chinese text abstract obtaining method based on semantic relevancy model
CN104699797B (en) A kind of web page data structured analysis method and device
CN111177394A (en) Knowledge map relation data classification method based on syntactic attention neural network
CN105938485A (en) Image description method based on convolution cyclic hybrid model
CN109213975B (en) Twitter text representation method based on character level convolution variation self-coding
CN104778224B (en) A kind of destination object social networks recognition methods based on video semanteme
CN106547735A (en) The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
CN111538848A (en) Knowledge representation learning method fusing multi-source information
CN111858932A (en) Multiple-feature Chinese and English emotion classification method and system based on Transformer
CN106844327B (en) Text coding method and system
CN109359297A (en) A kind of Relation extraction method and system
CN106919557A (en) A kind of document vector generation method of combination topic model
CN107092594B (en) Bilingual recurrence self-encoding encoder based on figure
CN105956158B (en) The method that network neologisms based on massive micro-blog text and user information automatically extract
JP2018067199A (en) Abstract generating device, text converting device, and methods and programs therefor
CN111625276B (en) Code abstract generation method and system based on semantic and grammar information fusion
CN113487024A (en) Alternate sequence generation model training method and method for extracting graph from text
CN112560456A (en) Generation type abstract generation method and system based on improved neural network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Nanjing University

Document name: the First Notification of an Office Action

WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160706