CN109214000A - A kind of neural network card language entity recognition method based on topic model term vector - Google Patents
A kind of neural network card language entity recognition method based on topic model term vector Download PDFInfo
- Publication number
- CN109214000A CN109214000A CN201810965632.XA CN201810965632A CN109214000A CN 109214000 A CN109214000 A CN 109214000A CN 201810965632 A CN201810965632 A CN 201810965632A CN 109214000 A CN109214000 A CN 109214000A
- Authority
- CN
- China
- Prior art keywords
- theme
- text
- word
- term vector
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 13
- 230000006870 function Effects 0.000 claims description 9
- 239000003550 marker Substances 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 claims 2
- 238000013136 deep learning model Methods 0.000 abstract description 3
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 238000011160 research Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 229920000742 Cotton Polymers 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to a kind of neural network card language entity recognition method based on topic model term vector, belongs to natural language processing technique field.The present invention first obtains card language corpus of text and pre-processes to corpus;Then topic model is constructed;It is numbered using the theme that the topic model built obtains each word of text, this theme number is considered as pseudo- word;To after pretreatment text and pseudo- word obtained above be put into same corpus text, handle while obtaining the term vector of each word and the corresponding theme vector of word in text using skip-gram model;Term vector obtained in above-mentioned steps and theme vector are cascaded to obtain theme term vector;It is finally input to obtained theme term vector as an input feature vector in the deep learning model constructed, and then realizes the Entity recognition to card language.The present invention can preferably solve the problems, such as that polysemy present in text and unisonance ambiguity, Kampuchean name the recognition correct rate of entity high.
Description
Technical field
The present invention relates to a kind of neural network card language entity recognition method based on topic model term vector, belongs to nature language
Say processing technology field.
Background technique
With the fast development of modern economy, the exchange, cooperation between China and country in Southeast Asia are more and more frequent, wherein
With State of Cambodia economic, culture, in terms of exchange and cooperation be also in increase trend year by year.Develop increasingly in China and Kampuchea
Under close background, the cultural knowledge of concern and study State of Cambodia is particularly important, but simultaneously because bilingualism is obstructed to give this
One task brings many difficulties.Therefore, the demands for solving these difficulties using natural language processing technique are more more and more intense.
Kampuchean is also known as Khmer, belongs to Austroasiatic Meng Cambodia linguistic subfamily Khmer Zhi Yuyan, as Cambodia
Official language uses in China.The phenomenon that foreign word is borrowed in Kampuchean is very universal.Kampuchean is ancient high
It develops and grows up on the basis of cotton language, absorb many bar Sanskrits, it is such as safe also to receive surrounding countries at the same time
The influence of the language such as language, Chinese, Vietnamese, Laotian.So there are many word formation patterns of form for Kampuchean.Due to Cambodia
Language is that text history is the most ancient in every country in Southeast Asia's language, has very high researching value.And at present both at home and abroad to card
In terms of the research of Pu stockaded village language mainly lays particular emphasis on culture, due to the particularity of language, for rare foreign languages word as Kampuchean
Research work in terms of method analysis aspect especially names Entity recognition is also extremely limited, therefore the research work is to solution card Pu
Political economy analysis, the public sentiment assurance on stockaded village etc. have very important significance.
Name Entity recognition is the basic task of natural language processing field, even more many natural language application field researchs
Antecedent basis.Earliest name Entity recognition is made on MUC-6 (Message Understanding Conference)
Put forward for a subtask.Name Entity recognition task mainly identifies the proprietary name occurred in text and significant
Numeral classifier phrase and sorted out.Its action is by earliest Entity recognition (name, place name, mechanism name) to now to text
The refinement of middle Entity recognition and the identification of temporal expression (date, time), numerical expression (currency values, percentage etc.).
Since the Entity recognitions such as quantity, time, date, currency can usually obtain good identification effect by the way of pattern match
Fruit, name, place name, mechanism name are more complex in contrast, therefore research in recent years mainly based on these types of entity and is ordered
Name Entity recognition is the important research content of information extraction, at the natural languages such as information retrieval, machine translation and question answering system
Reason field has a wide range of applications.
Summary of the invention
The present invention provides a kind of neural network card language entity recognition method based on topic model term vector, for solving
Certainly existing polysemy when recognition correct rate is low and card language Entity recognition of Kampuchean name entity, unisonance ambiguity are asked
Topic.
The technical scheme is that a kind of neural network card language entity recognition method based on topic model term vector,
Card language corpus of text is obtained first and corpus is pre-processed;Then topic model is constructed to the text after pretreatment;Make
The theme number of each word of text is obtained with the topic model built, this theme number is considered as pseudo- word;To pretreatment
Text and pseudo- word obtained above afterwards is put into same corpus text, handle while being obtained using skip-gram model
The term vector of each word and the corresponding theme vector of word into text;By term vector obtained in above-mentioned steps and theme vector into
Row cascade obtains theme term vector;Finally obtained theme term vector is input to as an input feature vector and have been constructed
In deep learning model, and then realize the Entity recognition to card language.
Specific step is as follows for the method:
Step1, card language corpus of text is obtained from papery text, card language website first with crawlers;To above-mentioned text
This is successively segmented, filters punctuation mark, stop words pre-processes to obtain card language list language corpus of text ready for use;
Step2, HDP topic model is constructed to the text after pretreatment;Text is obtained using the topic model built
The theme of each word is numbered, this theme number is considered as pseudo- word;
Step3, skip-gram model is constructed to the text after above-mentioned pretreatment;To text after pretreatment and upper
The pseudo- word stated is put into same corpus text, handle while obtaining using skip-gram model each in text
The corresponding theme vector of term vector and word of word;
Step4, term vector obtained in above-mentioned steps and theme vector are cascaded to obtain theme term vector;
Step5, finally using obtained theme term vector as an input feature vector it is input to the depth constructed
It practises in model, and then realizes the Entity recognition to card language.
Specific step is as follows by the step Step2:
Step2.1, pretreated text is divided into N number of document, each document
Step2.2, construction HDP topic model, then need the theme for assuming all documents to be distributed H both from some, then
Use α and H as the Dirichlet distribution of parameter as priori at this time;
Step2.3, a distribution G is extracted from priori first0, as the priori of the theme distribution of this document,
Meet at this time: G0~DP α, H;
Step2.4, G is recycled0It is one Dirichlet distribution of parametric configuration with γ, a master is extracted from this distribution
Topic distribution GdAs the theme distribution of d documents, i.e., meet at this time:Gd~DP (γ, G0);
Step2.5, from the theme distribution G of d documents obtained abovedIt is middle extract i-th of word theme θdi, most
A word x is generated from the theme eventuallydi, i.e., at this time by just obtaining the theme distribution of word after iteration, by this theme distribution
It is set as a pseudo- word.
Specific step is as follows by the step Step3:
Step3.1, skip-gram model is constructed to the text after above-mentioned pretreatment;It will be in the text after pretreatment
Word indicated with w, the pseudo- word that the theme for using topic model to obtain is numbered is indicated with z, textual words and theme are compiled
Number pseudo- word be put into one text as unit of group, i.e., input at this time is D { wi,zi}={ w1,z1,…wi,zi,…
wM,zM}
Step3.2, according to the input in above-mentioned steps, obtain the objective function of skip-gram model at this time are as follows:
Wherein, M is the number of the word of input model, and k is the window size for predicting context.
Specific step is as follows by the step Step4:
Step4.1, the term vector of word each in text obtained in above-mentioned Step2 is indicated with w, in step Step3
To the theme vector of word indicated with z;
Step4.2, the theme vector z of term vector w and word are cascaded using ⊕ mode, that is, met: wz=w ⊕ z, this
When just obtain required theme term vector wz。
Specific step is as follows by the step Step5:
Step5.1, using descriptor vector characteristics obtained above as input feature vector (x1,x2,…xn), it is input to CRF mould
In type, obtain:
Wherein, tj(ym+1,ym, x, m) and it is defined in the transfer characteristic function on two adjacent marker positions of observation sequence,
For portraying the influence of correlativity and observation sequence to them between adjacent marker variable, sk(ym, x, m) and it is defined in
Turntable characteristic function on the mark position m of observation sequence, for portraying influence of the observation sequence to token variable, λjAnd μkFor
Parameter, Z are standardizing factor, and the marking probability for just obtaining sequences y at this time realizes the name Entity recognition of card language.
The beneficial effects of the present invention are:
1, the present invention provides a kind of methods for being applicable in and solving the problems, such as the Entity recognition of card language, and in preferable solution text
Existing polysemy and unisonance ambiguity problem, Kampuchean name the recognition correct rate of entity high;
2, the present invention is syntactic analysis, Sentence analysis, information extraction, information retrieval and the machine translation etc. of subsequent card language
Work provides strong support.
Detailed description of the invention
Fig. 1 is the flow chart in the present invention.
Specific embodiment
Embodiment 1: as shown in Figure 1, a kind of neural network card language entity recognition method based on topic model term vector, first
It first obtains card language corpus of text and corpus is pre-processed;Then topic model is constructed to the text after pretreatment;It uses
The topic model built obtains the theme number of each word of text, this theme number is considered as pseudo- word;After pretreatment
Text and pseudo- word obtained above be put into same corpus text, handle while obtaining using skip-gram model
The term vector of each word and the corresponding theme vector of word in text;Term vector obtained in above-mentioned steps and theme vector are carried out
Cascade obtains theme term vector;Finally the depth constructed is input to using obtained theme term vector as an input feature vector
It spends in learning model, and then realizes the Entity recognition to card language.
Further, specific step is as follows for the method:
Step1, card language corpus of text is obtained from papery text, card language website first with crawlers;To above-mentioned text
This is successively segmented, filters punctuation mark, stop words pre-processes to obtain card language list language corpus of text ready for use;
Step2, HDP topic model is constructed to the text after pretreatment;Text is obtained using the topic model built
The theme of each word is numbered, this theme number is considered as pseudo- word;
Step3, skip-gram model is constructed to the text after above-mentioned pretreatment;To text after pretreatment and upper
The pseudo- word stated is put into same corpus text, handle while obtaining using skip-gram model each in text
The corresponding theme vector of term vector and word of word;
Step4, term vector obtained in above-mentioned steps and theme vector are cascaded to obtain theme term vector;
Step5, finally using obtained theme term vector as an input feature vector it is input to the depth constructed
It practises in model, and then realizes the Entity recognition to card language.
Further, specific step is as follows by the step Step2:
Step2.1, pretreated text is divided into N number of document, each document
Step2.2, construction HDP topic model, then need the theme for assuming all documents to be distributed H both from some, then
Use α and H as the Dirichlet distribution of parameter as priori at this time;
Step2.3, a distribution G is extracted from priori first0, as the priori of the theme distribution of this document,
Meet at this time: G0~DP α, H;
Step2.4, G is recycled0It is one Dirichlet distribution of parametric configuration with γ, a master is extracted from this distribution
Topic distribution GdAs the theme distribution of d documents, i.e., meet at this time:Gd~DP (γ, G0);
Step2.5, from the theme distribution G of d documents obtained abovedIt is middle extract i-th of word theme θdi, most
A word x is generated from the theme eventuallydi, i.e., at this time by just obtaining the theme distribution of word after iteration, by this theme distribution
It is set as a pseudo- word.
Further, specific step is as follows by the step Step3:
Step3.1, skip-gram model is constructed to the text after above-mentioned pretreatment;It will be in the text after pretreatment
Word indicated with w, the pseudo- word that the theme for using topic model to obtain is numbered is indicated with z, textual words and theme are compiled
Number pseudo- word be put into one text as unit of group, i.e., input at this time is D { wi,zi}={ w1,z1,…wi,zi,…
wM,zM}
Step3.2, according to the input in above-mentioned steps, obtain the objective function of skip-gram model at this time are as follows:
Wherein, M is the number of the word of input model, and k is the window size for predicting context.
Further, specific step is as follows by the step Step4:
Step4.1, the term vector of word each in text obtained in above-mentioned Step2 is indicated with w, in step Step3
To the theme vector of word indicated with z;
Step4.2, the theme vector z of term vector w and word are cascaded using ⊕ mode, that is, met: wz=w ⊕ z, this
When just obtain required theme term vector wz。
Further, specific step is as follows by the step Step5:
Step5.1, using descriptor vector characteristics obtained above as input feature vector (x1,x2,…xn), it is input to depth
In learning model (deep learning model uses CRF model), obtain:
Wherein, tj(ym+1,ym, x, m) and it is defined in the transfer characteristic function on two adjacent marker positions of observation sequence,
For portraying the influence of correlativity and observation sequence to them between adjacent marker variable, sk(ym, x, m) and it is defined in
Turntable characteristic function on the mark position m of observation sequence, for portraying influence of the observation sequence to token variable, λjAnd μkFor
Parameter, Z are standardizing factor, and the marking probability for just obtaining sequences y at this time realizes the name Entity recognition of card language.
Above in conjunction with attached drawing, the embodiment of the present invention is explained in detail, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
Put that various changes can be made.
Claims (6)
1. a kind of neural network card language entity recognition method based on topic model term vector, it is characterised in that: acquisition card first
Language corpus of text simultaneously pre-processes corpus;Then topic model is constructed to the text after pretreatment;Use what is built
Topic model obtains the theme number of each word of text, this theme number is considered as pseudo- word;To after pretreatment text and
Puppet word obtained above is put into same corpus text, handle while obtaining using skip-gram model every in text
The corresponding theme vector of term vector and word of a word;Term vector obtained in above-mentioned steps and theme vector are cascaded to obtain
Theme term vector;Finally the deep learning mould constructed is input to using obtained theme term vector as an input feature vector
In type, and then realize the Entity recognition to card language.
2. the neural network card language entity recognition method according to claim 1 based on topic model term vector, feature
Be: specific step is as follows for the method:
Step1, card language corpus of text is obtained from papery text, card language website first with crawlers;To above-mentioned text according to
It is secondary segmented, filter punctuation mark, stop words pre-processes to obtain card language list language corpus of text ready for use;
Step2, HDP topic model is constructed to the text after pretreatment;It is each that text is obtained using the topic model built
The theme of word is numbered, this theme number is considered as pseudo- word;
Step3, skip-gram model is constructed to the text after above-mentioned pretreatment;To text after pretreatment and above-mentioned
To pseudo- word be put into same corpus text, handle while obtaining each word in text using skip-gram model
Term vector and the corresponding theme vector of word;
Step4, term vector obtained in above-mentioned steps and theme vector are cascaded to obtain theme term vector;
Step5, finally using obtained theme term vector as an input feature vector it is input to the deep learning mould constructed
In type, and then realize the Entity recognition to card language.
3. the neural network card language entity recognition method according to claim 2 based on topic model term vector, feature
Be: specific step is as follows by the step Step2:
Step2.1, pretreated text is divided into N number of document, each document d ∈ { 1,2 ... N };
Step2.2, construction HDP topic model, then need the theme for assuming all documents to be distributed H both from some, then at this time
Use α and H as the Dirichlet of parameter distribution as priori;
Step2.3, a distribution G is extracted from priori first0, as the priori of the theme distribution of this document, i.e., at this time
Meet: G0~DP (α, H);
Step2.4, G is recycled0It is one Dirichlet distribution of parametric configuration with γ, a theme distribution is extracted from this distribution
GdAs the theme distribution of d documents, i.e., meet at this time:Gd~DP (γ, G0);
Step2.5, from the theme distribution G of d documents obtained abovedIt is middle extract i-th of word theme θdi, finally from
A word x is generated in the themedi, i.e., this theme distribution is set by just obtaining the theme distribution of word after iteration at this time
For a pseudo- word.
4. the neural network card language entity recognition method according to claim 2 based on topic model term vector, feature
Be: specific step is as follows by the step Step3:
Step3.1, skip-gram model is constructed to the text after above-mentioned pretreatment;By the list in the text after pretreatment
Word is indicated with w, and the pseudo- word that the theme for using topic model to obtain is numbered is indicated with z, and textual words and theme are numbered
Pseudo- word is put into one text as unit of group, i.e., input at this time is D={ wi,zi}={ w1,z1,…wi,zi,…wM,
zM}
Step3.2, according to the input in above-mentioned steps, obtain the objective function of skip-gram model at this time are as follows:
Wherein, M is the number of the word of input model, and k is the window size for predicting context.
5. the neural network card language entity recognition method according to claim 2 based on topic model term vector, feature
Be: specific step is as follows by the step Step4:
Step4.1, the term vector of word each in text obtained in above-mentioned Step2 is indicated with w, obtained in step Step3
The theme vector of word is indicated with z;
Step4.2, the theme vector z of term vector w and word are usedMode is cascaded, that is, is met:At this time
Just required theme term vector w is obtainedz。
6. the neural network card language entity recognition method according to claim 2 based on topic model term vector, feature
Be: specific step is as follows by the step Step5:
Step5.1, using descriptor vector characteristics obtained above as input feature vector (x1,x2,…xn), it is input to CRF model
In, it obtains:
Wherein, tj(ym+1,ym, x, m) and it is defined in the transfer characteristic function on two adjacent marker positions of observation sequence, it is used for
Portray the influence of correlativity and observation sequence to them between adjacent marker variable, sk(ym, x, m) and it is defined in observation
Turntable characteristic function on the mark position m of sequence, for portraying influence of the observation sequence to token variable, λjAnd μkFor parameter,
Z is standardizing factor, and the marking probability for just obtaining sequences y at this time realizes the name Entity recognition of card language.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810965632.XA CN109214000A (en) | 2018-08-23 | 2018-08-23 | A kind of neural network card language entity recognition method based on topic model term vector |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810965632.XA CN109214000A (en) | 2018-08-23 | 2018-08-23 | A kind of neural network card language entity recognition method based on topic model term vector |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109214000A true CN109214000A (en) | 2019-01-15 |
Family
ID=64989087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810965632.XA Pending CN109214000A (en) | 2018-08-23 | 2018-08-23 | A kind of neural network card language entity recognition method based on topic model term vector |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109214000A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112069826A (en) * | 2020-07-15 | 2020-12-11 | 浙江工业大学 | Vertical domain entity disambiguation method fusing topic model and convolutional neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110231347A1 (en) * | 2010-03-16 | 2011-09-22 | Microsoft Corporation | Named Entity Recognition in Query |
CN104268200A (en) * | 2013-09-22 | 2015-01-07 | 中科嘉速(北京)并行软件有限公司 | Unsupervised named entity semantic disambiguation method based on deep learning |
CN105224521A (en) * | 2015-09-28 | 2016-01-06 | 北大方正集团有限公司 | Key phrases extraction method and use its method obtaining correlated digital resource and device |
CN106980609A (en) * | 2017-03-21 | 2017-07-25 | 大连理工大学 | A kind of name entity recognition method of the condition random field of word-based vector representation |
CN107861947A (en) * | 2017-11-07 | 2018-03-30 | 昆明理工大学 | A kind of method of the card language name Entity recognition based on across language resource |
-
2018
- 2018-08-23 CN CN201810965632.XA patent/CN109214000A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110231347A1 (en) * | 2010-03-16 | 2011-09-22 | Microsoft Corporation | Named Entity Recognition in Query |
CN104268200A (en) * | 2013-09-22 | 2015-01-07 | 中科嘉速(北京)并行软件有限公司 | Unsupervised named entity semantic disambiguation method based on deep learning |
CN105224521A (en) * | 2015-09-28 | 2016-01-06 | 北大方正集团有限公司 | Key phrases extraction method and use its method obtaining correlated digital resource and device |
CN106980609A (en) * | 2017-03-21 | 2017-07-25 | 大连理工大学 | A kind of name entity recognition method of the condition random field of word-based vector representation |
CN107861947A (en) * | 2017-11-07 | 2018-03-30 | 昆明理工大学 | A kind of method of the card language name Entity recognition based on across language resource |
Non-Patent Citations (2)
Title |
---|
YANG LIU 等: "Topical Word Embeddings", 《AAAI"15: PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
刘绍毓: "实体关系抽取关键技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112069826A (en) * | 2020-07-15 | 2020-12-11 | 浙江工业大学 | Vertical domain entity disambiguation method fusing topic model and convolutional neural network |
CN112069826B (en) * | 2020-07-15 | 2021-12-07 | 浙江工业大学 | Vertical domain entity disambiguation method fusing topic model and convolutional neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109446404B (en) | Method and device for analyzing emotion polarity of network public sentiment | |
WO2021114745A1 (en) | Named entity recognition method employing affix perception for use in social media | |
CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
CN109635297B (en) | Entity disambiguation method and device, computer device and computer storage medium | |
CN110727880B (en) | Sensitive corpus detection method based on word bank and word vector model | |
CN109670041A (en) | A kind of band based on binary channels text convolutional neural networks is made an uproar illegal short text recognition methods | |
CN106776538A (en) | The information extracting method of enterprise's noncanonical format document | |
WO2019228466A1 (en) | Named entity recognition method, device and apparatus, and storage medium | |
CN106095749A (en) | A kind of text key word extracting method based on degree of depth study | |
CN105095190B (en) | A kind of sentiment analysis method combined based on Chinese semantic structure and subdivision dictionary | |
CN106598940A (en) | Text similarity solution algorithm based on global optimization of keyword quality | |
CN109800310A (en) | A kind of electric power O&M text analyzing method based on structuring expression | |
CN109960727B (en) | Personal privacy information automatic detection method and system for unstructured text | |
CN106611041A (en) | New text similarity solution method | |
CN103324626A (en) | Method for setting multi-granularity dictionary and segmenting words and device thereof | |
CN111476036A (en) | Word embedding learning method based on Chinese word feature substrings | |
CN111274814A (en) | Novel semi-supervised text entity information extraction method | |
CN111191463A (en) | Emotion analysis method and device, electronic equipment and storage medium | |
CN107894975A (en) | A kind of segmenting method based on Bi LSTM | |
CN112287240A (en) | Case microblog evaluation object extraction method and device based on double-embedded multilayer convolutional neural network | |
CN112084308A (en) | Method, system and storage medium for text type data recognition | |
CN111061873B (en) | Multi-channel text classification method based on Attention mechanism | |
CN110502759B (en) | Method for processing Chinese-Yue hybrid network neural machine translation out-of-set words fused into classification dictionary | |
Seeha et al. | ThaiLMCut: Unsupervised pretraining for Thai word segmentation | |
Tianxiong et al. | Identifying chinese event factuality with convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190115 |