CN112651245A - Sequence annotation model and sequence annotation method - Google Patents
Sequence annotation model and sequence annotation method Download PDFInfo
- Publication number
- CN112651245A CN112651245A CN202011577267.9A CN202011577267A CN112651245A CN 112651245 A CN112651245 A CN 112651245A CN 202011577267 A CN202011577267 A CN 202011577267A CN 112651245 A CN112651245 A CN 112651245A
- Authority
- CN
- China
- Prior art keywords
- sequence
- model
- vector
- lstm
- elmo
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 239000013598 vector Substances 0.000 claims abstract description 90
- 238000004364 calculation method Methods 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 11
- 230000002457 bidirectional effect Effects 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 9
- 230000015654 memory Effects 0.000 claims description 9
- 230000007704 transition Effects 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 238000002372 labelling Methods 0.000 abstract description 14
- 230000006872 improvement Effects 0.000 description 7
- 230000006403 short-term memory Effects 0.000 description 5
- 230000007787 long-term memory Effects 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention provides a sequence labeling model and a sequence labeling method. When the model is used for carrying out sequence labeling tasks, firstly ELMo word vectors are added into an input layer to serve as extra features, the representation of each character is the splicing of the character vectors and the ELMo representation, secondly, in a BilSTM network layer, in addition to the fact that a forward LSTM network is used for learning the historical features of each character, the sequence is also input into a reverse LSTM network in a reverse order for learning the subsequent features of each character, the context features of the characters are spliced and input into a CRF layer, and finally, a conditional random field is used for carrying out combined modeling to obtain a globally optimal label sequence. The method provided by the invention obtains good performance on data sets identified by Chinese named entities, such as Boson, LDC2009 and the like, and the average performance F1 value is improved by 4.95%.
Description
Technical Field
The invention relates to a sequence labeling model and a sequence labeling method, and belongs to the field of computer application and natural language processing.
Background
Named Entity Recognition (NER), also called "proper name Recognition", refers to recognizing entities with specific meaning in text, mainly including names of people, places, organizations, proper nouns, etc. Early named entity recognition is mostly based on rule method, but because the language structure itself has uncertainty, the difficulty of making unified and complete rule is large. The rule-based method requires the construction of a specific rule template, the adopted characteristics include statistical information, punctuation marks, keywords, position words, central words and the like, the matching of patterns and character strings is taken as a main means, and the method particularly depends on the establishment of a knowledge base and a dictionary. Aiming at different fields, an expert is required to rewrite rules, the cost is high, the rule establishment period is long, the portability is poor, knowledge bases in different fields are required to be established as assistance to improve the system identification capability and the like.
The traditional named entity recognition method mostly adopts supervised machine learning models, such as hidden Markov models, maximum entropy, support vector machines, conditional random fields and the like. The maximum entropy model has the characteristics of strict structure and good universality, but the training time is high in complexity, and the calculation overhead is large due to the fact that clear normalization calculation is needed. The conditional random field is excellent in word segmentation and named entity recognition, a labeling framework with flexible characteristics and global optimization is provided, and the problems of low convergence speed and long training time exist at the same time. The statistical-based method has high dependency on the selection of the features, the features with large influence factors on the task need to be analyzed and selected from the text, the features are added into the feature template, effective feature selection is carried out by counting and analyzing the language semantic information contained in the training corpus, and strong features are continuously found from the training corpus. In the work of word2vec in 2013 and GloVe in 2014, each word corresponds to a vector, and the vector cannot be used for the ambiguous word.
In view of the above, it is necessary to provide a sequence annotation model to solve the above problems.
Disclosure of Invention
The invention aims to provide a sequence annotation model for improving the performance of named entity recognition.
In order to achieve the above object, the present invention provides a sequence annotation model, wherein the sequence annotation model adopts a basic framework of a BilSTM-CRF model and adds word vectors of a pre-trained language model ELMo as additional features, and the sequence annotation model comprises:
an input layer: for inputting a sentence consisting of n characters (w)1w2...wn) Mapping each character in the sentence into a vector sequence by querying a word vector table, and introducing an ELMo word vector as an additional feature into the input layer;
BilSTM network layer: the device comprises a forward long-short time memory network LSTM and a backward long-short time memory network LSTM, wherein the forward long-short time memory network LSTM is used for representing a backward-forward calculation sequence, the backward long-short time memory network LSTM is used for representing a backward calculation same sequence, and a BiLSTM network layer is used for receiving a vector sequence obtained from an input layer and taking the vector sequence as input so as to obtain the context characteristics of each character;
CRF layer: for receiving the output of the BiLSTM network layer and introducing a transition matrix for global optimal decoding of the vector sequence of the entire sentence.
As a further refinement of the present invention, the sentences input into the input layer contain a variable number of characters.
As a further improvement of the present invention, the language model ELMo includes an english ELMo model in which one english word representation includes a word vector composed of english words and a word representation obtained by convolving characters in english through a convolutional neural network; in the Chinese ELMo model, the representation of the character is directly encoded.
As a further improvement of the invention, the language model ELMo is a bidirectional LSTM language model, and the bidirectional LSTM language model comprises a forward language model LSTM and a backward language model LSTM; after pre-training the bi-directional LSTM language model, the language model ELMo is based on the formula:is expressed as a word, whereinA vector representing the individual character itself,andrepresenting the input of the forward language model LSTM and the input of the backward language model LSTM, respectively, to compute 2L +1 tokens.
The invention also aims to provide a sequence labeling method, which is used for better realizing the sequence labeling model.
In order to achieve the above object, the present invention provides a sequence labeling method, which is applied to the above sequence labeling model, and specifically includes the following steps:
step 1, a characteristic representation stage: the input layer sets (w) the input sentence content through a random word vector or a word vector table initialized with a pre-trained word vector1w2...wn) Character w inkMapping to a sequence of vectors (v)1v2...vn) While introducing the word vector of ELMo as an additional feature, so that the character representation of each character is a concatenation of its character vector and the word vector representation of ELMo, i.e., wt=[vt,et],t∈[1,n]Then the vector sequence of the whole sentence input is (v)1v2...vn);
Step 2, an encoding stage: vector sequence (v) for the entire sentence output by the feature representation stage1v2...vn) In the forward language model LSTM, given (w)1w2...wk-1) Under the condition of (1), obtaining wkThen, the probability of the vector sequence of the whole sentence is obtained through a probability formula, a target function is obtained by combining the formula calculation of a backward language model LSTM, and then, the bidirectional LSTM language model is adopted to further realize coding so as to obtain the characteristic vector h of each character in the whole sentencet;
Step 3, decoding stage: defining a transition matrix AijTo represent a score from label i to label j, based on the feature vector h of each charactertCalculate the score for each label: ot=Woht+boThen, the vector sequence (v) of the whole sentence is calculated1v2...vn) Then transfer matrix A by referenceijTo select the globally most likely tag sequence.
As a further improvement of the present invention, the probability formula in step 2 is:
as a further improvement of the present invention, the formula of the backward language model LSTM in step 2 is:
as a further improvement of the present invention, the objective function in step 2 is:
wherein theta isxAnd thetasThe parameters of the word vector of the ELMo and the parameters of the softmax layer of the convolutional neural network, respectively.
As a further improvement of the present invention, the formula encoded by the bidirectional LSTM language model in step 2 includes: i.e. it=σ(Wiixt+bii+Whiht-1+bhi)
ft=σ(Wifxt+bif+Whfht-1+bhf)
gt=tanh(Wig+big+Whght-1+bhg)
ot=σ(Wioxt+bio+Whoht-1+bho)
Where W and b are parameters in the bi-directional LSTM language model, σ is a sigmoid function,is an element-by-element multiplication, it,ft,otInput, forget and output gates indicating time t, Ct,ht,gtIndicating the cell state, output state and new state at time t.
As a further improvement of the invention, the vector sequence (v) of the whole sentence in step 3 is1v2...vn) The score of (a) is:definition of y0And yn+1Is the beginning and ending tags of a sentence, the probability of the vector sequence of the entire sentence is:
the log probability of maximizing the correct tag sequence is:
wherein Y isxRepresents all possible tag sequences; by passingThe prediction outputs the most likely tag sequence.
The invention has the beneficial effects that: the sequence labeling model of the invention has the advantages of accelerating the recognition convergence speed of the named entity, shortening the training time, improving the performance of the sequence labeling model for the named entity recognition and improving the accuracy of the named entity recognition.
Drawings
FIG. 1 is a schematic structural diagram of a sequence annotation model according to the present invention.
Fig. 2 is a schematic structural view of the ELMo model in fig. 1.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The invention provides a sequence annotation model which is based on fine-grained named entity recognition. The method is characterized in that a pre-trained language model ELMo is added into a named entity recognition basic framework of the BilSTM-CRF as a characteristic, on the basis of large-scale unsupervised data, the performance of the named entity recognition model is improved by using pre-trained context-based word vectors, and the accuracy of the named entity recognition is improved by combining methods of natural language processing and machine learning.
As shown in fig. 1, the entire model includes: input layer, BilSTM network layer, and CRF layer.
An input layer: for inputting a sentence consisting of n characters (w)1w2...wn) And mapping each character in the sentence into a vector sequence by querying the word vector table, and introducing the word vector of the ELMo as an additional characteristic into the input layer. The sentences input into the input layer contain a variable number of characters. Specifically, characters contained in the sentence are mapped into a vector sequence through a word vector table, ELMo word vectors are added into an input layer, and the representation of each character is the concatenation of the word vectors and the ELMo representation and is input into a BilSTM layer.
BilSTM network layer (Bi-directional Long Short-Term Memory): the bidirectional long and short term memory neural network layer comprises a forward long and short term memory network LSTM and a backward long and short term memory network LSTM, and the forward long and short term memory network LSTM is used as a backward-forward calculation sequenceThe backward long-and-short time memory network LSTM is used for calculating the representation of the same sequence in a backward direction, and the BilSTM network layer is used for receiving the vector sequence obtained in the input layer as input so as to obtain the context characteristics of each character. Specifically, the LSTM model is formed by the input word w at time ttCell state CtTemporary cell stateHidden layer state htForgetting door ftMemory door itOutput gate otAnd (4) forming. The calculation process of the LSTM can be summarized as that information useful for the calculation at the subsequent moment is transmitted by forgetting and memorizing new information in the cell state, useless information is discarded, and a hidden layer state h is output at each time steptWherein the forgetting, memorizing and outputting are based on the hidden layer state h passing the last momentt-1And the current input wtCalculated forgetting door ftMemory door itOutput gate otTo control.
CRF layer: for receiving the output of the BiLSTM network layer and introducing a transition matrix for global optimal decoding of the vector sequence of the entire sentence. In particular, the CRF layer may add some constraints to the last predicted tag to ensure that the predicted tag is legitimate. These constraints may be learned automatically by the CRF layer during training of the training data. And the CRF layer receives the output score of the BilSTM layer as input, adds a transfer score matrix and selects a globally optimal label sequence according to the score.
The language model ELMo includes an english ELMo model in which one english word representation includes a word representation formed by convolving word vectors of english words and characters in english by CNN, and a chinese ELMo model; in the Chinese ELMo model, the representation of the character is directly encoded.
As shown in fig. 2, the language model ELMo is a bidirectional LSTM language model, which includes a forward language model LSTM and a backward language model LSTM, and the objective function is the maximum likelihood of the two directional language models. In advance trainingAfter the bi-directional LSTM language model, the language model ELMo follows the formula:as a word representation, which is a summation of each intermediate layer of the bi-directional language model, the simplest representation of the highest layer can be used as ELMo. Then when a supervised NLP task is performed, ELMo is spliced as a feature to the word vector input of a specific task model or the representation of the highest layer of the model. Unlike traditional word vectors, each word corresponds to only one word vector, ELMo utilizes a pre-trained bi-directional language model, and then can obtain context-dependent current word representations from the language model according to specific inputs, i.e., representations of the same word with different contexts are different, and then are added into a specific NLP supervised model as features.
In order to better realize the sequence labeling model, the invention also provides a sequence labeling method, which specifically comprises the following steps:
step 1, a characteristic representation stage: the input layer sets (w) the input sentence content through a random word vector or a word vector table initialized with a pre-trained word vector1w2...wn) Is mapped to a vector sequence (v)1v2...vn) While introducing the word vector of ELMo as an additional feature, so that the character representation of each character is a concatenation of its character vector and the word vector representation of ELMo, i.e., wt=[vt,et],t∈[1,n]Then the vector sequence of the whole sentence input is (v)1v2...vn);
Step 2, an encoding stage: vector sequence (v) for the entire sentence output by the feature representation stage1v2...vn) In the forward language model LSTM, given (w)1w2...wk-1) Under the condition of finding each character wkThen calculating the probability of the vector sequence of the whole sentence by a probability formula, obtaining a target function by combining the formula of a backward language model, and then adopting a bidirectional LSTM languageThe model is coded to obtain a feature vector h of each character in the whole sentencet;
Step 3, decoding stage: defining a transition matrix AijWhich represents the score from label i to label j, based on each character feature vector htCalculate the score for each label: ot=Woht+boFurther calculate the whole sentence vector sequence (v)1v2...vn) Then selects the globally most likely tag sequence by referring to the transition matrix.
In step 2, in particular, for a sequence of n characters (w)1w2...wn),wkCan be given by (w) in the forward language model1w2...wk-1) The probability of the whole sentence sequence can be calculated by establishing a model according to the following conditions:the backward language model is similar to the forward language model, and only the input sequence needs to be inverted, i.e. the context is predicted by:
since the bi-directional language model consists of a forward language model and a backward language model, the goal of model optimization is to maximize the sum of forward and backward language model probabilities:
wherein theta isxAnd thetasThe parameters of the word vector of the ELMo and the parameters of the softmax layer of the convolutional neural network, respectively. ELMo is a combination of BilsTM network layers, for each character wkAn L-level bi-directional language model can yield 2L +1 tokens:whereinA vector representing the individual character itself,andrepresenting the input of the forward language model LSTM and the input of the backward language model LSTM, respectively, to compute 2L +1 tokens. To apply ELMo to the Chinese NER task, all layers of R are collapsed into a single vector, Ek=E(Rk;θe) So ELMo provides 2L +1 tokens for each entered character. Secondly, in the BilSTM layer, the calculation method is as follows:
it=σ(Wiixt+bii+Whiht-1+bhi)
ft=σ(Wifxt+bif+Whfht-1+bhf)
gt=tanh(Wig+big+Whght-1+bhg)
ot=σ(Wioxt+bio+Whoht-1+bho)
where W and b are parameters in the LSTM cell, σ is the sigmoid function,is an element-by-element multiplication, it,ft,otInput gate, forget gate, and output gate representing time t, Ct,ht,gtCell state, output state and new state at time t. Given a sentence sensor (w) containing n characters1w2...wn) Splicing the outputs of the forward LSTM and the backward LSTM to obtain a feature vector representation h of the character at the time tt=[hlt,hrt]。
In step 3, in particular in the CRF layer, h is not usedtOutput o as a feature pairtIndependent label prediction is performed, but joint modeling is performed using CRF. According to character characteristics htCalculate the score for each label: ot=Woht+boDefining a transition matrix AijRepresenting the score from tag i to tag j, the vector sequence (v) of the entire sentence1v2...vn) Is scored asy0And yn+1Are the beginning and ending tags of a sentence. The probability of the vector sequence for the entire sentence is:in the training process, the log probability of the correct tag sequence is maximized:wherein Y isxRepresenting all possible tag sequences. When decoding, byThe prediction outputs the most likely tag sequence.
In summary, the invention provides a sequence tagging model, and improves the performance of the sequence tagging model for named entity recognition by using a sequence tagging method, thereby improving the accuracy of named entity recognition. The model is added with ELMo work, the model is different from the prior model that one word corresponds to one vector, and in the ELMo world, the pre-trained model is not only the vector corresponding relation but also a trained model. When the method is used, a sentence or a segment of a sentence is input into the model, the model infers the word vector corresponding to each word according to the online text, so that the polysemous words can be understood by combining the context before and after the polysemous words.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.
Claims (10)
1. A sequence annotation model, characterized in that the basic framework adopted by the sequence annotation model is a BilSTM-CRF model, and word vectors of a pre-trained language model ELMo are added as additional features, and the sequence annotation model comprises:
an input layer: for inputting a sentence consisting of n characters (w)1w2...wn) Mapping each character in the sentence into a vector sequence by querying a word vector table, and introducing an ELMo word vector as an additional feature into the input layer;
BilSTM network layer: the device comprises a forward long-short time memory network LSTM and a backward long-short time memory network LSTM, wherein the forward long-short time memory network LSTM is used for representing a backward-forward calculation sequence, the backward long-short time memory network LSTM is used for representing a backward calculation same sequence, and a BiLSTM network layer is used for receiving a vector sequence obtained from an input layer and taking the vector sequence as input so as to obtain the context characteristics of each character;
CRF layer: for receiving the output of the BiLSTM network layer and introducing a transition matrix for global optimal decoding of the vector sequence of the entire sentence.
2. The sequence annotation model of claim 1, wherein: the sentences input into the input layer contain a variable number of characters.
3. The sequence annotation model of claim 1, wherein: the language model ELMo comprises an English ELMo model and a Chinese ELMo model, wherein in the English ELMo model, an English word expression comprises a word vector of the English word and a word expression for performing convolution on characters in English through a convolution neural network; in the Chinese ELMo model, the representation of the character is directly encoded.
4. The sequence annotation model of claim 1, wherein: the language model ELMo is a bidirectional LSTM language model, and the bidirectional LSTM language model comprises a forward language model LSTM and a backward language model LSTM; after pre-training the bi-directional LSTM language model, the language model ELMo is based on the formula:is expressed as a word, whereinA vector representing the individual character itself,andrepresenting the input of the forward language model LSTM and the input of the backward language model LSTM, respectively, to compute 2L +1 tokens.
5. A sequence annotation method is applied to the sequence annotation model of any one of claims 1 to 4, and comprises the following specific steps:
step 1, a characteristic representation stage: the input layer sets (w) the input sentence content through a random word vector or a word vector table initialized with a pre-trained word vector1w2...wn) Character mapping ofIs emitted as a sequence of vectors (v)1v2...vn) While introducing the word vector of ELMo as an additional feature, so that the character representation of each character is a concatenation of its character vector and the word vector representation of ELMo, i.e., wt=[vt,et],t∈[1,n]Then the vector sequence of the whole sentence input is (v)1v2...vn);
Step 2, an encoding stage: vector sequence (v) for the entire sentence output by the feature representation stage1v2...vn) In the forward language model LSTM, (v) is given1v2...vk-1) Under the condition of (1), obtaining wkThen, the probability of the vector sequence of the whole sentence is obtained through a probability formula, a target function is obtained by combining the formula calculation of a backward language model LSTM, and then the bidirectional LSTM language model is adopted for coding to obtain the characteristic vector h of each character in the whole sentencet;
Step 3, decoding stage: defining a transition matrix AijTo represent a score from label i to label j, based on the feature vector h of each charactertCalculate the score for each label: ot=Woht+boThen, the vector sequence (v) of the whole sentence is calculated1v2...vn) Then transfer matrix A by referenceijTo select the globally most likely tag sequence.
9. The sequence annotation method of claim 8, wherein the formula encoded in step 2 by using the bi-directional LSTM language model comprises:
it=σ(Wiixt+bii+Whiht-1+bhi)
ft=σ(Wifxt+bif+Whfht-1+bhf)
gt=tanh(Wig+big+Whght-1+bhg)
ot=σ(Wioxt+bio+Whoht-1+bho)
10. The sequence annotation process of claim 9, wherein the vector sequence (v) of the entire sentence in step 3 is1v2...vn) The score of (a) is:
definition of y0And yn+1Is the beginning and ending tags of a sentence, the probability of the vector sequence of the entire sentence is:
the log probability of maximizing the correct tag sequence is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011577267.9A CN112651245A (en) | 2020-12-28 | 2020-12-28 | Sequence annotation model and sequence annotation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011577267.9A CN112651245A (en) | 2020-12-28 | 2020-12-28 | Sequence annotation model and sequence annotation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112651245A true CN112651245A (en) | 2021-04-13 |
Family
ID=75363348
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011577267.9A Pending CN112651245A (en) | 2020-12-28 | 2020-12-28 | Sequence annotation model and sequence annotation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112651245A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN108460013A (en) * | 2018-01-30 | 2018-08-28 | 大连理工大学 | A kind of sequence labelling model based on fine granularity vocabulary representation model |
CN109117472A (en) * | 2018-11-12 | 2019-01-01 | 新疆大学 | A kind of Uighur name entity recognition method based on deep learning |
CN115114924A (en) * | 2022-06-17 | 2022-09-27 | 珠海格力电器股份有限公司 | Named entity recognition method, device, computing equipment and storage medium |
-
2020
- 2020-12-28 CN CN202011577267.9A patent/CN112651245A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN108460013A (en) * | 2018-01-30 | 2018-08-28 | 大连理工大学 | A kind of sequence labelling model based on fine granularity vocabulary representation model |
CN109117472A (en) * | 2018-11-12 | 2019-01-01 | 新疆大学 | A kind of Uighur name entity recognition method based on deep learning |
CN115114924A (en) * | 2022-06-17 | 2022-09-27 | 珠海格力电器股份有限公司 | Named entity recognition method, device, computing equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
MATTHEW E. PETERS 等: "Deep contextualized word representations", 《PROCEEDINGS OF THE 2018 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》, pages 2227 * |
张栋 等: "基于上下文相关字向量的中文命名实体识别作", 《计算机科学》, pages 1 - 12 * |
胡万亭 等: "一种基于改进ELMO 模型的组织机构名识别方法", 《计算机技术与发展》, pages 25 - 29 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108460013B (en) | Sequence labeling model and method based on fine-grained word representation model | |
CN108268444B (en) | Chinese word segmentation method based on bidirectional LSTM, CNN and CRF | |
CN109657239B (en) | Chinese named entity recognition method based on attention mechanism and language model learning | |
CN112100351A (en) | Method and equipment for constructing intelligent question-answering system through question generation data set | |
CN110609891A (en) | Visual dialog generation method based on context awareness graph neural network | |
CN109408812A (en) | A method of the sequence labelling joint based on attention mechanism extracts entity relationship | |
CN110737758A (en) | Method and apparatus for generating a model | |
CN112487820B (en) | Chinese medical named entity recognition method | |
CN110555084A (en) | remote supervision relation classification method based on PCNN and multi-layer attention | |
CN114943230B (en) | Method for linking entities in Chinese specific field by fusing common sense knowledge | |
CN112101031B (en) | Entity identification method, terminal equipment and storage medium | |
CN113204611A (en) | Method for establishing reading understanding model, reading understanding method and corresponding device | |
CN112183083A (en) | Abstract automatic generation method and device, electronic equipment and storage medium | |
CN114153971A (en) | Error-containing Chinese text error correction, identification and classification equipment | |
CN114239574A (en) | Miner violation knowledge extraction method based on entity and relationship joint learning | |
CN113743099A (en) | Self-attention mechanism-based term extraction system, method, medium and terminal | |
CN112966073A (en) | Short text matching method based on semantics and shallow features | |
CN113326702A (en) | Semantic recognition method and device, electronic equipment and storage medium | |
CN111145914A (en) | Method and device for determining lung cancer clinical disease library text entity | |
CN114510946A (en) | Chinese named entity recognition method and system based on deep neural network | |
CN114443813A (en) | Intelligent online teaching resource knowledge point concept entity linking method | |
US20210303777A1 (en) | Method and apparatus for fusing position information, and non-transitory computer-readable recording medium | |
CN113641809A (en) | XLNET-BiGRU-CRF-based intelligent question answering method | |
WO2023116572A1 (en) | Word or sentence generation method and related device | |
CN116680407A (en) | Knowledge graph construction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |