CN112651245A - Sequence annotation model and sequence annotation method - Google Patents

Sequence annotation model and sequence annotation method Download PDF

Info

Publication number
CN112651245A
CN112651245A CN202011577267.9A CN202011577267A CN112651245A CN 112651245 A CN112651245 A CN 112651245A CN 202011577267 A CN202011577267 A CN 202011577267A CN 112651245 A CN112651245 A CN 112651245A
Authority
CN
China
Prior art keywords
sequence
model
vector
lstm
elmo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011577267.9A
Other languages
Chinese (zh)
Inventor
王进
章韵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202011577267.9A priority Critical patent/CN112651245A/en
Publication of CN112651245A publication Critical patent/CN112651245A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a sequence labeling model and a sequence labeling method. When the model is used for carrying out sequence labeling tasks, firstly ELMo word vectors are added into an input layer to serve as extra features, the representation of each character is the splicing of the character vectors and the ELMo representation, secondly, in a BilSTM network layer, in addition to the fact that a forward LSTM network is used for learning the historical features of each character, the sequence is also input into a reverse LSTM network in a reverse order for learning the subsequent features of each character, the context features of the characters are spliced and input into a CRF layer, and finally, a conditional random field is used for carrying out combined modeling to obtain a globally optimal label sequence. The method provided by the invention obtains good performance on data sets identified by Chinese named entities, such as Boson, LDC2009 and the like, and the average performance F1 value is improved by 4.95%.

Description

Sequence annotation model and sequence annotation method
Technical Field
The invention relates to a sequence labeling model and a sequence labeling method, and belongs to the field of computer application and natural language processing.
Background
Named Entity Recognition (NER), also called "proper name Recognition", refers to recognizing entities with specific meaning in text, mainly including names of people, places, organizations, proper nouns, etc. Early named entity recognition is mostly based on rule method, but because the language structure itself has uncertainty, the difficulty of making unified and complete rule is large. The rule-based method requires the construction of a specific rule template, the adopted characteristics include statistical information, punctuation marks, keywords, position words, central words and the like, the matching of patterns and character strings is taken as a main means, and the method particularly depends on the establishment of a knowledge base and a dictionary. Aiming at different fields, an expert is required to rewrite rules, the cost is high, the rule establishment period is long, the portability is poor, knowledge bases in different fields are required to be established as assistance to improve the system identification capability and the like.
The traditional named entity recognition method mostly adopts supervised machine learning models, such as hidden Markov models, maximum entropy, support vector machines, conditional random fields and the like. The maximum entropy model has the characteristics of strict structure and good universality, but the training time is high in complexity, and the calculation overhead is large due to the fact that clear normalization calculation is needed. The conditional random field is excellent in word segmentation and named entity recognition, a labeling framework with flexible characteristics and global optimization is provided, and the problems of low convergence speed and long training time exist at the same time. The statistical-based method has high dependency on the selection of the features, the features with large influence factors on the task need to be analyzed and selected from the text, the features are added into the feature template, effective feature selection is carried out by counting and analyzing the language semantic information contained in the training corpus, and strong features are continuously found from the training corpus. In the work of word2vec in 2013 and GloVe in 2014, each word corresponds to a vector, and the vector cannot be used for the ambiguous word.
In view of the above, it is necessary to provide a sequence annotation model to solve the above problems.
Disclosure of Invention
The invention aims to provide a sequence annotation model for improving the performance of named entity recognition.
In order to achieve the above object, the present invention provides a sequence annotation model, wherein the sequence annotation model adopts a basic framework of a BilSTM-CRF model and adds word vectors of a pre-trained language model ELMo as additional features, and the sequence annotation model comprises:
an input layer: for inputting a sentence consisting of n characters (w)1w2...wn) Mapping each character in the sentence into a vector sequence by querying a word vector table, and introducing an ELMo word vector as an additional feature into the input layer;
BilSTM network layer: the device comprises a forward long-short time memory network LSTM and a backward long-short time memory network LSTM, wherein the forward long-short time memory network LSTM is used for representing a backward-forward calculation sequence, the backward long-short time memory network LSTM is used for representing a backward calculation same sequence, and a BiLSTM network layer is used for receiving a vector sequence obtained from an input layer and taking the vector sequence as input so as to obtain the context characteristics of each character;
CRF layer: for receiving the output of the BiLSTM network layer and introducing a transition matrix for global optimal decoding of the vector sequence of the entire sentence.
As a further refinement of the present invention, the sentences input into the input layer contain a variable number of characters.
As a further improvement of the present invention, the language model ELMo includes an english ELMo model in which one english word representation includes a word vector composed of english words and a word representation obtained by convolving characters in english through a convolutional neural network; in the Chinese ELMo model, the representation of the character is directly encoded.
As a further improvement of the invention, the language model ELMo is a bidirectional LSTM language model, and the bidirectional LSTM language model comprises a forward language model LSTM and a backward language model LSTM; after pre-training the bi-directional LSTM language model, the language model ELMo is based on the formula:
Figure BDA0002864276680000031
is expressed as a word, wherein
Figure BDA0002864276680000032
A vector representing the individual character itself,
Figure BDA0002864276680000033
and
Figure BDA0002864276680000034
representing the input of the forward language model LSTM and the input of the backward language model LSTM, respectively, to compute 2L +1 tokens.
The invention also aims to provide a sequence labeling method, which is used for better realizing the sequence labeling model.
In order to achieve the above object, the present invention provides a sequence labeling method, which is applied to the above sequence labeling model, and specifically includes the following steps:
step 1, a characteristic representation stage: the input layer sets (w) the input sentence content through a random word vector or a word vector table initialized with a pre-trained word vector1w2...wn) Character w inkMapping to a sequence of vectors (v)1v2...vn) While introducing the word vector of ELMo as an additional feature, so that the character representation of each character is a concatenation of its character vector and the word vector representation of ELMo, i.e., wt=[vt,et],t∈[1,n]Then the vector sequence of the whole sentence input is (v)1v2...vn);
Step 2, an encoding stage: vector sequence (v) for the entire sentence output by the feature representation stage1v2...vn) In the forward language model LSTM, given (w)1w2...wk-1) Under the condition of (1), obtaining wkThen, the probability of the vector sequence of the whole sentence is obtained through a probability formula, a target function is obtained by combining the formula calculation of a backward language model LSTM, and then, the bidirectional LSTM language model is adopted to further realize coding so as to obtain the characteristic vector h of each character in the whole sentencet
Step 3, decoding stage: defining a transition matrix AijTo represent a score from label i to label j, based on the feature vector h of each charactertCalculate the score for each label: ot=Woht+boThen, the vector sequence (v) of the whole sentence is calculated1v2...vn) Then transfer matrix A by referenceijTo select the globally most likely tag sequence.
As a further improvement of the present invention, the probability formula in step 2 is:
Figure BDA0002864276680000035
as a further improvement of the present invention, the formula of the backward language model LSTM in step 2 is:
Figure BDA0002864276680000041
as a further improvement of the present invention, the objective function in step 2 is:
Figure BDA0002864276680000042
wherein theta isxAnd thetasThe parameters of the word vector of the ELMo and the parameters of the softmax layer of the convolutional neural network, respectively.
As a further improvement of the present invention, the formula encoded by the bidirectional LSTM language model in step 2 includes: i.e. it=σ(Wiixt+bii+Whiht-1+bhi)
ft=σ(Wifxt+bif+Whfht-1+bhf)
gt=tanh(Wig+big+Whght-1+bhg)
ot=σ(Wioxt+bio+Whoht-1+bho)
Figure BDA0002864276680000043
Figure BDA0002864276680000044
Where W and b are parameters in the bi-directional LSTM language model, σ is a sigmoid function,
Figure BDA0002864276680000045
is an element-by-element multiplication, it,ft,otInput, forget and output gates indicating time t, Ct,ht,gtIndicating the cell state, output state and new state at time t.
As a further improvement of the invention, the vector sequence (v) of the whole sentence in step 3 is1v2...vn) The score of (a) is:
Figure BDA0002864276680000046
definition of y0And yn+1Is the beginning and ending tags of a sentence, the probability of the vector sequence of the entire sentence is:
Figure BDA0002864276680000047
the log probability of maximizing the correct tag sequence is:
Figure BDA0002864276680000051
wherein Y isxRepresents all possible tag sequences; by passing
Figure BDA0002864276680000052
The prediction outputs the most likely tag sequence.
The invention has the beneficial effects that: the sequence labeling model of the invention has the advantages of accelerating the recognition convergence speed of the named entity, shortening the training time, improving the performance of the sequence labeling model for the named entity recognition and improving the accuracy of the named entity recognition.
Drawings
FIG. 1 is a schematic structural diagram of a sequence annotation model according to the present invention.
Fig. 2 is a schematic structural view of the ELMo model in fig. 1.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The invention provides a sequence annotation model which is based on fine-grained named entity recognition. The method is characterized in that a pre-trained language model ELMo is added into a named entity recognition basic framework of the BilSTM-CRF as a characteristic, on the basis of large-scale unsupervised data, the performance of the named entity recognition model is improved by using pre-trained context-based word vectors, and the accuracy of the named entity recognition is improved by combining methods of natural language processing and machine learning.
As shown in fig. 1, the entire model includes: input layer, BilSTM network layer, and CRF layer.
An input layer: for inputting a sentence consisting of n characters (w)1w2...wn) And mapping each character in the sentence into a vector sequence by querying the word vector table, and introducing the word vector of the ELMo as an additional characteristic into the input layer. The sentences input into the input layer contain a variable number of characters. Specifically, characters contained in the sentence are mapped into a vector sequence through a word vector table, ELMo word vectors are added into an input layer, and the representation of each character is the concatenation of the word vectors and the ELMo representation and is input into a BilSTM layer.
BilSTM network layer (Bi-directional Long Short-Term Memory): the bidirectional long and short term memory neural network layer comprises a forward long and short term memory network LSTM and a backward long and short term memory network LSTM, and the forward long and short term memory network LSTM is used as a backward-forward calculation sequenceThe backward long-and-short time memory network LSTM is used for calculating the representation of the same sequence in a backward direction, and the BilSTM network layer is used for receiving the vector sequence obtained in the input layer as input so as to obtain the context characteristics of each character. Specifically, the LSTM model is formed by the input word w at time ttCell state CtTemporary cell state
Figure BDA0002864276680000061
Hidden layer state htForgetting door ftMemory door itOutput gate otAnd (4) forming. The calculation process of the LSTM can be summarized as that information useful for the calculation at the subsequent moment is transmitted by forgetting and memorizing new information in the cell state, useless information is discarded, and a hidden layer state h is output at each time steptWherein the forgetting, memorizing and outputting are based on the hidden layer state h passing the last momentt-1And the current input wtCalculated forgetting door ftMemory door itOutput gate otTo control.
CRF layer: for receiving the output of the BiLSTM network layer and introducing a transition matrix for global optimal decoding of the vector sequence of the entire sentence. In particular, the CRF layer may add some constraints to the last predicted tag to ensure that the predicted tag is legitimate. These constraints may be learned automatically by the CRF layer during training of the training data. And the CRF layer receives the output score of the BilSTM layer as input, adds a transfer score matrix and selects a globally optimal label sequence according to the score.
The language model ELMo includes an english ELMo model in which one english word representation includes a word representation formed by convolving word vectors of english words and characters in english by CNN, and a chinese ELMo model; in the Chinese ELMo model, the representation of the character is directly encoded.
As shown in fig. 2, the language model ELMo is a bidirectional LSTM language model, which includes a forward language model LSTM and a backward language model LSTM, and the objective function is the maximum likelihood of the two directional language models. In advance trainingAfter the bi-directional LSTM language model, the language model ELMo follows the formula:
Figure BDA0002864276680000062
as a word representation, which is a summation of each intermediate layer of the bi-directional language model, the simplest representation of the highest layer can be used as ELMo. Then when a supervised NLP task is performed, ELMo is spliced as a feature to the word vector input of a specific task model or the representation of the highest layer of the model. Unlike traditional word vectors, each word corresponds to only one word vector, ELMo utilizes a pre-trained bi-directional language model, and then can obtain context-dependent current word representations from the language model according to specific inputs, i.e., representations of the same word with different contexts are different, and then are added into a specific NLP supervised model as features.
In order to better realize the sequence labeling model, the invention also provides a sequence labeling method, which specifically comprises the following steps:
step 1, a characteristic representation stage: the input layer sets (w) the input sentence content through a random word vector or a word vector table initialized with a pre-trained word vector1w2...wn) Is mapped to a vector sequence (v)1v2...vn) While introducing the word vector of ELMo as an additional feature, so that the character representation of each character is a concatenation of its character vector and the word vector representation of ELMo, i.e., wt=[vt,et],t∈[1,n]Then the vector sequence of the whole sentence input is (v)1v2...vn);
Step 2, an encoding stage: vector sequence (v) for the entire sentence output by the feature representation stage1v2...vn) In the forward language model LSTM, given (w)1w2...wk-1) Under the condition of finding each character wkThen calculating the probability of the vector sequence of the whole sentence by a probability formula, obtaining a target function by combining the formula of a backward language model, and then adopting a bidirectional LSTM languageThe model is coded to obtain a feature vector h of each character in the whole sentencet
Step 3, decoding stage: defining a transition matrix AijWhich represents the score from label i to label j, based on each character feature vector htCalculate the score for each label: ot=Woht+boFurther calculate the whole sentence vector sequence (v)1v2...vn) Then selects the globally most likely tag sequence by referring to the transition matrix.
In step 2, in particular, for a sequence of n characters (w)1w2...wn),wkCan be given by (w) in the forward language model1w2...wk-1) The probability of the whole sentence sequence can be calculated by establishing a model according to the following conditions:
Figure BDA0002864276680000071
the backward language model is similar to the forward language model, and only the input sequence needs to be inverted, i.e. the context is predicted by:
Figure BDA0002864276680000081
since the bi-directional language model consists of a forward language model and a backward language model, the goal of model optimization is to maximize the sum of forward and backward language model probabilities:
Figure BDA0002864276680000082
wherein theta isxAnd thetasThe parameters of the word vector of the ELMo and the parameters of the softmax layer of the convolutional neural network, respectively. ELMo is a combination of BilsTM network layers, for each character wkAn L-level bi-directional language model can yield 2L +1 tokens:
Figure BDA0002864276680000083
wherein
Figure BDA0002864276680000084
A vector representing the individual character itself,
Figure BDA0002864276680000085
and
Figure BDA0002864276680000086
representing the input of the forward language model LSTM and the input of the backward language model LSTM, respectively, to compute 2L +1 tokens. To apply ELMo to the Chinese NER task, all layers of R are collapsed into a single vector, Ek=E(Rk;θe) So ELMo provides 2L +1 tokens for each entered character. Secondly, in the BilSTM layer, the calculation method is as follows:
it=σ(Wiixt+bii+Whiht-1+bhi)
ft=σ(Wifxt+bif+Whfht-1+bhf)
gt=tanh(Wig+big+Whght-1+bhg)
ot=σ(Wioxt+bio+Whoht-1+bho)
Figure BDA0002864276680000087
Figure BDA0002864276680000088
where W and b are parameters in the LSTM cell, σ is the sigmoid function,
Figure BDA0002864276680000089
is an element-by-element multiplication, it,ft,otInput gate, forget gate, and output gate representing time t, Ct,ht,gtCell state, output state and new state at time t. Given a sentence sensor (w) containing n characters1w2...wn) Splicing the outputs of the forward LSTM and the backward LSTM to obtain a feature vector representation h of the character at the time tt=[hlt,hrt]。
In step 3, in particular in the CRF layer, h is not usedtOutput o as a feature pairtIndependent label prediction is performed, but joint modeling is performed using CRF. According to character characteristics htCalculate the score for each label: ot=Woht+boDefining a transition matrix AijRepresenting the score from tag i to tag j, the vector sequence (v) of the entire sentence1v2...vn) Is scored as
Figure BDA0002864276680000091
y0And yn+1Are the beginning and ending tags of a sentence. The probability of the vector sequence for the entire sentence is:
Figure BDA0002864276680000092
in the training process, the log probability of the correct tag sequence is maximized:
Figure BDA0002864276680000093
wherein Y isxRepresenting all possible tag sequences. When decoding, by
Figure BDA0002864276680000094
The prediction outputs the most likely tag sequence.
In summary, the invention provides a sequence tagging model, and improves the performance of the sequence tagging model for named entity recognition by using a sequence tagging method, thereby improving the accuracy of named entity recognition. The model is added with ELMo work, the model is different from the prior model that one word corresponds to one vector, and in the ELMo world, the pre-trained model is not only the vector corresponding relation but also a trained model. When the method is used, a sentence or a segment of a sentence is input into the model, the model infers the word vector corresponding to each word according to the online text, so that the polysemous words can be understood by combining the context before and after the polysemous words.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims (10)

1. A sequence annotation model, characterized in that the basic framework adopted by the sequence annotation model is a BilSTM-CRF model, and word vectors of a pre-trained language model ELMo are added as additional features, and the sequence annotation model comprises:
an input layer: for inputting a sentence consisting of n characters (w)1w2...wn) Mapping each character in the sentence into a vector sequence by querying a word vector table, and introducing an ELMo word vector as an additional feature into the input layer;
BilSTM network layer: the device comprises a forward long-short time memory network LSTM and a backward long-short time memory network LSTM, wherein the forward long-short time memory network LSTM is used for representing a backward-forward calculation sequence, the backward long-short time memory network LSTM is used for representing a backward calculation same sequence, and a BiLSTM network layer is used for receiving a vector sequence obtained from an input layer and taking the vector sequence as input so as to obtain the context characteristics of each character;
CRF layer: for receiving the output of the BiLSTM network layer and introducing a transition matrix for global optimal decoding of the vector sequence of the entire sentence.
2. The sequence annotation model of claim 1, wherein: the sentences input into the input layer contain a variable number of characters.
3. The sequence annotation model of claim 1, wherein: the language model ELMo comprises an English ELMo model and a Chinese ELMo model, wherein in the English ELMo model, an English word expression comprises a word vector of the English word and a word expression for performing convolution on characters in English through a convolution neural network; in the Chinese ELMo model, the representation of the character is directly encoded.
4. The sequence annotation model of claim 1, wherein: the language model ELMo is a bidirectional LSTM language model, and the bidirectional LSTM language model comprises a forward language model LSTM and a backward language model LSTM; after pre-training the bi-directional LSTM language model, the language model ELMo is based on the formula:
Figure FDA0002864276670000011
is expressed as a word, wherein
Figure FDA0002864276670000012
A vector representing the individual character itself,
Figure FDA0002864276670000013
and
Figure FDA0002864276670000014
representing the input of the forward language model LSTM and the input of the backward language model LSTM, respectively, to compute 2L +1 tokens.
5. A sequence annotation method is applied to the sequence annotation model of any one of claims 1 to 4, and comprises the following specific steps:
step 1, a characteristic representation stage: the input layer sets (w) the input sentence content through a random word vector or a word vector table initialized with a pre-trained word vector1w2...wn) Character mapping ofIs emitted as a sequence of vectors (v)1v2...vn) While introducing the word vector of ELMo as an additional feature, so that the character representation of each character is a concatenation of its character vector and the word vector representation of ELMo, i.e., wt=[vt,et],t∈[1,n]Then the vector sequence of the whole sentence input is (v)1v2...vn);
Step 2, an encoding stage: vector sequence (v) for the entire sentence output by the feature representation stage1v2...vn) In the forward language model LSTM, (v) is given1v2...vk-1) Under the condition of (1), obtaining wkThen, the probability of the vector sequence of the whole sentence is obtained through a probability formula, a target function is obtained by combining the formula calculation of a backward language model LSTM, and then the bidirectional LSTM language model is adopted for coding to obtain the characteristic vector h of each character in the whole sentencet
Step 3, decoding stage: defining a transition matrix AijTo represent a score from label i to label j, based on the feature vector h of each charactertCalculate the score for each label: ot=Woht+boThen, the vector sequence (v) of the whole sentence is calculated1v2...vn) Then transfer matrix A by referenceijTo select the globally most likely tag sequence.
6. The sequence annotation method of claim 5, wherein the probability formula in step 2 is:
Figure FDA0002864276670000021
7. the sequence annotation method of claim 6, wherein the formula of the backward language model LSTM in step 2 is:
Figure FDA0002864276670000022
8. the sequence annotation method of claim 7, wherein the objective function in step 2 is:
Figure FDA0002864276670000031
wherein theta isxAnd thetasThe parameters of the word vector of the ELMo and the parameters of the softmax layer of the convolutional neural network, respectively.
9. The sequence annotation method of claim 8, wherein the formula encoded in step 2 by using the bi-directional LSTM language model comprises:
it=σ(Wiixt+bii+Whiht-1+bhi)
ft=σ(Wifxt+bif+Whfht-1+bhf)
gt=tanh(Wig+big+Whght-1+bhg)
ot=σ(Wioxt+bio+Whoht-1+bho)
Figure FDA0002864276670000032
Figure FDA0002864276670000033
where W and b are parameters in the bi-directional LSTM language model, σ is a sigmoid function,
Figure FDA0002864276670000036
is an element-by-element multiplication, it,ft,otInput, forget and output gates indicating time t, Ct,ht,gtIndicating the cell state, output state and new state at time t.
10. The sequence annotation process of claim 9, wherein the vector sequence (v) of the entire sentence in step 3 is1v2...vn) The score of (a) is:
Figure FDA0002864276670000034
definition of y0And yn+1Is the beginning and ending tags of a sentence, the probability of the vector sequence of the entire sentence is:
Figure FDA0002864276670000035
the log probability of maximizing the correct tag sequence is:
Figure FDA0002864276670000041
wherein Y isxRepresents all possible tag sequences; by passing
Figure FDA0002864276670000042
The prediction outputs the most likely tag sequence.
CN202011577267.9A 2020-12-28 2020-12-28 Sequence annotation model and sequence annotation method Pending CN112651245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011577267.9A CN112651245A (en) 2020-12-28 2020-12-28 Sequence annotation model and sequence annotation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011577267.9A CN112651245A (en) 2020-12-28 2020-12-28 Sequence annotation model and sequence annotation method

Publications (1)

Publication Number Publication Date
CN112651245A true CN112651245A (en) 2021-04-13

Family

ID=75363348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011577267.9A Pending CN112651245A (en) 2020-12-28 2020-12-28 Sequence annotation model and sequence annotation method

Country Status (1)

Country Link
CN (1) CN112651245A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN108460013A (en) * 2018-01-30 2018-08-28 大连理工大学 A kind of sequence labelling model based on fine granularity vocabulary representation model
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN115114924A (en) * 2022-06-17 2022-09-27 珠海格力电器股份有限公司 Named entity recognition method, device, computing equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN108460013A (en) * 2018-01-30 2018-08-28 大连理工大学 A kind of sequence labelling model based on fine granularity vocabulary representation model
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN115114924A (en) * 2022-06-17 2022-09-27 珠海格力电器股份有限公司 Named entity recognition method, device, computing equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MATTHEW E. PETERS 等: "Deep contextualized word representations", 《PROCEEDINGS OF THE 2018 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》, pages 2227 *
张栋 等: "基于上下文相关字向量的中文命名实体识别作", 《计算机科学》, pages 1 - 12 *
胡万亭 等: "一种基于改进ELMO 模型的组织机构名识别方法", 《计算机技术与发展》, pages 25 - 29 *

Similar Documents

Publication Publication Date Title
CN108460013B (en) Sequence labeling model and method based on fine-grained word representation model
CN108268444B (en) Chinese word segmentation method based on bidirectional LSTM, CNN and CRF
CN109657239B (en) Chinese named entity recognition method based on attention mechanism and language model learning
CN112100351A (en) Method and equipment for constructing intelligent question-answering system through question generation data set
CN110609891A (en) Visual dialog generation method based on context awareness graph neural network
CN109408812A (en) A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN110737758A (en) Method and apparatus for generating a model
CN112487820B (en) Chinese medical named entity recognition method
CN110555084A (en) remote supervision relation classification method based on PCNN and multi-layer attention
CN114943230B (en) Method for linking entities in Chinese specific field by fusing common sense knowledge
CN112101031B (en) Entity identification method, terminal equipment and storage medium
CN113204611A (en) Method for establishing reading understanding model, reading understanding method and corresponding device
CN112183083A (en) Abstract automatic generation method and device, electronic equipment and storage medium
CN114153971A (en) Error-containing Chinese text error correction, identification and classification equipment
CN114239574A (en) Miner violation knowledge extraction method based on entity and relationship joint learning
CN113743099A (en) Self-attention mechanism-based term extraction system, method, medium and terminal
CN112966073A (en) Short text matching method based on semantics and shallow features
CN113326702A (en) Semantic recognition method and device, electronic equipment and storage medium
CN111145914A (en) Method and device for determining lung cancer clinical disease library text entity
CN114510946A (en) Chinese named entity recognition method and system based on deep neural network
CN114443813A (en) Intelligent online teaching resource knowledge point concept entity linking method
US20210303777A1 (en) Method and apparatus for fusing position information, and non-transitory computer-readable recording medium
CN113641809A (en) XLNET-BiGRU-CRF-based intelligent question answering method
WO2023116572A1 (en) Word or sentence generation method and related device
CN116680407A (en) Knowledge graph construction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination