CN111611802A - Multi-field entity identification method - Google Patents
Multi-field entity identification method Download PDFInfo
- Publication number
- CN111611802A CN111611802A CN202010437407.6A CN202010437407A CN111611802A CN 111611802 A CN111611802 A CN 111611802A CN 202010437407 A CN202010437407 A CN 202010437407A CN 111611802 A CN111611802 A CN 111611802A
- Authority
- CN
- China
- Prior art keywords
- model
- sequence
- label
- data
- labeling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a multi-field entity identification method. In this patent, we mainly made the following 2 innovations: 1. aiming at a cross-domain scene without any artificial labeling data in the target domain, weak labeling data in the target domain are quickly and automatically constructed. 2. And applying local annotation learning to a cross-domain named entity recognition task. Has the advantages that: under the condition that the target field does not have any artificial labeling data, the field self-adaptive capacity of the source field model is effectively improved, and the entity identification performance of the target field is improved while the data labeling cost is reduced.
Description
Technical Field
The invention relates to the field of entity identification, in particular to a multi-field entity identification method.
Background
Named entity recognition refers to the recognition of entities in text that have a particular meaning. In recent years, neural network methods have greatly improved the performance of named entity recognition tasks. However, in practical application scenarios, when the text belongs to a domain different from the corpus, the deep neural network model often exhibits a weak knowledge generalization capability.
The difficulties of named entity recognition across domains are mainly: 1) the entity names are various, and a large number of entities which do not appear in the source field can appear in the target field; 2) the language expression is different from the normal language expression in the news field, the data distribution of the linguistic data in each field is different, for example, the phenomenon of colloquial of social texts is serious, and the texts in the medical field have a large number of professional terms.
The current method for identifying the cross-domain named entity can be roughly divided into the following steps: 1) learning domain-independent features based on a multitask learning framework method; 2) and initializing a target domain model by using the model parameters obtained by the source domain training, and then training on target domain data.
Cross-domain named entity recognition based on multitask learning
The model is mainly divided into three parts: 1) word vector representation layer: converting the input word/phrase into a continuous vector representation; 2) a feature extraction layer: obtaining the probability of each word corresponding to each label through a bidirectional long-short term memory network and linear transformation; 3) prediction layer: it is predicted what the output sequence is under the current input conditions.
In order to extract domain-independent, task-dependent features, the method shares a word vector representation layer and a feature extraction layer of a source domain model and a target domain model. The CRF layer is not shared since the labels output by different domains may be different. The model is then trained separately using the artificial labeling data of the source domain and the artificial labeling data of the target domain. Experiments prove that the method can effectively extract the characteristics irrelevant to the fields by sharing a plurality of layers for joint training in 2 fields, thereby improving the entity recognition performance of the target field.
2. Cross-domain named entity identification based on parameter initialization
The method comprises the following steps:
1. and training in a source field with large-scale manual labeling data to obtain a model A.
2. Model B has the same model structure, and is initialized using the parameters of model a.
3. And continuing training the model B on the limited manual labeling data of the target field, and fitting the characteristics of the target field.
Experiments prove that the method can effectively improve the entity recognition performance of the target field, and the entity recognition performance of the fine-tuned model B to the target field is obviously superior to that of the model A.
The traditional technology has the following technical problems:
1. manual annotation corpora in the target domain are required. In practical application, large-scale high-quality labeled corpora are expensive to obtain. Moreover, the subdivided fields are very many, and a certain amount of linguistic data needs to be marked every new specific field, so that the cost is very high. When the target domain has no labeled data, most of the existing domain migration technologies cannot be effectively applied.
2. The utilization of label-free data for the target domain is lacking. The acquisition cost of large-scale label-free data is low, and abundant semantic information is contained in the label-free data. However, most existing domain migration techniques do not utilize it.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a multi-field entity identification method, which can automatically generate high-quality weak label data of a target field under the condition that no manual label data exists in the target field, and model the weak label data, thereby improving the identification performance of the named entity in the target field.
In order to solve the technical problem, the invention provides a multi-field entity identification method, which comprises the following steps: in order to reduce migration difficulty caused by different data distribution, two methods are used for simultaneously labeling the unmarked linguistic data of the target field, labels with high confidence are reserved, and special labels are adopted for uncertain positions to obtain weak labeling data of the target field; because the weakly labeled corpus contains uncertain labels, a common CRF layer cannot be modeled, and local labeling learning is applied to model the weakly labeled corpus;
automatic labeling:
searching entities which possibly appear in the text by using an external entity dictionary according to a forward maximum matching mechanism; marking the part which is successfully matched as an entity, and marking the part which is failed in matching as 'O';
training source field data to obtain a model, and directly marking the unmarked text of the target field by using the model as a result of a second automatic marking method;
comparing the labeling results of the two methods, and keeping the labels consistent with the two methods; marking the position where the conflict is generated as 'U', meaning 'Unknown', that is, the label of the word is uncertain and can be any possible label; the obtained result is the final target field weakly labeled corpus;
named entity identification based on local labeling:
the model processes the identification task as a sequence labeling task, the input of the model is a Chinese character sequence, and the output of the model is a label sequence;
in the model, for an input Chinese character sequence, neuron features are constructed through a bidirectional long-short term memory network (LSTM), and then the features are combined and input to a local CRF layer for label prediction; the whole model is divided into 3 main parts: 1) word vector representation layer: representing the input word string as a continuous vector through a word vector mapping table; 2) a feature extraction layer: obtaining the probability of each word corresponding to each label through a bidirectional long-short term memory network and linear transformation; 3) prediction layer: predicting what the output sequence under the current input condition is by adopting a local CRF;
the model is divided into two states, training and forecasting; in the training process, the system can calculate a corresponding label sequence according to an input training sentence, and the initial label sequence and the correct label sequence are definitely different from each other greatly, namely the performance of the initial model is poor; then the model calculates a difference value by using the result obtained by self prediction and the correct answer, and reversely updates the system parameters, wherein the updating aim is to minimize the difference value Ioss as much as possible; as training progresses, the model becomes more and more predictive of the sequence's tags until a performance maximum is reached.
In one embodiment, the model treats the recognition task as a sequence labeling task, the input of the model is a Chinese character sequence, and the output of the model is a label sequence; "in, the label takes the form of BIOES, where B-XX represents the first Chinese character of the XX category entity, E-XX represents the last Chinese character of the XX category entity, I-XX represents the middle part of the category XX entity, S-XX represents the category XX entity of a single character, and the other Chinese characters are labeled" 0 ".
In one embodiment, the word vector represents a layer: converting the discrete input Chinese characters into continuous vector representation; using a mapping table, wherein the vector representation corresponding to each Chinese character is stored in the table; the initial value of the vector can be initialized by using a random number and can also be set as a pre-training word vector; in the model training process, the vector table content is used as a parameter of the model and is optimized along with other parameters in the iteration process; given sentence C ═<c1,c2,...,cn>Mapped as a sequence of vectors<x1,x2,...,xn>。
In one embodiment, the feature extraction layer: based on the input vector sequence, coding by using a bidirectional long-short term memory network to obtain feature representation; the LSTM encodes only the past information and does not encode the future information; to take context into account, forward and reverse LSTM encoding of sentences is applied simultaneously; for the t-th Chinese character in the sentence, the forward LSTM and the reverse LSTM respectively obtain hidden layer representation of hidden layer representationObtaining the final hidden state representation h of each word after splicingt(ii) a Then, the probability P of each word corresponding to each label is calculated by the following formula:
P=Wmlpht+bmlp
wherein, WmlpAnd bmlpAre the model parameters.
In one embodiment, the prediction layer: in the local labeling data, labels of some positions can be a plurality of values; thus, the correct tag sequence for a sentence may be more than one; the partial label data format corresponding to the sentence is ({ B }, { B, I, E, O, S }, { B, I, E, O, S }, { O }, { O }, { O }, and { O }), and the correct tag sequence is considered to have 5 × 5-25 pieces;
given sentence C ═<c1,c2,...,cn>If the corresponding tag sequence y is equal to<y1,y2,...,yn>Then define the sentence score as:
where A is a matrix of recorded transfer scores, Ai,jRepresents the score for a transition from label i to label j; p is the output of the classification layer,indicating that the ith position is marked with a label yiA fraction of (d);
definition of YLFor the set of all correct sequences, a set Y is definedLThe fraction of (A) is:
wherein, YCRepresents the set of all possible sequences for the case where the input is C;
the loss function is still applicable to the fully annotated data; when in set YLWhen the size is 1, only one correct sequence is available, corresponding to the situation of the fully labeled data; thus, the model can process full annotation data and partial annotation data simultaneously.
In one embodiment, during training, it is desirable to maximize the probability of the sum of all correct sequence scores; therefore, the loss function is defined as follows:
in one embodiment, the sequence with the highest score is solved as a model prediction result by using a Viterbi algorithm during testing.
Based on the same inventive concept, the present application also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods when executing the program.
Based on the same inventive concept, the present application also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of any of the methods.
Based on the same inventive concept, the present application further provides a processor for executing a program, wherein the program executes to perform any one of the methods.
The invention has the beneficial effects that:
under the condition that the target field does not have any artificial labeling data, the field self-adaptive capacity of the source field model is effectively improved, and the entity identification performance of the target field is improved while the data labeling cost is reduced.
Drawings
Fig. 1 is a schematic diagram of a domain migration method based on multitask learning in the background of the invention.
FIG. 2 is a partially labeled example diagram of the multi-domain entity identification method of the present invention.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
In this patent, we mainly made the following 2 innovations:
1. aiming at a cross-domain scene without any artificial labeling data in the target domain, weak labeling data in the target domain are quickly and automatically constructed.
2. And applying local annotation learning to a cross-domain named entity recognition task.
In order to reduce migration difficulty caused by different data distribution, two methods are used for simultaneously labeling unmarked corpora in a target field, labels with high confidence are reserved, and special labels are adopted for uncertain positions to obtain weakly labeled data in the target field. Since the weakly labeled corpus contains uncertain labels, a common CRF layer cannot be modeled, and local labeling learning is applied to modeling.
1. Automatic labeling
1.1 entity dictionary
We use an external entity dictionary to find out the entities that may appear in the text according to the forward maximum matching mechanism. The part which is successfully matched is marked as an entity, and the part which is failed in matching is marked as 'O'.
1.2, Source Domain model
A model is obtained by training the source field data, and the model is directly used for marking the unmarked text of the target field as the result of the second automatic marking method.
1.3, Cross-comparison
Table 1 example of automatic labeling method
Comparing the labeling results of the two methods, and keeping the labels consistent with the two methods; the position where the collision occurred is labeled "U", meaning "Unknown", i.e. the label of this word is uncertain and can be any possible label. The obtained result is the final target field weakly labeled corpus. Table 1 shows the labeling results of each method when the news domain is migrated to the social media.
2. Named entity recognition based on local annotation
The model treats the recognition task as a sequence labeling task, the input of the model is a Chinese character sequence, and the output of the model is a label sequence. The tag takes the form of BIOES, where B-XX represents the first Chinese character of the XX category entity, E-XX represents the last Chinese character of the XX category entity, I-XX represents the middle portion of the category XX entity, S-XX represents the single character category XX entity, and the other Chinese characters are labeled "O".
In the model, for the input Chinese character sequence, first, neuron features are constructed through a bidirectional long-short term memory network (LSTM), and then the features are combined and input into a local CRF layer for label prediction. The whole model is divided into 3 main parts: 1) word vector representation layer: representing the input word string as a continuous vector through a word vector mapping table; 2) a feature extraction layer: obtaining the probability of each word corresponding to each label through a bidirectional long-short term memory network and linear transformation; 3) prediction layer: with local CRF, it is predicted what the output sequence is under the current input conditions.
Word vector representation layer: discrete input Chinese characters are converted into continuous vector representations. We use a mapping table in which the vector representation corresponding to each chinese character is stored. The initial value of the vector may be initialized using a random number or may be set to a pre-trained word vector. In the model training process, the vector table content is used as a parameter of the model, and is optimized along with other parameters in the iteration process. Given sentence C ═<c1,c2,...,cn>Mapped as a sequence of vectors<x1,x2,...,xn>。
A feature extraction layer: based on the input vector sequence, we use a bidirectional long short term memory network (LSTM) for encoding, resulting in a feature representation. LSTM encodes only past information and not future information. To take context into account, we apply forward and inverse LSTM to encode sentences simultaneously. For the t-th Chinese character in the sentence, the forward LSTM and the reverse LSTM respectively obtain hidden layer representation of hidden layer representationObtaining the final hidden state representation h of each word after splicingt. Then, the probability P of each word corresponding to each label is calculated by the following formula:
P=Wmlpht+bmlp
wherein, WmlpAnd bmlpAre the model parameters.
Prediction layer: in the local labeling data, the labels of some positions may have multiple values. Thus, the correct tag sequence for a sentence may be more than one. As shown in fig. 2, the partial annotation data format corresponding to a sentence is ({ B }, { B, I, E, O, S }, { O }, and the correct tag sequence is considered to have 5 × 5 — 25 pieces.
Given sentence C ═<c1,c2,...,cn>If the corresponding tag sequence y is equal to<y1,y2,...,yn>Then define the sentence score as:
where A is a matrix of recorded transfer scores, Ai,jRepresenting the score of the transition from label i to label j. P is the output of the classification layer,indicating that the ith position is marked with a label yiThe fraction of (c).
Definition of YLFor the set of all correct sequences, a set Y is definedLThe fraction of (A) is:
during the training process, we want to maximize the probability of the sum of all correct sequence scores. Therefore, the loss function is defined as follows:
wherein, YCRepresenting the set of all possible sequences for the case where the input is C.
The loss function is still applicable to fully annotated data. When in set YLWhen the size is 1, only one correct sequence is available, which corresponds to the case of fully labeled data. Thus, the model can process full annotation data and partial annotation data simultaneously.
During testing, the sequence with the highest score is solved by using a Viterbi algorithm to serve as a model prediction result.
The model is divided into two states, training and prediction (prediction is the actual use of the model). During the training process, the system will calculate the corresponding label sequence according to the input training sentence, and the label sequence is definitely different from the correct label sequence at the beginning, that is, the performance of the model at the beginning is very poor. The model then calculates a difference (loss) using the predicted result and the correct answer, and updates the system parameters in the reverse direction, with the goal of minimizing the difference loss as much as possible. As training progresses, the model becomes more and more predictive of the sequence's tags until a maximum of performance is reached (which is a cyclic iterative process).
One application scenario of the present invention is described below:
taking the migration of news domain to social media domain as an example, the enumeration steps are as follows:
1. training is carried out on manual marking data in the news field, and a model A is obtained.
2. And simultaneously, marking original texts in the social media by using the model A and the entity dictionary, and performing cross comparison to obtain the weakly marked corpus in the field of the social media.
3. And training by using local labeling learning on the weakly labeled corpus to obtain a model B.
4. The application model B labels the text in the social media field, and the performance is obviously superior to that of the model A.
The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.
Claims (10)
1. A multi-domain entity identification method is characterized by comprising the following steps: in order to reduce migration difficulty caused by different data distribution, two methods are used for simultaneously labeling the unmarked linguistic data of the target field, labels with high confidence are reserved, and special labels are adopted for uncertain positions to obtain weak labeling data of the target field; because the weakly labeled corpus contains uncertain labels, a common CRF layer cannot be modeled, and local labeling learning is applied to model the weakly labeled corpus;
automatic labeling:
searching entities which possibly appear in the text by using an external entity dictionary according to a forward maximum matching mechanism; marking the part which is successfully matched as an entity, and marking the part which is failed in matching as 'O';
training source field data to obtain a model, and directly marking the unmarked text of the target field by using the model as a result of a second automatic marking method;
comparing the labeling results of the two methods, and keeping the labels consistent with the two methods; marking the position where the conflict is generated as 'U', meaning 'Unknown', that is, the label of the word is uncertain and can be any possible label; the obtained result is the final target field weakly labeled corpus;
named entity identification based on local labeling:
the model processes the identification task as a sequence labeling task, the input of the model is a Chinese character sequence, and the output of the model is a label sequence;
in the model, for an input Chinese character sequence, neuron features are constructed through a bidirectional long-short term memory network (LSTM), and then the features are combined and input to a local CRF layer for label prediction; the whole model is divided into 3 main parts: 1) word vector representation layer: representing the input word string as a continuous vector through a word vector mapping table; 2) a feature extraction layer: obtaining the probability of each word corresponding to each label through a bidirectional long-short term memory network and linear transformation; 3) prediction layer: predicting what the output sequence under the current input condition is by adopting a local CRF;
the model is divided into two states, training and forecasting; in the training process, the system can calculate a corresponding label sequence according to an input training sentence, and the initial label sequence and the correct label sequence are definitely different from each other greatly, namely the performance of the initial model is poor; then the model calculates a difference value by using the result predicted by the model and the correct answer, and reversely updates the system parameters, wherein the updating aim is to minimize the difference value loss as much as possible; as training progresses, the model becomes more and more predictive of the sequence's tags until a performance maximum is reached.
2. The method for multi-domain entity recognition of claim 1, wherein the "model treats the recognition task as a sequence tagging task, the model input is a chinese character sequence, and the model output is a tag sequence; "in, the label takes the form of BIOES, where B-XX represents the first Chinese character of the XX category entity, E-XX represents the last Chinese character of the XX category entity, I-XX represents the middle part of the category XX entity, S-XX represents the category XX entity of a single character, and the other Chinese characters are labeled" O ".
3. The multi-domain entity recognition method of claim 1, wherein a word vector representation layer: converting the discrete input Chinese characters into continuous vector representation; using a mapping table, wherein the vector representation corresponding to each Chinese character is stored in the table; the initial value of the vector can be initialized by using a random number and can also be set as a pre-training word vector; in the model training process, the vector table content is used as a parameter of the model and is optimized along with other parameters in the iteration process; given sentence C ═<c1,c2,…,cn>Mapped as a sequence of vectors<x1,x2,...,xn>。
4. The multi-domain entity recognition method of claim 1, wherein the feature extraction layer: based on the input vector sequence, coding by using a bidirectional long-short term memory network to obtain feature representation; the LSTM encodes only the past information and does not encode the future information; to take context into account, forward and reverse LSTM encoding of sentences is applied simultaneously; for the t-th Chinese character in the sentence, the forward LSTM and the reverse LSTM respectively obtain hidden layer representation of hidden layer representationSplicing to obtain the final hidden character of each characterHidden state represents ht(ii) a Then, the probability P of each word corresponding to each label is calculated by the following formula:
P=Wmlpht+bmlp
wherein, WmlpAnd bmlpAre the model parameters.
5. The multi-domain entity recognition method of claim 1, wherein the prediction layer: in the local labeling data, labels of some positions can be a plurality of values; thus, the correct tag sequence for a sentence may be more than one; the partial label data format corresponding to the sentence is ({ B }, { B, I, E, O, S }, { B, I, E, O, S }, { O }, { O }, { O }, and { O }), and the correct tag sequence is considered to have 5 × 5-25 pieces;
given sentence C ═<c1,c2,...,cn>If the corresponding tag sequence y is equal to<y1,y2,...,yn>Then define the sentence score as:
where A is a matrix of recorded transfer scores, Ai,jRepresents the score for a transition from label i to label j; p is the output of the classification layer,indicating that the ith position is marked with a label yiA fraction of (d);
definition of YLFor the set of all correct sequences, a set Y is definedLThe fraction of (A) is:
wherein, YCRepresents the set of all possible sequences for the case where the input is C;
the loss function is still applicable to the fully annotated data; when in set YLAt a size of 1, i.e. onlyA correct sequence is provided, which corresponds to the situation of the fully-labeled data; thus, the model can process full annotation data and partial annotation data simultaneously.
7. the method of claim 1, wherein a sequence with the highest score is solved as a model prediction result using a viterbi algorithm at the time of testing.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the program is executed by the processor.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
10. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010437407.6A CN111611802B (en) | 2020-05-21 | 2020-05-21 | Multi-field entity identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010437407.6A CN111611802B (en) | 2020-05-21 | 2020-05-21 | Multi-field entity identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111611802A true CN111611802A (en) | 2020-09-01 |
CN111611802B CN111611802B (en) | 2021-08-31 |
Family
ID=72195877
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010437407.6A Active CN111611802B (en) | 2020-05-21 | 2020-05-21 | Multi-field entity identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111611802B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733911A (en) * | 2020-12-31 | 2021-04-30 | 平安科技(深圳)有限公司 | Entity recognition model training method, device, equipment and storage medium |
CN112989801A (en) * | 2021-05-11 | 2021-06-18 | 华南师范大学 | Sequence labeling method, device and equipment |
CN112989811A (en) * | 2021-03-01 | 2021-06-18 | 哈尔滨工业大学 | BilSTM-CRF-based historical book reading auxiliary system and control method thereof |
CN113392659A (en) * | 2021-06-25 | 2021-09-14 | 携程旅游信息技术(上海)有限公司 | Machine translation method, device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN110348018A (en) * | 2019-07-16 | 2019-10-18 | 苏州大学 | The method for completing simple event extraction using part study |
CN110765775A (en) * | 2019-11-01 | 2020-02-07 | 北京邮电大学 | Self-adaptive method for named entity recognition field fusing semantics and label differences |
CN111143571A (en) * | 2018-11-06 | 2020-05-12 | 马上消费金融股份有限公司 | Entity labeling model training method, entity labeling method and device |
-
2020
- 2020-05-21 CN CN202010437407.6A patent/CN111611802B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107122416A (en) * | 2017-03-31 | 2017-09-01 | 北京大学 | A kind of Chinese event abstracting method |
CN111143571A (en) * | 2018-11-06 | 2020-05-12 | 马上消费金融股份有限公司 | Entity labeling model training method, entity labeling method and device |
CN110348018A (en) * | 2019-07-16 | 2019-10-18 | 苏州大学 | The method for completing simple event extraction using part study |
CN110765775A (en) * | 2019-11-01 | 2020-02-07 | 北京邮电大学 | Self-adaptive method for named entity recognition field fusing semantics and label differences |
Non-Patent Citations (1)
Title |
---|
YAOSHENG YANG 等: "Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning", 《PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112733911A (en) * | 2020-12-31 | 2021-04-30 | 平安科技(深圳)有限公司 | Entity recognition model training method, device, equipment and storage medium |
WO2022142122A1 (en) * | 2020-12-31 | 2022-07-07 | 平安科技(深圳)有限公司 | Method and apparatus for training entity recognition model, and device and storage medium |
CN112733911B (en) * | 2020-12-31 | 2023-05-30 | 平安科技(深圳)有限公司 | Training method, device, equipment and storage medium of entity recognition model |
CN112989811A (en) * | 2021-03-01 | 2021-06-18 | 哈尔滨工业大学 | BilSTM-CRF-based historical book reading auxiliary system and control method thereof |
CN112989801A (en) * | 2021-05-11 | 2021-06-18 | 华南师范大学 | Sequence labeling method, device and equipment |
CN112989801B (en) * | 2021-05-11 | 2021-08-13 | 华南师范大学 | Sequence labeling method, device and equipment |
CN113392659A (en) * | 2021-06-25 | 2021-09-14 | 携程旅游信息技术(上海)有限公司 | Machine translation method, device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111611802B (en) | 2021-08-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111611802B (en) | Multi-field entity identification method | |
CN110457675B (en) | Predictive model training method and device, storage medium and computer equipment | |
CN111783462B (en) | Chinese named entity recognition model and method based on double neural network fusion | |
CN109902145B (en) | Attention mechanism-based entity relationship joint extraction method and system | |
CN109800437B (en) | Named entity recognition method based on feature fusion | |
CN110083710B (en) | Word definition generation method based on cyclic neural network and latent variable structure | |
CN111666758B (en) | Chinese word segmentation method, training device and computer readable storage medium | |
CN111046179B (en) | Text classification method for open network question in specific field | |
CN110263325B (en) | Chinese word segmentation system | |
CN112487820B (en) | Chinese medical named entity recognition method | |
CN110852089B (en) | Operation and maintenance project management method based on intelligent word segmentation and deep learning | |
CN113190656B (en) | Chinese named entity extraction method based on multi-annotation frame and fusion features | |
CN113221571B (en) | Entity relation joint extraction method based on entity correlation attention mechanism | |
CN113128203A (en) | Attention mechanism-based relationship extraction method, system, equipment and storage medium | |
RU2712101C2 (en) | Prediction of probability of occurrence of line using sequence of vectors | |
KR101646461B1 (en) | Method for korean dependency parsing using deep learning | |
CN115238026A (en) | Medical text subject segmentation method and device based on deep learning | |
CN112699685A (en) | Named entity recognition method based on label-guided word fusion | |
CN112016299A (en) | Method and device for generating dependency syntax tree by using neural network executed by computer | |
CN116362242A (en) | Small sample slot value extraction method, device, equipment and storage medium | |
CN116680407A (en) | Knowledge graph construction method and device | |
CN113704466B (en) | Text multi-label classification method and device based on iterative network and electronic equipment | |
CN115600597A (en) | Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium | |
CN112214994B (en) | Word segmentation method, device and equipment based on multi-level dictionary and readable storage medium | |
CN114519104A (en) | Action label labeling method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |