CN114510943B - Incremental named entity recognition method based on pseudo sample replay - Google Patents

Incremental named entity recognition method based on pseudo sample replay Download PDF

Info

Publication number
CN114510943B
CN114510943B CN202210150846.8A CN202210150846A CN114510943B CN 114510943 B CN114510943 B CN 114510943B CN 202210150846 A CN202210150846 A CN 202210150846A CN 114510943 B CN114510943 B CN 114510943B
Authority
CN
China
Prior art keywords
old
word
model
knowledge
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210150846.8A
Other languages
Chinese (zh)
Other versions
CN114510943A (en
Inventor
夏宇
李素建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202210150846.8A priority Critical patent/CN114510943B/en
Publication of CN114510943A publication Critical patent/CN114510943A/en
Application granted granted Critical
Publication of CN114510943B publication Critical patent/CN114510943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an incremental named entity recognition method based on pseudo sample replay, which is the basis of a knowledge graph construction technology and belongs to the technical field of information extraction in natural language processing. In the learning stage, the training set only comprising new entity types is given, the old model is used as a teacher, and knowledge distillation loss is increased on the conventional cross entropy loss when the new student model is trained; in the review phase, pseudo-samples are generated as review material for the old type, and old friend knowledge is warmed by further distillation on the review material and integrated with the new knowledge. The invention uses the pseudo sample of the old type to provide the new type of supervision signals for the review material, uses the teacher to provide the old type of supervision signals, and can use the supervision signals to restrict the output of the new student model on the review material after the new and old types of supervision signals exist.

Description

Incremental named entity recognition method based on pseudo sample replay
Technical Field
The invention provides an incremental named entity recognition technology, in particular relates to a named entity recognition method based on pseudo sample replay, is the basis of a knowledge graph construction technology, and belongs to the technical field of information extraction in natural language processing.
Background
Traditional named entity recognition refers to extracting entities of a specified category (e.g., person names, place names, organization names) from unstructured text, and is one of the important steps of information extraction. Conventional approaches are limited to extracting entities of predefined categories, however in reality, the categories of entities to be extracted tend to dynamically expand with demand, e.g., new intents are sometimes encountered in dialog systems, with new entity types being introduced, which requires the model to be able to identify a dynamically expanding set of entity types. To adapt to the above scenario, a simple approach is to label all the entity types seen with a dataset and use it to train a new model, however this approach is too demanding for labeling and consumes too much computational resources, even not viable in scenarios with particularly many entity types. Monaikul et al then propose a setup with low requirements for labeling and computational resources by providing only one dataset labeled with new entity types at a time and training a new model with knowledge of the old type entities in the old model.
This learning paradigm is also referred to as continuous learning (lifetime learning, incremental learning), and more specifically, belongs to the category incremental continuous learning. However, continuous learning techniques still have a certain gap from practical applications, where the biggest challenge is the problem of catastrophic forgetfulness, which is the dramatic drop in the performance of models on old tasks when learning new tasks. The cause of the catastrophic forgetfulness problem is: unlike humans, neural networks store task knowledge via parameters, which inevitably update to parameters associated with old tasks as it learns new entity types, thus resulting in a drop in the performance of the old tasks. In addition to catastrophic forgetting problems, class incremental continuous learning also faces class confusion problems, which means that models cannot distinguish between different classes well, the problem arises because: samples of different categories appear in different tasks, only part of the categories are seen by the model each time it is trained, and all the categories are not modeled at the same time.
Because of the lack of a unified reference dataset to measure named entity recognition in a continuous learning scene, the settings of related works are chaotic, the settings proposed by Monaikul et al are most suitable for an actual application scene, monaikul et al convert the named entity recognition dataset in the existing traditional scene into the settings of category increment: suppose that in step k, the goal is to learn a new set of entity typesProvided training data set/>Only the label belonging to/>Other old types of entities need not be labeled. In order to learn a new model and not forget an old model, monaikul takes the old model as a teacher, and knowledge distillation loss is added to the conventional cross entropy loss when training the new student model, and the purpose of the knowledge distillation loss is to restrict the output of the student model on the old model by using the output of the teacher model so as to prevent the student model from forgetting the old model. Although the above method has been successful initially, it has the following drawbacks: this distillation-based approach relies on training data sets/>Number of middle-old type entities, if/>Without the old type of entity, it is difficult for the teacher model to distill old knowledge into the student model.
Disclosure of Invention
In order to solve the problems of catastrophic forgetfulness and category confusion, the invention proposes a two-stage training framework Learn-and-Review (L & R), which is inspired by the human learning process, introducing a "Review stage" after the conventional "learning stage".
The technical scheme provided by the invention is as follows:
Referring to fig. 1, the named entity recognition method based on pseudo sample replay provided by the invention is characterized by comprising a learning stage and a review stage, wherein in the learning stage, a training set only comprising new entity types is given, an old model is used as a teacher, and knowledge distillation loss is increased on the basis of conventional cross entropy loss when a new student model is trained; in the review phase, pseudo-samples are generated as review material for the old type, and old friend knowledge is warmed by further distillation on the review material and integrated with the new knowledge; the method specifically comprises the following steps:
1) In the learning stage, in the kth step, a current dataset D k and an M k-1,G1:k-1 model obtained in the last step are obtained;
2) M k-1 is regarded as a teacher, Is considered as a student and distills knowledge of the old entity type in M k-1 to/>, by knowledge distillationIn (a) and (b);
3) In the review phase, for each old task i ε {1,2, …, k-1}, unlabeled text is generated that contains the old type E i
4) Feeding the unlabeled text to M k-1 and the student obtained in the first stageObtaining the output probability distribution P (x ik-1, T) and/>, over all seen entity types
5) Taking the front in the output distribution of M k-1 Dimension/>Output distribution of/>To the firstDimension, splice them to get/>
6) After the review phase, a model M k is obtained that identifies all the types of entities that have been seenCalculate the output distribution of M k/>KL divergence between as a function of distillation loss:
7) Each word in dataset D k is divided into two categories: one with and one without an entity tag; for words with entity tags, calculate Cross entropy loss function of the output of (c) and the entity tag:
For O-tagged words, calculate KL divergence of the output distribution of M k-1 from the output distribution of M k-1:
Wherein, Respectively represent M k-1 and/>Output distribution of (a); t represents the temperature in the distillation to obtain a smoother probability distribution;
8) The weighted sum of the three loss functions gives the total loss function in the review phase:
The invention uses the old type of unlabeled text to provide a new type of supervisory signal for the review material, uses the teacher to provide the old type of supervisory signal, and uses the supervisory signal to restrict the output of the new student model on the review material after the new and old types of supervisory signals exist.
Drawings
FIG. 1 is a unitary frame of the present invention;
FIG. 2 is a dataset statistics;
Fig. 3 is the main experimental result.
Detailed Description
The invention comprises a main model (M) for named entity recognition, a generator (G) for generating pseudo-samples,
The main model named entity recognition is typically modeled as a sequence labeling task, i.e., each word is assigned a label. The main model of the invention consists of a feature extractor and a classification layer. The feature extractor adopts a pre-training language model BERT-base, and the classification layer adopts a linear layer with softmax. Given a word sequence [ x 1,x2,...,xL ] of length L and a label [ y 1,y2,...,yL ] of each word, firstly obtaining a hidden vector [ h 1,h2,...,hL ] of each word through a feature extractor, then mapping the hidden vector to a label space [ s 1,s2,...,sL ] through a linear layer, and obtaining the probability [ p 1,p2,...,pL ] of each word on all types through softmax:
zi=Whi+b
Wherein, D is the hidden vector size of the pre-training language model, and d is 768; /(I)M is the size of the tag set, depending on the tag system employed, the present invention employs a BIO tag system, m is 2n+1, n is the number of entity types, and each step is dynamically increased.
The training objective function of the master model is cross entropy loss, which encourages the model to correctly predict the label of each word:
Wherein, Is the probability that the word x i belongs to the tag y i; θ is all trainable parameters.
The generator is a language model formed by an embedding layer, an LSTM layer and a classifier, given a word sequence [ x 1,x2,...,xL ] with a length L, a word vector of each word is firstly obtained through the embedding layer, wherein the invention adopts FastText word vectors which are disclosed in a document Joulin A,Grave E,Bojanowski P,et al.Fasttext.zip:Compressing text classification models[J](arXiv preprint arXiv:1612.03651,2016.),, then a hidden vector [ h 1,h2,...,hL ] which is integrated with context information is obtained through the LSTM layer, and finally the probability of the next word is obtained through a linear layer with softmax:
zi=Whi+b
Where z i∈Rv, V is the size of the dictionary, determined by the dataset; index (x i) represents the number of x i in the dictionary.
The training objective of the generator is a language modeling penalty function that minimizes the negative log likelihood penalty for predicting the next word:
the learning stage of the invention
Assuming that in the kth step, what can be used includes the current dataset D k and the M k-1,G1:k-1 model obtained in the previous step, the goal of the learning phase is to obtain a modelIt can identify all the entity types seen/>
First, the current model is initialized with parameters of M k-1 And its linear layer is extended to accommodate the new number of entity types. Specifically, it extends from h× (2n+1) to h× (2n+2m+1), where/>M= |e k | represents the number of old types and the number of new types, respectively.
Secondly, the invention regards M k-1 as a teacher,Is considered as a student and distills knowledge of the old entity type in M k-1 to/>, by knowledge distillationIs a kind of medium. Specifically, each word in the dataset can be divided into two categories: one with the physical label and the other without the physical label (label O). For words with entity tags, the present invention calculates/>Cross entropy loss function of the output of (c) and the entity tag:
For O-tagged words, it is possible to be an old type of entity tag, but under the present setting this information is not labeled, the present invention calculates KL divergence of the output distribution of M k-1 from the output distribution of M k-1:
Wherein, Respectively represent M k-1 and/>Output distribution of (a); t represents the temperature in the distillation for a smoother probability distribution, the present invention being set to 2. In order to make the dimensions of the two output distributions identical, the invention complements the class dimension of the output of M k-1 with a small constant and then renormalizes.
In summary, the total loss function of the learning phase is a weighted sum of two loss functions:
wherein, the values of alpha and beta are set to be 1.
Review phase of the invention
The purpose of the review phase is to wake up the old type of knowledge and integrate it with the new type of knowledge by further distillation on the old type of dummy sample, resulting in the final model M k of the kth step.
First, for each old task i ε {1,2, …, k-1}, the present invention uses G i to generate unlabeled text that contains the old type E i
Secondly, the invention feeds the non-marked text into M k-1 and the first student obtained in the first stage respectivelyObtaining the output probability distribution P (x ik-1, T) and/>, over all seen entity types
The invention then takes the front in the output profile of M k-1 Dimension/>Output distribution of/>To/>Dimension, splice them to get/>
Then, calculate the output distribution of M k andKL divergence between as a function of distillation loss:
The loss in the learning phase is still calculated at D k:
in summary, the total loss function of the review phase is a weighted sum of three loss functions:
The invention is implemented with reference to the details provided by Monaikul et al, using BERT-base as the extractor, pytorch of Huggingface as the programming framework, running the program on a single GeForce RTX3090 display card, batch size 32, maximum sentence length 128, maximum training round number 20, early stop round number set to 3, using Adam as the optimizer, learning rate 5e-5, weight of the penalty function set to 1, generator in L & R defaulting to 3000 samples, sampling 6 and 8 task sequences for CoNLL-03 and OntoNotes-5.0, respectively.
Preliminary experiments have found that significant improvement can be achieved using a layer of LSTM model as a generator, with an average run time of 10 min/task, and a model size of about 50 MB/task.
The invention uses a common data set for named entity recognition, coNLL-03, which is described in Sang E F,De Meulder F.Introduction to the CoNLL-2003shared task:Language-independent named entity recognition[J](arXiv preprint cs/0306050,2003.) and a data set OntoNotes-5.0, which is described in Hovy E,Marcus M,Palmer M,et al.OntoNotes:the 90%solution[C](//Proceedings of the human language technology conference of the NAACL,Companion Volume:Short Papers.2006:57-60.), wherein CoNLL-03 comprises four entity types: person (PER), location (LOC), organization (ORG), miscellaneous (MISC), the present invention selects the six most representative entity types of OntoNotes-5.0 with reference to Monaikul et al :person(PER)、geo-political entity(GPE)、organization(ORG)、cardinal(CARD)、Nationalities and Religious Political Group(NORP).
The invention adopts the following settings to simulate the data accumulation process in reality, and the invention carries out the following operations on samples in the original data set to construct a training/verification set of the kth task: for a sentence of the original training/validation set [ x 1,x2,…,xL ] and its tag [ y 1,y2,…,yL ], the present invention replaces y i with O ifThe invention marks the replaced label as/>If/>Not all O, it is added to the training/validation set of the kth task. In constructing the test set of the kth task, the invention replaces E k with/>
After the above operation, the statistics of the training/verifying/testing set of each task are as shown in fig. 2:
Referring to Monaikul et al, to evaluate the average performance of the model over all types seen, a macro-average F1 (macro-average F1) is used and the results of the sampled task sequences are averaged, defined as follows:
Wherein the method comprises the steps of Indicating that under the r task sequence, all the seen entity types are accumulated to the kth step,/>And F1 value of the e entity of the kth step in the r task sequence is represented.
In order to realize the model more comprehensively, the invention also measures the robustness of the model to the task sequence, and the index adopted by the invention is an Error Bound (EB) which is defined as follows:
Wherein, Is the confidence coefficient at a confidence level, σ is the standard deviation calculated for n different task orders, with lower upper error bounds representing lower order sensitivity.
The present invention compares ExtendNER proposed by Monaikul et al as a baseline to the method of the present project and selects the "multitasking" as mentioned in section 2.1.3, section first, to measure the upper-limit effect.
As shown in fig. 3, it can be seen from the first row and the third row of the graph that the L & R proposed by the present invention exceeds ExtendNER in all steps (step) of the two data sets, and the more steps, the more the L & R is promoted, because the method of the present invention improves the effect of each step, thereby alleviating the error propagation caused by distillation. In addition to the cumulative lift, the present invention also provides a lift immediately after each step has completed the "review phase", the fifth row represents the effect of the model before the "review phase", the fourth row represents the effect of the model after the "review phase", and the difference is the immediate lift brought by the "review phase". The second and fourth rows of fig. 3 also give the upper error bound for the model, and it can be seen that the upper error bound for L & R is lower, indicating that the model of the present invention is less sensitive to task order.

Claims (1)

1. The incremental named entity recognition method is characterized by comprising a learning stage and a review stage, wherein in the learning stage, a training set only comprising new entity types is given, an old model is used as a teacher, and knowledge distillation loss is increased on the conventional cross entropy loss when a new student model is trained; in the review phase, pseudo-samples are generated as review material for the old type, and old friend knowledge is warmed by further distillation on the review material and integrated with the new knowledge; the method comprises the following specific steps:
1) In the learning stage, in the kth step, a current dataset D k and M k-1 and G 1:k-1 models obtained in the previous step are obtained; where M is the master model and G is the generator for generating the pseudo-samples;
Model M k consists of a feature extractor and a classification layer, wherein the feature extractor adopts a pre-trained language model BERT-base, the classification layer adopts a linear layer with softmax, a word sequence [ x 1,x2,...,xL ] with the length of L and a label [ y 1,y2,...,yL ] of each word are given, firstly, a hidden vector [ h 1,h2,...,hL ] of each word is obtained through the feature extractor, then the hidden vector is mapped to a label space [ s 1,s2,...,sL ] through the linear layer, and then the probability [ p 1,p2,...,pL ] of each word on all types is obtained through the softmax:
zi=Whi+b
Wherein, D is the hidden vector size of the pre-training language model, and d is 768; /(I)M is the size of the tag set, depending on the tag system employed;
2) M k-1 is regarded as a teacher, Is considered as a student and distills knowledge of the old entity type in M k-1 to/>, by knowledge distillationIn (a) and (b);
3) In review phase, for each old task i ε {1,2, …, k-1}, generator G generates unlabeled text containing old type E i The generator G is a language model formed by an embedding layer, an LSTM layer and a classifier, a word sequence [ x 1,x2,...,xL ] with the length L is given, word vectors of each word are obtained through the embedding layer, then hidden vectors [ h 1,h2,...,hL ] which are integrated with context information are obtained through the LSTM layer, and finally the probability of the next word is obtained through a linear layer with softmax:
zi=Whi+b
Wherein, V is the size of the dictionary, determined by the dataset; index (x i) represents the number of x i in the dictionary;
The training objective of the generator is a language modeling penalty function that minimizes the negative log likelihood penalty for predicting the next word:
4) Feeding the unlabeled text to M k-1 and students obtained in learning stage Obtaining the output probability distribution P (x ik-1, T) and/>, over all seen entity typesWhere θ is all trainable parameters; t represents the temperature in the distillation to obtain a smoother probability distribution;
5) Taking the front in the output distribution of M k-1 Dimension/>Output distribution of/>To/>Dimension, splice them to get/>
6) After the review phase, a model M k is obtained that identifies all the types of entities that have been seenCalculate the output distribution of M k/>KL divergence between as a function of distillation loss:
7) Each word in dataset D k is divided into two categories: one with and one without an entity tag; for words with entity tags, calculate Cross entropy loss function of the output of (c) and the entity tag:
Wherein, Is the probability that the word x i belongs to the tag y i;
For words without entity tags, calculate KL divergence of the output distribution of M k-1 from the output distribution of M k-1:
Wherein, Respectively represent M k-1 and/>Output distribution of (a);
8) The weighted sum of the three loss functions gives the total loss function in the review phase:
CN202210150846.8A 2022-02-18 2022-02-18 Incremental named entity recognition method based on pseudo sample replay Active CN114510943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210150846.8A CN114510943B (en) 2022-02-18 2022-02-18 Incremental named entity recognition method based on pseudo sample replay

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210150846.8A CN114510943B (en) 2022-02-18 2022-02-18 Incremental named entity recognition method based on pseudo sample replay

Publications (2)

Publication Number Publication Date
CN114510943A CN114510943A (en) 2022-05-17
CN114510943B true CN114510943B (en) 2024-05-28

Family

ID=81552221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210150846.8A Active CN114510943B (en) 2022-02-18 2022-02-18 Incremental named entity recognition method based on pseudo sample replay

Country Status (1)

Country Link
CN (1) CN114510943B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853710A (en) * 2013-11-21 2014-06-11 北京理工大学 Coordinated training-based dual-language named entity identification method
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation
CN111783462A (en) * 2020-06-30 2020-10-16 大连民族大学 Chinese named entity recognition model and method based on dual neural network fusion
CN112257447A (en) * 2020-10-22 2021-01-22 北京众标智能科技有限公司 Named entity recognition system and recognition method based on deep network AS-LSTM
CN112633002A (en) * 2020-12-29 2021-04-09 上海明略人工智能(集团)有限公司 Sample labeling method, model training method, named entity recognition method and device
CN113408288A (en) * 2021-06-29 2021-09-17 广东工业大学 Named entity identification method based on BERT and BiGRU-CRF

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853710A (en) * 2013-11-21 2014-06-11 北京理工大学 Coordinated training-based dual-language named entity identification method
CN107203511A (en) * 2017-05-27 2017-09-26 中国矿业大学 A kind of network text name entity recognition method based on neutral net probability disambiguation
CN111783462A (en) * 2020-06-30 2020-10-16 大连民族大学 Chinese named entity recognition model and method based on dual neural network fusion
CN112257447A (en) * 2020-10-22 2021-01-22 北京众标智能科技有限公司 Named entity recognition system and recognition method based on deep network AS-LSTM
CN112633002A (en) * 2020-12-29 2021-04-09 上海明略人工智能(集团)有限公司 Sample labeling method, model training method, named entity recognition method and device
CN113408288A (en) * 2021-06-29 2021-09-17 广东工业大学 Named entity identification method based on BERT and BiGRU-CRF

Also Published As

Publication number Publication date
CN114510943A (en) 2022-05-17

Similar Documents

Publication Publication Date Title
CN111209401A (en) System and method for classifying and processing sentiment polarity of online public opinion text information
CN112487143A (en) Public opinion big data analysis-based multi-label text classification method
CN111353029B (en) Semantic matching-based multi-turn spoken language understanding method
CN111506732B (en) Text multi-level label classification method
CN111274817A (en) Intelligent software cost measurement method based on natural language processing technology
CN112699685B (en) Named entity recognition method based on label-guided word fusion
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN111881256B (en) Text entity relation extraction method and device and computer readable storage medium equipment
CN112802570A (en) Named entity recognition system and method for electronic medical record
CN110008699A (en) A kind of software vulnerability detection method neural network based and device
CN113836891A (en) Method and device for extracting structured information based on multi-element labeling strategy
CN116245097A (en) Method for training entity recognition model, entity recognition method and corresponding device
CN111783464A (en) Electric power-oriented domain entity identification method, system and storage medium
CN114510943B (en) Incremental named entity recognition method based on pseudo sample replay
Sinapoy et al. Comparison of lstm and indobert method in identifying hoax on twitter
CN116702753A (en) Text emotion analysis method based on graph attention network
CN113342982B (en) Enterprise industry classification method integrating Roberta and external knowledge base
CN115934936A (en) Intelligent traffic text analysis method based on natural language processing
CN115186670A (en) Method and system for identifying domain named entities based on active learning
CN115600595A (en) Entity relationship extraction method, system, equipment and readable storage medium
CN114239584A (en) Named entity identification method based on self-supervision learning
CN113822018A (en) Entity relation joint extraction method
CN114595324A (en) Method, device, terminal and non-transitory storage medium for power grid service data domain division
CN113342964A (en) Recommendation type determination method and system based on mobile service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant