CN114036948B - Named entity identification method based on uncertainty quantification - Google Patents
Named entity identification method based on uncertainty quantification Download PDFInfo
- Publication number
- CN114036948B CN114036948B CN202111246467.0A CN202111246467A CN114036948B CN 114036948 B CN114036948 B CN 114036948B CN 202111246467 A CN202111246467 A CN 202111246467A CN 114036948 B CN114036948 B CN 114036948B
- Authority
- CN
- China
- Prior art keywords
- entity
- uncertainty
- model
- representation
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000011002 quantification Methods 0.000 title claims abstract description 9
- 238000013531 bayesian neural network Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 10
- 238000013139 quantization Methods 0.000 claims abstract description 8
- 238000001514 detection method Methods 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 claims 2
- 238000005259 measurement Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a named entity identification method based on uncertainty quantification, which comprises the following steps: step 1, collecting a sample set positioned to the entity position, and constructing a detection model of a candidate entity; step 2, for the entities in the sample set, adopting BILSTM and self_ attention network structures suitable for long text memory dependence to respectively obtain the representation of the entity context characteristics and the entity self characteristics; step 3, learning an uncertainty quantization model of an entity by adopting the ideas of contrast loss and parameter sharing, and giving an uncertainty value of each entity; step 4, converting the uncertainty value into a dropout probability of each entity, giving a threshold value, and removing samples with uncertainty larger than the threshold value; and 5, training a new named entity recognition model by introducing the concept of Monte Carlo dropoff training in the Bayesian neural network through the dropoff probability of the entity in the step 4.
Description
Technical Field
The invention relates to the technical field of computer application, in particular to a novel method for quantifying uncertainty of entity context characteristics and entity self characteristics and for identifying named entities.
Background
In machine learning, there is always unavoidable uncertainty, and two main uncertainties are model uncertainty and data uncertainty, wherein the model uncertainty is derived from whether the structural selection and the model parameters can best describe data distribution, the data uncertainty is derived from that even if the observation and evaluation of the data are accurate, noise still exists in data generation, and especially for the named entity recognition task of supervised learning, the uncertainty of supervision information itself can have great influence on the final recognition result.
In recent years, with the proposal of bayesian neural networks, quantization uncertainty becomes possible. In the field of computer vision, bayesian neural network quantization uncertainty has been applied to semantic segmentation and monocular depth estimation tasks, and experimental comparison has found that model uncertainty and data uncertainty in both tasks are quantized by introducing a bayesian neural network, which brings about an effect improvement. Then, the Bayesian neural network is also used for quantifying the uncertainty in the natural language processing task, and the analysis and comparison result is that the uncertainty in the Bayesian neural network quantification is improved for the three natural language processing tasks by carrying out experiments on emotion analysis, named entity recognition and language model.
Although bayesian neural networks have been used to quantify uncertainties in named entity recognition, their influencing factors and quantification strategies for uncertainties in named entity recognition tasks are not well defined and lack interpretability for uncertainties in named entity recognition.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a named entity recognition method based on uncertainty quantification.
The invention aims at realizing the following technical scheme:
A named entity identification method based on uncertainty quantification comprises the following steps:
step1, collecting a sample set positioned to the entity position, and constructing a detection model of a candidate entity;
Step 2, for the entities in the sample set, adopting BiLSTM and self_ attention network structures suitable for long text memory dependence to respectively obtain the representation of the entity context characteristics and the entity self characteristics;
Step 3, learning an uncertainty quantization model of an entity by adopting the ideas of contrast loss and parameter sharing, and giving an uncertainty value of each entity;
Step 4, converting the uncertainty value into a dropout probability of each entity, giving a threshold value, and removing samples with uncertainty larger than the threshold value;
and 5, training a new named entity recognition model by introducing the concept of Monte Carlo dropoff training in the Bayesian neural network through the dropoff probability of the entity in the step 4.
Compared with the prior art, the technical scheme of the invention has the following beneficial effects:
1. The invention provides a new method for named entity recognition based on uncertainty quantification, which is characterized in that the main factors of uncertainty sources in named entity recognition are determined: the ambiguity of the characteristics of the entity and the characteristics of the context provides the uncertainty of the context-entity contrast loss quantization entity, and introduces the concept of Monte Carlo dropout training of the Bayesian neural network to train a named entity recognition model;
2. After the uncertainty of the entity is quantified, once the uncertainty is larger than a given threshold value, the entity sample is removed, and the model can be enabled to be free from too difficult samples with too difficult learning characteristics, so that the convergence rate of model learning is increased.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
FIG. 2 is a model of uncertainty measurements of an entity of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and the specific examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In this embodiment, an execution environment using a named entity recognition method based on uncertainty quantization adopts a server with a 3.0 ghz central processing unit, nvida GPU processor and 16 gbytes of memory, and compiles an uncertainty program in quantized named entity recognition by using python language, so that the new method for quantifying uncertainty in named entity recognition by using ambiguity of entity context characteristics and entity own characteristics is realized, and other execution environments can be adopted, which are not described herein.
FIG. 1 is a flowchart of a new method for identifying named entities based on uncertainty quantization, which comprises the following steps:
Step 101: collecting and constructing a sample set for locating the entity position, and constructing a detection model of a candidate entity;
Step 102: then, for sequences in the sample set Wherein s i,t,...,si,t+l is an entity, an ELMO model is adopted to obtain word-level word vectors, meanwhile, CNN is used to obtain character-level word vector representation of words for character-level features in words, and the vector formed by splicing the word-level word vectors and the character-level word vectors is a sample sequence word vector representation/>Next, using BiLSTM commonly used in the sequence task to obtain hidden layer representation h i,j=BiLSTM(ei,j of each word in the sequence, adding self-attention to obtain hidden layer representation of each word in the sequence after weight, and finally respectively obtaining context feature representation of the entity and feature representation of the entity according to the position information of the positioning entity in the sequence; wherein the weights and contextual characteristics represent/>And entity itself feature representation/>The calculation mode of (2) is as follows:
αi,j=Attention(hi,j)
Step 103: the method mainly aims at constructing an uncertainty measurement model of an entity, and is obtained by adopting context-entity self contrast loss; taking the context feature obtained in the step 102 as a negative example, taking the entity self feature obtained in the step 102 as a positive example, and respectively inputting the entity self feature into a model shared by two parameters, so that the higher the probability of obtaining the correct category from the entity prediction is, the better the probability of obtaining the correct category from the entity prediction is in the learning process, and the lower the probability of obtaining the correct category from the entity context feature is, the better the probability of obtaining the correct category from the entity prediction is. The specific calculation mode of the model 1 is as follows:
Model 2 is calculated in a similar manner to model 1, with the loss form:
Finally, given a candidate entity and its context, the class of the candidate entity is c, the model can output a probability, and 1 minus the probability value is the uncertainty metric v i,t of the given candidate entity, which is calculated as follows:
step 104: calculating uncertainty of the entity context features and the entity itself features (between interventions 0 to 1); given a threshold, samples with uncertainties greater than the threshold are removed. The selection of the threshold may be obtained in accordance with a cross-validation approach.
Step 105: converting the uncertainty value of each entity of the sentence into a dropout probability; in the final entity recognition model training process, training to obtain a final named entity recognition model by adopting a Monte Carlo dropout training mode in a Bayesian neural network.
The specific method for converting the uncertainty value of each entity of the sentence into the dropout probability can be set according to the actual task. The simplest way is to directly take the uncertainty value as a dropout probability, or set the dropout value range to 0,0.5, and then map the uncertainty value linearly to this interval.
The invention is not limited to the embodiments described above. The above description of specific embodiments is intended to describe and illustrate the technical aspects of the present invention, and is intended to be illustrative only and not limiting. Numerous specific modifications can be made by those skilled in the art without departing from the spirit of the invention and scope of the claims, which are within the scope of the invention.
Claims (1)
1. The named entity identification method based on uncertainty quantification is characterized by comprising the following steps of:
step1, collecting a sample set positioned to the entity position, and constructing a detection model of a candidate entity;
step 2, for the entities in the sample set, adopting BiLSTM and self_ attention network structures suitable for long text memory dependence to respectively obtain the representation of the entity context characteristics and the entity self characteristics; for sequences in a sample set Wherein s i,t,...,si,t+l is an entity, a word-level word vector is obtained by adopting an ELMO model, meanwhile, for character-level features in words, CNN is used for obtaining character-level word vector representation of the words, and a vector formed by splicing the word-level word vector and the character-level word vector is a sample sequence word vector representation/>Next, using BiLSTM to obtain hidden layer representation h i,j=BiLSTM(ei,j of each word in the sequence), adding self-attention to obtain hidden layer representation of each word in the sequence after weight, and finally respectively obtaining context characteristic representation of the entity and self characteristic representation of the entity according to the position information of the positioning entity in the sequence; wherein the weights and contextual characteristics represent/>
And entity itself feature representationThe calculation mode of (2) is as follows:
αi,j=Attention(hi,j)
Step3, learning an uncertainty quantization model of an entity by adopting the ideas of contrast loss and parameter sharing, and giving an uncertainty value of each entity; taking the context feature obtained in the step 2 as a negative example, and taking the entity feature obtained in the step 2 as a positive example, respectively inputting the entity feature into two models with shared parameters, wherein the specific calculation mode and the loss function of the first model are as follows:
The second model is calculated by the following method and loss function:
Finally, given a candidate entity and its context, the class of the candidate entity is c, and a probability can be output, and 1 minus the probability value is the uncertainty metric v i,t of the given candidate entity, which is specifically calculated as follows:
Step 4, converting the uncertainty value into a dropout probability of each entity, giving a threshold value, and removing samples with uncertainty larger than the threshold value;
and 5, training a new named entity recognition model by introducing the concept of Monte Carlo dropoff training in the Bayesian neural network through the dropoff probability of the entity in the step 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111246467.0A CN114036948B (en) | 2021-10-26 | 2021-10-26 | Named entity identification method based on uncertainty quantification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111246467.0A CN114036948B (en) | 2021-10-26 | 2021-10-26 | Named entity identification method based on uncertainty quantification |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114036948A CN114036948A (en) | 2022-02-11 |
CN114036948B true CN114036948B (en) | 2024-05-31 |
Family
ID=80135412
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111246467.0A Active CN114036948B (en) | 2021-10-26 | 2021-10-26 | Named entity identification method based on uncertainty quantification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114036948B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107203511A (en) * | 2017-05-27 | 2017-09-26 | 中国矿业大学 | A kind of network text name entity recognition method based on neutral net probability disambiguation |
CN111709241A (en) * | 2020-05-27 | 2020-09-25 | 西安交通大学 | Named entity identification method oriented to network security field |
CN111950269A (en) * | 2020-08-21 | 2020-11-17 | 清华大学 | Text statement processing method and device, computer equipment and storage medium |
CN112733541A (en) * | 2021-01-06 | 2021-04-30 | 重庆邮电大学 | Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107391485A (en) * | 2017-07-18 | 2017-11-24 | 中译语通科技(北京)有限公司 | Entity recognition method is named based on the Korean of maximum entropy and neural network model |
US11880411B2 (en) * | 2020-02-12 | 2024-01-23 | Home Depot Product Authority, Llc | Named entity recognition in search queries |
-
2021
- 2021-10-26 CN CN202111246467.0A patent/CN114036948B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107203511A (en) * | 2017-05-27 | 2017-09-26 | 中国矿业大学 | A kind of network text name entity recognition method based on neutral net probability disambiguation |
CN111709241A (en) * | 2020-05-27 | 2020-09-25 | 西安交通大学 | Named entity identification method oriented to network security field |
CN111950269A (en) * | 2020-08-21 | 2020-11-17 | 清华大学 | Text statement processing method and device, computer equipment and storage medium |
CN112733541A (en) * | 2021-01-06 | 2021-04-30 | 重庆邮电大学 | Named entity identification method of BERT-BiGRU-IDCNN-CRF based on attention mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN114036948A (en) | 2022-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110442878B (en) | Translation method, training method and device of machine translation model and storage medium | |
CN112541122A (en) | Recommendation model training method and device, electronic equipment and storage medium | |
CN106340297A (en) | Speech recognition method and system based on cloud computing and confidence calculation | |
JP6172317B2 (en) | Method and apparatus for mixed model selection | |
CN108399434B (en) | Analysis and prediction method of high-dimensional time series data based on feature extraction | |
CN112699998B (en) | Time series prediction method and device, electronic equipment and readable storage medium | |
CN108009571A (en) | A kind of semi-supervised data classification method of new direct-push and system | |
WO2020173270A1 (en) | Method and device used for parsing data and computer storage medium | |
CN114298050A (en) | Model training method, entity relation extraction method, device, medium and equipment | |
CN111161238A (en) | Image quality evaluation method and device, electronic device, and storage medium | |
CN106448660A (en) | Natural language fuzzy boundary determining method with introduction of big data analysis | |
CN114298299A (en) | Model training method, device, equipment and storage medium based on course learning | |
CN114036948B (en) | Named entity identification method based on uncertainty quantification | |
CN117155806A (en) | Communication base station flow prediction method and device | |
CN116542139A (en) | Method and device for predicting roughness of liquid jet polishing surface | |
CN114757310B (en) | Emotion recognition model and training method, device, equipment and readable storage medium thereof | |
CN112347776A (en) | Medical data processing method and device, storage medium and electronic equipment | |
CN112541557B (en) | Training method and device for generating countermeasure network and electronic equipment | |
CN113792776B (en) | Interpretation method for deep learning model in network security anomaly detection | |
CN115577290A (en) | Distribution network fault classification and source positioning method based on deep learning | |
CN114692615A (en) | Small sample semantic graph recognition method for small languages | |
Kim et al. | The use of discriminative belief tracking in pomdp-based dialogue systems | |
CN114416941A (en) | Generation method and device of dialogue knowledge point determination model fusing knowledge graph | |
CN113035363A (en) | Probability density weighted genetic metabolic disease screening data mixed sampling method | |
CN115099240B (en) | Text generation model training method and device, text generation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |