CN116384401A - Named entity recognition method based on prompt learning - Google Patents
Named entity recognition method based on prompt learning Download PDFInfo
- Publication number
- CN116384401A CN116384401A CN202310399388.6A CN202310399388A CN116384401A CN 116384401 A CN116384401 A CN 116384401A CN 202310399388 A CN202310399388 A CN 202310399388A CN 116384401 A CN116384401 A CN 116384401A
- Authority
- CN
- China
- Prior art keywords
- entity
- candidate
- candidate entity
- representing
- boundary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 239000013598 vector Substances 0.000 claims abstract description 42
- 238000013507 mapping Methods 0.000 claims abstract description 23
- 239000012634 fragment Substances 0.000 claims abstract description 22
- 230000008447 perception Effects 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims abstract description 7
- 238000012935 Averaging Methods 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 6
- 238000012546 transfer Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims 3
- 238000003058 natural language processing Methods 0.000 description 4
- 241000736199 Paeonia Species 0.000 description 3
- 235000006484 Paeonia officinalis Nutrition 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 2
- 230000007123 defense Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a named entity recognition method based on prompt learning, which utilizes a text representation model consistency to calculate the similarity between a text sequence and a candidate sample template, selects the most similar candidate sample template to splice into the text sequence in a context mode, uses a transducer-1 encoder to encode, maps into an entity boundary discrimination vector through a linear mapping layer, and obtains a candidate entity boundary predicted value through a conditional random field to obtain a candidate entity fragment; inserting a candidate entity segment separator into the text sequence by utilizing the candidate entity boundary predicted value, constructing entity boundary perception template input, encoding by using a transducer-2 encoder, and averaging character vectors in the candidate entity segment to obtain a candidate entity segment vector; and mapping the candidate entity category discrimination vector into a candidate entity category discrimination vector through a linear mapping layer, and obtaining a candidate entity category predicted value by using a softmax function to obtain the identified named entity. The invention improves the accuracy of named entity identification.
Description
Technical Field
The invention relates to a computer natural language processing technology, in particular to a named entity recognition method based on prompt learning.
Background
The recognition of the famous entity is used as a basic research task in natural language processing, aims to detect entity boundaries from texts and divide entity categories, is a necessary and key preprocessing step for many natural language processing tasks, and the quality of the result performance can directly influence the results of other tasks such as follow-up relation extraction and the like. Therefore, the named entity recognition can be efficiently and accurately completed, and the performance of other natural language processing tasks can be effectively improved.
In recent years, with the development of deep pre-training language models, the sequence labeling architecture based on the deep pre-training language models such as BERT [1], XLNET [2] and ERNIE [3] makes breakthrough progress in the task of identifying named entities by utilizing large-scale labeling data. For example, document [4] uses a BERT pre-trained language model for text representation learning, extracting features with an iteratively expanding convolutional network and a long-term memory network, achieving excellent performance over multiple data sets. Document [5] obtains the enhancement word embedding by using BERT on the basis of BiLSTM-CRF model, and realizes the named entity recognition based on the enhancement word embedding. And the literature [6] utilizes ALBERT, and is based on a deep multi-network collaboration mechanism, so that the accuracy of named entity identification is effectively improved. Document [7] uses stroke features for named entity recognition, inputs stroke sequences, and improves the ELMo model. However, in low-resource scenes such as military national defense, medical images and the like which lack large-scale labeling data, the methods all suffer from a certain degree of characteristic collapse problem (namely, the quality of feature vectors derived from a pre-training language model in the low-resource scene is lower), so that named entities cannot be accurately and efficiently identified.
[1]Devlin J,Chang M W,Lee K,et al.Bert:Pre-training ofdeep bidirectional transformers for languageunderstanding[J].arXivpreprintarXiv:1810.04805,2018.
[2]Yang Z,Dai Z,Yang Y,et al.Xlnet:Generalized autoregressive pretraining for language understanding[J].Advances in neuralinformationprocessing systems,2019,32.
[3]Zhang Z,Han X,Liu Z,et al.ERNIE:Enhanced Language Representation with Informative Entities[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:1441-1451.
[4]Chang Y,Kong L,Jia K,et al.Chinese named entity recognition method based on BERT[C]//2021IEEE International Conference on Data Science and Computer Application(ICDSCA).IEEE,2021:294-299.
[5]Jia B,Wu Z,Wu B,et al.Enhanced character embedding for Chinese named entity recognition[J].MeasurementandControl,2020,53(9-10):1669-1681.
[6]Yao L,Huang H,Wang KW,et al.Fine-grained mechanical Chinese named entity recognition basedonALBERT-AttBiLSTM-CRF andtransfer learning[J].Symmetry,2020,12(12):1986.
[7] Luo Ling, yang Zhihao, song Yawen, etc. Chinese electronic medical record naming entity recognition study based on stroke ELMo and multitasking learning [ J ] computer science newspaper 2020,43 (10): 15.
Disclosure of Invention
The invention aims to provide a named entity recognition method based on prompt learning so as to solve the problems of representation collapse and the like in a low-resource scene.
The technical solution for realizing the purpose of the invention is as follows: a named entity recognition method based on prompt learning comprises the following steps:
step 3, inserting a candidate entity segment separator/in the text sequence by using the candidate entity boundary predicted value, constructing entity boundary perception template input, encoding by using a transducer-2 encoder, and averaging character vectors in the candidate entity segments to obtain candidate entity segment vectors;
and 4, mapping the model into a candidate entity category discrimination vector through a linear mapping layer, and obtaining a candidate entity category predicted value by using a softmax function to obtain the identified named entity.
Further, in step 1, similarity between the text sequence and the candidate sample template is calculated by using a text representation model Consert, the candidate sample template which is the most similar is selected to be spliced into the text sequence in the form of context, and is encoded by using a transducer-1 encoder, and the specific formula is as follows:
wherein,,representing a candidate sample example template, t representing a candidate sample representationTemplate length,/represents separator, +.>Representation->Is the t candidate entity fragment of (1), x= { X 1 ,x 2 ,…,x n -text sequence is represented, n text sequence length is represented, consert (·) similarity between two text sequences is calculated, D represents candidate sample example template set, < >>Representing the selected sample example template, h= { H 1 ,h 2 ,…,h n The text sequence X encoded output, +.>Sample template representation +.>And outputting the coded output.
In step 2, the output of the transducer-1 encoder is mapped into an entity boundary discrimination vector through a linear mapping layer, and a candidate entity boundary predicted value is obtained through a conditional random field, so that a candidate entity fragment is obtained, and the specific formula is as follows:
o i =W·h i +b (3)
wherein h is i Is the output of the transducer-1 encoder for the ith character, o i Entity boundary discrimination vectors representing the ith character, W, b represent trainable parameters,candidate entity boundary prediction representing the ith characterValue of->The candidate entity boundary which represents the ith character possibly takes a value, T= { B, I, E, D } represents a candidate entity boundary taking a value set, and +.>Andrepresentation modeling from +.>Transfer to->Is used for training the parameters of the system.
Further, in step 3, a candidate entity boundary predictor is utilized to insert a candidate entity segment separator "/" into the text sequence, an entity boundary perception template is constructed and input, a transducer-2 encoder is used for encoding, character vectors in the candidate entity segment are averaged, and a candidate entity segment vector is obtained, wherein the specific formula is as follows:
wherein,,representing entity boundary perception template input, m represents candidate entity fragment number, w m Representing the mth candidate entity segment obtained from the candidate entity boundary prediction value,/representing the candidate entity segment separator, average () representing the character vector in the Average candidate entity segment,/and> representing candidate entity fragment vectors.
In step 4, a linear mapping layer is mapped to a candidate entity class discrimination vector, and a candidate entity class predicted value is obtained by using a softmax function, so that the identified named entity is obtained, and the specific formula is as follows:
wherein,,and->Representing trainable parameters->Representing the i candidate entity fragment vector, +.>Representing the i candidate entity fragment class prediction value.
Compared with the prior art, the invention has the remarkable advantages that: prompt learning is integrated, additional priori knowledge is provided by using a sample example template and an entity boundary perception template, entity perception characterization is generated, the problem of characterization collapse of a mainstream named entity recognition method in a low-resource scene is effectively avoided, and named entity recognition accuracy is improved.
Drawings
FIG. 1 is a flow chart of a named entity recognition method based on prompt learning;
FIG. 2 is a diagram of a named entity recognition model based on prompt learning.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
A named entity recognition method based on prompt learning comprises the following steps:
wherein,,representing a candidate sample template, t representing the candidate sample template length,/representing a separator, ++>Representation->T candidate entity fragment, t= { x 1 ,x 2 ,…,x n -text sequence is represented, n text sequence length is represented, consert (·) similarity between two text sequences is calculated, D represents candidate sample example template set, < >>Representing the selected sample example template, h= { H 1 ,h 2 ,…,h n The text sequence X encoded output, +.>Representation ofSample example template->And outputting the coded output.
o i =W·h i +b (3)
wherein h is i Is the output of the transducer-1 encoder for the ith character, o i Entity boundary discrimination vectors representing the ith character, W, b represent trainable parameters,candidate entity boundary prediction value representing the ith character,/->The candidate entity boundary which represents the ith character possibly takes a value, T= { B, I, E, S } represents a candidate entity boundary taking a value set, and +.>Andrepresentation modeling from +.>Transfer to->Is used for training the parameters of the system.
And 3, inserting a candidate entity segment separator "/" into the text sequence by utilizing the candidate entity boundary predicted value, constructing an entity boundary perception template input, encoding by using a transducer-2 encoder, and averaging character vectors in the candidate entity segments to obtain candidate entity segment vectors, wherein the specific formula is as follows:
wherein,,representing entity boundary perception template input, m represents candidate entity fragment number, w m Representing the mth candidate entity segment obtained from the candidate entity boundary prediction value,/representing the candidate entity segment separator, average () representing the character vector in the Average candidate entity segment,/and> representing candidate entity fragment vectors.
And 4, mapping the model into a candidate entity category discrimination vector through a linear mapping layer, obtaining a candidate entity category predicted value by using a softmax function, and obtaining the identified named entity, wherein the specific formula is as follows:
wherein,,and->Representing trainable parameters->Representing the ith candidateEntity fragment vector->Representing the i candidate entity fragment class prediction value.
Examples
To verify the effectiveness of the inventive protocol, the following experiments were performed.
Given the text sequence [ Harbin is cold in winter ], the named entity is Harbin with category LOC. The method of the invention is adopted to identify the named entity in the text sequence, and the specific implementation steps are as follows:
Step 1.1, calculating the similarity between the text sequence [ Haerbin winter good coldness) and all templates in the candidate sample template set D by using a Consert (·) function, and obtaining the most similar sample templates [ Luoyang/peony/nice ].
Step 1.2, spliced text sequence [ Haerbin winter good Cold ]]And [ Luoyang// peony/nice looking ]]Obtaining [ Harbin winter good cold SEP Luoyang/peony/good looking ]]Inputting a transducer-1 encoder, wherein SEP represents an inter-sentence separator to obtain H= [ H ] 1 ,h 2 ,…,h 8 ]。
step 3, for text sequence [ Haerbin winter good Cold ]]Adding a separator/(neglecting the first character of the text sequence) before the character with the predicted value of the candidate entity boundary being B, and adding a separator/(neglecting the first character of the text sequence) before the character with the predicted value of the candidate entity boundary being S, thereby obtaining the entity boundary perception template input [ Harbin/winter/good/cold ]]Encoding by using a transducer-2 encoder, and averaging character vectors in the candidate entity fragments to obtain candidate entity fragment vectors
Step 4, willMapping the candidate entity category discrimination vector by a linear mapping layer, and obtaining a candidate entity category predicted value +.>And acquiring the identified LOC entity halbine.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.
Claims (8)
1. A named entity recognition method based on prompt learning is characterized by comprising the following steps:
step 1, calculating the similarity between a text sequence and a candidate sample template by using a text representation model Consert, selecting the most similar candidate sample template to splice the template into the text sequence in a context mode, and encoding by using a transducer-1 encoder;
step 2, mapping the output of the transducer-1 encoder into an entity boundary discrimination vector through a layer of linear mapping layer, and obtaining a candidate entity boundary predicted value through a conditional random field to obtain a candidate entity fragment;
step 3, inserting a candidate entity segment separator/in the text sequence by using the candidate entity boundary predicted value, constructing entity boundary perception template input, encoding by using a transducer-2 encoder, and averaging character vectors in the candidate entity segments to obtain candidate entity segment vectors;
and 4, mapping the model into a candidate entity category discrimination vector through a linear mapping layer, and obtaining a candidate entity category predicted value by using a softmax function to obtain the identified named entity.
2. The named entity recognition method based on prompt learning according to claim 1, wherein in step 1, similarity between a text sequence and a candidate sample template is calculated by using a text representation model consirt, a candidate sample template which is the most similar is selected to be spliced into the text sequence in a context form, and is encoded by using a transducer-1 encoder, and a specific formula is as follows:
wherein,,representing a candidate sample template, t representing the candidate sample template length,/representing a separator, ++>Representation->Is the t candidate entity fragment of (1), x= { X 1 ,x 2 ,…,x n -text sequence is represented, n text sequence length is represented, consert (·) similarity between two text sequences is calculated, D represents candidate sample example template set, < >>Representing the selected sample example template, h= { H 1 ,h 2 ,…,h n The text sequence X encoded output, +.>Sample template representation +.>And outputting the coded output.
3. The named entity recognition method based on prompt learning according to claim 2, wherein in step 2, the output of a transducer-1 encoder is mapped into an entity boundary discrimination vector through a linear mapping layer, and a candidate entity boundary prediction value is obtained through a conditional random field, so as to obtain a candidate entity segment, and the specific formula is as follows:
o i =W·h i +b (3)
wherein h is i Transformer-1 encoder that is the ith characterOutput, o i Entity boundary discrimination vectors representing the ith character, W, b represent trainable parameters,candidate entity boundary prediction value representing the ith character,/->Candidate entity boundary representing the ith character may take on value, t= { B, I, E, S } represents candidate entity boundary take on value set, +.>And->Representation modeling from +.>Transfer to->Is used for training the parameters of the system.
4. The recognition method of named entity based on prompt learning according to claim 3, wherein in step 3, a candidate entity segment separator "/" is inserted into a text sequence by using a candidate entity boundary predicted value, an entity boundary perception template input is constructed, a transducer-2 encoder is used for encoding, character vectors in candidate entity segments are averaged, and a candidate entity segment vector is obtained, wherein the specific formula is as follows:
wherein,,representing entity boundary perception template input, m represents candidate entity fragment number, w m Representing the mth candidate entity segment obtained from the candidate entity boundary prediction value,/representing the candidate entity segment separator, average () representing the character vector in the Average candidate entity segment,/and> representing candidate entity fragment vectors.
5. The method for recognizing named entity based on prompt learning according to claim 4, wherein in step 4, a layer of linear mapping layer is mapped into a candidate entity class discrimination vector, and a candidate entity class predicted value is obtained by using a softmax function, so as to obtain the recognized named entity, and the specific formula is as follows:
6. A named entity recognition system based on prompt learning, characterized in that named entity recognition based on prompt learning is realized based on named entity recognition according to any one of claims 1-5.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing named entity recognition based on hint learning based on named entity recognition of any of claims 1-5 when the computer program is executed.
8. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements named entity recognition based on hint learning based on named entity recognition according to any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310399388.6A CN116384401A (en) | 2023-04-14 | 2023-04-14 | Named entity recognition method based on prompt learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310399388.6A CN116384401A (en) | 2023-04-14 | 2023-04-14 | Named entity recognition method based on prompt learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116384401A true CN116384401A (en) | 2023-07-04 |
Family
ID=86976723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310399388.6A Pending CN116384401A (en) | 2023-04-14 | 2023-04-14 | Named entity recognition method based on prompt learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116384401A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117034942A (en) * | 2023-10-07 | 2023-11-10 | 之江实验室 | Named entity recognition method, device, equipment and readable storage medium |
-
2023
- 2023-04-14 CN CN202310399388.6A patent/CN116384401A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117034942A (en) * | 2023-10-07 | 2023-11-10 | 之江实验室 | Named entity recognition method, device, equipment and readable storage medium |
CN117034942B (en) * | 2023-10-07 | 2024-01-09 | 之江实验室 | Named entity recognition method, device, equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111222317B (en) | Sequence labeling method, system and computer equipment | |
CN111460807B (en) | Sequence labeling method, device, computer equipment and storage medium | |
CN111783462A (en) | Chinese named entity recognition model and method based on dual neural network fusion | |
CN109887484B (en) | Dual learning-based voice recognition and voice synthesis method and device | |
CN113313022B (en) | Training method of character recognition model and method for recognizing characters in image | |
CN110598191B (en) | Complex PDF structure analysis method and device based on neural network | |
CN111950287B (en) | Entity identification method based on text and related device | |
CN109325242B (en) | Method, device and equipment for judging whether sentences are aligned based on word pairs and translation | |
CN110990555B (en) | End-to-end retrieval type dialogue method and system and computer equipment | |
CN112380837B (en) | Similar sentence matching method, device, equipment and medium based on translation model | |
JP2021033995A (en) | Text processing apparatus, method, device, and computer-readable storage medium | |
WO2021212601A1 (en) | Image-based writing assisting method and apparatus, medium, and device | |
CN113158687B (en) | Semantic disambiguation method and device, storage medium and electronic device | |
CN113536795B (en) | Method, system, electronic device and storage medium for entity relation extraction | |
CN113987169A (en) | Text abstract generation method, device and equipment based on semantic block and storage medium | |
CN116384401A (en) | Named entity recognition method based on prompt learning | |
CN112232070A (en) | Natural language processing model construction method, system, electronic device and storage medium | |
CN114445832A (en) | Character image recognition method and device based on global semantics and computer equipment | |
JP2023062150A (en) | Character recognition model training, character recognition method, apparatus, equipment, and medium | |
CN111597816A (en) | Self-attention named entity recognition method, device, equipment and storage medium | |
CN113887169A (en) | Text processing method, electronic device, computer storage medium, and program product | |
CN116975347A (en) | Image generation model training method and related device | |
CN114882334B (en) | Method for generating pre-training model, model training method and device | |
CN113052156B (en) | Optical character recognition method, device, electronic equipment and storage medium | |
CN110866404B (en) | Word vector generation method and device based on LSTM neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |