CN114462391B - Nested entity identification method and system based on contrast learning - Google Patents
Nested entity identification method and system based on contrast learning Download PDFInfo
- Publication number
- CN114462391B CN114462391B CN202210247571.XA CN202210247571A CN114462391B CN 114462391 B CN114462391 B CN 114462391B CN 202210247571 A CN202210247571 A CN 202210247571A CN 114462391 B CN114462391 B CN 114462391B
- Authority
- CN
- China
- Prior art keywords
- entity
- statement
- sentence
- data table
- representation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000013145 classification model Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims description 42
- 239000012634 fragment Substances 0.000 claims description 20
- 238000012512 characterization method Methods 0.000 claims description 18
- 239000013598 vector Substances 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 3
- 238000003058 natural language processing Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a nested entity classification method and a system based on contrast learning, and provides a target nested entity classification model for nested entity classification, which is obtained through two stages, wherein the first stage utilizes the contrast learning method to learn the representation of an entity, and the second stage adopts a segment method, and as the first stage learns the characteristics of a sample, the proportion of negative samples can be reduced, the convergence of the model can be accelerated, the model result is more stable, and the boundary distinction degree of the entity is higher.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a nested entity identification method and system based on contrast learning.
Background
In the current nested entity identification technology, there are two main methods: firstly, a sequence labeling method is used for decoding for a plurality of times in the decoding process so as to identify nested entities in a sentence; and secondly, a fragment method, namely converting entity identification into classification of fragments, enumerating all fragments in a sentence, and classifying the fragments so as to identify nested entities.
Compared with a sequence labeling method, the fragment method has lower missing recognition rate, so that the fragment method is widely adopted. However, the negative sample size to be considered in the training process of the method is very large, n (n+1)/2 fragments are generated on the assumption that one sentence has n characters, so that sample unbalance is caused, the convergence speed of the model is slower, the training efficiency is affected, and the timeliness requirement of the model on line is not met especially when the sequence is longer.
Disclosure of Invention
Aiming at the technical problems, the embodiment of the invention provides a nested entity classification method and a nested entity classification system based on contrast learning, which are used for at least solving one of the technical problems.
The invention adopts the technical scheme that:
the embodiment of the invention provides a nested entity classification method based on contrast learning, which comprises the following steps:
S1, acquiring an input sentence data table; wherein, the jth line of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the kth character in the jth sentence of the sentence data table i, the value of k is 1 to nij, nij is the number of characters ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij in the jth sentence of the sentence data table i as the r entity in the jth sentence of the sentence data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the jth sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i; n is the number of statement data tables;
S2, for statement j in statement data table i, executing the following operation:
s201, adopting a pre-training language model to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
S202, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
S204, optimizing τ and dropout in the pre-training language model to minimize Loss1 k ij and Loss2 k ij;
s205, setting j=j+1; if j is less than or equal to Pi, S2 is executed; otherwise, executing S3;
s3, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
s4, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
s5, optimizing the optimized pre-training language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model;
S6, classifying the input sentences by using the target nested entity classification model.
The invention also provides a nested entity classification system based on contrast learning, which comprises a server and a database which are in communication connection, wherein the server comprises a processor and a memory which stores a computer program, N statement data tables are stored in the database, the j-th row of the statement data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij which is the k character in the j-th statement of the statement data table i, the value of k is 1 to nij, the nij is the number ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij of the characters in the j-th statement of the statement data table i, the r entity in the j-th statement of the statement data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of the entities in the j-th statement of the statement data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i;
the processor is configured to execute a computer program to implement the steps of:
s10, for a statement j in a statement data table i, executing the following operation:
S101, adopting a pre-training language model Bert to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
s102, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
s104, optimizing tau and dropout in the pre-training language model to enable Loss1 k ij and Loss2 k ij to be minimum;
s105, setting j=j+1; if j is less than or equal to Pi, S10 is executed; otherwise, executing S20;
s20, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
S30, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
And S40, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
The embodiment of the invention has at least the following technical effects: the target nested entity classification model for nested entity classification is obtained through two stages, wherein the first stage utilizes a contrast learning method to learn the representation of the entity, and the second stage adopts a segment method, and as the characteristics of the sample are learned in the first stage, the proportion of negative samples can be reduced, the convergence of the model can be accelerated, the model result is more stable, and the boundary distinction degree of the entity is higher.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved more clear, the technical solutions in the embodiments of the present invention will be clearly and completely described below.
An embodiment of the invention provides a nested entity classification method based on contrast learning, which can include the following steps:
S1, acquiring an input sentence data table; wherein, the jth line of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the kth character in the jth sentence of the sentence data table i, the value of k is 1 to nij, nij is the number of characters ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij in the jth sentence of the sentence data table i as the r entity in the jth sentence of the sentence data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the jth sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i; n is the number of statement data tables.
In an embodiment of the present invention, the number of sentences in each sentence data table may be the same, i.e., p1=p2= … =pn.
In another embodiment of the present invention, the number of sentences in the first N-1 sentence data tables may be the same, i.e., p1=p2= … =p (N-1) =p, and the number of sentences in the last sentence data table may be equal to the total number of sentences M- (N-1) ×p.
S2, for statement j in statement data table i, executing the following operation:
S201, adopting a pre-training language model to code the sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first coding and second coding on X k ij.
In an exemplary embodiment of the invention, the pre-trained language model may be a bert model. Because the mechanism of the random mask in the Bert causes that the results are different even though the result is encoded for a plurality of times for the same sentence, the sample is encoded twice by utilizing the characteristic to generate the positive sample required by contrast learning.
In another exemplary embodiment of the present invention, the pre-trained language model is a roberta model.
Those skilled in the art will appreciate that methods of encoding sentences using pre-trained language models may belong to the prior art.
S202, obtainingWherein/>The first entity representation and the second entity representation of the r-th entity in the entity representation vectors corresponding to h1 ij and h2 ij respectively. /(I)Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
In the embodiment of the invention, the Loss function Loss1 is used for enabling corresponding entity words to be similar in the twice coding result.
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>And (3) representing the first entity representation of the entity q with different types of the entity corresponding to the r-th entity representation in the p-th statement except the statement j in the statement data table i.
In the embodiment of the invention, the Loss function Loss2 is used for enabling the entity words of the same type to be similar and the entity words of different types to be far in the current statement data table.
S204, optimizing τ and dropout in the pre-trained language model to minimize Loss1 k ij and Loss2 k ij.
Those skilled in the art will appreciate that optimizing τ and dropout in the pre-trained language model, such that the minimum implementation of Loss1 k ij and Loss2 k ij, may be prior art.
Through S204, τ and dropout after the first-stage optimization can be obtained.
S205, setting j=j+1; if j is less than or equal to Pi, S2 is executed; otherwise, S3 is performed.
S3, enumerating fragments of each statement t, randomly extracting a set number of fragments except for the entity to serve as negative samples, and obtaining a training set comprising N training samples.
In the embodiment of the invention, the set number can be set based on actual needs. Specifically, each training sample is a sentence, including a positive sample and a negative sample, and the positive sample is an entity in each sentence.
S4, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result.
The classification prediction results may include classification results, i.e., types, for each segment in the training set.
And S5, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
In embodiments of the present invention, the optimized pre-trained language model may be optimized based on the F1 Score. Because in the embodiment of the invention, the positive samples in the training set are labeled, namely the entity type of each entity is known, so that the fragments in the sentence are known to be of no type, and the classification accuracy can be obtained based on the comparison of the predicted type and the actual type. Those skilled in the art will appreciate that determining classification accuracy based on the F1 score may be prior art.
Under the condition that the classification accuracy is greater than or equal to the set threshold, the current classification model is accurate, so the current classification model can be used as the target nested entity classification model, and if the classification accuracy is smaller than the set threshold, tau and dropout are continuously adjusted until the classification accuracy is greater than or equal to the set threshold.
Because S1 and S2 can learn the characteristics of the sample, the proportion of the negative sample can be reduced, the convergence of the model can be quickened, the model result is more stable, and the boundary distinction degree of the entity is higher.
S6, classifying the input sentences by using the target nested entity classification model.
In practical application, the input sentences can be directly classified by using the obtained target nested entity classification model.
The invention provides a nested entity classification system based on contrast learning, which comprises a server and a database which are in communication connection, wherein the server comprises a processor and a memory which stores a computer program, N statement data tables are stored in the database, the j-th row of the statement data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij which is the k character in the j-th statement of the statement data table i, the value of k is 1 to nij, the nij is the number ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij of the characters in the j-th statement of the statement data table i, the r entity in the j-th statement of the statement data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and the mij is the number of the entities in the j-th statement of the statement data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i;
the processor is configured to execute a computer program to implement the steps of:
s10, for a statement j in a statement data table i, executing the following operation:
S101, adopting a pre-training language model Bert to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
s102, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
s104, optimizing tau and dropout in the pre-training language model to enable Loss1 k ij and Loss2 k ij to be minimum;
s105, setting j=j+1; if j is less than or equal to Pi, S10 is executed; otherwise, executing S20;
s20, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
S30, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
And S40, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
Further, in S40, the optimized pre-trained language model is optimized based on the F1 score.
Further, the pre-trained language model is bert models.
Further, the pre-trained language model is roberta models.
Further, p1=p2= … =pn.
The implementation of this embodiment can be seen in the previous embodiments.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.
Claims (10)
1. The nested entity classification method based on contrast learning is applied to the technical field of natural language processing and is characterized by comprising the following steps of:
S1, acquiring an input sentence data table; wherein, the jth line of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the kth character in the jth sentence of the sentence data table i, the value of k is 1 to nij, nij is the number of characters ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij in the jth sentence of the sentence data table i as the r entity in the jth sentence of the sentence data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the jth sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i; n is the number of statement data tables;
S2, for statement j in statement data table i, executing the following operation:
s201, adopting a pre-training language model to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
S202, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation ofAnd/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
S204, optimizing τ and dropout in the pre-training language model to minimize Loss1 k ij and Loss2 k ij;
s205, setting j=j+1; if j is less than or equal to Pi, S2 is executed; otherwise, executing S3;
s3, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
s4, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
s5, optimizing the optimized pre-training language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model;
S6, classifying the input sentences by using the target nested entity classification model.
2. The method of claim 1, wherein in S5 the optimized pre-trained language model is optimized based on the F1 score.
3. The method of claim 1, wherein the pre-trained language model is a bert model.
4. The method of claim 1, wherein the pre-trained language model is a roberta model.
5. The method of claim 1, wherein p1=p2= … =pn.
6. The nested entity classification system based on contrast learning is applied to the technical field of natural language processing and is characterized by comprising a server and a database which are in communication connection, wherein the server comprises a processor and a memory which stores a computer program, N sentence data tables are stored in the database, the j-th row of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the k-th character in the j-th sentence of the sentence data table i, the value of k is 1 to nij, nij is the number ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij of characters in the j-th sentence of the sentence data table i, the number of T r ij is the r-th entity in the j-th sentence of the sentence data table i, the value of T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the j-th sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i;
the processor is configured to execute a computer program to implement the steps of:
s10, for a statement j in a statement data table i, executing the following operation:
S101, adopting a pre-training language model Bert to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
s102, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation ofAnd/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
s104, optimizing tau and dropout in the pre-training language model to enable Loss1 k ij and Loss2 k ij to be minimum;
s105, setting j=j+1; if j is less than or equal to Pi, S10 is executed; otherwise, executing S20;
s20, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
S30, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
And S40, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
7. The system of claim 6, wherein in S40 the optimized pre-trained language model is optimized based on the F1 score.
8. The system of claim 6, wherein the pre-trained language model is a bert model.
9. The system of claim 6, wherein the pre-trained language model is a roberta model.
10. The system of claim 6, wherein p1=p2= … =pn.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210247571.XA CN114462391B (en) | 2022-03-14 | 2022-03-14 | Nested entity identification method and system based on contrast learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210247571.XA CN114462391B (en) | 2022-03-14 | 2022-03-14 | Nested entity identification method and system based on contrast learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114462391A CN114462391A (en) | 2022-05-10 |
CN114462391B true CN114462391B (en) | 2024-05-14 |
Family
ID=81417788
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210247571.XA Active CN114462391B (en) | 2022-03-14 | 2022-03-14 | Nested entity identification method and system based on contrast learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114462391B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111753545A (en) * | 2020-06-19 | 2020-10-09 | 科大讯飞(苏州)科技有限公司 | Nested entity recognition method and device, electronic equipment and storage medium |
CN112347785A (en) * | 2020-11-18 | 2021-02-09 | 湖南国发控股有限公司 | Nested entity recognition system based on multitask learning |
CN112487812A (en) * | 2020-10-21 | 2021-03-12 | 上海旻浦科技有限公司 | Nested entity identification method and system based on boundary identification |
CN113869051A (en) * | 2021-09-22 | 2021-12-31 | 西安理工大学 | Named entity identification method based on deep learning |
CN113886571A (en) * | 2020-07-01 | 2022-01-04 | 北京三星通信技术研究有限公司 | Entity identification method, entity identification device, electronic equipment and computer readable storage medium |
-
2022
- 2022-03-14 CN CN202210247571.XA patent/CN114462391B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111753545A (en) * | 2020-06-19 | 2020-10-09 | 科大讯飞(苏州)科技有限公司 | Nested entity recognition method and device, electronic equipment and storage medium |
CN113886571A (en) * | 2020-07-01 | 2022-01-04 | 北京三星通信技术研究有限公司 | Entity identification method, entity identification device, electronic equipment and computer readable storage medium |
WO2022005188A1 (en) * | 2020-07-01 | 2022-01-06 | Samsung Electronics Co., Ltd. | Entity recognition method, apparatus, electronic device and computer readable storage medium |
CN112487812A (en) * | 2020-10-21 | 2021-03-12 | 上海旻浦科技有限公司 | Nested entity identification method and system based on boundary identification |
CN112347785A (en) * | 2020-11-18 | 2021-02-09 | 湖南国发控股有限公司 | Nested entity recognition system based on multitask learning |
CN113869051A (en) * | 2021-09-22 | 2021-12-31 | 西安理工大学 | Named entity identification method based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN114462391A (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110209823B (en) | Multi-label text classification method and system | |
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN111985239B (en) | Entity identification method, entity identification device, electronic equipment and storage medium | |
CN111738169B (en) | Handwriting formula recognition method based on end-to-end network model | |
CN112100377B (en) | Text classification method, apparatus, computer device and storage medium | |
CN114330475A (en) | Content matching method, device, equipment, storage medium and computer program product | |
CN111753995B (en) | Local interpretable method based on gradient lifting tree | |
CN112084435A (en) | Search ranking model training method and device and search ranking method and device | |
CN116450796A (en) | Intelligent question-answering model construction method and device | |
CN117114063A (en) | Method for training a generative large language model and for processing image tasks | |
CN114283350A (en) | Visual model training and video processing method, device, equipment and storage medium | |
CN111858984A (en) | Image matching method based on attention mechanism Hash retrieval | |
CN115700515A (en) | Text multi-label classification method and device | |
CN114048290A (en) | Text classification method and device | |
CN118171149B (en) | Label classification method, apparatus, device, storage medium and computer program product | |
CN115114493A (en) | Intelligent question-answering system implementation method and device based on question matching | |
CN117033961A (en) | Multi-mode image-text classification method for context awareness | |
CN111666375A (en) | Matching method of text similarity, electronic equipment and computer readable medium | |
CN114462391B (en) | Nested entity identification method and system based on contrast learning | |
CN111783688A (en) | Remote sensing image scene classification method based on convolutional neural network | |
CN114595329B (en) | System and method for extracting few sample events of prototype network | |
CN115860002A (en) | Combat task generation method and system based on event extraction | |
CN116798044A (en) | Text recognition method and device and electronic equipment | |
Qi et al. | A network pruning method for remote sensing image scene classification | |
CN114722802B (en) | Word vector generation method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |