CN114462391B - Nested entity identification method and system based on contrast learning - Google Patents

Nested entity identification method and system based on contrast learning Download PDF

Info

Publication number
CN114462391B
CN114462391B CN202210247571.XA CN202210247571A CN114462391B CN 114462391 B CN114462391 B CN 114462391B CN 202210247571 A CN202210247571 A CN 202210247571A CN 114462391 B CN114462391 B CN 114462391B
Authority
CN
China
Prior art keywords
entity
statement
sentence
data table
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210247571.XA
Other languages
Chinese (zh)
Other versions
CN114462391A (en
Inventor
胡碧峰
王艳飞
胡茂海
尹光荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Workway Shenzhen Information Technology Co ltd
Original Assignee
Workway Shenzhen Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Workway Shenzhen Information Technology Co ltd filed Critical Workway Shenzhen Information Technology Co ltd
Priority to CN202210247571.XA priority Critical patent/CN114462391B/en
Publication of CN114462391A publication Critical patent/CN114462391A/en
Application granted granted Critical
Publication of CN114462391B publication Critical patent/CN114462391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a nested entity classification method and a system based on contrast learning, and provides a target nested entity classification model for nested entity classification, which is obtained through two stages, wherein the first stage utilizes the contrast learning method to learn the representation of an entity, and the second stage adopts a segment method, and as the first stage learns the characteristics of a sample, the proportion of negative samples can be reduced, the convergence of the model can be accelerated, the model result is more stable, and the boundary distinction degree of the entity is higher.

Description

Nested entity identification method and system based on contrast learning
Technical Field
The invention relates to the technical field of natural language processing, in particular to a nested entity identification method and system based on contrast learning.
Background
In the current nested entity identification technology, there are two main methods: firstly, a sequence labeling method is used for decoding for a plurality of times in the decoding process so as to identify nested entities in a sentence; and secondly, a fragment method, namely converting entity identification into classification of fragments, enumerating all fragments in a sentence, and classifying the fragments so as to identify nested entities.
Compared with a sequence labeling method, the fragment method has lower missing recognition rate, so that the fragment method is widely adopted. However, the negative sample size to be considered in the training process of the method is very large, n (n+1)/2 fragments are generated on the assumption that one sentence has n characters, so that sample unbalance is caused, the convergence speed of the model is slower, the training efficiency is affected, and the timeliness requirement of the model on line is not met especially when the sequence is longer.
Disclosure of Invention
Aiming at the technical problems, the embodiment of the invention provides a nested entity classification method and a nested entity classification system based on contrast learning, which are used for at least solving one of the technical problems.
The invention adopts the technical scheme that:
the embodiment of the invention provides a nested entity classification method based on contrast learning, which comprises the following steps:
S1, acquiring an input sentence data table; wherein, the jth line of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the kth character in the jth sentence of the sentence data table i, the value of k is 1 to nij, nij is the number of characters ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij in the jth sentence of the sentence data table i as the r entity in the jth sentence of the sentence data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the jth sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i; n is the number of statement data tables;
S2, for statement j in statement data table i, executing the following operation:
s201, adopting a pre-training language model to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
S202, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
S204, optimizing τ and dropout in the pre-training language model to minimize Loss1 k ij and Loss2 k ij;
s205, setting j=j+1; if j is less than or equal to Pi, S2 is executed; otherwise, executing S3;
s3, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
s4, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
s5, optimizing the optimized pre-training language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model;
S6, classifying the input sentences by using the target nested entity classification model.
The invention also provides a nested entity classification system based on contrast learning, which comprises a server and a database which are in communication connection, wherein the server comprises a processor and a memory which stores a computer program, N statement data tables are stored in the database, the j-th row of the statement data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij which is the k character in the j-th statement of the statement data table i, the value of k is 1 to nij, the nij is the number ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij of the characters in the j-th statement of the statement data table i, the r entity in the j-th statement of the statement data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of the entities in the j-th statement of the statement data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i;
the processor is configured to execute a computer program to implement the steps of:
s10, for a statement j in a statement data table i, executing the following operation:
S101, adopting a pre-training language model Bert to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
s102, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
s104, optimizing tau and dropout in the pre-training language model to enable Loss1 k ij and Loss2 k ij to be minimum;
s105, setting j=j+1; if j is less than or equal to Pi, S10 is executed; otherwise, executing S20;
s20, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
S30, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
And S40, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
The embodiment of the invention has at least the following technical effects: the target nested entity classification model for nested entity classification is obtained through two stages, wherein the first stage utilizes a contrast learning method to learn the representation of the entity, and the second stage adopts a segment method, and as the characteristics of the sample are learned in the first stage, the proportion of negative samples can be reduced, the convergence of the model can be accelerated, the model result is more stable, and the boundary distinction degree of the entity is higher.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved more clear, the technical solutions in the embodiments of the present invention will be clearly and completely described below.
An embodiment of the invention provides a nested entity classification method based on contrast learning, which can include the following steps:
S1, acquiring an input sentence data table; wherein, the jth line of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the kth character in the jth sentence of the sentence data table i, the value of k is 1 to nij, nij is the number of characters ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij in the jth sentence of the sentence data table i as the r entity in the jth sentence of the sentence data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the jth sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i; n is the number of statement data tables.
In an embodiment of the present invention, the number of sentences in each sentence data table may be the same, i.e., p1=p2= … =pn.
In another embodiment of the present invention, the number of sentences in the first N-1 sentence data tables may be the same, i.e., p1=p2= … =p (N-1) =p, and the number of sentences in the last sentence data table may be equal to the total number of sentences M- (N-1) ×p.
S2, for statement j in statement data table i, executing the following operation:
S201, adopting a pre-training language model to code the sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first coding and second coding on X k ij.
In an exemplary embodiment of the invention, the pre-trained language model may be a bert model. Because the mechanism of the random mask in the Bert causes that the results are different even though the result is encoded for a plurality of times for the same sentence, the sample is encoded twice by utilizing the characteristic to generate the positive sample required by contrast learning.
In another exemplary embodiment of the present invention, the pre-trained language model is a roberta model.
Those skilled in the art will appreciate that methods of encoding sentences using pre-trained language models may belong to the prior art.
S202, obtainingWherein/>The first entity representation and the second entity representation of the r-th entity in the entity representation vectors corresponding to h1 ij and h2 ij respectively. /(I)Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
In the embodiment of the invention, the Loss function Loss1 is used for enabling corresponding entity words to be similar in the twice coding result.
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>And (3) representing the first entity representation of the entity q with different types of the entity corresponding to the r-th entity representation in the p-th statement except the statement j in the statement data table i.
In the embodiment of the invention, the Loss function Loss2 is used for enabling the entity words of the same type to be similar and the entity words of different types to be far in the current statement data table.
S204, optimizing τ and dropout in the pre-trained language model to minimize Loss1 k ij and Loss2 k ij.
Those skilled in the art will appreciate that optimizing τ and dropout in the pre-trained language model, such that the minimum implementation of Loss1 k ij and Loss2 k ij, may be prior art.
Through S204, τ and dropout after the first-stage optimization can be obtained.
S205, setting j=j+1; if j is less than or equal to Pi, S2 is executed; otherwise, S3 is performed.
S3, enumerating fragments of each statement t, randomly extracting a set number of fragments except for the entity to serve as negative samples, and obtaining a training set comprising N training samples.
In the embodiment of the invention, the set number can be set based on actual needs. Specifically, each training sample is a sentence, including a positive sample and a negative sample, and the positive sample is an entity in each sentence.
S4, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result.
The classification prediction results may include classification results, i.e., types, for each segment in the training set.
And S5, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
In embodiments of the present invention, the optimized pre-trained language model may be optimized based on the F1 Score. Because in the embodiment of the invention, the positive samples in the training set are labeled, namely the entity type of each entity is known, so that the fragments in the sentence are known to be of no type, and the classification accuracy can be obtained based on the comparison of the predicted type and the actual type. Those skilled in the art will appreciate that determining classification accuracy based on the F1 score may be prior art.
Under the condition that the classification accuracy is greater than or equal to the set threshold, the current classification model is accurate, so the current classification model can be used as the target nested entity classification model, and if the classification accuracy is smaller than the set threshold, tau and dropout are continuously adjusted until the classification accuracy is greater than or equal to the set threshold.
Because S1 and S2 can learn the characteristics of the sample, the proportion of the negative sample can be reduced, the convergence of the model can be quickened, the model result is more stable, and the boundary distinction degree of the entity is higher.
S6, classifying the input sentences by using the target nested entity classification model.
In practical application, the input sentences can be directly classified by using the obtained target nested entity classification model.
The invention provides a nested entity classification system based on contrast learning, which comprises a server and a database which are in communication connection, wherein the server comprises a processor and a memory which stores a computer program, N statement data tables are stored in the database, the j-th row of the statement data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij which is the k character in the j-th statement of the statement data table i, the value of k is 1 to nij, the nij is the number ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij of the characters in the j-th statement of the statement data table i, the r entity in the j-th statement of the statement data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and the mij is the number of the entities in the j-th statement of the statement data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i;
the processor is configured to execute a computer program to implement the steps of:
s10, for a statement j in a statement data table i, executing the following operation:
S101, adopting a pre-training language model Bert to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
s102, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation/>And/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
s104, optimizing tau and dropout in the pre-training language model to enable Loss1 k ij and Loss2 k ij to be minimum;
s105, setting j=j+1; if j is less than or equal to Pi, S10 is executed; otherwise, executing S20;
s20, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
S30, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
And S40, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
Further, in S40, the optimized pre-trained language model is optimized based on the F1 score.
Further, the pre-trained language model is bert models.
Further, the pre-trained language model is roberta models.
Further, p1=p2= … =pn.
The implementation of this embodiment can be seen in the previous embodiments.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims (10)

1. The nested entity classification method based on contrast learning is applied to the technical field of natural language processing and is characterized by comprising the following steps of:
S1, acquiring an input sentence data table; wherein, the jth line of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the kth character in the jth sentence of the sentence data table i, the value of k is 1 to nij, nij is the number of characters ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij in the jth sentence of the sentence data table i as the r entity in the jth sentence of the sentence data table i, T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the jth sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i; n is the number of statement data tables;
S2, for statement j in statement data table i, executing the following operation:
s201, adopting a pre-training language model to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
S202, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation ofAnd/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
S204, optimizing τ and dropout in the pre-training language model to minimize Loss1 k ij and Loss2 k ij;
s205, setting j=j+1; if j is less than or equal to Pi, S2 is executed; otherwise, executing S3;
s3, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
s4, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
s5, optimizing the optimized pre-training language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model;
S6, classifying the input sentences by using the target nested entity classification model.
2. The method of claim 1, wherein in S5 the optimized pre-trained language model is optimized based on the F1 score.
3. The method of claim 1, wherein the pre-trained language model is a bert model.
4. The method of claim 1, wherein the pre-trained language model is a roberta model.
5. The method of claim 1, wherein p1=p2= … =pn.
6. The nested entity classification system based on contrast learning is applied to the technical field of natural language processing and is characterized by comprising a server and a database which are in communication connection, wherein the server comprises a processor and a memory which stores a computer program, N sentence data tables are stored in the database, the j-th row of the sentence data table i comprises (Xij,Lij),Xij=(X1 ij,X2 ij,…,Xnij ij),Xk ij as the k-th character in the j-th sentence of the sentence data table i, the value of k is 1 to nij, nij is the number ;Lij={(E1 ij,T1 ij),(E2 ij,T2 ij),…,(Emij ij,Tmij ij)},Er ij of characters in the j-th sentence of the sentence data table i, the number of T r ij is the r-th entity in the j-th sentence of the sentence data table i, the value of T r ij is the actual entity type corresponding to E r ij, the value of r is 1 to mij, and mij is the number of entities in the j-th sentence of the sentence data table i; i has a value of 1 to N, j has a value of 1 to Pi, and Pi is the number of sentences in the sentence data table i;
the processor is configured to execute a computer program to implement the steps of:
s10, for a statement j in a statement data table i, executing the following operation:
S101, adopting a pre-training language model Bert to encode a sentence j twice to respectively obtain a first characterization vector h1 ij=(h11 ij,h12 ij,…,h1nij ij) and a second characterization vector h2 ij=(h21 ij,h22 ij,…,h2nij ij), wherein h1 k ij and h2 k ij are characterizations obtained by performing first encoding and second encoding on X k ij;
s102, obtaining Wherein/>First and second entity representations of an r-th entity in entity representation vectors corresponding to h1 ij and h2 ij, respectively,/>Characterizing a second entity of a t-th entity in an s-th statement except the statement j in the statement data table i; τ is a temperature super parameter; /(I)Representation ofAnd/>Cosine similarity between/(Representation/>And/>
S203, obtainingWherein B1' i is the entity representation of any entity of the same type as the entity corresponding to the r-th entity representation except for statement j in statement data table i,/>A first entity representation of an entity q with different types corresponding to the r-th entity representation in a p-th statement except the statement j in the statement data table i is obtained;
s104, optimizing tau and dropout in the pre-training language model to enable Loss1 k ij and Loss2 k ij to be minimum;
s105, setting j=j+1; if j is less than or equal to Pi, S10 is executed; otherwise, executing S20;
s20, enumerating fragments of each statement t, randomly extracting a set number of fragments except for an entity to serve as negative samples, and obtaining a training set comprising N training samples;
S30, inputting the training set into the optimized pre-training language model, and classifying the types of the entities in each sentence to obtain a classification prediction result;
And S40, optimizing the optimized pre-trained language model based on the classification prediction result and the actual entity type in each sentence to obtain a target nested entity classification model.
7. The system of claim 6, wherein in S40 the optimized pre-trained language model is optimized based on the F1 score.
8. The system of claim 6, wherein the pre-trained language model is a bert model.
9. The system of claim 6, wherein the pre-trained language model is a roberta model.
10. The system of claim 6, wherein p1=p2= … =pn.
CN202210247571.XA 2022-03-14 2022-03-14 Nested entity identification method and system based on contrast learning Active CN114462391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210247571.XA CN114462391B (en) 2022-03-14 2022-03-14 Nested entity identification method and system based on contrast learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210247571.XA CN114462391B (en) 2022-03-14 2022-03-14 Nested entity identification method and system based on contrast learning

Publications (2)

Publication Number Publication Date
CN114462391A CN114462391A (en) 2022-05-10
CN114462391B true CN114462391B (en) 2024-05-14

Family

ID=81417788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210247571.XA Active CN114462391B (en) 2022-03-14 2022-03-14 Nested entity identification method and system based on contrast learning

Country Status (1)

Country Link
CN (1) CN114462391B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753545A (en) * 2020-06-19 2020-10-09 科大讯飞(苏州)科技有限公司 Nested entity recognition method and device, electronic equipment and storage medium
CN112347785A (en) * 2020-11-18 2021-02-09 湖南国发控股有限公司 Nested entity recognition system based on multitask learning
CN112487812A (en) * 2020-10-21 2021-03-12 上海旻浦科技有限公司 Nested entity identification method and system based on boundary identification
CN113869051A (en) * 2021-09-22 2021-12-31 西安理工大学 Named entity identification method based on deep learning
CN113886571A (en) * 2020-07-01 2022-01-04 北京三星通信技术研究有限公司 Entity identification method, entity identification device, electronic equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753545A (en) * 2020-06-19 2020-10-09 科大讯飞(苏州)科技有限公司 Nested entity recognition method and device, electronic equipment and storage medium
CN113886571A (en) * 2020-07-01 2022-01-04 北京三星通信技术研究有限公司 Entity identification method, entity identification device, electronic equipment and computer readable storage medium
WO2022005188A1 (en) * 2020-07-01 2022-01-06 Samsung Electronics Co., Ltd. Entity recognition method, apparatus, electronic device and computer readable storage medium
CN112487812A (en) * 2020-10-21 2021-03-12 上海旻浦科技有限公司 Nested entity identification method and system based on boundary identification
CN112347785A (en) * 2020-11-18 2021-02-09 湖南国发控股有限公司 Nested entity recognition system based on multitask learning
CN113869051A (en) * 2021-09-22 2021-12-31 西安理工大学 Named entity identification method based on deep learning

Also Published As

Publication number Publication date
CN114462391A (en) 2022-05-10

Similar Documents

Publication Publication Date Title
CN110209823B (en) Multi-label text classification method and system
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN111985239B (en) Entity identification method, entity identification device, electronic equipment and storage medium
CN111738169B (en) Handwriting formula recognition method based on end-to-end network model
CN112100377B (en) Text classification method, apparatus, computer device and storage medium
CN111753995B (en) Local interpretable method based on gradient lifting tree
CN112084435A (en) Search ranking model training method and device and search ranking method and device
CN114330475A (en) Content matching method, device, equipment, storage medium and computer program product
CN116450796A (en) Intelligent question-answering model construction method and device
CN111858984A (en) Image matching method based on attention mechanism Hash retrieval
CN114283350A (en) Visual model training and video processing method, device, equipment and storage medium
CN115700515A (en) Text multi-label classification method and device
CN117114063A (en) Method for training a generative large language model and for processing image tasks
CN114048290A (en) Text classification method and device
CN115114493A (en) Intelligent question-answering system implementation method and device based on question matching
CN117033961A (en) Multi-mode image-text classification method for context awareness
CN111666375A (en) Matching method of text similarity, electronic equipment and computer readable medium
CN114462391B (en) Nested entity identification method and system based on contrast learning
CN111783688A (en) Remote sensing image scene classification method based on convolutional neural network
CN115860002A (en) Combat task generation method and system based on event extraction
Qi et al. A network pruning method for remote sensing image scene classification
CN112925961A (en) Intelligent question and answer method and device based on enterprise entity
CN114722802B (en) Word vector generation method, device, computer equipment and storage medium
CN112507126B (en) Entity linking device and method based on recurrent neural network
US20240087349A1 (en) Handwriting text recognition system based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant