CN116644751A - Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning - Google Patents

Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning Download PDF

Info

Publication number
CN116644751A
CN116644751A CN202310621806.1A CN202310621806A CN116644751A CN 116644751 A CN116644751 A CN 116644751A CN 202310621806 A CN202310621806 A CN 202310621806A CN 116644751 A CN116644751 A CN 116644751A
Authority
CN
China
Prior art keywords
domain
span
loss
learning
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310621806.1A
Other languages
Chinese (zh)
Inventor
王也
史宸枭
韩启龙
宋洪涛
刘鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Longming Technology Co ltd
Harbin Engineering University
Original Assignee
Harbin Longming Technology Co ltd
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Longming Technology Co ltd, Harbin Engineering University filed Critical Harbin Longming Technology Co ltd
Priority to CN202310621806.1A priority Critical patent/CN116644751A/en
Publication of CN116644751A publication Critical patent/CN116644751A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

A cross-domain named entity recognition method, device, storage medium and product based on span comparison learning belong to the technical field of named entity recognition and solve the problem of low learning performance of domain offset and span boundary information. The method of the application comprises the following steps: the cross-domain named entity recognition model based on span contrast learning is constructed by using the technologies of a pre-training language model, countermeasure training, contrast learning, KL divergence and the like, the sequence labels are converted into a global boundary matrix by utilizing entity boundary information, the contrast learning of the span level and the calculation of the KL divergence are completed, relevant parameters in the model are continuously changed in the training process, the entity boundary information is fully considered, more fields of invariance information are learned, and the recognition performance of the cross-domain named entity is greatly improved. The method is suitable for identifying cross-domain named entities.

Description

Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning
Technical Field
The application relates to the technical field of named entity recognition, in particular to cross-domain named entity recognition.
Background
Named entity recognition (Named Entity Recognition, NER) is a task of automatically recognizing named entities in text, such as person names, place names, institution names, etc., and classifying them into different types. Deep learning models have outperformed traditional machine learning methods in terms of feature extraction depth and model performance, but require large amounts of labeling data. NER tasks are difficult to solve in the case of data resource starvation in certain fields, languages, etc. The domain adaptation is an important part of the transfer learning, and the domain offset problem is a common problem in the domain adaptation. Domain migration refers to the phenomenon of a model's performance degrading when migrating from one domain to another when the training set and the test set do not conform to the same underlying distribution.
A wide range of algorithms have been proposed to alleviate the domain shift problem, such as domain resistant neural networks (DANN) and distribution matching. However, these algorithms all have some problems. For DANN, the instability of the joint optimization training process requires a large amount of hyper-parameter adjustment. While the distributed matching algorithm is difficult to simultaneously maintain the discrimination capability of the model to the target task when attempting to realize instance level alignment. Therefore, there is a need to develop a stable and efficient solution to learn the domain invariance and instance matching capabilities of unsupervised domain adaptation.
In recent self-supervised learning (SSL) studies, contrast Learning (CL) has proven to be an effective approach to facilitating learning characterization at the instance level using data proxy task definitions in raw learning. From a domain adaptation perspective, constructing cross-domain positive samples and aligning domain-cognition pairs receives less attention in the relevant literature. Previous work focused on designing text transformations that preserve labels, such as transliteration, synonyms, omissions, and combinations thereof.
Disclosure of Invention
The invention aims to solve the problem of low learning performance of domain offset and span boundary information aiming at the problem of identifying cross-domain named entities, and provides a cross-domain named entity identification method, equipment, a storage medium and a product based on span comparison learning.
The invention is realized by the following technical scheme, and in one aspect, the invention provides a span comparison learning-based cross-domain named entity identification method, which comprises the following steps:
step 1, acquiring a source domain data set and a target domain data set, preprocessing the data set, and dividing the data set into a training set and a testing set;
step 2, constructing a cross-domain named entity recognition model based on span comparison learning, which specifically comprises the following steps:
Step 2.1, obtaining embedded representations of source domain data and target domain data, and assigning corresponding domain labels to the source domain and the target domain;
step 2.2, constructing a domain confusion enhancement sample, embedding and inputting the source domain and target domain data obtained in the step 2.1 into a pre-training language model BERT, generating an antagonism sample by using a projection gradient descent PGD method, and classifying the domain by using the antagonism attack;
step 2.3, generating a global boundary prediction matrix, which specifically comprises the following steps:
embedding a source domain into an input BERT, and constructing a Global boundary prediction matrix by using Global Pointer by using the obtained output; embedding the source domain and the domain confusion enhancement sample generated in the step 2.2, inputting the embedded domain confusion enhancement sample into BERT, and constructing a global boundary prediction matrix added with disturbance resistance by using a GlobalPointer by using the obtained output;
step 3, training the cross-domain named entity recognition model based on span comparison learning in the step 2, and specifically comprising the following steps:
step 3.1, calculating the loss of named entity identification of the source domain by using a cross entropy loss function by using the global boundary prediction matrix obtained by embedding the source domain in the step 2.3;
step 3.2, calculating the loss of contrast learning through the similarity and dissimilarity of vectors of all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3;
Step 3.3, calculating the loss of KL divergence through all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3, so that the generated countermeasure sample is more consistent with the distribution predicted by the model;
step 3.4, updating parameters of the model by combining the loss functions in the step 3.1, the step 3.2 and the step 3.3 to optimize the combined loss function, and training the obtained optimal cross-domain named entity recognition model based on span comparison learning;
and step 4, inputting the target domain test set into the span comparison learning-based cross-domain named entity recognition model after training, updating and optimizing in the step 3, and calculating the score of the target domain entity.
Further, step 2.2 specifically includes:
assume a source dataset D with n tag data S ={x i ,y i } 1,...,n Wherein x is i Is a token sequence, y i Is x i The data of the source data set is obtained by independent and uniformly distributed sampling from a source domain;
target data set D with m unlabeled data T ={x j } 1,...,m Wherein x is j The data of the target data set is obtained by independent and uniformly distributed sampling in a target domain;
the model aims to learn the function f (x; θ) fy ) x-C, wherein the input of the function is a token sequence, and the output is a corresponding label; wherein θ f Is a parameter of a pre-training language model, θ y Is a parameter of category label prediction, C is a label set;
is the loss of the model in the classification task where the purpose of model learning is to minimize this loss, the specific formula is as follows:
wherein:both the presentation sequence and the tag are from the source domain; in a single field, challenge training is a challenge problem that aims at maximizing internal losses and minimizing external losses;
wherein: delta is the challenge sample generated;
wherein alpha is adv The trade-off between the two losses to be controlled is usually set to 1;
the following iterative steps may generate an antagonistic disturbance;
where ε is the upper bound of the challenge disturbance, η is the challenge step, δ t The challenge samples generated for the current iteration step,gradient of the loss of classification task at time t with respect to input at time t, < >>For gradient formula +.>Representing that if the disturbance exceeds the range e, it is mapped back into the specified range F Represents an L-definition norm;
generating a challenge sample with domain confusion:
wherein domain-specific loss, delta, using a resistance attack learning domain classifier 0 Is an initialized challenge sample, θ d Is a parameter corresponding to the calculation of the domain classification, d is a domain label; synthesizing the disturbance delta, f (x+delta; theta) by searching the extreme directions of the most plagued domain classifiers in the embedding space f ) Is a domain puzzle made from a pre-trained language model;the gradient input at time t is used for representing the loss of the domain classification task at time t.
Further, step 2.3 specifically includes:
let s= [ S ] 1 ,s 2 ,…,s m ]Is a possible span in sentences; span s is denoted as s [ i:j ]]Wherein i and j are a head index and a tail index, respectively; the object of named entity identification is to identify all s E, where E is the set of entity types; given a sentence x= [ X ] with n tags 1 ,x 2 ,…x n ]First, each mark in X is corresponding to the mark in the pre-training language modelRepresenting the association, thereby obtaining a sentence representation matrixWhere v is the dimension:
h 1 ,h 2 ,…h n =BERT(x 1 ,x 2 ,…x n )
after obtaining the sentence representation H, the span representation may be calculated using two feed-forward layers that rely on the start and end indices of the span:
q i,α =W q,α h i +b q,α
k j,α =W k, αh j +b k,α
wherein:is a vector representation of an entity token for identifying type alpha, q i,α ,k j,α Span s [ i:j ] of type alpha]Start and end positions, W q,α ,W k,α Is h i And h j Weights of b q,α ,b k,α For offset value, span sj]The score belonging to type α is calculated as follows:
calculating a scoring function of each span, and generating a global boundary prediction matrix through the scoring function;
wherein:and->Are all orthogonal matrices.
Further, step 3.1 specifically includes:
Calculating the score of each entity through the scoring function obtained in the step 2.4;
setting a cross entropy loss function as follows:
wherein: q, k denote the start index and end index, respectively, of the span, P α Representing a set of spans of entity type α, Q α Representing a set of spans that are not entities or entity types are not alpha, s α (q, k) is the fraction of alpha type entities, satisfying s α (q,k)>A segment of 0 is the output of an entity of type alpha.
Further, in step 3.2, for an input sentence, each entity span is represented as a vector, and similarity and dissimilarity of vectors of all entity spans contained in the input sentence are calculated to calculate a contrast loss;
the loss function calculation process for contrast learning is as follows:
wherein: n is the maximum length of the sentence, M is the number of negative examples, span (i, j) is the span representation, span (i, j) + Is the positive example of the current sentence, namely, the data enhancement of the source domain data countermeasure training, span (i, j) - The method is a negative example of the current sentence, namely, the span different from the current token label, and the cos cosine similarity is used for calculating the distance between the original sample and the positive and negative samples.
Further, in step 3.3, the loss function calculation process of the KL divergence is as follows:
further, step 3.4 specifically includes:
Performing overall training by adopting an end-to-end neural network model, wherein the model comprises four loss functions, namely identifying task loss, loss of a domain classifier, loss of contrast learning and loss of KL divergence for named entities of a source domain;
adding the loss functions to obtain the loss of a cross-domain named entity recognition model based on span comparison learning, and carrying out joint training on the loss functions;
wherein: alpha, lambda, beta are hyper-parameters used to control the weight of the various losses.
In a second aspect, the present invention provides a computer device comprising a memory and a processor, the memory having stored therein a computer program which when executed by the processor performs the steps of a span contrast learning based cross-domain named entity recognition method as described above.
In a third aspect, the present invention provides a computer-readable storage medium having stored therein a plurality of computer instructions for causing a computer to perform a span-based contrast learning cross-domain named entity recognition method as described above.
In a fourth aspect, the invention provides a computer program product which when executed by a processor implements a span-based contrast learning cross-domain named entity recognition method as described above.
The application has the beneficial effects that:
aiming at the problem of identifying the cross-domain named entity, the application overcomes the defects of the prior art, adopts the technologies of pre-training language model, countermeasure training, contrast learning and the like, fully considers and digs the invariance characteristics of the field, and provides a cross-domain named entity identification method based on span contrast learning.
1. And introducing entity boundary information, converting the sequence labels into global boundary matrixes, wherein the global boundary matrixes represent target labels of sentence levels, so that the model can learn clear span boundary information. In cross-domain learning, both distribution matching and instance-based matching have certain limitations, while contrast learning can learn domain invariance without labels in the target domain.
2. Contrast learning is used to reduce domain bias problems by counterlearning knowledge that confuses the model learning domain.
3. And simultaneously, the KL divergence is used for learning the distribution predicted by the approach model, so that the model performance is further improved.
The method is suitable for cross-domain named entity identification.
Drawings
In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of a cross-domain named entity recognition method based on span contrast learning;
FIG. 2 is a model diagram of a cross-domain named entity recognition method based on span contrast learning;
FIG. 3 is a field puzzle sample diagram;
FIG. 4 is a diagram of a multi-headed recognition nesting entity;
FIG. 5 is a schematic diagram of positive sampling;
fig. 6 is a negative sampling schematic.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended to illustrate the present invention and should not be construed as limiting the invention.
The specific embodiment is a span comparison learning-based cross-domain named entity identification method, which comprises the following steps:
step 1, acquiring a source domain data set and a target domain data set, preprocessing the data set, and dividing the data set into a training set and a testing set;
step 2, constructing a cross-domain named entity recognition model based on span comparison learning, which specifically comprises the following steps:
step 2.1, obtaining embedded representations of source domain data and target domain data, and assigning corresponding domain labels to the source domain and the target domain;
Step 2.2, constructing a domain confusion enhancement sample, embedding and inputting the source domain and target domain data obtained in the step 2.1 into a pre-training language model BERT, generating an antagonism sample by using a projection gradient descent PGD method, and classifying the domain by using the antagonism attack;
step 2.3, generating a global boundary prediction matrix, which specifically comprises the following steps:
embedding a source domain into an input BERT, and constructing a Global boundary prediction matrix by using Global Pointer by using the obtained output; embedding the source domain and the domain confusion enhancement sample generated in the step 2.2, inputting the embedded domain confusion enhancement sample into BERT, and constructing a Global boundary prediction matrix added with disturbance resistance by using Global Pointer by using the obtained output;
step 3, training the cross-domain named entity recognition model based on span comparison learning in the step 2, and specifically comprising the following steps:
step 3.1, calculating the loss of named entity identification of the source domain by using a cross entropy loss function by using the global boundary prediction matrix obtained by embedding the source domain in the step 2.3;
step 3.2, calculating the loss of contrast learning through the similarity and dissimilarity of vectors of all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3;
Step 3.3, calculating the loss of KL divergence through all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3, so that the generated countermeasure sample is more consistent with the distribution predicted by the model;
step 3.4, updating parameters of the model by combining the loss functions in the step 3.1, the step 3.2 and the step 3.3 to optimize the combined loss function, and training the obtained optimal cross-domain named entity recognition model based on span comparison learning;
and step 4, inputting the target domain test set into the span comparison learning-based cross-domain named entity recognition model after training, updating and optimizing in the step 3, and calculating the score of the target domain entity.
In the embodiment, aiming at the problem of identifying the cross-domain named entity, the defects of the prior art are overcome, the technology such as pre-training language model, countermeasure training, contrast learning and the like is used, the invariance characteristics of the field are fully considered and mined, and the cross-domain named entity identification method based on the span contrast learning is provided.
Firstly, introducing entity boundary information, converting sequence labels into a global boundary matrix, wherein the global boundary matrix represents a target label of sentence level, so that a model can learn clear span boundary information. In cross-domain learning, both distribution matching and instance-based matching have certain limitations, while contrast learning can learn domain invariance without labels in the target domain.
Second, contrast learning is used to reduce domain bias problems by counterlearning knowledge that confuses the model learning domain.
And at the same time, the KL divergence is used for learning the distribution which is close to the prediction of the model, so that the performance of the model is further improved.
In a second embodiment, the method for identifying a cross-domain named entity based on span comparison learning in the first embodiment is further defined, and in the second embodiment, step 2.2 is further defined, and specifically includes:
step 2.2, specifically comprising:
assume a source dataset D with n tag data S ={x i ,y i } 1,…,n Wherein x is i Is a token sequence, y i Is x i The data of the source data set is obtained by independent and uniformly distributed sampling from a source domain;
target data set D with m unlabeled data T ={x j } 1,…,m Wherein x is j The data of the target data set is obtained by independent and uniformly distributed sampling in a target domain;
the model aims to learn the function f (x; θ) fy ) x-C, wherein the input of the function is a token sequence, and the output is a corresponding label; wherein θ f Is a parameter of a pre-training language model, θ y Is a parameter of category label prediction, C is a label set;
is the loss of the model in the classification task where the purpose of model learning is to minimize this loss, the specific formula is as follows:
Wherein:both the presentation sequence and the tag are from the source domain; in a single field, challenge training is a challenge problem that aims at maximizing internal losses and minimizing external losses;
wherein: delta is the challenge sample generated;
wherein alpha is adv For controlling the trade-off between the two losses, usually provided as1;
The following iterative steps may generate an antagonistic disturbance;
where ε is the upper bound of the challenge disturbance, η is the challenge step, δ t The challenge samples generated for the current iteration step,gradient of the loss of classification task at time t with respect to input at time t, < >>For gradient formula +.>Representing that if the disturbance exceeds the range e, it is mapped back into the specified range F Represents an L-definition norm;
generating a challenge sample with domain confusion:
wherein domain-specific loss, delta, using a resistance attack learning domain classifier 0 Is an initialized challenge sample, θ d Is a parameter corresponding to the calculation of the domain classification, d is a domain label; synthesizing the disturbance delta, f (x+delta; theta) by searching the extreme directions of the most plagued domain classifiers in the embedding space f ) Is a domain puzzle made from a pre-trained language model;to indicate t Loss of domain classification task at the moment of time is related to the gradient input at the moment of t.
In this embodiment, the internal maximization may be solved by a Projection Gradient Descent (PGD) method, and the loss function is assumed to be locally linear. PGD has the advantage that it relies only on the model itself, which can generate samples with different resistance, thereby improving the generalization ability of the model on invisible data. PGD use a step-by-step strategy to combat. Specifically, the forward and backward propagation is performed one by one, and the forward and backward propagation is performed one by one according to g adv Calculating the disturbance, accumulating the new disturbance counter delta to g of the embedded layer one by one adv If one of the ranges is exceeded, the mapping is performed back to the given range. Finally, the g obtained by the last step is calculated adv Accumulated onto the original gradient. I.e. g corresponding to the gradient of the accumulated t-step disturbance adv And updating the original gradient.
As shown in FIG. 3, the domain puzzle can enhance the domain invariance of the model, thereby enabling the model to better adapt to unknown data and domains, confusing the model by discarding domain-related information during the training process, making it difficult to distinguish between data of different domains, and pulling the source (target) data and their corresponding domain puzzles closer together to reduce domain differences.
In a third embodiment, the present embodiment is further defined by the span comparison learning-based cross-domain named entity recognition method in the first embodiment, where step 2.3 is further defined, and specifically includes:
step 2.3, specifically comprising:
let s= [ S ] 1 ,s 2 ,…,s m ]Is a possible span in sentences; span s is denoted as s [ i:j ]]Wherein i and j are a head index and a tail index, respectively; the object of named entity identification is to identify all s E, where E is the set of entity types; given a sentence x= [ X ] with n tags 1 ,x 2 ,…x n ]First, each tag in X is associated with its corresponding representation in the pre-trained language model, thereby obtaining a sentence representation matrixWhere v is the dimension:
h 1 ,h 2 ,…h n =BERT(x 1 ,x 2 ,…x n )
after obtaining the sentence representation H, the span representation may be calculated using two feed-forward layers that rely on the start and end indices of the span:
q i,α =W q,α h i +b q,α
k j,α =W k,α h j +b k,α
wherein:is a vector representation of an entity token for identifying type alpha, q i,α ,k j,α Span s [ i:j ] of type alpha]Start and end positions, W q,α ,W k,α Is h i And h j Weights of b q,α ,b k,α For offset value, span sj]The score belonging to type α is calculated as follows:
calculating a scoring function of each span, and generating a global boundary prediction matrix through the scoring function;
Wherein:and->Are all orthogonal matrices.
In this embodiment, as shown in fig. 4, the multi-headed recognition nesting entity is to generate all possible entity spans.
In the attention mechanism, position coding is divided into two forms, absolute position coding and relative position coding. Although absolute position coding may add position information to a word vector, the position information is associated with a fixed position and cannot represent context information of the fixed position. In order to utilize the boundary information, it will satisfyIs applied to the representation of the entities, making the model more sensitive to the relative position between the entities and thereby improving the performance of the entity recognition. In this way, a scoring function for each span may be calculated, and a global boundary prediction matrix may be generated from the scoring function.
In a fourth embodiment, the present embodiment is further defined by a span comparison learning-based cross-domain named entity recognition method according to the first embodiment, where step 3.1 is further defined, and specifically includes:
step 3.1, specifically comprising:
calculating the score of each entity through the scoring function obtained in the step 2.4;
Setting a cross entropy loss function as follows:
wherein: q, k denote the start index and end index, respectively, of the span, P α Representing a set of spans of entity type α, Q α Representing a set of spans that are not entities or entity types are not alpha, s α (q, k) is the fraction of alpha type entities, satisfying s α (q,k)>A segment of 0 is the output of an entity of type alpha.
In this embodiment, the score of each entity is calculated by the scoring function obtained in step 2.4. And to solve the class imbalance problem in the classification problem, a cross entropy loss function is designed to facilitate model learning of boundary information for each training support instance.
In a fifth embodiment, the present embodiment is further defined by a span comparison learning-based cross-domain named entity recognition method according to the fourth embodiment, where step 3.2 is further defined, and specifically includes:
in step 3.2, for an input sentence, each entity span is represented as a vector, and the similarity and dissimilarity of the vectors of all entity spans contained in the input sentence are calculated to calculate a contrast loss;
the loss function calculation process for contrast learning is as follows:
wherein: n is the maximum length of the sentence, M is the number of negative examples, span (i, j) is the span representation, span (i, j) + Is the positive example of the current sentence, namely, the data enhancement of the source domain data countermeasure training, span (i, j) - The method is a negative example of the current sentence, namely, the span different from the current token label, and the cos cosine similarity is used for calculating the distance between the original sample and the positive and negative samples.
In the embodiment, contrast learning is used, spans similar to the model are zoomed in at the span level, spans dissimilar to the model are zoomed out, and the model can learn more entity span invariance information; as shown in fig. 5, for positive sampling, the model can encode the span of the source domain and the span of the domain puzzle to be closer in representation space, gradually pulling the example to the domain decision boundary as training progresses; for negative samples across domains, contrast loss may push the negative samples of the source and target domains away from each other, as in the left half of fig. 6, and negative samples of the same class across domains away from each other, which contradicts the goal of pulling up different domains. So to avoid the existence of such cross-domain rejection, samples of different domains are excluded from the negative sampling set.
And pulling spans similar to the span hierarchy and spans dissimilar to the span hierarchy, so that the model can learn more entity span invariance information.
In a sixth embodiment, the present embodiment is further defined by a span comparison learning-based cross-domain named entity recognition method in the fifth embodiment, where step 3.3 is further defined, and specifically includes:
in step 3.3, the loss function calculation process of the KL divergence is as follows:
in this embodiment, the loss of KL divergence is calculated through all the entity spans included in the two global boundary prediction matrices obtained in step 2.3, so that the generated challenge sample is more consistent with the distribution predicted by the model itself;
in the countermeasure training, in order to make the model more robust, certain disturbance is performed on training data to generate a countermeasure sample. These challenge samples, unlike the original samples, may introduce some noise or disturbance. To ensure that the challenge samples generated have some similarity and continuity, a hidden variable is typically introduced to control the distance between the samples. The samples that are expected to be generated in the process of generating the reactance samples can approach the predicted distribution of the model itself, so that the model is more robust. The KL distribution is a commonly used index for comparing the difference between two distributions, and a smaller KL divergence indicates a closer two distributions. Accordingly, the quality of the generated challenge sample is evaluated in challenge training by a method of calculating KL distribution for the distribution of the generated challenge sample and the predicted distribution of the model on the original sample, so as to make the generated challenge sample more consistent with the predicted distribution of the model itself. The challenge sample thus generated can be better used to train the model, improving the robustness of the model. Therefore, the loss function calculation process of KL divergence is adopted by the method in the present embodiment.
In a seventh embodiment, the present embodiment is further defined by a span comparison learning-based cross-domain named entity recognition method in the sixth embodiment, where step 3.4 is further defined, and specifically includes:
step 3.4, specifically comprising:
performing overall training by adopting an end-to-end neural network model, wherein the model comprises four loss functions, namely identifying task loss, loss of a domain classifier, loss of contrast learning and loss of KL divergence for named entities of a source domain;
adding the loss functions to obtain the loss of a cross-domain named entity recognition model based on span comparison learning, and carrying out joint training on the loss functions;
wherein: alpha, lambda, beta are hyper-parameters used to control the weight of the various losses.
In the embodiment, the loss functions are added to obtain the loss of the cross-domain named entity recognition model based on span comparison learning, and multiple aspects of the model can be optimized through joint training of the loss functions, so that the performance and the robustness of the model are improved.
By optimizing the loss function through end-to-end training, a robust model with better classification performance on the original sample and higher quality of the countermeasure sample can be obtained.
In an eighth embodiment, this embodiment is an embodiment 1 of a span-based comparison learning method for identifying a cross-domain named entity, which specifically includes:
the span comparison learning-based cross-domain named entity identification method designed by the embodiment is realized through the following steps:
step 1, acquiring a source domain data set and a target domain data set, preprocessing the data set, and dividing the data set into a training set and a testing set;
step 2, constructing a cross-domain named entity recognition model based on span comparison learning, which specifically comprises the following steps:
step 2.1, obtaining embedded representations of source domain data and target domain data, and assigning corresponding domain labels to the source domain and the target domain;
step 2.2, constructing a field confusion enhancement sample. Embedding and inputting the source domain and target domain data obtained in the step 2.1 into a pre-training language model BERT, generating a challenge sample by using a Projection Gradient Descent (PGD) method, and classifying the fields by using a challenge attack;
and 2.3, generating a global boundary prediction matrix. Embedding a source domain into an input BERT, and constructing a Global boundary prediction matrix by using Global Pointer by using the obtained output; embedding a source domain and embedding a domain confusion enhancement sample generated in the step 2.2, splicing the domain confusion enhancement sample by using a function concat () and then inputting the spliced domain confusion enhancement sample into BERT, and constructing a Global boundary prediction matrix added with disturbance resistance by using a Global Pointer by using the obtained output;
Step 3: training the cross-domain named entity recognition model based on span comparison learning in the step 2, wherein the method specifically comprises the following steps:
step 3.1, calculating the loss of named entity identification of the source domain by using a cross entropy loss function by using the global boundary prediction matrix obtained by embedding the source domain in the step 2.3;
step 3.2, calculating the loss of contrast learning through the similarity and dissimilarity of vectors of all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3;
step 3.3, calculating the loss of KL divergence through all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3, so that the generated countermeasure sample is more consistent with the distribution predicted by the model;
step 3.4, updating parameters of the model by combining the loss functions in the step 3.1, the step 3.2 and the step 3.3 to optimize the combined loss function, and training the obtained optimal cross-domain named entity recognition model based on span comparison learning;
and 4, inputting the target domain test set into the span comparison learning-based cross-domain named entity recognition model after training, updating and optimizing in the step 3, and calculating the score of the target domain entity.
In an eighth embodiment, this embodiment is an embodiment 2 of a span-based comparison learning method for identifying a cross-domain named entity, which specifically includes:
As shown in fig. 1 to 2, the invention provides a span comparison learning-based cross-domain named entity identification method, which specifically comprises the following steps:
step 1, acquiring a source domain data set and a target domain data set, preprocessing the data set, and dividing the data set into a training set and a testing set;
the step 1 specifically comprises the following steps:
step 1.1: extracting text sequences from a source domain and a target domain dataset;
step 1.2: dividing the preprocessed data set into a training set and a testing set;
and 2, constructing a cross-domain named entity recognition model based on span comparison learning.
The step 2 specifically comprises the following steps:
step 2.1, obtaining embedded representation of source domain data and target domain data, firstly obtaining codes of each label by using independent thermal codes, generating source domain and target domain token embedments by using a pre-training language model BERT, and numbering two fields of the source domain and the target domain: the source domain is 0 and the target domain is 1.
Step 2.2, constructing a field confusion enhancement sample. Embedding and inputting the source domain and target domain data obtained in the step 2.1 into a pre-training language model BERT, generating a challenge sample by using a Projection Gradient Descent (PGD) method, and classifying the fields by using a challenge attack;
In step 2.2, it is assumed that there are n source data sets D of marker data S ={x i ,y i } 1,…,n These data are sampled from the source domain independently and in a distributed fashion. At the same time, there is also a target data set D with m unlabeled data T ={x j } 1,…,m These data are independently sampled from the target domain in a uniform distribution, where x i ,x j Is a token sequence, y i Is x i Is a label of (a). In intra-domain training, models are aimed atA function is learned whose input is a token sequence and whose output is the corresponding label. The model aims to learn the function f (x; θ) fy ) x-C, where θ f Is a parameter of a pre-training language model, θ y Is a parameter of class label prediction, and C is a label set. In the general task of classification,is the loss of the model in the classification task where the purpose of model learning is to minimize this loss, the specific formula is as follows:
wherein:both the presentation sequence and the tag come from the source domain. In a single field, challenge training is a challenge problem that aims at maximizing internal losses and minimizing external losses.
Wherein: delta is the challenge sample we generated.
Wherein: alpha adv For controlling the trade-off between the two losses, usually set to 1. Internal maximization can be solved by the Projection Gradient Descent (PGD) method and assumes that the loss function is locally linear. The advantage of PGD is that it relies only on the model itself, it is possible to generate a model with different pairs Samples of resistance, thereby improving the generalization ability of the model on invisible data. PGD use a step-by-step strategy to combat. Specifically, the forward and backward propagation is performed one by one, and the forward and backward propagation is performed one by one according to g adv Calculating the disturbance, accumulating the new disturbance counter delta to g of the embedded layer one by one adv If one of the ranges is exceeded, the mapping is performed back to the given range. Finally, the g obtained by the last step is calculated adv Accumulated onto the original gradient. I.e. g corresponding to the gradient of the accumulated t-step disturbance adv And updating the original gradient. The following iterative steps may generate an antagonistic disturbance.
Wherein: e is the upper bound of the countermeasure disturbance, eta is the countermeasure step size, delta t The challenge samples generated for the current iteration step,gradient of the loss of classification task at time t with respect to input at time t, < >>Is a gradient formula. Pi (II) ||δ||F≤∈ Representing that if the disturbance exceeds the range e, it is mapped back into the specified range F Representing the L-definition norm.
As shown in FIG. 3, the domain puzzle can enhance the domain invariance of the model, thereby enabling the model to better adapt to unknown data and domains, confusing the model by discarding domain-related information during the training process, making it difficult to distinguish between data of different domains, and pulling the source (target) data and their corresponding domain puzzles closer together to reduce domain differences. To generate domain confusion enhancement, we use a challenge attack with perturbations to perform the classification tasks of the source and target domains, using the process of generating challenge samples described above, we can generate challenge samples with domain confusion:
Wherein domain-specific loss, delta, using a resistance attack learning domain classifier 0 Is an initialized challenge sample, θ d Is a parameter corresponding to the calculation of the domain class, and d is a domain label. Synthesizing the disturbance delta, f (x+delta; theta) by searching the extreme directions of the most plagued domain classifiers in the embedding space f ) Is a domain puzzle made from a pre-trained language model.The gradient input at time t is used for representing the loss of the domain classification task at time t.
And 2.3, generating a global boundary prediction matrix. Embedding a source domain into an input BERT, and constructing a Global boundary prediction matrix by using Global Pointer by using the obtained output; embedding a source domain and embedding a domain confusion enhancement sample generated in the step 2.2, splicing the domain confusion enhancement sample by using a function concat () and then inputting the spliced domain confusion enhancement sample into BERT, and constructing a global boundary prediction matrix added with disturbance resistance by using the obtained output;
as shown in FIG. 4, the multi-headed recognition nest entity is to generate all possible entity spans. In step 2.3, assume s= [ S ] 1 ,s 2 ,…,s m ]Is a possible span in sentences. Span s is denoted as s [ i:j ]]Where i and j are the head index and the tail index, respectively. The goal of named entity identification is to identify all s E, where E is the set of entity types. Given a sentence x= [ X ] with n tags 1 ,x 2 ,…x n ]First, each marker in X is associated with its corresponding representation in the pre-trained language modelThereby obtaining a new hidden vector output matrixWhere v is the dimension:
h 1 ,h 2 ,…h n =BERT(x 1 ,x 2 ,…x n )
after obtaining the sentence representation H, the span representation may be calculated using two feed-forward layers that rely on the start and end indices of the span:
q i,α =W q,α h i +b q,α
k j,α =W k,α h j +b k,α
wherein:is a vector representation of an entity token for identifying type alpha, q i,α ,k j,α Span s [ i:j ] of type alpha]Start and end positions, W q,α ,W k,α Is h i And h j Weights of b q,α ,b k,α For offset value, span sj]The score belonging to type α may be calculated as follows: />
In the attention mechanism, position coding is divided into two forms, absolute position coding and relative position coding. Although absolute position coding may add position information to a word vector, the position information is associated with a fixed position and cannot represent context information of the fixed position. In order to utilize the boundary information, it will satisfyIs applied to the representation of the entities, making the model more sensitive to the relative position between the entities and thereby improving the performance of the entity recognition. In this way, the respective spans can be calculatedAnd generating a global boundary prediction matrix by the scoring function:
Wherein:and->Are all orthogonal matrices.
Step 3: training the cross-domain named entity recognition model based on span comparison learning in the step 2, wherein the method specifically comprises the following steps:
step 3.1: calculating the loss of named entity identification of the source domain by using a cross entropy loss function by using the global boundary prediction matrix obtained by embedding the source domain in the step 2.3;
and (3) calculating the score of each entity through the scoring function obtained in the step 2.4. And to solve the class imbalance problem in the classification problem, a cross entropy loss function is designed to facilitate model learning of boundary information for each training support instance:
wherein: q, k denote the start index and end index, respectively, of the span, P α Representing a set of spans of entity type α, Q α Representing a set of spans that are not entities or entity types are not alpha, s α (q, k) is the fraction of alpha type entities, satisfying s α (q,k)>A segment of 0 is the output of an entity of type alpha.
Step 3.2: calculating the loss of contrast learning through the similarity and dissimilarity of vectors of all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3;
in step 3.2, using contrast learning, pulling spans similar to the model per se at the span level, and pulling spans dissimilar to the model per se, so that the model can learn more entity span invariance information; as shown in fig. 5, for positive sampling, the model can encode the span of the source domain and the span of the domain puzzle to be closer in representation space, gradually pulling the example to the domain decision boundary as training progresses; for negative samples across domains, contrast loss may push the negative samples of the source and target domains away from each other, as in the left half of fig. 6, and negative samples of the same class across domains away from each other, which contradicts the goal of pulling up different domains. So to avoid the existence of such cross-domain rejection, samples of different domains are excluded from the negative sampling set.
And (3) zooming in spans similar to the span hierarchy, zooming out spans dissimilar to the span hierarchy, enabling the model to learn more entity span invariance information, representing each entity span as a vector for one input sentence, and calculating the similarity and dissimilarity of the vectors of all entity spans contained in the vector to calculate the contrast loss. Therefore, the loss function calculation process of the contrast learning in step 3.23.3 is as follows:
wherein: n is the maximum length of the sentence, M is the number of negative examples, span (i, j) is the span representation, span (i, j) + Is the positive example of the current sentence, namely, the data enhancement of the source domain data countermeasure training, span (i, j) - The method is a negative example of the current sentence, namely, the span different from the current token label, and the cos cosine similarity is used for calculating the distance between the original sample and the positive and negative samples.
Step 3.3, calculating the loss of KL divergence through all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3, so that the generated countermeasure sample is more consistent with the distribution predicted by the model;
in the countermeasure training, in order to make the model more robust, certain disturbance is performed on training data to generate a countermeasure sample. These challenge samples, unlike the original samples, may introduce some noise or disturbance. To ensure that the challenge samples generated have some similarity and continuity, a hidden variable is typically introduced to control the distance between the samples. The samples that are expected to be generated in the process of generating the reactance samples can approach the predicted distribution of the model itself, so that the model is more robust. The KL distribution is a commonly used index for comparing the difference between two distributions, and a smaller KL divergence indicates a closer two distributions. Accordingly, the quality of the generated challenge sample is evaluated in challenge training by a method of calculating KL distribution for the distribution of the generated challenge sample and the predicted distribution of the model on the original sample, so as to make the generated challenge sample more consistent with the predicted distribution of the model itself. The challenge sample thus generated can be better used to train the model, improving the robustness of the model. Therefore, the loss function calculation process of the KL divergence in step 3.33.4 is as follows:
Step 3.4, updating parameters of the model by combining the loss functions in the step 3.1, the step 3.2 and the step 3.3 to optimize the combined loss function, and training the obtained optimal cross-domain named entity recognition model based on span comparison learning;
and (3) performing overall training by adopting an end-to-end neural network model, wherein the model comprises four loss functions, namely identifying task loss, loss of a domain classifier, loss of contrast learning and loss of KL divergence for named entities of a source domain. The loss functions are added to obtain the loss of the cross-domain named entity recognition model based on span comparison learning, and multiple aspects of the model can be optimized through joint training of the loss functions, so that the performance and the robustness of the model are improved.
Wherein: alpha, lambda, beta are hyper-parameters used to control the weight of the various losses.
By optimizing the loss function through end-to-end training, a robust model with better classification performance on the original sample and higher quality of the countermeasure sample can be obtained.
And 4, inputting the target domain test set into the span comparison learning-based cross-domain named entity recognition model after training, updating and optimizing in the step 3, and calculating the score of the target domain entity.
The invention is not limited to the embodiments described above. The above description of specific embodiments is intended to be illustrative, but not limiting, of the technical solutions of the present invention. Numerous specific modifications can be made by those skilled in the art without departing from the spirit of the invention and scope of the claims, which are within the scope of the invention.

Claims (10)

1. A cross-domain named entity recognition method based on span comparison learning is characterized by comprising the following steps:
step 1, acquiring a source domain data set and a target domain data set, preprocessing the data set, and dividing the data set into a training set and a testing set;
step 2, constructing a cross-domain named entity recognition model based on span comparison learning, which specifically comprises the following steps:
step 2.1, obtaining embedded representations of source domain data and target domain data, and assigning corresponding domain labels to the source domain and the target domain;
step 2.2, constructing a domain confusion enhancement sample, embedding and inputting the source domain and target domain data obtained in the step 2.1 into a pre-training language model BERT, generating an antagonism sample by using a projection gradient descent PGD method, and classifying the domain by using the antagonism attack;
Step 2.3, generating a global boundary prediction matrix, which specifically comprises the following steps:
embedding a source domain into an input BERT, and constructing a Global boundary prediction matrix by using Global Pointer by using the obtained output; embedding the source domain and the domain confusion enhancement sample generated in the step 2.2, inputting the embedded domain confusion enhancement sample into BERT, and constructing a Global boundary prediction matrix added with disturbance resistance by using Global Pointer by using the obtained output;
step 3, training the cross-domain named entity recognition model based on span comparison learning in the step 2, and specifically comprising the following steps:
step 3.1, calculating the loss of named entity identification of the source domain by using a cross entropy loss function by using the global boundary prediction matrix obtained by embedding the source domain in the step 2.3;
step 3.2, calculating the loss of contrast learning through the similarity and dissimilarity of vectors of all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3;
step 3.3, calculating the loss of KL divergence through all entity spans contained in the two global boundary prediction matrixes obtained in the step 2.3, so that the generated countermeasure sample is more consistent with the distribution predicted by the model;
step 3.4, updating parameters of the model by combining the loss functions in the step 3.1, the step 3.2 and the step 3.3 to optimize the combined loss function, and training the obtained optimal cross-domain named entity recognition model based on span comparison learning;
And step 4, inputting the target domain test set into the span comparison learning-based cross-domain named entity recognition model after training, updating and optimizing in the step 3, and calculating the score of the target domain entity.
2. The span contrast learning-based cross-domain named entity recognition method according to claim 1, wherein step 2.2 specifically comprises:
assume a source dataset D with n tag data S ={x i ,y i } 1,…,n Wherein x is i Is a token sequence, y i Is x i The data of the source data set is obtained by independent and uniformly distributed sampling from a source domain;
target data set D with m unlabeled data T ={x j } 1,…,m Wherein x is j The data of the target data set is obtained by independent and uniformly distributed sampling in a target domain;
the model aims to learn the function f (x; θ) fy ) x-C, wherein the input of the function is a token sequence, and the output is a corresponding label; wherein θ f Is a parameter of a pre-training language model, θ y Is a parameter of category label prediction, C is a label set;
is the loss of the model in the classification task where the purpose of model learning is to minimize this loss, the specific formula is as follows:
wherein:both the presentation sequence and the tag are from the source domain; in a single field, challenge training is a challenge problem that aims at maximizing internal losses and minimizing external losses;
Wherein: delta is the challenge sample generated;
wherein alpha is adv For controlling the space between two lossesA trade-off, typically set to 1;
the following iterative steps may generate an antagonistic disturbance;
where ε is the upper bound of the challenge disturbance, η is the challenge step, δ t The challenge samples generated for the current iteration step,gradient of the loss of classification task at time t with respect to input at time t, < >>For gradient formula +.>Representing that if the disturbance exceeds the range e, it is mapped back into the specified range F Represents an L-definition norm;
generating a challenge sample with domain confusion:
wherein domain-specific loss, delta, using a resistance attack learning domain classifier 0 Is an initialized challenge sample, θ d Is a parameter corresponding to the calculation of the domain classification, d is a domain label; synthesizing the disturbance delta, f (x+delta; theta) by searching the extreme directions of the most plagued domain classifiers in the embedding space f ) Is a domain puzzle made from a pre-trained language model;the gradient input at time t is used for representing the loss of the domain classification task at time t.
3. The span contrast learning-based cross-domain named entity recognition method according to claim 1, wherein step 2.3 specifically comprises:
Let s= [ S ] 1 ,s 2 ,…,s m ]Is a possible span in sentences; span s is denoted as s [ i:j ]]Wherein i and j are a head index and a tail index, respectively; the object of named entity identification is to identify all s E, where E is the set of entity types; given a sentence x= [ X ] with n tags 1 ,x 2 ,…x n ]First, each tag in X is associated with its corresponding representation in the pre-trained language model, thereby obtaining a sentence representation matrixWhere v is the dimension:
h 1 ,h 2 ,…h n =BERT(x 1 ,x 2 ,…x n )
after obtaining the sentence representation H, the span representation may be calculated using two feed-forward layers that rely on the start and end indices of the span:
q i,α =W q,α h i +b q,α
k j,α =W k,α h j +b k,α
wherein:is a vector representation of an entity token for identifying type alpha, q i,α ,k j,α Span s [ i:j ] of type alpha]Start and end positions, W q,α ,W k,α Is h i And h j Weights of b q,α ,b k,α For offset value, span sj]Score belonging to type alphaThe calculation is as follows:
calculating a scoring function of each span, and generating a global boundary prediction matrix through the scoring function;
wherein:and->Are all orthogonal matrices.
4. The span contrast learning-based cross-domain named entity recognition method according to claim 1, wherein the step 3.1 specifically comprises:
calculating the score of each entity through the scoring function obtained in the step 2.4;
Setting a cross entropy loss function as follows:
wherein: q, k denote the start index and end index, respectively, of the span, P α Representing a set of spans of entity type α, Q α Representing a set of spans that are not entities or entity types are not alpha, s α (q, k) is the fraction of alpha type entities, satisfying s α (q,k)>A segment of 0 is the output of an entity of type alpha.
5. The method for identifying cross-domain named entities based on span comparison learning according to claim 4, wherein in step 3.2, for an input sentence, each entity span is represented as a vector, and similarity and dissimilarity of vectors of all entity spans contained in the input sentence are calculated to calculate comparison loss;
the loss function calculation process for contrast learning is as follows:
wherein: n is the maximum length of the sentence, M is the number of negative examples, span (i, j) is the span representation, span (i, j) + Is the positive example of the current sentence, namely, the data enhancement of the source domain data countermeasure training, span (i, j) - The method is a negative example of the current sentence, namely, the span different from the current token label, and the cos cosine similarity is used for calculating the distance between the original sample and the positive and negative samples.
6. The span contrast learning-based cross-domain named entity recognition method according to claim 5, wherein in step 3.3, the loss function calculation process of the KL divergence is as follows:
7. The span contrast learning-based cross-domain named entity recognition method according to claim 6, wherein the step 3.4 specifically comprises:
performing overall training by adopting an end-to-end neural network model, wherein the model comprises four loss functions, namely identifying task loss, loss of a domain classifier, loss of contrast learning and loss of KL divergence for named entities of a source domain;
adding the loss functions to obtain the loss of a cross-domain named entity recognition model based on span comparison learning, and carrying out joint training on the loss functions;
wherein: alpha, lambda, beta are hyper-parameters used to control the weight of the various losses.
8. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when running the computer program stored in the memory, performs the steps of the method of any one of claims 1 to 7.
9. A computer-readable storage medium having stored therein a plurality of computer instructions for causing a computer to perform the method of any one of claims 1 to 7.
10. A computer program product, characterized in that the computer program, when executed by a processor, implements the method of any of claims 1 to 7.
CN202310621806.1A 2023-05-30 2023-05-30 Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning Pending CN116644751A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310621806.1A CN116644751A (en) 2023-05-30 2023-05-30 Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310621806.1A CN116644751A (en) 2023-05-30 2023-05-30 Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning

Publications (1)

Publication Number Publication Date
CN116644751A true CN116644751A (en) 2023-08-25

Family

ID=87615013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310621806.1A Pending CN116644751A (en) 2023-05-30 2023-05-30 Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning

Country Status (1)

Country Link
CN (1) CN116644751A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117708601A (en) * 2024-02-06 2024-03-15 智慧眼科技股份有限公司 Similarity calculation model training method, device, equipment and storage medium
CN117807999A (en) * 2024-02-29 2024-04-02 武汉科技大学 Domain self-adaptive named entity recognition method based on countermeasure learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117708601A (en) * 2024-02-06 2024-03-15 智慧眼科技股份有限公司 Similarity calculation model training method, device, equipment and storage medium
CN117708601B (en) * 2024-02-06 2024-04-26 智慧眼科技股份有限公司 Similarity calculation model training method, device, equipment and storage medium
CN117807999A (en) * 2024-02-29 2024-04-02 武汉科技大学 Domain self-adaptive named entity recognition method based on countermeasure learning
CN117807999B (en) * 2024-02-29 2024-05-10 武汉科技大学 Domain self-adaptive named entity recognition method based on countermeasure learning

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN112131404B (en) Entity alignment method in four-risk one-gold domain knowledge graph
Nickel et al. Poincaré embeddings for learning hierarchical representations
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN116644751A (en) Cross-domain named entity identification method, equipment, storage medium and product based on span comparison learning
CN115019123B (en) Self-distillation contrast learning method for remote sensing image scene classification
CN111460824B (en) Unmarked named entity identification method based on anti-migration learning
CN112733866A (en) Network construction method for improving text description correctness of controllable image
CN111738007A (en) Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
CN114372465B (en) Mixup and BQRNN-based legal naming entity identification method
CN110941734A (en) Depth unsupervised image retrieval method based on sparse graph structure
CN113220865B (en) Text similar vocabulary retrieval method, system, medium and electronic equipment
CN117725261A (en) Cross-modal retrieval method, device, equipment and medium for video text
CN113722439B (en) Cross-domain emotion classification method and system based on antagonism class alignment network
CN115994204A (en) National defense science and technology text structured semantic analysis method suitable for few sample scenes
CN117171393A (en) Multi-mode retrieval-oriented self-adaptive semi-pairing inquiry hash method
CN114817581A (en) Cross-modal Hash retrieval method based on fusion attention mechanism and DenseNet network
CN114841151A (en) Medical text entity relation joint extraction method based on decomposition-recombination strategy
CN117851567A (en) Zero sample table retrieval method based on field adaptation
CN116720498A (en) Training method and device for text similarity detection model and related medium thereof
CN113516209B (en) Comparison task adaptive learning method for few-sample intention recognition
CN116302953A (en) Software defect positioning method based on enhanced embedded vector semantic representation
CN115599392A (en) Code processing method, device, medium and electronic equipment
CN114841148A (en) Text recognition model training method, model training device and electronic equipment
CN114282537A (en) Social text-oriented cascade linear entity relationship extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination