CN116205217A - Small sample relation extraction method, system, electronic equipment and storage medium - Google Patents
Small sample relation extraction method, system, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116205217A CN116205217A CN202310495624.4A CN202310495624A CN116205217A CN 116205217 A CN116205217 A CN 116205217A CN 202310495624 A CN202310495624 A CN 202310495624A CN 116205217 A CN116205217 A CN 116205217A
- Authority
- CN
- China
- Prior art keywords
- concept
- sentence
- text
- small sample
- coding module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 73
- 230000004927 fusion Effects 0.000 claims abstract description 41
- 238000000034 method Methods 0.000 claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 73
- 238000012549 training Methods 0.000 claims description 47
- 238000004590 computer program Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical compound OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a small sample relation extraction method, a system, electronic equipment and a storage medium, and relates to the technical field of data processing. The method comprises the following steps: acquiring a target text; determining entity relation representation according to the small sample relation extraction model and the target text; the entity relationship representation comprises entity text and corresponding concepts and relationships; the small sample relation extraction model is trained by comparing the learning loss and the cross entropy loss; the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is built based on a self-attention mechanism network and a similarity gate. The invention can improve the accuracy of extracting the sample relation when the sample is insufficient.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and system for extracting a small sample relationship, an electronic device, and a storage medium.
Background
The purpose of the relationship extraction task is to extract relationships between entities from unstructured raw text data, thereby converting these raw unstructured data into structured data that is easy to store and analyze. The relation extraction technology is widely applied to the artificial intelligence fields such as knowledge maps, automatic questions and answers, search engines and the like. The relation extraction is divided into general relation extraction, open domain relation extraction, small sample relation extraction and the like according to the number of training samples. Deep learning techniques use neural network models to automatically extract text features, and it is a difficulty how to have the neural network models better mine text semantic information for classification. Typically, the relationship extraction task requires a significant amount of training data to support, which typically requires a significant amount of human effort to annotate. The small sample relation extraction is to extract the relation existing between given test sentences given a few sample instances.
The performance of current traditional models relies heavily on time-consuming labor-intensive labeling data, while some models achieve good results in terms of public relations, as the number of training instances of a relation decreases, the performance of the relation may drop dramatically, and classifiers may be prone to more sample classes, so that when the samples are insufficient, the accuracy of the existing models in sample relation extraction is not high.
Disclosure of Invention
The invention aims to provide a small sample relation extraction method, a system, electronic equipment and a storage medium, which can improve the accuracy of sample relation extraction when samples are insufficient.
In order to achieve the above object, the present invention provides the following solutions:
a small sample relationship extraction method, comprising:
acquiring a target text;
determining entity relation representation according to the small sample relation extraction model and the target text; the entity relation representation comprises entity texts and corresponding concepts and relations; the small sample relation extraction model is trained by comparing learning loss and cross entropy loss;
the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is constructed based on a self-attention mechanism network and a similarity gate;
the concept coding module is used for determining a plurality of concept embedding vectors of entity texts in the target texts; the sentence coding module is used for determining sentence embedded vectors of the target text; the text concept fusion module is used for determining entity relation expression according to each concept embedding vector and each sentence embedding vector.
Optionally, the determining entity relationship representation according to the small sample relationship extraction model and the target text specifically includes:
extracting concepts of the entity text of the target text according to a set concept database to obtain a plurality of candidate concepts of the entity text;
inputting each candidate concept into the concept coding module to obtain a plurality of concept embedding vectors;
inputting the target text into the sentence coding module to obtain a sentence embedded vector;
and inputting each concept embedded vector and each sentence embedded vector into the text concept fusion module to obtain entity relationship representation.
Optionally, inputting each concept embedding vector and each sentence embedding vector into the text concept fusion module to obtain an entity relationship representation, which specifically includes:
calculating the similarity between each concept embedded vector and each sentence embedded vector to obtain a plurality of similarity matrixes;
normalizing each similarity matrix by using a Softmax function to obtain each similarity score;
determining an optimal concept embedding vector corresponding to the sentence embedding vector according to a preset similarity threshold value and each similarity score value;
and splicing the optimal concept embedded vector and the sentence embedded vector to obtain entity relation expression.
Optionally, the training process of the small sample relation extraction model specifically includes:
acquiring training data; the training data comprises training texts and corresponding relation labels; the relationship label comprises entity texts in training texts and corresponding concepts and relationships;
constructing a training model based on the Bert embedded model, the skip-gram model, the self-attention mechanism network and the similarity gate;
and inputting the training data into the training model, training the training model by utilizing the contrast learning loss and the cross entropy loss, and determining the trained training model as the small sample relation extraction model.
Optionally, the setting concept database includes a YAGO3 database, a ConceptNet database, and a ConceptGraph database.
The invention also provides a small sample relation extraction system, comprising:
the text acquisition module is used for acquiring a target text;
the relation extraction module is used for determining entity relation expression according to the small sample relation extraction model and the target text; the entity relation representation comprises entity texts and corresponding concepts and relations; the small sample relation extraction model is trained by comparing learning loss and cross entropy loss;
the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is constructed based on a self-attention mechanism network and a similarity gate;
the concept coding module is used for determining a plurality of concept embedding vectors of entity texts in the target texts; the sentence coding module is used for determining sentence embedded vectors of the target text; the text concept fusion module is used for determining entity relation expression according to each concept embedding vector and each sentence embedding vector.
The invention also provides an electronic device comprising a memory for storing a computer program and a processor for running the computer program to cause the electronic device to perform a small sample relation extraction method according to the above.
The present invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements a small sample relationship extraction method as described above.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention discloses a small sample relation extraction method, a system, electronic equipment and a storage medium. The small sample relation extraction model is trained by comparing the learning loss and the cross entropy loss, and the similarity of the entity text and the corresponding concepts is controlled by utilizing a self-attention mechanism network and a similarity gate in the text concept fusion module, so that the model result is prevented from being prone to more types of samples, and the accuracy of sample relation extraction is improved when the samples are insufficient.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a small sample relationship extraction method according to the present invention;
FIG. 2 is a schematic diagram of the operation logic of the small sample relation extraction model in the present embodiment;
FIG. 3 is a schematic diagram of the operation logic of the concept selection gate algorithm in the present embodiment;
FIG. 4 is a schematic diagram of text fusion operation logic based on self-attention in the present embodiment;
FIG. 5 is a diagram of a relational classification module architecture based on contrast learning in the present embodiment;
FIG. 6 is a block diagram of a small sample relationship extraction system of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a small sample relation extraction method, a system, electronic equipment and a storage medium, which can improve the accuracy of sample relation extraction when samples are insufficient.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in fig. 1, the present invention provides a small sample relation extraction method, which includes:
step 100: and acquiring the target text.
Step 200: determining entity relation representation according to the small sample relation extraction model and the target text; the entity relation representation comprises entity texts and corresponding concepts and relations; wherein the small sample relationship extraction model is trained by comparing learning loss and cross entropy loss.
Specifically, the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is constructed based on a self-attention mechanism network and a similarity gate.
The concept coding module is used for determining a plurality of concept embedding vectors of entity texts in the target texts; the sentence coding module is used for determining sentence embedded vectors of the target text; the text concept fusion module is used for determining entity relation expression according to each concept embedding vector and each sentence embedding vector.
The training process of the small sample relation extraction model specifically comprises the following steps:
acquiring training data; the training data comprises training texts and corresponding relation labels; the relationship label comprises entity texts in training texts and corresponding concepts and relationships; constructing a training model based on the Bert embedded model, the skip-gram model, the self-attention mechanism network and the similarity gate; and inputting the training data into the training model, training the training model by utilizing the contrast learning loss and the cross entropy loss, and determining the trained training model as the small sample relation extraction model.
In this embodiment, the set concept database includes a YAGO3 database, a ConceptNet database, and a ConceptGraph database.
As a specific embodiment of step 200, it includes:
firstly, extracting concepts of the entity text of the target text according to a set concept database, and obtaining a plurality of candidate concepts of the entity text.
And then, inputting each candidate concept into the concept coding module to obtain a plurality of concept embedding vectors. And inputting the target text into the sentence coding module to obtain a sentence embedded vector.
And finally, inputting each concept embedded vector and each sentence embedded vector into the text concept fusion module to obtain entity relationship representation. The method specifically comprises the following steps:
calculating the similarity between each concept embedded vector and each sentence embedded vector to obtain a plurality of similarity matrixes; normalizing each similarity matrix by using a Softmax function to obtain each similarity score; determining an optimal concept embedding vector corresponding to the sentence embedding vector according to a preset similarity threshold value and each similarity score value; and splicing the optimal concept embedded vector and the sentence embedded vector to obtain entity relation expression.
Based on the above scheme, embodiments as shown in fig. 2-5 are provided:
as shown in fig. 2, the main operation logic of the small sample relation extraction model is: for the text of the input sentence, candidate concepts corresponding to the entities are obtained from the database (such as the candidate concepts corresponding to the Bellgas in FIG. 2: people, million-rich, and Enterprise). Embedding sentences by using the Bert, and embedding selected candidate relations by using a skip-gram training model (a skip model) to obtain vector representations of sentences and concepts. And calculating the similarity between all concepts and sentences according to the concept vectors and the sentence vectors, judging a threshold value of the similarity after softmax normalization, and selecting a reasonable vector. And fusing sentences and concepts by using a self-attention mechanism, and finally, introducing supervised contrast learning loss and cross entropy loss for training.
A key part of the above process is the introduction of external knowledge to enhance the embedded representation (i.e., to set up the concept database), it is apparent that plain text information is limited in small sample relation extraction scenarios. Therefore, when some external auxiliary information is introduced to compensate for limited information in the support set, the performance of the model can be improved. How to extract the most useful information from the external information while avoiding the introduction of interference information is a considerable problem. In this embodiment, therefore, the entity concept is introduced as external information to enhance the representation of the prototype.
(1) First, concepts are intuitive and concise descriptions of entities and can be easily obtained from a database of set concepts (YAGO 3, conceptNet, conceptGraph). The present embodiment option is obtained from the ConceptGraph database. The ConceptGraph database is a large-scale common sense concept knowledge graph developed by Microsoft, contains entity concepts stored in triples (entity, isA, concept), and can provide concept knowledge for entities in a relationship extraction scheme (ConceptFERE). Wherein concept embedding employs pre-trained concept embedding.
(2) In addition, concepts are more abstract than each entity-specific textual description, and not only can supplement the limited information in the support set, but are also more suitable as prototypes for a class of relationships.
As shown in table 1, it is intuitively known that the head entity is a company, the tail entity is an enterprise, and the corresponding relationship of entity pairs in sentences can be limited to a range. On the other hand, some relationships should be eliminated, for example, as read. Semantic information of concepts may help determine model predicted relationships: the creator.
Table 1 concept introduction example
But an entity may have different concepts in different aspects or angles, such as the person belgium, the million-rich, the business, and ultimately only a few concepts may be related to their relationship. Therefore, when external information is introduced, a similarity judging gate is introduced to judge the similarity of the external concepts and the relational text. Secondly, since sentence embedding and pre-trained concept embedding are not learned in the same semantic space, the present embodiment adopts a self-attention mechanism to perform word-level semantic fusion on sentences and selected concepts, resulting in final sentence representation.
Sentence embedding:
in this embodiment, a Bert pre-training model is used as text embedding for sentences of the support set. The input of Bert is a representation corresponding to each embedded word (token), and is input into the Bert pre-training model by converting the token into feature vectors. To accomplish a specific classification task, a specific classification token is inserted at the beginning of each sentence sequence entered, except for the token of the word, and the last attention layer (transducer) output corresponding to the classification token is used to aggregate the entire sequence characterization information.
Since Bert is a pre-trained model that must accommodate a wide variety of natural language tasks, the sequence entered by the model must be capable of containing one sentence (text emotion classification, sequence labeling task) or more than two sentences (text abstract, natural language inference, question-answering task). In order for the model to have the ability to resolve which part belongs to sentence a and which part belongs to sentence B, bert adopts two methods to solve: inserting a segmentation token ([ SEP ]) into each sentence in the sequence token to separate different sentences token; a learnable segmentation embedding (empdding) is added to each token to indicate whether it belongs to sentence a or sentence B.
In terms of pre-training, bert builds two pre-training tasks, a mask language model (Masked Language Model, MLM) and next sentence prediction (Next Sentence Prediction, NSP), respectively. The MLM task tends to extract token-level tokens and therefore cannot directly obtain sentence-level tokens. In order to enable models to understand relationships between sentences, bert uses NSP tasks to pretrain, simply to predict whether two sentences constitute a context. In this way, some tasks such as question-answering, natural language inference, etc. that require understanding the relationship between two sentences can be well trained using Bert.
Based on the above two points, the Bert pre-training model can be well used as an embedding layer for NLP relation extraction.
Concept embedding:
for introduced concepts, conventional pre-training concept embedding is used, i.e., the embedded layer representation of the concept is learned on learning wiki encyclopedias and conceptual diagrams using a skip-gram model.
The concept selection gate algorithm as shown in fig. 3:
an entity may have different concepts in different aspects or angles, and eventually only a few concepts may be related to the relationship to which the entity belongs, while other concepts may have a negative effect on the classification of the relationship, and a more semantically similar concept needs to be selected by the model autonomously.
First, since concept embedding is performed through a concept embedding layer (Skip-gram)V c And sentence embedding via a sentence embedding layer (Bert)V s Not learned in the same semantic space, so semantic similarity cannot be directly compared. Thus, the present embodiment uses a fully connected mapping layerPWill beV c And (3) withV s Mapping to the same space, and comparing the similarity. The mapping layer herePIs learnable.
Secondly, sentences and n candidate concepts are respectively embedded into a vector with the length of 768 by a Bert and skip-gram pre-training modelV s AndV c after that, through the fully connected mapping layerPWill beV s AndV c performing similarity comparison to obtain a similarity matrix with length of nsim cs 。
Finally, the similarity values were normalized by the Softmax function. Obtaining similarity scores corresponding to the n candidate concepts when the scores are greater than a thresholdWhen this corresponding concept and sentence is considered very similar and relevant. Then, in the following self-attention based fusion module, the corresponding concept weight is assigned 1, and the score is smaller than the threshold +.>When the concept is considered to be irrelevant to sentences, and possibly brings interference to relation classification, the weight of the corresponding concept is assigned with 0. Here->Is a model hyper-parameter.
The self-attention based text concept fusion module as shown in fig. 4:
since the concept embedding and the word embedding in the sentence are not learned in the same semantic space, the present embodiment designs a self-attention based fusion module to perform word-level semantic fusion for each word in the concept and sentence. First, the embedding of all words in the sentence and the embedding of the selected concept are connected and then sent to the self-attention module. As shown in fig. 4, the self-attention module calculates a similarity value between the concept and each word in the sentence. It multiplies the concept embedding and the similarity value and then combines with the corresponding word embedding as follows:
wherein ,is shown in the firstiThe individual words are embedded after performing word-level semantic fusion. N is the number of words in the sentence,jis as followsiGenerating an embedded firstjThe individual affects the word.q i 、k j Andv j each of the vectors generated after passing through the self-attention module matrix.
Thus, the support set sentence embedding which introduces external reasonable clues is obtained.
Prototype discriminant ability was increased using contrast learning: the embodiment adopts a contrast learning method to solve the problem that prototypes of different categories are likely to be close in an embedding space, and the specific process is as follows.
In the above, sentence embedding has been achieved that introduces external rational clues. The embodiment proposes that the prototype network is used as a backbone network, and contrast learning loss is introduced when a relational prototype is constructed, so as to alleviate the above approach problem. The overall architecture of the relation classification module based on contrast learning is shown in fig. 5, and the algorithm mainly comprises the following steps:
1. text embedding
For sentences belonging to Support Instance and sentences belonging to Query Instance, the embodiment is firstly sent into a fusion module of external concepts to obtain sentences embedded with richer semanticsAndQ j . wherein ,Representation support setiClass relationship of the firstkA vector representation of the instances of the object,Q j represent the firstjVector representations of individual query set instances. In addition, the present embodiment also sends the relation descriptor in the support set to the Bert encoder to obtain the vector representation corresponding to the relationR i 。/>
2. Generating relationship prototypes
For any relationship under a batch, the algorithm generates its corresponding relationship prototype (prototype). For the previous prototype network, the method that the average value embedded by all the instances under the same class is used as the rough method of the relation prototype, the embodiment manually selects more important vectors to generate the prototype. Specifically, for the firstiPrototype of class relationships, the following formula:
wherein ,to support centralization ofiClass IIIkThe hidden layer vector representation of the token corresponding to the external concept and the token in the embedded representation of the individual sentences are added to obtain the single sentence feature.r i Is the firstiClass relationship descriptors are relationship features represented by token hidden layer vectors after bert encoding. Reuse is carried outkThe average value of the single sentence features is added with the relation features to finally obtain the first sentenceiA relationship prototype of the relationship.
Then, by embedding the representation of Query Instance and prototypes of N relationships, the model can calculate the probability of the possible relationships of the Query Instance:
wherein ,q j a vector representation representing the j-th query instance,p n a relationship prototype vector representing a relationship of the nth class.
3. Adjusting a relationship prototype based on contrast learning
As described above, when the relationship types in the same training batch are similar, the distribution of the relationship prototypes in the embedding space is close, and thus the discrimination capability of the model may be reduced. Therefore, after the relation prototypes are embedded and generated through the text, all the relation prototypes contained in the support set are put into the contrast learning module, so that the distribution of each relation prototype in an embedding space is more discretized, and the discrimination capability of the prototypes is improved. It should be noted that, unlike the conventional self-supervised contrast learning approach, labels supporting set instance relationships are used for supervised contrast learning.
Specifically, the supervised contrast learning module uses the relationship as an anchor point, takes a relationship prototype of the same class as a positive sample, takes prototypes of different relationship classes under the same batch as negative samples, and aims to pull the positive sample relationship by the anchor point and push the negative samples away.
For support under-set (BJV)iClass relationshipsr i The model will select the corresponding positive sample prototypep i And negative sample prototypeSimilarity is measured using dot products, with:
wherein ,representing the similarity between pairs of positive samples in the relationship of class i,Representing the similarity of negative pairs of samples in the i-th class relationship.
The supervised contrast learning penalty is as follows:
wherein ,representing the similarity between the negative sample pairs under the i-th class relationship and the other n classes.
And finally, embedding the query instance and the sentence prototype of the support set adjusted by the comparison learning module into dot products, performing cross entropy loss as a classification result closest to the query instance, and optimizing the overall parameters.
wherein ,LCE Representing cross entropy loss; z y Representing the predicted outcome of the model.
In this embodiment, the required operating environment is: pytorch:1.7.1, CUDA:11.0, GPU: NVIDIA GeForce RTX 3090, 24G. Training was performed in this environment.
The embodiment provides an external information introducing module based on similarity, which can enable a model to autonomously select external information so as to filter external noise and improve the performance of the model. In addition, in the embodiment, for the similarity relationship, a method of contrast learning is introduced to improve the discrimination capability of the model.
As shown in fig. 6, the present invention further provides a small sample relationship extraction system, comprising:
the text acquisition module is used for acquiring a target text;
the relation extraction module is used for determining entity relation expression according to the small sample relation extraction model and the target text; the entity relation representation comprises entity texts and corresponding concepts and relations; the small sample relation extraction model is trained by comparing learning loss and cross entropy loss; the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is constructed based on a self-attention mechanism network and a similarity gate; the concept coding module is used for determining a plurality of concept embedding vectors of entity texts in the target texts; the sentence coding module is used for determining sentence embedded vectors of the target text; the text concept fusion module is used for determining entity relation expression according to each concept embedding vector and each sentence embedding vector.
The invention also provides an electronic device comprising a memory for storing a computer program and a processor for running the computer program to cause the electronic device to perform a small sample relation extraction method according to the above.
The present invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements a small sample relationship extraction method as described above.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the core concept of the invention; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.
Claims (8)
1. A method for small sample relationship extraction, comprising:
acquiring a target text;
determining entity relation representation according to the small sample relation extraction model and the target text; the entity relation representation comprises entity texts and corresponding concepts and relations; the small sample relation extraction model is trained by comparing learning loss and cross entropy loss;
the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is constructed based on a self-attention mechanism network and a similarity gate;
the concept coding module is used for determining a plurality of concept embedding vectors of entity texts in the target texts; the sentence coding module is used for determining sentence embedded vectors of the target text; the text concept fusion module is used for determining entity relation expression according to each concept embedding vector and each sentence embedding vector.
2. The small sample relationship extraction method according to claim 1, wherein the determining an entity relationship representation according to the small sample relationship extraction model and the target text specifically comprises:
extracting concepts of the entity text of the target text according to a set concept database to obtain a plurality of candidate concepts of the entity text;
inputting each candidate concept into the concept coding module to obtain a plurality of concept embedding vectors;
inputting the target text into the sentence coding module to obtain a sentence embedded vector;
and inputting each concept embedded vector and each sentence embedded vector into the text concept fusion module to obtain entity relationship representation.
3. The method for extracting small sample relationships according to claim 2, wherein inputting each of the concept embedding vectors and the sentence embedding vectors into the text concept fusion module to obtain an entity relationship representation, comprises:
calculating the similarity between each concept embedded vector and each sentence embedded vector to obtain a plurality of similarity matrixes;
normalizing each similarity matrix by using a Softmax function to obtain each similarity score;
determining an optimal concept embedding vector corresponding to the sentence embedding vector according to a preset similarity threshold value and each similarity score value;
and splicing the optimal concept embedded vector and the sentence embedded vector to obtain entity relation expression.
4. The small sample relationship extraction method according to claim 2, wherein the training process of the small sample relationship extraction model specifically comprises:
acquiring training data; the training data comprises training texts and corresponding relation labels; the relationship label comprises entity texts in training texts and corresponding concepts and relationships;
constructing a training model based on the Bert embedded model, the skip-gram model, the self-attention mechanism network and the similarity gate;
and inputting the training data into the training model, training the training model by utilizing the contrast learning loss and the cross entropy loss, and determining the trained training model as the small sample relation extraction model.
5. The small sample relationship extraction method of claim 2, wherein the set concept database comprises a YAGO3 database, a ConceptNet database, and a ConceptGraph database.
6. A small sample relationship extraction system, comprising:
the text acquisition module is used for acquiring a target text;
the relation extraction module is used for determining entity relation expression according to the small sample relation extraction model and the target text; the entity relation representation comprises entity texts and corresponding concepts and relations; the small sample relation extraction model is trained by comparing learning loss and cross entropy loss;
the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are connected with the text concept fusion module; the concept coding module is constructed based on a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept fusion module is constructed based on a self-attention mechanism network and a similarity gate;
the concept coding module is used for determining a plurality of concept embedding vectors of entity texts in the target texts; the sentence coding module is used for determining sentence embedded vectors of the target text; the text concept fusion module is used for determining entity relation expression according to each concept embedding vector and each sentence embedding vector.
7. An electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the small sample relationship extraction method of claims 1-5.
8. A computer readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the small sample relationship extraction method as claimed in claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310495624.4A CN116205217B (en) | 2023-05-05 | 2023-05-05 | Small sample relation extraction method, system, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310495624.4A CN116205217B (en) | 2023-05-05 | 2023-05-05 | Small sample relation extraction method, system, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116205217A true CN116205217A (en) | 2023-06-02 |
CN116205217B CN116205217B (en) | 2023-09-01 |
Family
ID=86508057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310495624.4A Active CN116205217B (en) | 2023-05-05 | 2023-05-05 | Small sample relation extraction method, system, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116205217B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209412A (en) * | 2020-02-10 | 2020-05-29 | 同方知网(北京)技术有限公司 | Method for building knowledge graph of periodical literature by cyclic updating iteration |
US10909442B1 (en) * | 2017-03-30 | 2021-02-02 | Amazon Technologies, Inc. | Neural network-based artificial intelligence system for content-based recommendations using multi-perspective learned descriptors |
CN112784589A (en) * | 2021-01-29 | 2021-05-11 | 北京百度网讯科技有限公司 | Training sample generation method and device and electronic equipment |
CN112820411A (en) * | 2021-01-27 | 2021-05-18 | 清华大学 | Medical relation extraction method and device |
CN113378573A (en) * | 2021-06-24 | 2021-09-10 | 北京华成智云软件股份有限公司 | Content big data oriented small sample relation extraction method and device |
CN114492412A (en) * | 2022-02-10 | 2022-05-13 | 湖南大学 | Entity relation extraction method for Chinese short text |
CN114880307A (en) * | 2022-06-07 | 2022-08-09 | 上海开放大学 | Structured modeling method for knowledge in open education field |
CN115688753A (en) * | 2022-09-30 | 2023-02-03 | 阿里巴巴(中国)有限公司 | Knowledge injection method and interaction system of Chinese pre-training language model |
-
2023
- 2023-05-05 CN CN202310495624.4A patent/CN116205217B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10909442B1 (en) * | 2017-03-30 | 2021-02-02 | Amazon Technologies, Inc. | Neural network-based artificial intelligence system for content-based recommendations using multi-perspective learned descriptors |
CN111209412A (en) * | 2020-02-10 | 2020-05-29 | 同方知网(北京)技术有限公司 | Method for building knowledge graph of periodical literature by cyclic updating iteration |
CN112820411A (en) * | 2021-01-27 | 2021-05-18 | 清华大学 | Medical relation extraction method and device |
CN112784589A (en) * | 2021-01-29 | 2021-05-11 | 北京百度网讯科技有限公司 | Training sample generation method and device and electronic equipment |
CN113378573A (en) * | 2021-06-24 | 2021-09-10 | 北京华成智云软件股份有限公司 | Content big data oriented small sample relation extraction method and device |
CN114492412A (en) * | 2022-02-10 | 2022-05-13 | 湖南大学 | Entity relation extraction method for Chinese short text |
CN114880307A (en) * | 2022-06-07 | 2022-08-09 | 上海开放大学 | Structured modeling method for knowledge in open education field |
CN115688753A (en) * | 2022-09-30 | 2023-02-03 | 阿里巴巴(中国)有限公司 | Knowledge injection method and interaction system of Chinese pre-training language model |
Non-Patent Citations (2)
Title |
---|
周凯锐等: "概念驱动的小样本判别特征学习方法", 《智能系统学报》, vol. 2023, no. 01, pages 162 - 172 * |
马昂,于艳华等: "基于强化学习的知识图谱综述", 《计算机研究与发展》, vol. 2022, no. 08, pages 1694 - 1722 * |
Also Published As
Publication number | Publication date |
---|---|
CN116205217B (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021164199A1 (en) | Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device | |
CN108984526B (en) | Document theme vector extraction method based on deep learning | |
CN111046179B (en) | Text classification method for open network question in specific field | |
CN111274790B (en) | Chapter-level event embedding method and device based on syntactic dependency graph | |
Daelemans | Memory-based lexical acquisition and processing | |
CN109800434B (en) | Method for generating abstract text title based on eye movement attention | |
CN112541356B (en) | Method and system for recognizing biomedical named entities | |
US20230069935A1 (en) | Dialog system answering method based on sentence paraphrase recognition | |
CN111414481A (en) | Chinese semantic matching method based on pinyin and BERT embedding | |
CN110298044B (en) | Entity relationship identification method | |
CN111738007A (en) | Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network | |
CN110717045A (en) | Letter element automatic extraction method based on letter overview | |
CN113360667B (en) | Biomedical trigger word detection and named entity identification method based on multi-task learning | |
CN114722805B (en) | Little sample emotion classification method based on size instructor knowledge distillation | |
CN111581964A (en) | Theme analysis method for Chinese ancient books | |
CN115510230A (en) | Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism | |
CN117217277A (en) | Pre-training method, device, equipment, storage medium and product of language model | |
CN114548117A (en) | Cause-and-effect relation extraction method based on BERT semantic enhancement | |
Zheng et al. | Pretrained domain-specific language model for general information retrieval tasks in the aec domain | |
CN114722798A (en) | Ironic recognition model based on convolutional neural network and attention system | |
CN116757195B (en) | Implicit emotion recognition method based on prompt learning | |
CN113486143A (en) | User portrait generation method based on multi-level text representation and model fusion | |
CN117436522A (en) | Biological event relation extraction method and large-scale biological event relation knowledge base construction method of cancer subject | |
CN116205217B (en) | Small sample relation extraction method, system, electronic equipment and storage medium | |
CN115391534A (en) | Text emotion reason identification method, system, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |