CN115017324A - Entity relationship extraction method, device, terminal and storage medium - Google Patents

Entity relationship extraction method, device, terminal and storage medium Download PDF

Info

Publication number
CN115017324A
CN115017324A CN202210203644.5A CN202210203644A CN115017324A CN 115017324 A CN115017324 A CN 115017324A CN 202210203644 A CN202210203644 A CN 202210203644A CN 115017324 A CN115017324 A CN 115017324A
Authority
CN
China
Prior art keywords
information
subject
training
sample
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210203644.5A
Other languages
Chinese (zh)
Inventor
张芮
彭力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Beijing Xiaomi Pinecone Electronic Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd, Beijing Xiaomi Pinecone Electronic Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN202210203644.5A priority Critical patent/CN115017324A/en
Publication of CN115017324A publication Critical patent/CN115017324A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the disclosure discloses a method, a device, a terminal and a storage medium for extracting entity relations; the method comprises the following steps: acquiring first relation data of at least one training sample; inputting the training sample where the first relation data is to the first relation extraction model for recognition to obtain second relation data of the training sample; inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model; and inputting the target text into the second relation extraction model for training to obtain target relation data of the target text.

Description

Entity relationship extraction method, device, terminal and storage medium
Technical Field
The present disclosure relates to, but not limited to, the field of artificial intelligence or the field of computer technology, and in particular, to a method, an apparatus, a terminal, and a storage medium for extracting an entity relationship.
Background
Huge information is contained in massive natural texts, and the value of the natural texts can be mined and the information utilization rate can be improved by extracting the information of the natural texts and constructing relationship information. The method has wide application in the fields of knowledge map construction, information retrieval, question-answering systems, emotion analysis and the like. Triple knowledge is an important component of a knowledge graph and mainly comprises two forms of < entity, relationship, entity > and < entity, attribute and attribute value >. Extracting entity relationships from natural text is a form of information extraction, and the entity relationship extraction may be converting unstructured text information into structured triple knowledge. For example, from the natural text "A is the originator of B, the triple knowledge < B, originator, A > can be extracted.
At present, the entity relationship extraction method mainly includes limited domain relationship extraction and open text relationship extraction. Wherein, the limited domain relation extraction is to extract entity relation pairs from the predefined relations; the relationship type is fixed, and new relationships can not be extracted, so that the practical application of the extraction technology is limited. The extraction of the text relation in the open field is to extract all triples existing in the text without limiting the content of the relation; but has the problems of low precision and the like.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides an entity relationship extraction method, apparatus, terminal and storage medium.
According to a first aspect of the present disclosure, there is provided an entity relationship extraction method, including:
acquiring first relation data of at least one training sample;
inputting the training sample where the first relation data is to the first relation extraction model for recognition to obtain second relation data of the training sample;
inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model;
and inputting the target text into the second relation extraction model for training to obtain target relation data of the target text.
In some embodiments, the method comprises:
acquiring sample information of at least one training sample; wherein the sample information comprises: subject information and object information of at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, and obtaining the trained second relation extraction model;
inputting the target text into the second relation extraction model for training to obtain the target relation data of the target text, including:
and inputting the target text into the trained second relation extraction model to obtain the target relation data of the target text.
In some embodiments, the one-time iterative training of inputting the sample information into the second relational extraction model comprises:
inputting the subject information and the object information of at least one sample relation into the second relation extraction model, and constructing a loss value of current iteration training;
and updating the second relation extraction model based on the loss value to obtain the second relation extraction model after the current iterative training.
In some embodiments, the inputting subject information and object information of at least one of the sample relationships into the second relationship extraction model, and constructing the loss value of the current iteration training includes:
acquiring coding information of each training sample based on each training sample;
based on the coding information and the subject classifier, obtaining predicted subject information of the training sample, wherein the predicted subject information comprises: predicting subject heading pointer information and predicting subject suffix pointer information;
obtaining predicted object information of the training samples based on the coding information and an object classifier, wherein the predicted object information includes: predicting object head pointer information and predicting object tail pointer information;
and obtaining a loss value of the current iteration training based on subject information of at least one sample relation of the training samples and corresponding predicted subject information, object information and corresponding predicted object information.
In some embodiments, the method comprises:
carrying out fusion processing on the coding information of the training sample and the subject vector representation to obtain a fused vector representation; wherein the subject vector characterization is determined based on the predicted subject header pointer information and predicted tail pointer information;
the obtaining of the predicted object information of the training samples based on the coding information and the object classifier includes:
and acquiring the prediction object information of the training sample based on the fusion vector characterization and the object classifier.
In some embodiments, the method comprises:
if one of the training samples includes at least two subjects, the predicted subject tail pointer information corresponding to the predicted subject head pointer information of one of the subjects is determined within a predetermined range of a position indicated by the predicted subject head pointer information of the subject.
In some embodiments, the obtaining a loss value of the current iteration training based on the subject information and the corresponding predicted subject information, object information and the corresponding predicted object information of at least one of the sample relationships of the training samples includes:
obtaining a subject loss value based on the subject information and predicted subject information of at least one of the sample relationships of at least one of the training samples;
obtaining an object loss value based on the object information and predicted object information of at least one of the sample relationships of at least one of the training samples; wherein the object loss value comprises: a first object loss value representing that the subject has an object relationship and a second object loss value representing that the subject does not have an object relationship;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one training sample.
In some embodiments, the obtaining a loss value of a current iteration training based on a sum of the subject loss value and the object loss value of at least one of the training samples includes:
obtaining an object loss value weighted by the training sample based on the object loss value and the weighting coefficient of the training sample;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the weighted object loss value of at least one training sample.
In some embodiments, the obtaining the coding information of each of the training samples based on each of the training samples includes:
inputting each training sample into a pre-training model to obtain the coding information of each training sample; the pre-training model comprises the corresponding relation between each alternative word and the coding information.
According to a second aspect of the present disclosure, there is provided an entity relationship extraction method, the method including:
obtaining sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, and obtaining the trained second relation extraction model;
inputting the target text into the trained second relation extraction model to obtain the target relation data of the target text.
According to a third aspect of the present disclosure, there is provided an entity relationship extraction apparatus, the apparatus comprising:
the first obtaining module is used for obtaining first relation data of at least one training sample;
the first identification module is used for inputting the training sample where the first relation data is located into the first relation extraction model for identification so as to obtain second relation data of the training sample;
the first processing module is used for inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model;
and the second processing module is used for inputting the target text into the second relation extraction model for training so as to obtain the target relation data of the target text.
In some embodiments, the apparatus comprises:
the second acquisition module is used for acquiring sample information of at least one training sample; wherein the sample information comprises: subject information and object information of at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
the first processing module is configured to input the sample information to the second relationship extraction model for iterative training until a loss function of the second relationship extraction model meets a convergence condition, so as to obtain a trained second relationship extraction model;
the second processing module is configured to input the target text into the trained second relationship extraction model to obtain the target relationship data of the target text.
In some embodiments, the first processing module is configured to input subject information and object information of at least one of the sample relationships into the second relationship extraction model, and construct a loss value of the current iteration training;
and the first processing module is used for updating the second relation extraction model based on the loss value to obtain the second relation extraction model after current iteration training.
In some embodiments, the first processing module is configured to perform the following steps:
acquiring coding information of each training sample based on each training sample;
obtaining predicted subject information of the training samples based on the coding information and a subject classifier, wherein the predicted subject information comprises: predicting subject heading pointer information and predicting subject suffix pointer information;
obtaining prediction object information of the training samples based on the coding information and an object classifier, wherein the prediction object information comprises: predicting object head pointer information and predicting object tail pointer information;
and obtaining a loss value of the current iteration training based on subject information of at least one sample relation of the training samples and corresponding predicted subject information, object information and corresponding predicted object information.
In some embodiments, the first processing module is configured to perform fusion processing on the coding information and the subject vector characterization of the training sample to obtain a fused vector characterization; wherein the subject vector characterization is determined based on the predicted subject header pointer information and predicted tail pointer information;
the first processing module is used for obtaining the prediction object information of the training sample based on the fusion vector characterization and the object classifier.
In some embodiments, if one of the training samples includes at least two subjects, the first processing module is configured to determine predicted subject end pointer information corresponding to the predicted subject head pointer information of one of the subjects within a predetermined range of a position indicated by the predicted subject head pointer information of the subject.
In some embodiments, the first processing module is configured to perform the steps of:
obtaining a subject loss value based on the subject information and predicted subject information of at least one of the sample relationships of at least one of the training samples;
obtaining an object loss value based on the object information and predicted object information of at least one of the sample relationships of at least one of the training samples; wherein the object loss values comprise: a first object loss value representing that the subject has an object relationship and a second object loss value representing that the subject does not have an object relationship;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one training sample.
In some embodiments, the first processing module is configured to obtain a weighted object loss value of the training samples based on the object loss value and the weighting coefficients of the training samples;
the first processing module is used for obtaining a loss value of current iteration training based on the sum of the subject loss value and the weighted object loss value of at least one training sample.
In some embodiments, the first processing module is configured to input each of the training samples into a pre-training model, and obtain the coding information of each of the training samples; the pre-training model comprises the corresponding relation between each alternative word and the coding information.
According to a fourth aspect of the present disclosure, there is provided an entity relationship extraction apparatus, the apparatus comprising:
the second acquisition module is used for acquiring sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
the first processing module is used for inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, so as to obtain the trained second relation extraction model;
and the second processing module is used for inputting the target text into the trained second relation extraction model so as to obtain the target relation data of the target text.
According to a fifth aspect of the embodiments of the present disclosure, there is provided a terminal, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: and when the executable instruction is run, the entity relationship extraction method according to any embodiment of the disclosure is realized.
According to a sixth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing an executable program, where the executable program, when executed by a processor, implements the entity relationship extraction method according to any embodiment of the present disclosure.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
the embodiment of the disclosure can obtain first relation data of at least one training sample through a terminal; inputting the training sample where the first relation data is to the first relation extraction model for recognition to obtain second relation data of the training sample; inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model; and inputting the target text into the second relation extraction model for training to obtain target relation data of the target text. Thus, the embodiment of the present disclosure may introduce new relationship data (i.e., the first relationship data) to iterate out a new relationship extraction model (i.e., the second relationship extraction model) under the structural characteristics of the original relationship extraction model (i.e., the first relationship extraction model). Therefore, the precision of the second relation extraction model can be ensured, and the new relation data can be expanded; namely, when the target relation data of the target text is extracted, the extraction precision of the target relation can be ensured, and the extraction of the extensible entity relation can be realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a flowchart illustrating an entity relationship extraction method according to an exemplary embodiment of the present disclosure.
Fig. 2 is a schematic diagram illustrating an entity relationship extraction method according to an exemplary embodiment of the present disclosure.
Fig. 3 is a flowchart illustrating an entity relationship extraction method according to an exemplary embodiment of the present disclosure.
Fig. 4 is a flowchart illustrating an entity relationship extraction method according to an exemplary embodiment of the present disclosure.
FIG. 5 is a schematic diagram illustrating a second relational extraction model according to an exemplary embodiment of the present disclosure.
Fig. 6 is a flowchart illustrating an entity relationship extraction method according to an exemplary embodiment of the present disclosure.
Fig. 7 is a block diagram illustrating an entity relationship extraction apparatus according to an exemplary embodiment of the present disclosure.
Fig. 8 is a block diagram illustrating an entity relationship extraction apparatus according to an exemplary embodiment of the present disclosure.
Fig. 9 is a block diagram illustrating an entity relationship extraction apparatus according to an exemplary embodiment of the present disclosure.
Fig. 10 is a block diagram illustrating a terminal according to an exemplary embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
In order to better understand the technical solution described in any embodiment of the present disclosure, first, a part of the entity link in the related art is explained:
in some embodiments, the defined domain relationship extraction includes, but is not limited to, a pipeline (pipeline) method or a joint extraction method. The pipelining method divides entity extraction and relationship extraction into two independent processes; the pipeline method usually extracts entity relationship pairs first and then determines the relationship of the entity relationship pairs. The joint extraction method is used for carrying out end-to-end modeling on entity extraction and relation extraction.
In some embodiments, the open-realm relationship extraction includes, but is not limited to, one of: a learning-based method, a rule-based extraction method, a sentence-based extraction method, a sequence labeling model-based method, and a generative model-based method. The learning-based approach may be: extracting entity relationship pairs by constructing a self-supervision learning system; acquiring annotation data by a Wikipedia and rule method; and judging entity relationship pairs by constructing a classifier, identifying triples among phrases by constructing an extractor, and evaluating confidence scores of the triples based on various characteristics by constructing an evaluator. The rule-based extraction method may be to determine entity-relationship pairs by manually defining rules or by a generic lexical, syntactic classifier. The sentence-based extraction method can convert complex texts into a sentence set of triples which have dependency relationships and are easy to divide by reconstructing sentences. The method based on the sequence labeling model can directly extract predicate labels in sentences to extract triples. The generative model-based approach may be to extract triples by generating triples. For example, "zhang san, beijing men", can extract "zhang san, sheng di, beijing" by the method of the generative model; here, "place of birth" is an implicit predicate attribute.
FIG. 1 is a block diagram of a method for entity relationship extraction according to an exemplary embodiment; as shown in fig. 1, the entity relationship extraction method includes the following steps:
step S11: acquiring first relation data of at least one training sample;
step S12: inputting the training sample where the first relation data is to the first relation extraction model for recognition to obtain second relation data of the training sample;
step S13: inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model;
step S14: and inputting the target text into the second relation extraction model for training to obtain target relation data of the target text.
The entity relationship extraction method disclosed by the embodiment of the disclosure is executed by a terminal. The terminal here may be various mobile devices or fixed devices. For example, the terminal may be, but is not limited to, a server, a computer, a tablet, a mobile phone, or a wearable device.
Here, the terminal may be caused to train the first relationship extraction model to obtain a second relationship extraction model having more relationship data (including the first relationship data and the second relationship data); therefore, the terminal can obtain more target relation data of the target text while ensuring the accuracy of the target relation data of the training target text based on the second relation extraction model. For example, the terminal is a computer, and the computer is provided with a first relation extraction model, and the first relation extraction model may be trained to obtain a second relation extraction model. If the computer extracts the entity relationship of the knowledge base based on the second relationship extraction model, more branches (namely entity relationships and the like) and more detailed knowledge bases are obtained; and/or if the computer analyzes the data based on the second relation extraction model, more intelligence relations (namely entity relations) are obtained, so that more intelligence is obtained. In a word, the terminal can extract the entities of the text information of mass data in various fields, and more entity relations can be extracted; and then the content contained in the text information of the mass data in each field can be more comprehensively and more perfectly obtained and analyzed based on the entity relations.
Here, the training sample may be any kind of text information. For example, the training sample may be "Zhang one, Tang three four, Yang five, etc. attend the event", "A is the originator of company B" and/or "Wang one two, born in Dalian City of Liaoning province, Wang three-four-subjects of president of the C.D. group", etc.
Here, both the first relationship data and the second relationship data may be entity-relationship pairs, or may be data representing the relationship of entity-relationship pairs. For example, the first relationship data and the second relationship data may both be triplets; the triplet may be, but is not limited to, an < entity, relationship, entity > or an < entity, attribute value >. As another example, the first relationship data and the second relationship data can be data of a relationship of "wife," "parent-child," and/or "place of birth," among others.
In one embodiment, the step S11 includes: at least one first relationship data of at least one training sample is obtained. Here, one training sample may correspond to one first relationship data, or one training sample may correspond to a plurality of second relationship data. In some embodiments of the present disclosure, the plurality is 2 or more than 2.
For example, the terminal may obtain a first relationship data from the training sample "one two, three and four Tang, and five Yang attend the activity", such as "one two Zhang" wife "three and four Tang". The terminal can obtain two first relationship data from 'Wang Yi, born in Dalian City in Liaoning province, and Dong captain Wang Siqizi' of the third and fourth group of the father 'Wang Yi', for example, 'Wang Yi' radix rehmanniae 'Dalian City' and 'Wang Yi' father 'Wang san Size City'.
In one embodiment, the step S11 includes: based on the knowledge base, at least first relationship data of at least one training sample is obtained. Here, the knowledge base may be any kind of knowledge base. For example, the knowledge base includes any text with entity relationship pairs; the text may include data for a subject and an object. Exemplarily, the terminal determines that the relationship data of the wife needs to be extracted based on training samples ' Zhang one and two, Tang three and four, Yang five and the like ' in a knowledge base to attend the activity '; the first relationship data of the training sample can be < zhang-yi, wife, tang san-si >.
In one embodiment, the step S11 includes: acquiring relational data of at least one training sample based on a knowledge base; and formulating a preset rule to filter the training sample based on the data characteristics of the relation data so as to obtain at least one first relation data of the training sample. Illustratively, the terminal obtains training samples of 'Zhang one and two, Tang three and four, Yang five and the like to attend the activity' based on the knowledge base; the training sample comprises two entities of Zhang-Yi and Tang-san-Si, but the training sample cannot explain the couple relationship of Zhang-Yi and Tangsan-Si; it is possible to determine the relationship data "wife" based on the training sample, and determine whether the predetermined rule "text contains the relationship data" wife "or synonyms of the relationship data" based on the feature of "wife" to filter erroneous data in the training sample; and determining first relation data of the training samples, such as Zhang-two, wife and Tang-three-four, based on the filtered training samples.
Here, the first relational extraction model and the second relational extraction model are the same type of relational extraction model; the relational extraction model may refer to any one of the relational extraction models. For example, the first relation extraction model and the second relation extraction model include: a relation extraction model of the coding layer, the subject recognition layer and the object recognition layer.
Here, the second relation extraction model is the updated first relation extraction model; the first relational extraction model iterates based on the first relational data to obtain an updated second relational extraction model. Here, the sample relationship corresponding to the first relationship extraction model includes second relationship data; the sample relationship corresponding to the second relationship extraction model comprises: first relationship data and second relationship data.
Here, the second relational extraction model has the same structural characteristics as the first relational model.
Illustratively, there are n second relationship data such as "occupation", "birth date", and "parent-child relationship" in the second relationship extraction model; one training sample is "Zhang-one-two, Tang-three-four, Yang-five, etc. attend the event; zhang Ying is an actor who grows in … …/24/8/1988, and first relationship data of 'couple relationship' can be extracted based on the training sample; the relation data of n +1 of "occupation", "date of birth", "parent-child relation" and "couple relation" are used as the sample relation of the second relation extraction model.
Illustratively, as shown in fig. 2, an embodiment of the present disclosure provides an entity relationship extraction method, including: step S21: constructing first relational data through a knowledge base; step S22: back marking the second relation data; step S23: and iterating to obtain a second relation extraction model. Here, the step S21 may include: and constructing first relation data of the training samples based on the knowledge base. In step S22, the training sample is input into a first relation extraction model (M) n ) Identifying n second relationship data (R) n ) (ii) a And using the first relation data and the second relation data as a second relation extraction model (M) n+1 ) Sample relationship (R) of n+1 ). The step S23 includes: at M n Based on R n+1 To update M n Obtaining M n+1
Here, the target text may be any one of text information; and the target text is used for extracting the text of the entity relationship pair.
Here, the target relationship data may be pairs of entity relationships, such as triples; or the target relationship data may also be relationship data characterizing pairs of entity relationships.
In this way, in the embodiment of the present disclosure, new relationship data (i.e., the first relationship data) may be introduced by the terminal under the structural characteristics of the original relationship extraction model (i.e., the first relationship extraction model) to iterate out a new relationship extraction model (i.e., the second relationship extraction model). Therefore, the precision of the second relation extraction model can be ensured, and the new relation data can be expanded; namely, when the target relation data of the target text is extracted, the extraction precision of the target relation can be ensured, and the extraction of the extensible entity relation can be realized.
In some embodiments of the present disclosure, an entity relationship extraction method may also be: acquiring first relation data of at least one training sample; inputting the training sample where the first relation data is to the first relation extraction model for recognition to obtain second relation data of the training sample; and inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model. Thus, the embodiment of the present disclosure may obtain a new relationship extraction model (i.e., a second relationship extraction model) by introducing new relationship data to train through the original relationship extraction model (i.e., the first relationship extraction model); the new relation extraction model can realize rapid expansion of new relations, completion of missing relation data and the like.
As shown in fig. 3, in some embodiments, the method further comprises:
step S31: obtaining sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
step S32: inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, and obtaining the trained second relation extraction model;
the step S14 includes:
step S33: and inputting the target text into the trained second relation extraction model to obtain the target relation data of the target text.
In some embodiments, the sample relationships may be, but are not limited to, the sample relationships in the embodiments described above. In other embodiments, the sample relationships may be partially the same sample relationships as in the embodiments described above.
Here, the subject head pointer information may indicate a position of a first word of the subject in the training sample; the subject tail pointer information may indicate the location of the last word of the subject in the training sample. The object head pointer information may indicate a position of a first word of the object in the training sample; the object tail pointer information may indicate a position of a last word of the object in the training sample. In some embodiments of the present disclosure, the word may refer to a single word; for example, a word may be, but not limited to, a Chinese character, an English word, a Korean letter, or a character or character string.
Illustratively, taking the training sample "wang di, born in dahlian city of liaoning province, and" dong ding group president wang san si "as an example, the subject may be" wang di "or" dong ding group "and the object" wang san si "may be recognized directly from the training sample. The head pointer 'king' and the tail pointer 'II' of the subject 'WangYing' are respectively positioned at the 1 st position and the 3 rd position of the training sample; the "king" may be indicated in the 1 st position by the subject head pointer information and the "second" in the 3 rd position by the subject tail pointer information. The head pointer "king" and the tail pointer "four" of the object are respectively at the 22 nd position and the 24 th position of the training sample, so that the object head pointer information can be used for indicating that the "king" is at the 22 th position and the object tail pointer information can be used for indicating that the "four" is at the 24 th position.
Here, the loss function satisfies a convergence condition, and may mean: the loss value of the loss function is less than a predetermined value. For example, the predetermined value may be 0.01, 0.02, 0.05, 0.001, or the like.
In some embodiments, the iterative training in step S32 may also be: a predetermined number of iterative training is performed. After the iterative training of the preset times, the loss value of the loss function of the second relation extraction model meets the convergence condition. For example, the predetermined number of times is 5 times, 10 times, 20 times, or the like.
In this way, in the embodiment of the present disclosure, the target relationship data of the target text may be extracted based on the trained second relationship extraction model, and the accuracy of extracting the target relationship data may be improved.
As shown in fig. 4, in some embodiments, the inputting of the sample information into the second relational extraction model in step S32 includes an iterative training process including:
step S321: inputting subject information and object information of at least one sample relation into the second relation extraction model, and constructing a loss value of current iteration training;
step S322: and updating the second relation extraction model based on the loss value to obtain the second relation extraction model after the current iterative training.
Illustratively, as shown in fig. 5, the second relational extraction model includes: an encoding layer, a subject recognition layer, and an object recognition layer.
In some embodiments, the step S321 includes:
step S321A: acquiring coding information of each training sample based on each training sample;
step S321B: obtaining predicted subject information of the training samples based on the coding information and a subject classifier, wherein the predicted subject information comprises: predicting subject heading pointer information and predicting subject suffix pointer information;
step S321C: obtaining prediction object information of the training samples based on the coding information and an object classifier, wherein the prediction object information comprises: predicting object head pointer information and predicting object tail pointer information;
step S321D: and obtaining a loss value of the current iteration training based on subject information of at least one sample relation of the training samples and corresponding predicted subject information, object information and corresponding predicted object information.
In some embodiments, the step S321A includes:
inputting each training sample into a pre-training model to obtain the coding information of each training sample; the pre-training model comprises the corresponding relation between each alternative word and the coding information.
Here, the pre-training model may be any model that can enable encoding of text information. For example, the pre-training model may be, but is not limited to, a BERT model or an enhanced BERT model, etc.
Illustratively, the BERT model is a depth-based bi-directional text-coded text representation text proposed by Google (Google). The BERT model comprises 6 same modules which are stacked, wherein each module consists of 2 sublayers; the first sublayer is a multi-headed self-attention layer and the second sublayer is a fully-connected layer. Here, each sub-layer is connected with the residual by using a normalization layer.
Here, the coding information of the training samples includes: the vector representation of each word of the training sample or the vector representation of at least part of the words in the training sample; a vector characterization is used to characterize a word in the training sample. Here, a vector characterization of a word may be used to characterize the word. For example, training samples are input to the BERT model; if the training sample is 30 words and the BERT is 300 dimensions, the output coding information is 30 vector representations with 300 dimensions; if the ith vector in the coded information is obtained to characterize the available x i Wherein i is an integer greater than 0.
In one embodiment, the BERT model is provided in an encoding layer of the second relational extraction model.
Of course, in other embodiments, the coding model of each training sample may also be determined directly based on each training sample and the coding rule. Here, the pre-training model and the encoding rule in the present disclosure are not limited, and only the obtained encoding information may represent the vector representation of each word in each training sample.
Here, the subject classifier includes: a subject head pointer classifier and/or a subject tail pointer classifier. The subject head pointer classifier is for determining whether a word is a predicted subject head pointer; a subject tail pointer classifier is used to determine whether a word is a predicted subject tail pointer.
In one embodiment, the step S321B includes:
determining whether the word corresponding to each vector token in the training sample is a predicted subject head pointer based on each vector token in the coding information and a subject head pointer classifier; determining the predicted subject head pointer information based on the position of each predicted subject head pointer in the training sample;
determining whether the words corresponding to the vector tokens in the training sample are predicted subject tail pointers or not based on the vector tokens in the coding information and an object head pointer classifier; and determining the predicted subject language tail pointer information based on the position of each predicted subject language tail pointer in the training sample.
Here, the predicted subject head pointer information may be used to indicate the location of the predicted subject head pointer in the training sample; the predicted subject tail pointer information may be used to indicate the location of the predicted subject tail pointer in the training samples.
Here, one way to determine whether a word corresponding to each vector token in a training sample is a predicted subject head pointer based on each vector token in the coding information and a subject head pointer classifier is as follows: based on the formula
Figure BDA0003530578990000131
Determining; wherein x is i The vector characterization of the ith word in the training sample is carried out; w start 、b start Parameters of a subject header pointer classifier; σ () is an activation function (sigmoid), and σ (W) start x i +b start ) For mixing W start x i +b start The value of (d) maps between 0 and 1;
Figure BDA0003530578990000132
is the ith subject head pointer score. The subject heading pointer score
Figure BDA0003530578990000133
When the number is larger than or equal to the preset score, determining the word as a predicted subject language head pointer; alternatively, the subject header pointer score
Figure BDA0003530578990000134
If the score is less than the predetermined score, the word is determined to be the predicted subject head pointer. In one embodiment, the predetermined fraction is 0.5.
Here, one way to determine whether a word corresponding to each vector token in a training sample is a predicted subject end pointer based on each vector token in the coding information and a subject head pointer classifier is as follows: based on the formula
Figure BDA0003530578990000135
Determining; wherein x is i The vector characterization of the ith word in the training sample is carried out; w end 、b end Parameters of a subject tail pointer classifier; σ () is an activation function (sigmoid), and σ (W) end x i +b end ) For mixing W end x i +b end The value of (d) maps between 0 and 1;
Figure BDA0003530578990000136
is the ith subject tail pointer score. The subject tail pointer score
Figure BDA0003530578990000137
When the score is larger than or equal to the preset score, the word is determined as a predicted subject tail pointer; alternatively, the subject tail pointer score
Figure BDA0003530578990000138
If the score is less than the predetermined score, the word is determined to be the predicted subject tail pointer. In one embodiment, the predetermined fraction is 0.5.
In one embodiment, a subject classifier is disposed in the subject recognition layer of the second relational extraction model.
As such, in embodiments of the present disclosure, it may be determined whether each word is a predicted subject head pointer or subject tail pointer based on the vector characterization and subject classifier (including subject head pointer classifier and subject tail pointer classifier) of each word of each training sample.
Here, the object classifier includes: an object head pointer classifier and/or an object tail pointer classifier. The object head pointer classifier for determining whether the word is a predicted object head pointer; an object tail pointer classifier is used to determine whether a word is a predicted object tail pointer.
In one embodiment, the step S321C includes:
determining whether the words corresponding to the vector representations in the training samples are predicted object head pointers or not based on the vector representations in the coding information and the object head pointer classifier; and determining the predicted object head pointer information based on the predicted position of the object head pointer in the training sample;
determining whether the words corresponding to the vector representations in the training samples are predicted object tail pointers or not based on the vector representations in the coding information and the object tail pointer classifier; and determining the predicted object language tail pointer information based on the position of the predicted object language tail pointer in the training sample.
Here, the predicted object head pointer information may be used to indicate the position of the predicted object head pointer at the training sample; and the predicted guest language tail pointer information is used for indicating the position of the predicted subject language tail pointer in the training sample.
Here, one way to determine whether each word corresponding to an individual token in a training sample is a predicted object tail pointer based on each vector token in the coding information and an object tail pointer classifier is as follows: based on the formula
Figure BDA0003530578990000141
Determining; x is the number of i The vector characterization of the ith word in the training sample is carried out;
Figure BDA0003530578990000142
parameters of the classifier for the object head pointer; σ () is an activation function (sigmoid), which
Figure BDA0003530578990000143
For use in
Figure BDA0003530578990000144
The value of (d) maps between 0 and 1;
Figure BDA0003530578990000145
is the ith object head pointer score. The object head pointer score
Figure BDA0003530578990000146
When the number is larger than or equal to the preset fraction, determining the word as a predicted object head pointer; or, the object head pointer score
Figure BDA0003530578990000147
If the number is less than the predetermined score, the word is determined to be the predicted object head pointer. In one embodiment, the predetermined fraction is 0.5.
Here, one way to determine whether each word corresponding to an individual token in a training sample is a predicted object tail pointer based on each vector token in the coding information and an object tail pointer classifier is as follows: based on the formula
Figure BDA0003530578990000148
Determining; wherein x is i The vector characterization of the ith word in the training sample is carried out;
Figure BDA0003530578990000149
parameters of the classifier for the object tail pointer; σ () is an activation function (sigmoid), which
Figure BDA00035305789900001410
For use in
Figure BDA00035305789900001411
The value of (d) maps between 0 and 1;
Figure BDA00035305789900001412
is the ith object tail pointer score. The object tail pointer score
Figure BDA00035305789900001413
When the number is larger than or equal to the preset fraction, determining the word as a predicted object tail pointer; alternatively, the object tail pointer score
Figure BDA00035305789900001414
If the number is less than the predetermined score, the word is determined to be the predicted object tail pointer. In one embodiment, the predetermined fraction is 0.5.
In one embodiment, an object classifier is disposed in the object recognition layer of the second relational extraction model.
As such, in embodiments of the present disclosure, it may be determined whether each word is a predicted object head pointer or object tail pointer based on a vector characterization of each word and an object classifier (including an object head pointer classifier and an object tail pointer classifier) for each training sample.
In some embodiments, the method comprises:
carrying out fusion processing on the coding information of the training sample and the subject vector representation to obtain a fused vector representation; wherein the subject vector representation is determined based on the predicted subject head pointer information and predicted tail pointer information;
the step S321C includes: and acquiring the prediction object information of the training sample based on the fusion vector characterization and the object classifier.
Here, a subject vector token is a vector token that characterizes the subject.
Here, the subject vector representation may be determined based on a vector representation of each word in the predicted subject.
In one embodiment, the subject vector representation may be an average vector of the vector representations of the subject head pointer and the subject tail pointer. Such as a subjectThe vector characterization of the head pointer is x i The vector of the subject tail pointer is characterized as x i+n (ii) a The subject vector is characterized as
Figure BDA0003530578990000151
In another embodiment, the subject vector characterization may be an average vector of vector characterizations of words in the subject. For example, the kth subject in the training sample has three words, and the vectors of the three words are respectively characterized by x i 、x i+1 And x i+2 (ii) a The subject vector is characterized as
Figure BDA0003530578990000152
Here, one process of fusing the coding information and the subject vector characterization of the training samples may be: and performing fusion processing through a conditional layer normalization formula. Is exemplarily based on
Figure BDA0003530578990000153
Carrying out fusion processing on the formula; x is i The vector characterization of the ith word in the training sample is carried out;
Figure BDA0003530578990000154
characterizing for a kth subject vector;
Figure BDA0003530578990000155
and characterizing the fused vector after fusion. Here, the
Figure BDA0003530578990000156
One implementation form of (1) is:
Figure BDA0003530578990000157
wherein,
Figure BDA0003530578990000158
and
Figure BDA0003530578990000159
to fuse the kth principalParameters characterized by subject vectors of the words; μ, σ are the mean and variance, respectively, of the vector characterization of each word in the training sample. The
Figure BDA00035305789900001510
Where N is the number of words of the training sample. The
Figure BDA00035305789900001511
Figure BDA00035305789900001512
Wherein γ, β represent zoom and pan parameters; the vector dimension and subject vector characterization of gamma, beta
Figure BDA00035305789900001513
The dimensions are the same.
Of course, in other embodiments, the fusion processing may be performed on the coding information of the training samples and the subject vector characterization in various ways, for example, the subject vector characterization may be scaled by a predetermined size and/or translated by a predetermined position; the specific fusion processing method is not limited herein.
In some embodiments, said obtaining said prediction object information for said training samples based on said fusion vector characterization and said object classifier comprises:
determining whether a word corresponding to each fused vector representation in the training sample is a predicted object head pointer based on the fused vector representations and the object head pointer classifier; determining object head pointer information based on the predicted position of the object head pointer in the training sample;
determining whether a word corresponding to each fused vector representation in the training sample is a predicted object tail pointer based on the fused vector representations and an object tail pointer classifier; and determining the information of the object tail pointer based on the position of the predicted object tail pointer in the training sample.
Here, a method determines whether a word corresponding to each fused vector token in the training sample is pre-determined based on the fused vector tokens and an object head pointer classifierThe mode of measuring the object head pointer is as follows: based on the formula
Figure BDA0003530578990000161
Determining; wherein,
Figure BDA0003530578990000162
characterizing the fused vector after fusion;
Figure BDA0003530578990000163
parameters of the classifier are the object head pointer; σ () is an activation function (sigmoid), which
Figure BDA0003530578990000164
For use in
Figure BDA0003530578990000165
The value of (d) maps between 0 and 1;
Figure BDA0003530578990000166
is the ith object head pointer score. The object head pointer score
Figure BDA0003530578990000167
When the number is larger than or equal to the preset fraction, determining the word as a predicted object head pointer; alternatively, the object head pointer score
Figure BDA0003530578990000168
If the number is less than the predetermined score, the word is determined to be the predicted object head pointer. In one embodiment, the predetermined fraction is 0.5.
Here, one way to determine whether the word corresponding to each of the fused vector tokens in the training sample is a predicted object tail pointer based on the fused vector tokens and the object tail pointer classifier is to: based on the formula
Figure BDA0003530578990000169
Determining; wherein, among others,
Figure BDA00035305789900001610
characterizing the fused vector after fusion;
Figure BDA00035305789900001611
parameters of the classifier for the object tail pointer; σ () is an activation function (sigmoid), which
Figure BDA00035305789900001612
For connecting with
Figure BDA00035305789900001613
The value of (d) maps between 0 and 1;
Figure BDA00035305789900001614
is the ith object tail pointer score. The object tail pointer score
Figure BDA00035305789900001615
When the number is greater than or equal to the preset fraction, determining the word as a predicted object tail pointer; alternatively, the object tail pointer score
Figure BDA00035305789900001616
If the number is less than the predetermined score, the word is determined to be the predicted object tail pointer. In one embodiment, the predetermined fraction is 0.5.
In the embodiment of the disclosure, the subject vector characterization and the coding information are fused, that is, the subject vector characterization is fused into the vector characterization of each word of the training sample, so that the perception of the subject can be enhanced; if whether the words corresponding to the fusion vector characterization features are the objects is determined based on the fusion vector characterization and the object classifier after fusion, the association degree between the subject and the objects can be improved.
When the target text is recognized based on the trained second extraction model, possible subjects in the target text can be recognized, and objects in corresponding relation are recognized based on the subjects; therefore, the problem of entity overlapping can be effectively solved, namely when a plurality of relationships exist in the target text, the identification of entity relationship pairs (namely target relationship data) of all the relationships in the target text can be improved.
In some embodiments, the method comprises:
if one training sample comprises at least two subjects, determining predicted subject tail pointer information corresponding to the predicted subject head pointer information of one subject within a predetermined range of a position indicated by the predicted subject head pointer information of the subject.
Here, the predetermined range means a predetermined position range. For example, the 1 st position is within a predetermined range of positions from the 3 rd position.
For example, please refer to fig. 5 again, if there are two subjects; wherein the head pointers 'king' and 'c' of the two subjects are respectively at the 1 st and the x-th positions in the training sample; and the subject's tail pointers "two" and "cliques" are at the 3 rd and y th positions in the training sample, respectively. The subject head pointer "king" and the subject tail pointer "two" can be matched to the subject "wang one" and the subject head pointer "c" and the subject tail pointer "clique" can be matched to the subject "c-butyl group" based on the principle of close-up matching.
Here, when the subject head pointer and the subject tail pointer are matched based on the nearby matching principle, the rule that the position of the training sample where the subject tail pointer of one subject is located is behind the position of the training sample of the subject head pointer can be followed.
Thus, in the embodiment of the present disclosure, when there are multiple subjects in one training sample, the accuracy of recognizing the subject head pointer and subject tail pointer of each subject can be improved, thereby improving the correctness of the recognized subjects.
In some embodiments, the step S321D includes:
obtaining a subject loss value based on the subject information and predicted subject information of at least one of the sample relationships of at least one of the training samples;
obtaining an object loss value based on the object information and predicted object information of at least one of the sample relationships of at least one of the training samples; wherein the object loss value comprises: a first object loss value representing that the subject has an object relationship and a second object loss value representing that the subject does not have an object relationship;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one training sample.
Here, the second relation extraction model includes a loss value of a loss function including: subject loss value and object loss value; wherein the object loss value comprises: a first object loss value and a second loss value.
In one embodiment, one way to implement step S321D is:
Figure BDA0003530578990000171
wherein | D | is the number of training samples, and | D | is an integer greater than 0; s denotes a subject and o denotes an object. x is the number of j Represents the jth training sample, where j is an integer greater than 0; t is j Representing the set of triples present in the jth training sample. s is equal to T j Indicating that the subject s appears in the triplet T j Performing the following steps; t is j S represents a triple set headed by the subject s, r belongs to T j | s represents all the triple sets of the triple in which the subject s is; r is all possible triple sets; r \ T j | s represents the set of triples other than the triplet in which the subject s is located; r is in the same place as R \ T j I s denotes the subject: and (4) other triple sets outside the triple in which the subject is located. θ, δ are the parameters of the subject recognition layer and the object recognition layer, respectively.
For example, if in one training in the second relationship extraction model, the possible set of triplets for each training sample is R, where one subject is "one and two". Then
Figure BDA0003530578990000181
The subject head pointer loss value and subject tail pointer loss value of the subject "zhangyi" in the jth training sample can be obtained;
Figure BDA0003530578990000182
is the subject loss value.
Figure BDA0003530578990000183
The loss values of the head pointer and the tail pointer of the object corresponding to the object of 'Tang three four' in the jth training sample can be set;
Figure BDA0003530578990000184
is a first object loss value.
Figure BDA0003530578990000185
The object head pointer loss value and the tail pointer loss value of other objects except the object corresponding to the 'one two times' in the jth training sample are obtained;
Figure BDA0003530578990000186
is a second object loss value.
Here, one implementation of obtaining the subject loss value based on the subject information and the predicted subject information of the sample relationship is as follows: based on
Figure BDA0003530578990000187
Determining; wherein, P θ Is a subject loss value;
Figure BDA0003530578990000188
is subject information;
Figure BDA0003530578990000189
is predictive subject information. Here, ,
Figure BDA00035305789900001810
or a subject score corresponding to the subject information,
Figure BDA00035305789900001811
the predicted subject score may be a predicted subject score corresponding to the predicted subject information. Here, ,
Figure BDA00035305789900001812
may also include subject header pointer information and a subjectSuffix pointer information, e.g.
Figure BDA00035305789900001813
Is the average of the subject head pointer information and the subject tail pointer information.
Figure BDA00035305789900001814
It may also include predicted subject header pointer information and predicted subject suffix pointer information, e.g.
Figure BDA00035305789900001815
Is the average of the predicted subject header pointer information and the predicted subject suffix pointer information. Exemplaryly,
Figure BDA00035305789900001816
is the subject loss value of the jth training sample.
Here, one implementation of obtaining an object loss value based on the object information and the prediction object information of the sample relationship is: based on P δ =P δ1 +P δ2 (ii) a Wherein,
Figure BDA00035305789900001817
Figure BDA00035305789900001818
wherein, P δ As object loss value, P δ1 Is a first object loss value, P δ2 A second object loss value;
Figure BDA00035305789900001819
the object information of the first object and the object information of the second object are respectively;
Figure BDA00035305789900001820
the predicted object information of the first object and the predicted object information of the second object are respectively. Here, ,
Figure BDA00035305789900001821
object letters which may also be the first object, respectivelyThe corresponding object score and the object score corresponding to the object information of the second object;
Figure BDA00035305789900001822
the prediction object score corresponding to the prediction object information of the first object and the prediction object information corresponding to the prediction object information of the second object may be the same. Here, ,
Figure BDA00035305789900001823
it is also possible to include the object head pointer information and the object tail pointer information of the first object, for example, as an average value of the object head pointer information and the object tail pointer information of the first object.
Figure BDA0003530578990000191
It is also possible to include the object head pointer information and the object tail pointer information of the second object, for example, as an average value of the object head pointer information and the object tail pointer information of the second object.
Figure BDA0003530578990000192
The predicted object head pointer information and the predicted object tail pointer information of the first object may also be included, for example, as an average of the predicted object head pointer information and the predicted object tail pointer information of the first object.
Figure BDA0003530578990000193
The predicted object head pointer information and the predicted object tail pointer information of the second object may be included, for example, as an average value of the predicted object head pointer information and the predicted object tail pointer information of the second object. Exemplaryly,
Figure BDA0003530578990000194
a first object loss value for a jth training sample;
Figure BDA0003530578990000195
a second object loss value for the jth training sample.
Here, the iterative training of the second relation extraction model is performed every timeUpdating parameters in the second relation extraction model based on the loss value; for example, parameters of the subject classifier and parameters of the object classifier in the second extraction model may be updated, such as W start 、b start 、W end 、b end
Figure BDA0003530578990000196
And
Figure BDA0003530578990000197
and the like. Therefore, when the obtained trained second relation extraction model trains the target file, the obtained target relation data is closer to the real relation data. Here, the more the number of iterative training times of the second relationship extraction model is, the smaller the loss value of the loss function is; the closer the target relationship data of the target file is trained based on the trained second relationship model is to the real relationship data.
In the embodiment of the present disclosure, the loss value may be obtained by continuously passing through the subject information, the predicted subject information, the object information, and the predicted object information of each sample relationship in each training sample, so that the second relationship extraction model may be continuously updated based on the loss value; therefore, the performance of the second relation extraction model can be better and better.
In some embodiments, the obtaining a loss value of a current iteration training based on a sum of the subject loss value and the object loss value of at least one of the training samples includes:
obtaining an object loss value weighted by the training sample based on the object loss value and the weighting coefficient of the training sample;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the weighted object loss value of at least one training sample.
Here, one way to determine the weight coefficients is: based on
Figure BDA0003530578990000198
Determining; wherein, lw i Is the weight coefficient in the relation in the k; n is k The number of the relation in the k < th > training sample; n is a radical of R Represents the total number of all relationships; λ ═ N R /max(n i ) L, where max (n) i ) The number of the relationship with the largest number in all the relationships is obtained; a is an index parameter. In one embodiment, a is greater than 1; for example, a is 2.5 or 3. It can be understood that: because the number of each relation in the training sample is not balanced, if the number of the relation of the 'place of birth' is far larger than the number of the relation of the 'area'; thus, the weight coefficient lw can be added to each relation in the object loss value k To decrease the weight of a relatively high number of relationships and to increase the weight of a relatively low number of relationships.
Here, one way to obtain the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one of the training samples is to:
Figure BDA0003530578990000201
wherein k1 and k2 are integers more than 0;
Figure BDA0003530578990000202
weighting the loss value of the first object in the jth training sample;
Figure BDA0003530578990000203
and weighting the object loss value of the second object in the jth training sample.
In the embodiment of the disclosure, the problem of unbalanced number of each relation in the training sample can be relieved by introducing a weight coefficient into the object loss value; therefore, the accuracy of the extraction of the second relation extraction model can be further improved.
As shown in fig. 6, an embodiment of the present disclosure provides an entity relationship extraction method, where the method includes:
step S41: obtaining sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
step S42: inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, and obtaining the trained second relation extraction model;
step S43: inputting the target text into the trained second relation extraction model to obtain the target relation data of the target text.
Here, the steps S41, S42, and S43 are similar to steps S31, S32, and S33, respectively; the entity relationship extraction method disclosed in the embodiments of the present disclosure may refer to any of the embodiments described above, and is not described herein again.
It should be noted that, as can be understood by those skilled in the art, the method provided in the embodiment of the present disclosure may be executed alone, or may be executed together with some methods in the embodiment of the present disclosure or some methods in the related art.
To further explain any embodiment of the present disclosure, a specific embodiment is provided below.
The embodiment of the disclosure provides an entity relationship extraction method, which is implemented based on a second relationship extraction model; the second relation extraction model comprises a coding layer, a subject recognition layer and an object recognition layer; the method comprises the following steps:
step S51: inputting at least one training sample into a coding layer for coding so as to obtain coding information of each training sample; the coding information includes a vector characterization of each word in the training sample. The coding layer may employ a BERT model structure.
Step S52: inputting the coding information into a subject recognition layer, and determining whether each word in each training sample is a predicted subject head pointer or a predicted subject tail pointer through a subject classifier of the subject recognition layer; and obtaining predicted subject head pointer information based on the predicted subject head pointer and predicted subject tail pointer information based on the predicted tail pointer. For example, can be obtained by
Figure BDA0003530578990000211
Determining whether each word in the training sample is a predicted subject head pointer, and passing
Figure BDA0003530578990000212
It is determined whether each word in the training sample is a predicted object head pointer.
Here, the encoding information of the input subject recognition layer may be labeled to mark the starting position and the ending position of the subject to obtain subject information of each subject; the subject information includes: subject header pointer information and subject tail pointer information.
Here, if a plurality of subjects exist in the training sample, the subject head pointer and subject tail pointer of each subject can be matched based on the nearby matching principle.
Step S53: inputting the encoded information into an object recognition layer: obtaining subject vector representation of each subject based on the mean value of the vector representation of each word of each subject; performing fusion processing based on the coding information of the training sample and the subject vector representation to obtain a fused vector representation; acquiring whether each word of each training sample is a predicted object head pointer or a predicted object tail pointer or not based on the fusion vector characterization and an object classifier of the object recognition layer; and obtaining predicted object head pointer information based on the predicted object head pointer and obtaining predicted object tail pointer information based on the predicted object tail pointer. For example, can be obtained by
Figure BDA0003530578990000213
Obtaining subject vector representations of subjects, wherein i and n are integers larger than 0; can pass through
Figure BDA0003530578990000214
Obtaining a subject vector representation of a kth subject; wherein,
Figure BDA0003530578990000215
can pass through
Figure BDA0003530578990000216
Determining whether each word in the training sample is a predicted object head pointer, and passing
Figure BDA0003530578990000217
It is determined whether each word in the training sample is a predicted object tail pointer.
Here, the encoded information of the input object recognition layer may be labeled to label the start position and the end position of the object to obtain the object information of each object; the subject information includes: object head pointer information and object tail pointer information.
Step S54: in the object recognition layer, a loss value is obtained based on subject information, predicted subject information, object information, and predicted object information of each sample relationship of each training sample. For example, by
Figure BDA0003530578990000218
Obtaining a loss value; wherein the subject loss value
Figure BDA0003530578990000219
Can be determined based on subject information and predicted subject information, object loss value
Figure BDA00035305789900002110
And
Figure BDA00035305789900002111
the determination may be based on the object information and the predicted object information.
Here, one sample relationship may correspond to one subject or one object; the predicted subject information comprises predicted subject head pointer information and predicted subject tail pointer information; the prediction object information comprises prediction object head pointer information and prediction object tail pointer information.
Step S55: at the object recognition layer, a weighting factor may be added to the object loss value to obtain an updated loss value. For example, an updated loss value is obtained based on a sum of the subject loss value of the at least one training sample and the object loss value weighted based on the weighting factor. The weighting factor may be based on
Figure BDA0003530578990000221
And (4) determining.
For implementation of the embodiments of the present disclosure, reference may be made to the implementation manners in the embodiments described above, and details are not described herein again.
In the embodiment of the disclosure, if the target text is encoded by the encoding layer, all possible subjects in each training sample can be identified by the subject identification layer; and after fusing each predicted subject and the coding information, recognizing possible objects with corresponding relations through an object recognition layer. For example: inputting a training sample of 'Wang-Yi' born in Dalian City of Liaoning province and 'Wang-Li-Y) and's-Li; aiming at the subject of 'Wang-two', the subject of 'Liaoning Dalian City' corresponding to the 'place of birth' relationship, the subject of 'Wang-three-four' corresponding to the 'father' relationship and the subject of 'Wang-three-four' corresponding to the 'president' relationship can be predicted through the subject recognition layer. And finally three triples of target text data of' Wang-second, Sheng Di, Dalian City of Liaoning province >, < Wang-second, father, Wang-third-fourth >, < propyl group, president and Wang-third-fourth > are obtained.
In some embodiments, the terminal may predefine the sample relationships in the sample data 43, and input the target text into the first relationship extraction model (e.g. the original model of Wei et al), the second relationship extraction model (adding weight coefficients), and the second relationship extraction model (adding a fusion processing manner) respectively for performing the identification experiments, and the identification result of the identified target text data may be as shown in table 1 below:
Figure BDA0003530578990000222
TABLE 1
Here, the integrated probability includes an accuracy rate and a recall rate. As shown in table 1, if the number of each sample relationship in the sample data is not uniform, the loss value of each sample relationship is optimized (for example, the weight coefficient of the object loss value is increased); the accuracy of the second relation extraction model is improved by 0.7%, and the comprehensive probability is improved by 0.2%. If the subject perception is insufficient when the object is predicted by aiming at the second relation extraction model, performing fusion processing on the subject vector representation by adopting a condition layer normalization method; the accuracy of the second relation extraction model is improved by about 1%, and the comprehensive probability is further improved by 0.4%.
The description texts of the entities in the knowledge graph are extracted in a triple manner through a second relation extraction model (a weight coefficient is added and a fusion processing mode is added), the accuracy rate of the randomly selected entity texts can be 92.88%, and the recall rate can be 83.02%; the entity text (top entity text) accuracy may be 91.24%, and the recall may be 64.07%.
In an expansion frame adopting a second relation extraction model, new samples are iterated to be related to 44 types, and the accuracy is improved by 0.8%; the whole process can ensure that the expansion time of the relation of a single new sample is about 2 days on the premise of not influencing the accuracy of the relation of the original sample.
It is understood that each of the elements in table 1 above are present independently and are exemplarily listed in the same table, but do not mean that all the elements in the table must be present at the same time according to the table. Wherein the value of each element is independent of any other element value in table 1. Therefore, as will be understood by those skilled in the art, the values of each of the elements in table 1 are independent embodiments.
Fig. 7 provides an entity relationship extracting apparatus shown in an exemplary embodiment, which is applied to a terminal; as shown in fig. 7, the apparatus includes:
a first obtaining module 61, configured to obtain first relationship data of at least one training sample;
a first identification module 62, configured to input the training sample in which the first relationship data is located into the first relationship extraction model for identification, so as to obtain second relationship data of the training sample;
a first processing module 63, configured to input the first relationship data and the second relationship data into the first relationship extraction model for iteration, so as to update the first relationship extraction model to obtain a second relationship extraction model;
and a second processing module 64, configured to input the target text into the second relation extraction model for training, so as to obtain target relation data of the target text.
As shown in fig. 8, in some embodiments, the apparatus comprises:
a second obtaining module 65, configured to obtain sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
the first processing module 63 is configured to input the sample information to the second relationship extraction model for iterative training until a loss function of the second relationship extraction model meets a convergence condition, so as to obtain the trained second relationship extraction model;
the second processing module 64 is configured to input the target text into the trained second relationship extraction model to obtain the target relationship data of the target text.
In some embodiments, the first processing module 63 is configured to input subject information and object information of at least one of the sample relationships into the second relationship extraction model, and construct a loss value of the current iteration training;
the first processing module 63 is configured to update the second relationship extraction model based on the loss value, so as to obtain the second relationship extraction model after current iterative training.
In some embodiments, the first processing module 63 is configured to perform the following steps:
acquiring coding information of each training sample based on each training sample;
obtaining predicted subject information of the training samples based on the coding information and a subject classifier, wherein the predicted subject information comprises: predicting subject header pointer information and predicting subject suffix pointer information;
obtaining prediction object information of the training samples based on the coding information and an object classifier, wherein the prediction object information comprises: predicting object head pointer information and predicting object tail pointer information;
and obtaining a loss value of the current iteration training based on subject information of at least one sample relation of the training samples and corresponding predicted subject information, object information and corresponding predicted object information.
In some embodiments, the first processing module 63 is configured to perform fusion processing on the coding information and the subject vector characterization of the training sample to obtain a fused fusion vector characterization; wherein the subject vector representation is determined based on the predicted subject head pointer information and predicted tail pointer information;
the first processing module 63 is configured to obtain the prediction object information of the training sample based on the fusion vector characterization and the object classifier.
In some embodiments, the first processing module 63 is configured to, if one of the training samples includes at least two subjects, determine the predicted subject end pointer information corresponding to the predicted subject head pointer information of one of the subjects within a predetermined range of a position indicated by the predicted subject head pointer information of the subject.
In some embodiments, the first processing module 63 is configured to perform the following steps:
obtaining a subject loss value based on the subject information and predicted subject information of at least one of the sample relationships of at least one of the training samples;
obtaining an object loss value based on the object information and predicted object information of at least one of the sample relationships of at least one of the training samples; wherein the object loss value comprises: a first object loss value representing that the subject has an object relationship and a second object loss value representing that the subject does not have an object relationship;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one training sample.
In some embodiments, the first processing module 63 is configured to obtain a weighted object loss value of the training samples based on the object loss value and the weighting coefficients of the training samples;
the first processing module 63 is configured to obtain a loss value of the current iterative training based on a sum of the subject loss value and the weighted object loss value of at least one of the training samples.
In some embodiments, the first processing module 63 is configured to input each of the training samples into a pre-training model, and obtain the coding information of each of the training samples; the pre-training model comprises the corresponding relation between each alternative word and the coding information.
As shown in fig. 9, there is provided an entity relationship extraction apparatus, the apparatus including:
a second obtaining module 65, configured to obtain sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
the first processing module 63 is configured to input the sample information to the second relationship extraction model for iterative training until a loss function of the second relationship extraction model meets a convergence condition, so as to obtain a trained second relationship extraction model;
a second processing module 64, configured to input a target text into the trained second relationship extraction model to obtain the target relationship data of the target text.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
An embodiment of the present disclosure further provides a terminal, which includes:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: and when the executable instruction is run, the entity relationship extraction method according to any embodiment of the disclosure is realized.
The memory may include various types of storage media, which are non-transitory computer storage media capable of continuing to remember the information stored thereon after a communication device has been powered down.
The processor may be connected to the memory via a bus or the like for reading the executable program stored on the memory, for example, for implementing at least one of the methods shown in fig. 1, 3, 4 and 6.
Embodiments of the present disclosure also provide a computer-readable storage medium, where an executable program is stored, where the executable program, when executed by a processor, implements the entity relationship extraction method according to any embodiment of the present disclosure. For example, at least one of the methods shown in fig. 1, 3, 4, and 6 is implemented.
With regard to the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
Fig. 10 is a block diagram illustrating a terminal 800 according to an example embodiment. For example, the terminal 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
Referring to fig. 10, terminal 800 can include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the terminal 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on terminal 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
A power supply component 806 provides power to the various components of the terminal 800. Power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal 800.
The multimedia component 808 includes a screen providing an output interface between the terminal 800 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the terminal 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
Sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for terminal 800. For example, sensor assembly 814 can detect the open/closed state of device 800, the relative positioning of components, such as a display and keypad of terminal 800, sensor assembly 814 can also detect a change in position of terminal 800 or a component of terminal 800, the presence or absence of user contact with terminal 800, orientation or acceleration/deceleration of terminal 800, and a change in temperature of terminal 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
Communication component 816 is configured to facilitate communications between terminal 800 and other devices in a wired or wireless manner. The terminal 800 may access a wireless network based on a communication standard, such as WiFi, 4G or 5G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the terminal 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium including instructions, such as the memory 804 including instructions, executable by the processor 820 of the terminal 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (22)

1. An entity relationship extraction method, the method comprising:
acquiring first relation data of at least one training sample;
inputting the training sample where the first relation data is to the first relation extraction model for recognition to obtain second relation data of the training sample;
inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model;
and inputting the target text into the second relation extraction model for training to obtain target relation data of the target text.
2. The method according to claim 1, characterized in that it comprises:
obtaining sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, so as to obtain a trained second relation extraction model;
inputting the target text into the second relation extraction model for training to obtain the target relation data of the target text, including:
and inputting the target text into the trained second relation extraction model to obtain the target relation data of the target text.
3. The method of claim 2, wherein the inputting the sample information into the second relational extraction model for one iterative training comprises:
inputting subject information and object information of at least one sample relation into the second relation extraction model, and constructing a loss value of current iteration training;
and updating the second relation extraction model based on the loss value to obtain the second relation extraction model after the current iterative training.
4. The method of claim 3, wherein inputting subject information and object information of at least one of the sample relationships into the second relationship extraction model to construct a loss value for a current iteration of training comprises:
acquiring coding information of each training sample based on each training sample;
based on the coding information and the subject classifier, obtaining predicted subject information of the training sample, wherein the predicted subject information comprises: predicting subject heading pointer information and predicting subject suffix pointer information;
obtaining prediction object information of the training samples based on the coding information and an object classifier, wherein the prediction object information comprises: predicting object head pointer information and predicting object tail pointer information;
and obtaining a loss value of the current iteration training based on subject information of at least one sample relation of the training samples and corresponding predicted subject information, object information and corresponding predicted object information.
5. The method of claim 4, wherein the method comprises:
carrying out fusion processing on the coding information of the training sample and the subject vector representation to obtain a fused vector representation; wherein the subject vector representation is determined based on the predicted subject head pointer information and predicted tail pointer information;
the obtaining of the predicted object information of the training samples based on the coding information and the object classifier includes:
and acquiring the prediction object information of the training sample based on the fusion vector characterization and the object classifier.
6. The method of claim 4, wherein the method comprises:
if one of the training samples includes at least two subjects, the predicted subject tail pointer information corresponding to the predicted subject head pointer information of one of the subjects is determined within a predetermined range of a position indicated by the predicted subject head pointer information of the subject.
7. The method of claim 4, wherein obtaining a loss value for a current iteration of training based on subject information and corresponding predicted subject information, object information and corresponding predicted object information for at least one of the sample relationships of the training samples comprises:
obtaining a subject loss value based on the subject information and predicted subject information of at least one of the sample relationships of at least one of the training samples;
obtaining an object loss value based on the object information and predicted object information of at least one of the sample relationships of at least one of the training samples; wherein the object loss value comprises: a first object loss value representing that the subject has an object relationship and a second object loss value representing that the subject does not have an object relationship;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one training sample.
8. The method of claim 7, wherein obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one of the training samples comprises:
obtaining an object loss value weighted by the training sample based on the object loss value and the weighting coefficient of the training sample;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the weighted object loss value of at least one training sample.
9. The method of claim 4, wherein obtaining the encoded information for each of the training samples based on each of the training samples comprises:
inputting each training sample into a pre-training model to obtain the coding information of each training sample; the pre-training model comprises the corresponding relation between each alternative word and the coding information.
10. An entity relationship extraction method, the method comprising:
obtaining sample information of at least one training sample; wherein the sample information comprises: subject information and object information of at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, and obtaining the trained second relation extraction model;
inputting the target text into the trained second relation extraction model to obtain the target relation data of the target text.
11. An entity relationship extraction apparatus, the apparatus comprising:
the first obtaining module is used for obtaining first relation data of at least one training sample;
the first identification module is used for inputting the training sample where the first relation data is located into the first relation extraction model for identification so as to obtain second relation data of the training sample;
the first processing module is used for inputting the first relation data and the second relation data into the first relation extraction model for iteration so as to update the first relation extraction model to obtain a second relation extraction model;
and the second processing module is used for inputting the target text into the second relation extraction model for training so as to obtain the target relation data of the target text.
12. The apparatus of claim 11, wherein the apparatus comprises:
the second acquisition module is used for acquiring sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
the first processing module is configured to input the sample information to the second relationship extraction model for iterative training until a loss function of the second relationship extraction model meets a convergence condition, so as to obtain a trained second relationship extraction model;
the second processing module is configured to input the target text into the trained second relationship extraction model to obtain the target relationship data of the target text.
13. The apparatus of claim 12,
the first processing module is used for inputting the subject information and the object information of at least one sample relation into the second relation extraction model and constructing a loss value of current iterative training;
and the first processing module is used for updating the second relation extraction model based on the loss value to obtain the second relation extraction model after current iteration training.
14. The apparatus of claim 13, wherein the first processing module is configured to perform the following steps:
acquiring coding information of each training sample based on each training sample;
based on the coding information and the subject classifier, obtaining predicted subject information of the training sample, wherein the predicted subject information comprises: predicting subject heading pointer information and predicting subject suffix pointer information;
obtaining prediction object information of the training samples based on the coding information and an object classifier, wherein the prediction object information comprises: predicting object head pointer information and predicting object tail pointer information;
and obtaining a loss value of the current iteration training based on subject information of at least one sample relation of the training samples and corresponding predicted subject information, object information and corresponding predicted object information.
15. The apparatus of claim 14,
the first processing module is used for carrying out fusion processing on the coding information and the subject vector representation of the training sample to obtain a fused vector representation; wherein the subject vector representation is determined based on the predicted subject head pointer information and predicted tail pointer information;
the first processing module is used for obtaining the prediction object information of the training sample based on the fusion vector characterization and the object classifier.
16. The apparatus of claim 14,
the first processing module is configured to, if one training sample includes at least two subjects, determine predicted subject end pointer information corresponding to predicted subject head pointer information of one subject within a predetermined range of a position indicated by the predicted subject head pointer information of the subject.
17. The apparatus of claim 14, wherein the first processing module is configured to perform the following steps:
obtaining a subject loss value based on the subject information and predicted subject information of at least one of the sample relationships of at least one of the training samples;
obtaining an object loss value based on the object information and predicted object information of at least one of the sample relationships of at least one of the training samples; wherein the object loss value comprises: a first object loss value representing that the subject has an object relationship and a second object loss value representing that the subject does not have an object relationship;
and obtaining the loss value of the current iteration training based on the sum of the subject loss value and the object loss value of at least one training sample.
18. The apparatus of claim 17,
the first processing module is used for obtaining an object loss value weighted by the training sample based on the object loss value and the weighting coefficient of the training sample;
the first processing module is used for obtaining a loss value of current iteration training based on the sum of the subject loss value and the weighted object loss value of at least one training sample.
19. The apparatus of claim 14,
the first processing module is configured to input each training sample into a pre-training model to obtain the coding information of each training sample; the pre-training model comprises the corresponding relation between each alternative word and the coding information.
20. An entity relationship extraction apparatus, the apparatus comprising:
the second acquisition module is used for acquiring sample information of at least one training sample; wherein the sample information comprises: subject information and object information for at least one sample relationship; the subject information comprises subject head pointer information and subject tail pointer information; the object information comprises object head pointer information and object tail pointer information;
the first processing module is used for inputting the sample information into the second relation extraction model for iterative training until a loss function of the second relation extraction model meets a convergence condition, so as to obtain the trained second relation extraction model;
and the second processing module is used for inputting the target text into the trained second relation extraction model so as to obtain the target relation data of the target text.
21. A terminal, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: when the executable instructions are executed, the entity relationship extraction method of any one of claims 1 to 9 or claim 10 is realized.
22. A computer-readable storage medium storing an executable program, wherein the executable program when executed by a processor implements the entity relationship extraction method of any one of claims 1-9 or claim 10.
CN202210203644.5A 2022-03-03 2022-03-03 Entity relationship extraction method, device, terminal and storage medium Pending CN115017324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210203644.5A CN115017324A (en) 2022-03-03 2022-03-03 Entity relationship extraction method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210203644.5A CN115017324A (en) 2022-03-03 2022-03-03 Entity relationship extraction method, device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN115017324A true CN115017324A (en) 2022-09-06

Family

ID=83066987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210203644.5A Pending CN115017324A (en) 2022-03-03 2022-03-03 Entity relationship extraction method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN115017324A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033717A (en) * 2022-08-12 2022-09-09 杭州恒生聚源信息技术有限公司 Triple extraction model training method, triple extraction method, device and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033717A (en) * 2022-08-12 2022-09-09 杭州恒生聚源信息技术有限公司 Triple extraction model training method, triple extraction method, device and equipment
CN115033717B (en) * 2022-08-12 2022-11-08 杭州恒生聚源信息技术有限公司 Triple extraction model training method, triple extraction method, device and equipment

Similar Documents

Publication Publication Date Title
CN108038103B (en) Method and device for segmenting text sequence and electronic equipment
CN111985240B (en) Named entity recognition model training method, named entity recognition method and named entity recognition device
CN112926339B (en) Text similarity determination method, system, storage medium and electronic equipment
CN111949802B (en) Construction method, device and equipment of knowledge graph in medical field and storage medium
CN110633577B (en) Text desensitization method and device
CN111368541B (en) Named entity identification method and device
CN113792207B (en) Cross-modal retrieval method based on multi-level feature representation alignment
CN111832316B (en) Semantic recognition method, semantic recognition device, electronic equipment and storage medium
CN109558599B (en) Conversion method and device and electronic equipment
CN108399914A (en) A kind of method and apparatus of speech recognition
CN113157910B (en) Commodity description text generation method, commodity description text generation device and storage medium
WO2023071562A1 (en) Speech recognition text processing method and apparatus, device, storage medium, and program product
CN114328838A (en) Event extraction method and device, electronic equipment and readable storage medium
CN116166843B (en) Text video cross-modal retrieval method and device based on fine granularity perception
EP3734472A1 (en) Method and device for text processing
CN112328793A (en) Comment text data processing method and device and storage medium
CN114880480A (en) Question-answering method and device based on knowledge graph
CN113158656A (en) Ironic content identification method, ironic content identification device, electronic device, and storage medium
CN112036174B (en) Punctuation marking method and device
WO2024179519A1 (en) Semantic recognition method and apparatus
CN115017324A (en) Entity relationship extraction method, device, terminal and storage medium
CN107832691B (en) Micro-expression identification method and device
CN112328809A (en) Entity classification method, device and computer readable storage medium
CN115062783B (en) Entity alignment method and related device, electronic equipment and storage medium
KR20210050484A (en) Information processing method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination