CN113033209B - Text relation extraction method and device, storage medium and computer equipment - Google Patents

Text relation extraction method and device, storage medium and computer equipment Download PDF

Info

Publication number
CN113033209B
CN113033209B CN202110569523.8A CN202110569523A CN113033209B CN 113033209 B CN113033209 B CN 113033209B CN 202110569523 A CN202110569523 A CN 202110569523A CN 113033209 B CN113033209 B CN 113033209B
Authority
CN
China
Prior art keywords
relation
candidate
question
score
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110569523.8A
Other languages
Chinese (zh)
Other versions
CN113033209A (en
Inventor
蒋海云
史树明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110569523.8A priority Critical patent/CN113033209B/en
Publication of CN113033209A publication Critical patent/CN113033209A/en
Application granted granted Critical
Publication of CN113033209B publication Critical patent/CN113033209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a text relation extraction method, a text relation extraction device, a storage medium and computer equipment, wherein the method comprises the following steps: acquiring a target text and a relation set of a target entity pair; predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations; updating scores according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations; and predicting the semantic relation of the target entity pair in the target text according to the first updated score. According to the embodiment of the application, the output result of the relation extraction model is verified through the question-answering system model, and the relation extraction performance of the model is effectively improved.

Description

Text relation extraction method and device, storage medium and computer equipment
Technical Field
The application relates to the technical field of information extraction, in particular to a text relation extraction method, a text relation extraction device, a storage medium and computer equipment.
Background
Information extraction is one of the core tasks of natural language understanding. In information extraction, relationship extraction is one of the most important subtasks. Relationship extraction aims at identifying the semantic relationship of an entity pair from the text that contains the entity pair.
Traditional relationship extraction mainly studies how to design effective features. In recent years, with the rise of deep learning, deep relation extraction has been widely studied. Current research is focused on how to design an efficient neural network architecture to automatically extract efficient relationship discrimination information from text and entities. The inventors generalize the previous studies into "feature level" and "model level" studies. However, due to various irresistible factors (such as limited data scale, difficult acquisition of data noise and optimal model architecture, etc.), the current relationship extraction is difficult to obtain significant performance improvement from a model level or a feature level.
Disclosure of Invention
The embodiment of the application provides a text relation extraction method, a text relation extraction device, a storage medium and computer equipment, which can verify the output result of a relation extraction model through a question-answering system model, effectively improve the relation extraction performance of the model and improve the accuracy of predicting semantic relations.
In a first aspect, a text relation extraction method is provided, where the method includes: acquiring a target text and a relation set of a target entity pair; predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain question-answering scores corresponding to each candidate relation in the candidate relations; updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations so as to obtain a first updated score corresponding to each candidate relation in the candidate relations; and predicting the semantic relation of the target entity pair in the target text according to the first updated score.
In a second aspect, a text relation extracting apparatus is provided, the apparatus including:
the acquiring unit is used for acquiring a target text and a relation set of a target entity pair; the first prediction unit is used for predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; the selecting unit is used for selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; the computing unit is used for inputting the candidate relations into a trained question-answering system model for processing so as to obtain a question-answering score corresponding to each candidate relation in the candidate relations; the updating unit is used for updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations so as to obtain a first updated score corresponding to each candidate relation in the candidate relations; and the second prediction unit is used for predicting the semantic relation of the target entity pair in the target text according to the first updated score.
In a third aspect, a computer-readable storage medium is provided, where a computer program is stored, where the computer program is adapted to be loaded by a processor to execute the steps in the text relation extracting method according to any of the above embodiments.
In a fourth aspect, a computer device is provided, where the computer device includes a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the steps in the text relation extraction method according to any one of the above embodiments by calling the computer program stored in the memory.
The method comprises the steps of obtaining a target text and a relation set of a target entity pair; then, predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; then selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations; then updating the scores of all candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations; and finally, predicting the semantic relation of the target entity pair in the target text according to the score after the first updating. According to the embodiment of the application, the output result of the relation extraction model is verified through the question-answering system model, the relation extraction performance of the model is effectively improved, and the accuracy of the predicted semantic relation is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1a is a schematic view of an application scenario of a text relationship extraction method provided in an application embodiment.
Fig. 1b is a first flowchart of a text relation extraction method according to an embodiment of the present application.
Fig. 1c is a schematic view of a first PR curve provided in the present application.
Fig. 1d is a schematic diagram of a second PR curve provided in the present application.
Fig. 1e is a schematic diagram of a third PR curve provided in the present application.
Fig. 1f is a schematic diagram of a fourth PR curve provided in the present application.
Fig. 2 is a second flow chart of the text relation extraction method according to the embodiment of the present application.
Fig. 3a is a schematic view of an application architecture of a blockchain network according to an embodiment of the present disclosure.
Fig. 3b is a schematic diagram of an alternative structure of a block chain in the block chain network 31 according to the embodiment of the present application.
Fig. 3c is a schematic functional architecture diagram of the blockchain network 31 according to the embodiment of the present disclosure.
Fig. 4 is a schematic structural diagram of a text relation extraction apparatus according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a text relation extraction method and device, computer equipment and a storage medium. Specifically, the text relation extraction method according to the embodiment of the present application may be executed by a computer device, where the computer device may be a terminal or a server.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
Deep Learning (DL) is a branch of machine Learning, an algorithm that attempts to perform high-level abstraction of data using multiple processing layers that contain complex structures or consist of multiple nonlinear transformations.
Neural Networks (NN), a deep learning model that mimics the structure and function of biological Neural networks in the field of machine learning and cognitive science.
Information extraction (information extraction), namely extracting specific event or fact information from natural language text, and helping a user to automatically classify, extract and reconstruct mass contents. The specific event or fact information generally includes an entity (entity), a relationship (relationship), and an event (event). For example, time, place, key people are extracted from news, or product name, development time, performance index, etc. are extracted from technical documents. Because information extraction can extract information frames and fact information interested by users from natural language, the information extraction is widely applied in knowledge maps, information retrieval, question-answering systems, emotion analysis and text mining. The information extraction mainly comprises three subtasks: entity extraction and chain finger, relationship extraction, and event extraction. The entity extraction and chain finger is named entity identification. The relationship extraction, namely triple extraction, is mainly used for extracting the relationship between entities. The event extraction is equivalent to the extraction of a multivariate relation.
Relationship Extraction (RE), given an entity pair and the text that contains the entity pair, aims to determine the semantic relationship of the entity pair based on the text. For example, given an entity pair (M nation, state president) and text ("president candidate a defeats president candidate B in the most recent hit to become the next president … in M nation"), the user wishes to identify that the relationship between the entity "president candidate a" and "M nation" is "state president". In the relationship extraction, a set of relationships, such as "national president", is typically predefined.
Relationship Classification (RC), a modeling approach for relationship extraction, that is, converting relationship extraction into a Classification problem, where each relationship corresponds to a category.
A Question-Answering system (QA) identifies the location of an answer to a text given a piece of text and a Question.
Knowledge Base Completion (KBC), in the embodiment of the present application, means to give a head entity and a relationship specifically, and predict the correct tail entity.
According to the embodiment of the application, the relation extraction model and the question-answering system model are trained in a machine learning mode, then the output result of the trained relation extraction model is verified through the trained question-answering system model, the relation extraction performance of the model is effectively improved, and the accuracy of semantic relation prediction is improved.
Referring to fig. 1a, fig. 1a is a schematic view of an application scenario of a text relationship extraction method according to an embodiment of the present application. The text relation extraction method is executed as an example by computer equipment, wherein the computer equipment can be equipment such as a terminal or a server. The text relation extraction method comprises a training process of a relation extraction model and a question-answering system model and a prediction process of predicting semantic relation of a target entity in a target text by using the relation extraction model and the question-answering system model in a process executed by computer equipment. When the training of the model is carried out, the computer equipment carries out learning training on the relation extraction model according to the first training sample set so as to obtain a trained relation extraction model; and performing learning training on the question-answering system model according to the second training sample set to obtain a trained question-answering system model. When the model is detected, a user can upload a target entity pair to be predicted through a client, a browser client or an instant messaging client installed in computer equipment, the computer equipment further acquires a target text and a relationship set of the target entity pair after acquiring the target entity pair to be predicted uploaded by the user, and then an initial score corresponding to each relationship in the relationship set is predicted through a trained relationship extraction model; then selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations; then updating the scores of all candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations; and finally, predicting the semantic relation of the target entity pair in the target text according to the score after the first updating. According to the embodiment of the application, the output result of the relation extraction model is verified through the question-answering system model, the relation extraction performance of the model is effectively improved, and the accuracy of the predicted semantic relation is improved.
It should be noted that the training process and the actual prediction process of the relationship extraction model and the question-answering system model may be completed in the server or the terminal. When the training process and the actual prediction process of the model are finished at the middle end of the server and the trained relation extraction model and the question-answering system model are required to be used, the target entity pair to be predicted can be input into the server, and after the actual prediction of the server is finished, the obtained prediction result is sent to the terminal to be displayed.
When the training process and the actual prediction process of the model are finished in the terminal and the trained relation extraction model and the question-answering system model are required to be used, the target entity pair to be predicted can be input to the terminal, and after the actual prediction of the terminal is finished, the terminal displays the prediction result.
When the training process of the model is completed in the server and the actual prediction process of the model is completed in the terminal, and the trained relation extraction model and the question-answering system model are required to be used, the target entity pair to be predicted can be input to the terminal, and after the actual prediction of the terminal is completed, the terminal displays the prediction result. Optionally, the model file (model file) trained in the server may be transplanted to the terminal, and if the target entity pair to be predicted needs to be input for prediction, the target entity pair to be predicted is input to the trained model file (model file), and a prediction result may be obtained through calculation.
The following are detailed below. It should be noted that the description sequence of the following embodiments is not intended to limit the priority sequence of the embodiments.
The embodiments of the present application provide a text relation extraction method, which may be executed by a terminal or a server, or may be executed by both the terminal and the server; the embodiment of the present application is described by taking an example in which a text relationship extraction method is executed by a server.
In order to improve the performance of relationship extraction, the embodiment of the application provides a text relationship extraction method based on a question-answering system as verification on the basis of a result level. The model framework provided by the embodiment of the application can be applied to any existing relation extraction task, and additional background data does not need to be provided.
Referring to fig. 1b to 1f, fig. 1b is a first flow chart of a text relationship extraction method according to an embodiment of the present application, and fig. 1c to 1f are PR curves of different baseline models and different candidate relationship selection strategies according to the embodiment of the present application. It is assumed that both the relational extraction model and the question-answering system model are trained. Given a target entity pair and a target text containing the target entity pair, a specific process for predicting the semantic relationship of the target entity pair in the target text by the text relationship extraction method in the embodiment of the present application may be as follows:
step 101, acquiring a target text and a relationship set of a target entity pair.
Wherein the target entity pair comprises a head entity and a tail entity. For example, the target entity pair (M country, president candidate a) entered by the user, then the head entity is M country, the tail entity is president candidate a, and the given target text is "president candidate a defeats president candidate B in the most recent hit, becoming the next president … in M country". The relationship set is a plurality of predetermined relationships.
And step 102, predicting an initial score corresponding to each relation in the relation set through the trained relation extraction model.
After the target texts and the relation sets of the target entity pairs are obtained, the initial scores corresponding to all relations in the relation sets can be predicted through the trained relation extraction model.
Step 103, selecting a candidate relationship from the relationship set according to the initial score corresponding to each relationship in the relationship set.
And selecting partial candidate relations for subsequent verification according to the initial scores corresponding to all relations in the relation set. Generally speaking, a trained relationship extraction model has a certain discrimination capability, and can predict the most wrong relationships to obtain a low initial score. For example, the relationship between the front α% with the highest initial score and the rear β% with the lowest initial score is selected as a candidate relationship for subsequent verification. For example, the top k relations with the highest initial score usually contain correct relations, and the top k relations with the highest score may be used as candidate relations to be used for the next verification.
Optionally, the selecting a candidate relationship from the relationship set according to the initial score corresponding to each relationship in the relationship set includes:
and selecting a relation corresponding to the front alpha% with the highest initial score and the rear beta% with the lowest initial score from the relation set as a candidate relation, wherein alpha and beta are both natural numbers between 0 and 100.
For example, in a first candidate relationship selection strategy, all relationships in a set of relationships are first input into a relationship extraction model to predict an initial score for each relationship in the set of relationships. And then, selecting a relation corresponding to the front alpha% with the highest initial score and the rear beta% with the lowest initial score as a candidate relation for subsequent verification. For example, α ranges from 0 to 100, such as α is 10; β ranges from 0 to 100, for example, β is 20. The above examples are not intended to limit the values of α and β in the embodiments of the present application.
Optionally, the selecting a candidate relationship from the relationship set according to the initial score corresponding to each relationship in the relationship set includes:
and selecting the first k relations with the highest initial scores from the relation set as candidate relations, wherein k is a positive integer larger than 0.
For example, k =3, the first 3 relationships predicted by the relationship extraction model with the highest initial scores are selected as candidate relationships for subsequent verification.
And 104, inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations.
Optionally, the inputting the candidate relationships into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relationship in the candidate relationships includes:
constructing a question corresponding to each candidate relationship in the candidate relationships based on the head entity of the target entity pair and the candidate relationships;
and predicting whether the tail entity of the target entity pair is an answer matched with the question or not through the trained question-answering system model based on the constructed question and the target text of the target entity pair so as to obtain a question-answering score corresponding to each candidate relation in the candidate relations.
Firstly, based on each relation in the candidate relations, a question corresponding to each relation in the candidate relations is constructed, and the constructed question is used as an input parameter of the question-answering system model. Specifically, for the head entity e and the relationship r of the target entity pair, the problem can be directly constructed: "r for e is …". For example, assume that the candidate relationship includes two relationships, "country president" and "located in". For the target entity pair (M nation, presidential candidate a), the problem of construction is: "the national president of M nation is …" and "M nation is …".
Secondly, based on the constructed question and the target text of the target entity pair as context, predicting whether the tail entity of the target entity pair is an answer matched with the question or not through a trained question-answering system model so as to obtain a question-answering score corresponding to each candidate relation in the candidate relations. For example, when the context is that "president candidate a defeats president candidate B in the latest big selection to become the next president … in M nation", and the question is that "president in M nation is …", the question-answering system model can take the tail entity character string "president candidate a" as the answer with a high probability, that is, the question-answering system model will give a high question-answering score to the character string fragment "president candidate a". However, when the question is "M nation …", the question-answering system model will not use "president candidate a" as the answer, that is, the question-answering system model will give the character string segment "president candidate a" a lower question-answering score.
Wherein, in the question-answering system model, the target text is used as the representation of the context.
The working principle of the question-answering system model aims to judge whether given context can be answered by a question or not and give the position of an answer. The embodiment of the application takes the question-answering system model as a verification model of the output result of the relation extraction model. In the task of relationship extraction, in order to train and use the question-answering system model, a question and a training sample need to be constructed.
In the question construction process of the question-answering system model, a target entity pair, a context and a candidate relation are given, and a question is formed by simply combining a head entity in the target entity pair and the candidate relation. There are two reasons for doing so: firstly, the difficulty of manual construction problems can be reduced, and secondly, the problems constructed by the method can be used for judging which candidate relations are wrong to a certain degree. Taking the target entity pair (M nation, president candidate a) as an example, the context is "president candidate a defeated president candidate B in the most recent hit, becoming the next president … in M nation". For example, when the candidate relationship is "national president", the problem of construction is: the "president of country M is …", the question is legal and the answer "president candidate a", i.e. the final entity of the target entity pair, can be found from the context, so the score of the final entity "president candidate a" as the answer will be very high. For example, when the candidate relationship is "located", the corresponding question is "M nation located …", which is also legal, but cannot find the correct answer from the context, so the score of the final entity "president candidate a" as the answer will be very low. For example, when the candidate relationship is "born in" the corresponding question is "M nationally born in …", the question itself is illegal and therefore it is less likely to find an answer in context. Therefore, the problem construction mode of simply splicing the head entity and the candidate relation is simple and effective, and when the candidate relation is wrong, the problem is illegal, so that the question-answering system model can easily verify that the candidate relation is wrong.
And 105, updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations.
Optionally, updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations, including: and updating the initial score and the question-answer score corresponding to each candidate relation in the candidate relations based on a first preset formula so as to obtain a first updated score corresponding to each candidate relation in the candidate relations.
When score fusion of candidate relations is carried out, a target entity pair to be predicted and a target text (context) containing the target entity pair are given, and the score of each relation in a predicted relation set is marked as { (r) through a trained relation extraction model1,p1),(r2,p2) … … }. And respectively providing corresponding score fusion schemes aiming at the two candidate relation selection strategies introduced above.
For example, in a first candidate relationship selection strategy, all relationships in a set of relationships are first input into a relationship extraction model to predict an initial score for each relationship in the set of relationships. And then selecting the corresponding relation of the front alpha% with the highest initial score and the rear beta% with the lowest initial score in the relation set as a candidate relation for subsequent verification. For example, α ranges from 0 to 100, such as α is 10; β ranges from 0 to 100, for example, β is 20. The above examples are not intended to limit the values of α and β in the embodiments of the present application. For the selected relation (candidate relation) rjThe corresponding relation extraction model will predict the candidate relation rjIs scored as pjThe score of the question-answering system model prediction is (p)j,QA) Then the updated score (first updated score) p of the candidate relationshipj' may be calculated using the following first preset formula:
Figure DEST_PATH_IMAGE001
wherein λ is a balance factor of the balance relation extraction model and the question-answering system model, λ is greater than 0, for example, λ = 10.
For example, in the second candidate relationship selection strategy, the top k relationships with the highest initial score predicted by the relationship extraction model are selected as candidate relationships for subsequent verification. For example, k =3, the first 3 relations with the highest prediction scores of the relation extraction model are selected as candidate relations for subsequent verification. For the selected relation (candidate relation) rjCorresponds to the offThe extraction model will predict the candidate relation rjIs scored as pjThe score of the question-answering system model prediction is (p)j,QA) Then the final score (first updated score) p of the candidate relationshipj' may be calculated using the following first preset formula:
Figure 700678DEST_PATH_IMAGE002
wherein, lambda is more than 0 and is the balance factor of the balance relation extraction model and the question-answering system model.
And 106, predicting the semantic relation of the target entity pair in the target text according to the first updated score.
Optionally, the predicting the semantic relationship of the target entity pair in the target text according to the first updated score includes:
according to the first updated score, performing score sorting on all candidate relations;
and selecting the relation with the score larger than a given threshold value from all the candidate relations with the sorted scores as the semantic relation of the target entity pair in the target text.
Through the score fusion operation, the scores of all the candidate relations in the candidate relations are updated. The scores after all the candidate relations are updated need to be sorted, and the candidate relation with the score larger than a given threshold value in the scores after all the candidate relations are updated is selected as the semantic relation of the target entity pair in the target text which is finally predicted. For example, the given threshold may be 0.7, 0.8, or a maximum of pk’。
For example, the most common New York Times (NYT) dataset can be used to evaluate the model framework proposed by the embodiments of the present application. The NYT data set is a widely used data set for remote supervised relationship extraction tasks. The dataset is generated by aligning the relationships in freebase (a large collaborative knowledge base consisting of metadata) with the new york times corpus. For example, the training set includes 53 different relationships, 522611 different sentences and 281270 different entity pairs. The test set comprised 172448 different sentences, 96678 different entity pairs.
For example, 5 relationship extraction models may be employed as baselines. Among them, Baseline (Baseline) is a very basic model or solution. A baseline is typically created and then attempts are made to make a more complex solution to achieve better results. The baseline in this application is a very basic model. The five relationship extraction models are CNN + ATT, PCNN + ATT, CNN + HATT, PCNN + HATT and RESIDE model. The CNN + ATT and the PCNN + ATT respectively adopt a convolutional network CNN and a segmented convolutional network PCNN as encoders to encode information of a text, and adopt a sentence-level attention mechanism to alleviate the influence of noise on relation prediction. CNN + HATT and PCNN + HATT also use a convolutional network or a segmented convolutional network as an encoder to encode information of text. However, unlike the first two models (CNN + ATT and PCNN + ATT), these two models (CNN + HATT and PCNN + HATT) use a hierarchical attention mechanism to learn the effective features of text and mitigate the effects of noise. The RESIDE model uses a graph neural network to encode syntactic information of text, thereby learning a representation of the text.
For example, table 1, table 2, and fig. 1c to fig. 1f show performance comparison between the model framework proposed in the embodiment of the present application and the baseline model, wherein fig. 1c to fig. 1f show accuracy-recall curves under different baseline models and different candidate relationship selection strategies. In the embodiment of the present application, the following evaluation criteria are adopted: precision-recall curve (PR curve), AUC values, and Precision @ N. Wherein different accuracy rates and recall rates are obtained since different thresholds can be selected. When traversing the threshold from 0 to 1, a corresponding PR curve may be obtained. Further, the area under the PR curve, i.e., the AUC value, can be obtained, and the classifier with the larger AUC value has higher accuracy. Precision @ N represents the accuracy rate corresponding to the first N prediction objects with the highest scores in the test set. In all results presentations, "ValStrgy I" represents a first candidate relationship selection strategy, and "ValStrgy II" represents a second candidate relationship selection strategy. All results are in percent.
Exemplary, AUC values for the different models are shown in table 1:
TABLE 1
Figure DEST_PATH_IMAGE003
For example, the Precision @ N values for different models and validation strategies are shown in Table 2:
TABLE 2
Figure 187023DEST_PATH_IMAGE004
For example, from the results shown in table 1, table 2 and fig. 1c to 1f, the following conclusions are drawn:
firstly, after the question-answering system model is used as the verification of the relation extraction task, the performance of all models can be effectively improved. This shows that the model framework based on the question-answering system model of the embodiment of the present application is valid for all the reference models. This is because some erroneous predictions in the relational extraction model are corrected during the validation process.
Second, CNN/PCNN + ATT/HATT uses CNN or its variants as a sentence coder, while RESIDE uses GNN to learn features from sentences. CNN and GNN are two representative neural network structures in deep learning, and CNN and GNN learn relationship-related features in sentences from different angles. With the model framework provided by the embodiment of the application, the performance of all baselines is successfully improved, which shows that the verification model can learn the additional features which cannot be captured by the two CNN/GNN-based classifiers, and the verification model and the extraction model form natural complementation.
Finally, the improvement after using strategy I is more pronounced than strategy II as can be seen from table 1. This indicates that, in the embodiment of the present application, the verification function of the relationship corresponding to the top α% with the highest initial score and the relationship corresponding to the β% with the lowest initial score as candidate relationships is more reliable. In particular, PCNN + HATT + ValStrgy I gave the best results on the NYT dataset. It is worth noting, however, that although policy II is somewhat inferior to policy I, policy I takes much less time to verify, since policy I does not require pre-computation of all the relationships of each entity pair.
According to the embodiment of the application, the question-answering system model is introduced into the relation extraction task as a verification means. In order to verify the result of the relationship extraction, the embodiment of the application provides two candidate relationship selection strategies. Given a target text (context) and a candidate relationship of a target entity pair, the embodiment of the application forms a question by simply combining the head entity and the candidate relationship in the target entity pair, and is efficient and capable of effectively helping a question-answering system model to identify wrong candidate relationships. In addition, the scores of the relationship extraction model and the question-answering system model are updated through an effective score fusion strategy, and the relationship extraction performance of the model is effectively improved.
The text relation extraction method provided by the embodiment of the application can be applied to any relation extraction task, knowledge base completion and text understanding application, and effectively improves the relation extraction performance of related applications, so that the user experience of downstream applications is improved. The proposed model framework can also be applied to other information extraction tasks. For example, when performance improvement needs to be performed on knowledge base completion or slot filling tasks, the relationship extraction model can be applied to the tasks in turn as a verification model.
All the above technical solutions can be combined arbitrarily to form the optional embodiments of the present application, and are not described herein again.
Referring to fig. 2, fig. 2 is a second flow chart of the text relation extracting method according to the embodiment of the present application. The specific process of the text relation extraction method can be as follows:
step 201, learning and training the relation extraction model according to a first training sample set to obtain the trained relation extraction model.
Optionally, the performing learning training on the relationship extraction model according to the first training sample set to obtain the trained relationship extraction model includes:
obtaining a first training sample set comprising a plurality of first samples, wherein the first samples comprise positive and negative characteristics of first samples, a first sample entity pair, and texts and candidate relations of the first sample entity pair;
calculating a loss function of the relation extraction model according to the positive and negative characteristics of the first sample of each first sample in the first training sample set, the text of the first sample entity pair and a candidate relation;
and training the loss function of the relation extraction model based on a gradient descent method to obtain the trained relation extraction model.
In the embodiment of the application, a relationship extraction model and a question-answering system model need to be trained respectively.
Wherein the loss function L of the relational extraction modelRECan be expressed as the following formula (1):
Figure DEST_PATH_IMAGE005
(1);
where k denotes the kth sample in the training set, ykDenotes the positive and negative characteristics of the kth sample, yk=1 denotes that the kth sample is a positive sample, ykAnd =0 indicates that the kth sample is a negative sample. skIs a context representation of the entity pair in the kth sample, skTypically an m-dimensional vector. r iskIs the feature vector of the candidate relationship in the kth sample. Function p(s)k, rk) And (3) representing the matching model of the context and the candidate relation of the entity pair, namely judging whether the candidate relation is correct or not. The output score of the matching model lies between 0 and 1. The parameters to be learned in the relation extraction model are mainly functions p and skCorresponding encoder and feature vector r of the relationk
Step 202, learning and training the question-answering system model according to a second training sample set to obtain the trained question-answering system model.
In the process of constructing the training samples of the question-answering system model, in order to train the question-answering system model for verifying the candidate relationship, the training samples of the question-answering system model need to be constructed. Wherein, in the relationship extraction dataset, one sample of the relationship extraction dataset can be represented as (entity pair, context, candidate relationship). When the candidate relation is correct, the sample is a positive sample; when the candidate relationship is incorrect, the sample is a negative sample. Based on this, the relation extraction sample can be easily converted into a training sample of the question-answering system model. In the embodiment of the present application, one training sample may be represented as (question, context, answer or not). The questions are constructed based on the head entities and the candidate relations, and the answers are tail entities in the entity pairs. Whether a candidate relationship is correct is "true" or not, and false otherwise.
In the embodiment of the present application, the selection of the question-answering system model needs to satisfy two points: one is to determine whether the question is answered, and the other is to predict the score of each string segment in each context as the answer. Therefore, the question-answering system model is built on the basis of the pre-training model ALBERT. The role of the ALBERT model is to provide coding for context and problems, i.e. to represent a piece of text as a low-dimensional dense distributed vector. The advantage of using the ALBERT pre-training model is that the knowledge in the general corpus can be effectively utilized, thereby improving the accuracy of the question-answering system.
Wherein the pre-training model ALBERT is a reduced BERT model. The pre-training model ALBERT has much less parametric quantities than the conventional BERT architecture. ALBERT overcomes the main obstacles faced by the extended pre-training model through two parameter reduction techniques: first, the embedding parameterization is factorized. The pre-training model ALBERT decomposes a large vocabulary embedding matrix into two small matrixes, so that the size of a hidden layer is separated from the size of vocabulary embedding, the separation facilitates the increase of the hidden layer, and meanwhile, the number of vocabulary embedding parameters is not increased remarkably; second, cross-layer parameter sharing, which can avoid the increase of parameter number with the increase of network depth. Both techniques significantly reduce the parameters of BERT while not significantly affecting its performance, thereby improving parameter efficiency. These parameter reduction techniques can also serve as some form of regularization, can make training more stable, and are beneficial for generalization.
Optionally, the performing learning training on the question-answering system model according to the second training sample set to obtain the trained question-answering system model includes:
obtaining a second training sample set comprising a plurality of second samples, wherein the second samples comprise a second sample entity pair, a context of the second sample entity pair, a score of the second sample entity for a starting position of a leading entity in the context, and a score of the second sample entity for an ending position of a trailing entity in the context;
calculating a loss function of the question-answering system model according to a second sample entity pair of each second sample in the second training sample set, the context of the second sample entity pair, the score of the starting position of the head entity in the context in the second sample entity pair, and the score of the ending position of the tail entity in the context in the second sample entity pair;
and training the loss function of the question-answering system model based on a gradient descent method to obtain the trained question-answering system model.
For example, a question-answering system model constructed based on a pre-training model ALBERT, the loss function L of whichQACan be expressed as the following equation (2):
Figure 101890DEST_PATH_IMAGE006
(2);
wherein Q represents a training sample set of the question-answering system model,
Figure DEST_PATH_IMAGE007
e.Q denotes a sample
Figure 232657DEST_PATH_IMAGE007
Belonging to a training sample set Q.
Figure 522912DEST_PATH_IMAGE008
Represents the sample
Figure 903078DEST_PATH_IMAGE007
Whether the loss function can be answered or not,
Figure DEST_PATH_IMAGE009
it means that the question-answering system model has the tail entity as a loss function of the correct answer. In particular, the amount of the solvent to be used,
Figure 293739DEST_PATH_IMAGE009
and
Figure 75750DEST_PATH_IMAGE010
shown as equation (3) and equation (4) below:
Figure DEST_PATH_IMAGE011
(3);
Figure 775722DEST_PATH_IMAGE012
(4);
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE013
the value is 0 or 1, which represents the sample
Figure 49708DEST_PATH_IMAGE007
Whether the question in (1) is answered or not, PansRepresenting a sample
Figure 634273DEST_PATH_IMAGE007
The score of the question in (1). lsDenotes the starting position of the tail entity in the context,/eIndicating the end position of the tail entity in context.
Figure 208474DEST_PATH_IMAGE014
Indicates that the correct answer has a starting position of l in the contextsThe score of (a) is obtained,
Figure DEST_PATH_IMAGE015
it means that the end position of the correct answer in the context is leIs scored.
For the convenience of distinction, a training sample set used for training the relationship extraction model is defined as a first training sample set, and training samples included in the first training sample set are defined as first samples. The training sample set used for training the question-answering system model is defined as a second training sample set, and the training samples contained in the second training sample set are defined as second samples. For example, the first training sample set and the second training sample set may be the same training sample set, or may be training sample sets that are different or partially different. The embodiment of the present application does not limit the two. In the training process of the embodiment of the application, the objective functions of the relationship extraction model and the question-answering system model need to be trained respectively. Like most machine learning models, the embodiment of the present application can adopt a small batch gradient descent (mini-batch SGD) algorithm for parameter learning. In the prediction process of the embodiment of the present application, after learning the mathematics, a new target entity pair to be predicted and a target text (context) containing the target entity pair are given, and then the relationship extraction may be performed by using the model framework (the trained relationship extraction model and the question-and-answer system model) provided in the embodiment of the present application, so as to predict the semantic relationship of the target entity pair in the target text.
Step 203, acquiring a target text and a relationship set of the target entity pair. For a detailed description of step 203, please refer to step 101, which is not described herein again.
And 204, predicting an initial score corresponding to each relation in the relation set through the trained relation extraction model. For a detailed description of step 204, please refer to step 102, which is not described herein again.
Step 205, selecting a candidate relationship from the relationship set according to the initial score corresponding to each relationship in the relationship set. For the detailed description of step 205, please refer to step 103, which is not described herein again.
Optionally, step 205 may be implemented by step 2051 or step 2052, specifically:
step 2051, selecting a relation corresponding to the front alpha% with the highest initial score and the rear beta% with the lowest initial score from the relation set as a candidate relation, wherein alpha and beta are both natural numbers between 0 and 100.
And step 2052, selecting the first k relations with the highest initial scores from the relation set as candidate relations, wherein k is a positive integer larger than 0.
And step 206, inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations. For a detailed description of step 206, please refer to step 104, which is not described herein again.
Optionally, step 206 may be implemented by steps 2061 to 2062, specifically, step 2062
Step 2061, constructing a problem corresponding to each candidate relation in the candidate relations based on the head entity of the target entity pair and the candidate relations;
step 2062, based on the constructed question and the target text of the target entity pair, predicting whether the tail entity of the target entity pair is an answer matched with the question through the trained question-answering system model so as to obtain a question-answering score corresponding to each candidate relationship in the candidate relationships.
Step 207, updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations, so as to obtain a first updated score corresponding to each candidate relation in the candidate relations. For a detailed description of step 207, please refer to step 105, which is not described herein again. After step 207, step 208 or step 209 may be selected to be executed according to the actual application.
Optionally, updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations, including:
and updating the initial score and the question-answer score corresponding to each candidate relation in the candidate relations based on a first preset formula so as to obtain a first updated score corresponding to each candidate relation in the candidate relations.
When score fusion is carried out, a target entity pair to be predicted and a target text (context) containing the target entity pair are given, and the score of each relation in the predicted relation set is marked as { (r) through a trained relation extraction model1,p1),(r2,p2) … … }. And respectively providing corresponding score fusion schemes aiming at the two candidate relation selection strategies introduced above.
In a first candidate relationship selection strategy, first, all the relationships in the relationship set are input into the relationship extraction model to predict an initial score corresponding to each relationship in the relationship set. And then selecting the corresponding relation of the front alpha% with the highest initial score and the rear beta% with the lowest initial score in the relation set as a candidate relation for subsequent verification. For example, α ranges from 0 to 100, such as α is 10; β ranges from 0 to 100, for example, β is 20. The above examples are not intended to limit the values of α and β in the embodiments of the present application. For the selected relation (candidate relation) rjThe corresponding relation extraction model will predict the candidate relation rjIs scored as pjThe score of the question-answering system model prediction is (p)j,QA) Then the updated score (first updated score) p of the candidate relationshipj' may be calculated using the following first preset formula:
Figure 643129DEST_PATH_IMAGE001
wherein λ is a balance factor of the balance relation extraction model and the question-answering system model, λ is greater than 0, for example, λ = 10.
For example, in the second candidate relationship selection strategy, the first k relationships with the highest initial scores predicted by the relationship extraction model are selected as candidate relationshipsFor subsequent verification. For example, k =3, the first 3 relations with the highest prediction scores of the relation extraction model are selected as candidate relations for subsequent verification. For the selected relation (candidate relation) rjThe corresponding relation extraction model will predict the candidate relation rjIs scored as pjThe score of the question-answering system model prediction is (p)j,QA) Then the final score (first updated score) p of the candidate relationshipj' may be calculated using the following first preset formula:
Figure 669991DEST_PATH_IMAGE002
wherein, lambda is more than 0 and is the balance factor of the balance relation extraction model and the question-answering system model.
And step 208, predicting the semantic relation of the target entity pair in the target text according to the first updated score. For a detailed description of step 208, please refer to step 106, which is not described herein again.
Optionally, the predicting the semantic relationship of the target entity pair in the target text according to the first updated score includes:
according to the first updated score, performing score sorting on all candidate relations;
and selecting the relation with the score larger than a given threshold value from all the candidate relations with the sorted scores as the semantic relation of the target entity pair in the target text.
Step 209, obtaining initial scores corresponding to the remaining unselected relationships in the relationship set except the candidate relationship.
For example, due to (p)j,QA) Typically less than 1, and therefore the score of any selected candidate relationship is narrowed to some extent. For those remaining unselected relationships p for verificationkThe relationship set is also required to be narrowed down to a certain extent, so that initial scores corresponding to the unselected relationships except the candidate relationship in the relationship set are also required to be obtained for subsequent useAnd calculating second updated scores of the rest unselected relations.
And step 210, updating the initial scores corresponding to the rest of the unselected relationships based on a second preset formula to obtain second updated scores corresponding to the rest of the unselected relationships.
For example, in the first candidate relationship selection strategy, since (p)j,QA) Typically less than 1, and therefore the score of any selected candidate relationship is narrowed to some extent. For those remaining unselected relationships p for verificationkThat is, in the relationship set, the remaining unselected relationships except the relationship corresponding to the front α% with the highest initial score and the rear β% with the lowest initial score need to be narrowed down to some extent, and the updated score (second updated score) p of the remaining unselected relationships needs to be reduced to the certain extentk' may be calculated using the following second preset formula:
Figure 933613DEST_PATH_IMAGE016
wherein c is a constant, and the value range of c is 0 to 1, for example, the value of c is 0.9.
For example, in the second candidate relationship selection strategy, since (p)j,QA) Typically less than 1, and therefore the score of any selected candidate relationship is narrowed to some extent. For those remaining unselected relationships p for verificationkThat is, if the remaining unselected relationships in the relationship set except the first k relationships with the highest initial score need to be narrowed down to a certain extent, the final scores (second updated scores) p of the remaining unselected relationshipsk' may be calculated using the following second preset formula:
Figure DEST_PATH_IMAGE017
wherein c is a constant, and the value range of c is 0 to 1, for example, the value of c is 0.9.
And step 211, predicting the semantic relation of the target entity pair in the target text according to the first updated score and the second updated score.
Optionally, the predicting the semantic relationship of the target entity pair in the target text according to the first updated score and the second updated score includes:
according to the first updated score and the second updated score, performing score sorting on all the relationships in the relationship set;
and selecting the relation with the score larger than a given threshold value from all the relations in the relation set after the scores are sorted as the semantic relation of the target entity pair in the target text.
Through the score fusion operation, the scores of all the relations in the relation set are updated. The updated scores of all the relationships need to be sorted, and the relationship with the score larger than a given threshold value in the updated scores of all the relationships is selected as the semantic relationship of the final predicted target entity pair in the target text. For example, the given threshold may be 0.7, 0.8, or a maximum of pk’。
All the above technical solutions can be combined arbitrarily to form the optional embodiments of the present application, and are not described herein again.
In the model framework in the embodiment of the application, the question-answering system model is introduced into the relationship extraction task, and the question-answering system model is used as a result verification model of an output result of the relationship extraction model. In the model framework, each candidate relation generated by the relation extraction model can be verified, and the accuracy is high. The proposed model framework can be effectively verified on the basis of any existing relational extraction model, and does not require additional background knowledge.
In addition, in order to improve the verification efficiency, the embodiment of the application also provides two candidate relationship selection strategies. And in the second candidate relationship selection strategy, a relationship set (containing a plurality of candidate relationships) with the highest prediction score in the relationship extraction model is selected as a candidate relationship for verifying the question-answering system model in the next step. The reason is that the candidate relationships in the set of relationships with the highest prediction score usually contain the correct relationships. In the first candidate relationship selection strategy, initial scores are given to all relationships by using a relationship extraction model, then the first k relationships with the highest and the lowest scores of the initial scores are selected as candidate relationships, and the candidate relationships are used for verifying the next question-answering system model. And obtaining a question-answer score through the candidate relationship processed by the question-answer system model, and then performing score fusion on the basis of the initial score and the question-answer score corresponding to the candidate relationship. The reason why the top k relations with the highest and lowest scores of the initial scores are selected as the candidate relations is that the question-answering system model is considered to have certain limitation, but the highest or lowest score indicates that the confidence is higher, that is, "highest" indicates that the candidate relations are likely to be correct, and "lowest" indicates that the corresponding candidate relations are likely to be wrong, so that more candidate relations need to be introduced for subsequent verification in order to improve the verification efficiency. Compared with the second candidate relationship selection strategy, the first candidate relationship selection strategy needs more computing resources, but the verification effect is more reliable.
The method comprises the steps of obtaining a target text and a relation set of a target entity pair; then, predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; then selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations; then updating the scores of all candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations; and finally, predicting the semantic relation of the target entity pair in the target text according to the score after the first updating. According to the embodiment of the application, the output result of the relation extraction model is verified through the question-answering system model, the relation extraction performance of the model is effectively improved, and the accuracy of the predicted semantic relation is improved.
The embodiments of the present application may be implemented by combining a Cloud technology or a blockchain network technology, where the Cloud technology is a hosting technology that unifies series of resources such as hardware, software, and network in a wide area network or a local area network to implement calculation, storage, processing, and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, photo-like websites and more portal websites, so cloud technology needs to be supported by cloud computing.
It should be noted that cloud computing is a computing mode, and distributes computing tasks on a resource pool formed by a large number of computers, so that various application systems can obtain computing power, storage space and information services as required. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as being infinitely expandable and available at any time, available on demand, expandable at any time, and paid for on-demand. As a basic capability provider of cloud computing, a cloud computing resource pool platform, referred to as a cloud platform for short, generally referred to as Infrastructure as a Service (IaaS), is established, and multiple types of virtual resources are deployed in a resource pool and are selectively used by external clients. The cloud computing resource pool mainly comprises: a computing device (which may be a virtualized machine, including an operating system), a storage device, and a network device.
In order to facilitate the storage and query of the target text and the relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the semantic relationship of the target entity pair in the target text, in some embodiments, the text relationship extraction method further includes: sending the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the semantic relationship of the target entity pair in the target text to the blockchain network, so that the nodes of the blockchain network fill the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the semantic relationship of the target entity pair in the target text to the new blocks, and when the new blocks are identified together, adding the new blocks to the tail of the blockchain.
Next, a block chain network in the embodiment of the present application will be explained. Referring to fig. 3a, fig. 3a is a schematic diagram of an application architecture of a blockchain network provided in the embodiment of the present application, including a blockchain network 31 (exemplarily illustrating a consensus node 310-1 to a consensus node 310-3), an authentication center 32, a service agent 33, and a service agent 34, which are respectively described below.
The type of blockchain network 31 is flexible and may be any of a public chain, a private chain, or a federation chain, for example. Taking a public link as an example, computer devices of any business entity, such as a user terminal and a server (e.g., a cloud server), can access the blockchain network 31 without authorization; taking a federation chain as an example, a computer device (e.g., a terminal/server) under the jurisdiction of a service entity after obtaining authorization may access the blockchain network 31, in this case, if becoming a client node in the blockchain network 31, the client indicates an application client for predicting a semantic relationship of a target entity to a target text.
In some embodiments, the client node may act as a mere observer of the blockchain network 31, i.e., provide functionality to support a business entity to initiate a transaction (e.g., for uplink storage of data or querying of data on a chain), and may be implemented by default or selectively (e.g., depending on the specific business requirements of the business entity) for the functions of the consensus node 310 in the blockchain network 31, such as a ranking function, a consensus service, and an accounting function, etc. Therefore, the data and the service processing logic of the service subject can be migrated to the blockchain network 31 to the maximum extent, and the credibility and traceability of the data and service processing process are realized through the blockchain network 31.
The consensus nodes in the blockchain network 31 receive transactions submitted from client nodes (e.g., client node 330 attributed to the business entity 33, and client node 340 attributed to the business entity 34, shown in fig. 3 a) of different business entities (e.g., business entity 33 and business entity 34, shown in fig. 3 a), perform the transactions to update the ledger or query the ledger, and various intermediate or final results of performing the transactions may be returned to the business entity's client nodes for display.
For example, the client node 330/340 may subscribe to events of interest in the blockchain network 31, such as transactions occurring in a particular organization/channel in the blockchain network 31, and the consensus node 310 pushes corresponding transaction notifications to the client node 330/340, thereby triggering corresponding business logic in the client node 330/340.
An exemplary application of the blockchain network is described below with an example that a plurality of business entities access the blockchain network to achieve a target text and a relationship set of a target entity pair, an initial score corresponding to each relationship in the relationship set, a candidate relationship, a question-answer score and a first updated score corresponding to each candidate relationship, and management of semantic relationships in the target text by the target entity. Referring to fig. 3a, a plurality of service entities involved in the management link, such as the service entity 33 and the service entity 34, may be clients corresponding to the video identification device, and register with the certificate authority 32 to obtain respective digital certificates, where each digital certificate includes a public key of the service entity and a digital signature signed by the certificate authority 32 for the public key and the identity information of the service entity, and is used to be attached to the transaction together with the digital signature of the service entity for the transaction, and is sent to the blockchain network, so that the blockchain network takes out the digital certificate and the signature from the transaction, verifies the reliability of the message (i.e. whether the message is not tampered) and the identity information of the service entity sending the message, and the blockchain network 31 performs verification according to the identity, for example, whether the client has the right to initiate the transaction. Clients running computer devices (e.g., terminals or servers) hosted by the business entity may request access from the blockchain network 31 to become client nodes.
The client node 330 of the business agent 33 is used to obtain a target text and a relationship set of a target entity pair; then, predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; then selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations; then updating the scores of all candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations; and finally, predicting the semantic relation of the target entity pair in the target text according to the score after the first updating. The client node 330 of the business agent 33 is further configured to send the target text and the set of relationships of the target entity pair, the initial score corresponding to each relationship in the set of relationships, the candidate relationships, the question-answer score and the first updated score corresponding to each candidate relationship, and the semantic relationship of the target entity pair in the target text to the blockchain network 31.
Wherein, the operation of sending the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the semantic relationship of the target entity pair in the target text to the blockchain network 31 may set service logic in the client node 330 in advance, and when the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the semantic relationship of the target entity pair in the target text are found, the client node 330 sends the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the first updated score corresponding to the target entity pair, And the semantic relationship of the target entity pair in the target text is automatically sent to the blockchain network 31, or a service person of the service agent 33 logs in the client node 330, and manually packages the target text and the relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the candidate relationship, the question-answer score and the first updated score corresponding to each candidate relationship, and the semantic relationship of the target entity pair in the target text, and sends the semantic relationship to the blockchain network 31. During sending, the client node 330 generates a transaction corresponding to an update operation according to the target text and the relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the candidate relationship, the question-answer score and the first updated score corresponding to each candidate relationship, and the semantic relationship in the target text of the target entity pair, specifies an intelligent contract that needs to be invoked to implement the update operation and parameters transferred to the intelligent contract, and also carries a digital certificate of the client node 330 and a signed digital signature (for example, a digest of the transaction is encrypted by using a private key in the digital certificate of the client node 330) in the transaction, and broadcasts the transaction to the consensus node 310 in the blockchain network 31.
When the consensus node 310 in the blockchain network 31 receives the transaction, the digital certificate and the digital signature carried by the transaction are verified, after the verification is successful, whether the service agent 33 has the transaction right or not is determined according to the identity of the service agent 33 carried in the transaction, and the transaction fails due to any verification judgment of the digital signature and the right verification. Upon successful verification, node 310 signs its own digital signature (e.g., by encrypting the digest of the transaction using the private key of node 310-1) and continues to broadcast in blockchain network 31.
After the consensus node 310 in the blockchain network 31 receives the transaction successfully verified, the transaction is filled into a new block and broadcast. When broadcasting a new block, the consensus node 310 in the block chain network 31 performs a consensus process on the new block, and if the consensus is successful, adds the new block to the tail of the block chain stored in the consensus node, updates the state database according to the transaction result, and executes the transaction in the new block: and adding a target text and relation set comprising the target entity pair, an initial score and a first updated score corresponding to each relation in the relation set, a candidate relation and a question-answer score and a first updated score corresponding to each candidate relation and a semantic relation of the target entity pair in the target text into the state database for the transaction of submitting the target text and relation set storing the target entity pair, the initial score corresponding to each relation in the relation set, the question-answer score and the first updated score corresponding to the candidate relation and each candidate relation and the semantic relation of the target entity pair in the target text.
The service person of the service agent 34 logs in the client node 340, inputs the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the query request of the target entity for the semantic relationship in the target text, the client node 340 generates a transaction corresponding to the update operation/query operation according to the target text and relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the query request of the target entity for the semantic relationship in the target text, specifies the intelligent contract required to be called for realizing the update operation/query operation and the parameters transferred to the intelligent contract in the transaction, the transaction also carries the digital certificate of the client node 340, a signed digital signature (e.g., obtained by encrypting a digest of the transaction using a private key in the digital certificate of the client node 340), and broadcasts the transaction to the consensus node 310 in the blockchain network 31.
After the consensus node 310 in the blockchain network 31 receives the transaction, verifies the transaction, fills the block, and agrees with the consensus, adds the new filled block to the tail of the blockchain stored in itself, updates the state database according to the transaction result, and executes the transaction in the new block: for the submitted transaction of updating the semantic relationship of the target entity pair in the target text, updating the key value pair corresponding to the target entity pair in the state database according to the semantic relationship of the target entity pair in the target text; and for the submitted transaction of the semantic relation of the query target entity in the target text, querying the key value pair of the semantic relation of the target entity in the target text from the state database, and returning a transaction result.
As an example of the block chain, referring to fig. 3b, fig. 3b is an optional structural schematic diagram of the block chain in the block chain network 31 provided in this embodiment of the present application, a header of each block may include hash values of all transactions in the block and also include hash values of all transactions in a previous block, a record of a newly generated transaction is filled in the block and is subjected to consensus of nodes in the block chain network, and then is appended to a tail of the block chain to form a chain growth, and a chain structure based on hash values between blocks ensures tamper-proofing and forgery-proofing of transactions in the block.
An exemplary functional architecture of the blockchain network provided in the embodiment of the present application is described below, referring to fig. 3c, fig. 3c is a schematic functional architecture diagram of the blockchain network 31 provided in the embodiment of the present application, which includes an application layer 301, a consensus layer 302, a network layer 303, a data layer 304, and a resource layer 305, and the following description is separately given below.
The resource layer 305 encapsulates the computing, storage, and communication resources that implement the various nodes 310 in the blockchain network 31.
Data layer 304 encapsulates various data structures that implement ledgers, including blockchains implemented in files in a file system, key-value type state databases, and presence certificates (e.g., hash trees for transactions in blocks).
The network layer 303 encapsulates the functions of a Point-to-Point (P2P) network protocol, a data propagation mechanism and a data verification mechanism, an access authentication mechanism, and service agent identity management.
Wherein the P2P network protocol implements communication between nodes 310 in the blockchain network 31, the data propagation mechanism ensures propagation of transactions in the blockchain network 31, and the data verification mechanism implements reliability of data transmission between nodes 310 based on cryptography methods (e.g., digital certificates, digital signatures, public/private key pairs); the access authentication mechanism is used for authenticating the identity of the service subject added into the block chain network 31 according to an actual service scene, and endowing the service subject with the authority of accessing the block chain network 31 when the authentication is passed; the service agent identity management is used to store the identity of the service agents that are allowed to access the blockchain network 31, as well as the permissions (e.g., the types of transactions that can be initiated).
The consensus layer 302 encapsulates the functionality of the mechanisms by which nodes 310 in the blockchain network 31 agree on a block (i.e., consensus mechanisms), transaction management, and ledger management. The consensus mechanism comprises consensus algorithms such as POS, POW and DPOS, and the pluggable consensus algorithm is supported.
The transaction management is used for verifying the digital signature carried in the transaction received by the node 310, verifying the identity information of the service body, and judging and confirming whether the service body has the authority to perform the transaction (reading the relevant information from the identity management of the service body) according to the identity information; for the service agents authorized to access the blockchain network 31, the service agents all have digital certificates issued by the certificate authority, and the service agents sign the submitted transactions by using private keys in the digital certificates of the service agents, so that the legal identities of the service agents are declared.
The ledger administration is used to maintain blockchains and state databases. For the block with the consensus, adding the block to the tail of the block chain; executing the transaction in the acquired consensus block, updating the key-value pairs in the state database when the transaction comprises an update operation, querying the key-value pairs in the state database when the transaction comprises a query operation and returning a query result to the client node of the business entity. Supporting query operations for multiple dimensions of a state database, comprising: querying the block based on the block vector number (e.g., hash value of the transaction); inquiring the block according to the block hash value; inquiring a block according to the transaction vector number; inquiring the transaction according to the transaction vector number; inquiring account data of a business main body according to an account (vector number) of the business main body; and inquiring the block chain in the channel according to the channel name.
The application layer 301 encapsulates various services that the blockchain network can implement, including tracing, crediting, and verifying transactions.
By adopting the technical scheme provided by the embodiment of the application, the target text and the relation set of the target entity pair are obtained; then, predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; then selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations; then updating the scores of all candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations to obtain a first updated score corresponding to each candidate relation in the candidate relations; and finally, according to the score after the first updating, predicting the semantic relation of the target entity in the target text, and verifying the output result of the relation extraction model based on the question-answering system model, thereby effectively improving the relation extraction performance of the model and improving the accuracy of predicting the semantic relation. Meanwhile, the embodiment of the application can also store the target text and the relationship set of the target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, which are acquired by the terminal, and the semantic relationship of the target entity pair in the target text in a chain manner, so as to realize the backup of records, when the user uses the text relationship extraction system again, the target text and the relationship set of the corresponding target entity pair, the initial score corresponding to each relationship in the relationship set, the question-answer score and the first updated score corresponding to the candidate relationship and each candidate relationship, and the semantic relationship of the target entity pair in the target text can be directly and quickly acquired from the block chain, and the semantic relationship of the target entity pair in the target text can be acquired without carrying out a series of processing on the acquired target entity pair by the text relationship extraction system, thereby improving data acquisition efficiency.
In order to better implement the text relationship extraction method according to the embodiment of the present application, an embodiment of the present application further provides a text relationship extraction device. Referring to fig. 4, fig. 4 is a schematic structural diagram of a text relation extracting device according to an embodiment of the present application. The text relation extracting apparatus 400 may include:
an obtaining unit 401, configured to obtain a target text and a relationship set of a target entity pair;
a first prediction unit 402, configured to predict, through a trained relationship extraction model, an initial score corresponding to each relationship in the relationship set;
a selecting unit 403, configured to select a candidate relationship from the relationship set according to an initial score corresponding to each relationship in the relationship set;
a calculating unit 404, configured to input the candidate relationships into a trained question-answering system model for processing, so as to obtain a question-answering score corresponding to each candidate relationship in the candidate relationships;
an updating unit 405, configured to update the scores of all candidate relationships in the candidate relationships according to the initial score and the question-answer score corresponding to each candidate relationship in the candidate relationships, so as to obtain a first updated score corresponding to each candidate relationship in the candidate relationships;
a second predicting unit 406, configured to predict, according to the first updated score, a semantic relationship of the target entity pair in the target text.
Optionally, the selecting unit 403 is configured to select, as the candidate relationship, a relationship corresponding to a front α% with a highest initial score and a rear β% with a lowest initial score from the relationship set, where α and β are both natural numbers between 0 and 100.
Optionally, the selecting unit 403 is configured to select, as candidate relations, the top k relations with the highest initial score from the relation set, where k is a positive integer greater than 0.
Optionally, the calculating unit 404 is configured to: constructing a question corresponding to each candidate relationship in the candidate relationships based on the head entity of the target entity pair and the candidate relationships; and predicting whether the tail entity of the target entity pair is an answer matched with the question or not through the trained question-answering system model based on the constructed question and the target text of the target entity pair so as to obtain a question-answering score corresponding to each candidate relation in the candidate relations.
Optionally, the updating unit 405 is configured to update the initial score and the question-answer score corresponding to each of the candidate relationships based on a first preset formula, so as to obtain a first updated score corresponding to each of the candidate relationships.
Optionally, the second predicting unit 406 is configured to perform score sorting on all candidate relationships according to the first updated scores; and selecting the relation with the score larger than a given threshold value from all the candidate relations with the sorted scores as the semantic relation of the target entity pair in the target text.
Optionally, the updating unit 405 is further configured to: obtaining initial scores corresponding to the rest unselected relations except the candidate relation in the relation set; and updating the initial scores corresponding to the rest of the unselected relationships based on a second preset formula to obtain second updated scores corresponding to the rest of the unselected relationships.
Optionally, the second predicting unit 406 is further configured to predict a semantic relationship of the target entity pair in the target text according to the first updated score and the second updated score.
Optionally, the second prediction unit 406 is further specifically configured to: according to the first updated score and the second updated score, performing score sorting on all the relationships in the relationship set; and selecting the relation with the score larger than a given threshold value from all the relations in the relation set after the scores are sorted as the semantic relation of the target entity pair in the target text.
Optionally, the apparatus 400 further comprises: the first training unit is used for learning and training the relation extraction model according to a first training sample set so as to obtain the trained relation extraction model; and
and the second training unit is used for carrying out learning training on the question-answering system model according to a second training sample set so as to obtain the trained question-answering system model.
Optionally, the first training unit is configured to: obtaining a first training sample set comprising a plurality of first samples, wherein the first samples comprise positive and negative characteristics of first samples, a first sample entity pair, and texts and candidate relations of the first sample entity pair; calculating a loss function of the relation extraction model according to the positive and negative characteristics of the first sample of each first sample in the first training sample set, the text of the first sample entity pair and a candidate relation; and training the loss function of the relation extraction model based on a gradient descent method to obtain the trained relation extraction model.
Optionally, the second training unit is configured to: obtaining a second training sample set comprising a plurality of second samples, wherein the second samples comprise a second sample entity pair, a context of the second sample entity pair, a score of the second sample entity for a starting position of a leading entity in the context, and a score of the second sample entity for an ending position of a trailing entity in the context; calculating a loss function of the question-answering system model according to a second sample entity pair of each second sample in the second training sample set, the context of the second sample entity pair, the score of the starting position of the head entity in the context in the second sample entity pair, and the score of the ending position of the tail entity in the context in the second sample entity pair; and training the loss function of the question-answering system model based on a gradient descent method to obtain the trained question-answering system model.
All the above technical solutions can be combined arbitrarily to form the optional embodiments of the present application, and are not described herein again.
It is to be understood that apparatus embodiments and method embodiments may correspond to one another and that similar descriptions may refer to method embodiments. To avoid repetition, further description is omitted here. Specifically, the apparatus shown in fig. 4 may execute the text relation extraction method embodiment, and the foregoing and other operations and/or functions of each module in the apparatus implement corresponding processes of the method embodiment, which are not described herein again for brevity.
Correspondingly, the embodiment of the application further provides a computer device, which can be a terminal or a server, and the terminal can be a smart phone, a tablet computer, a notebook computer, a smart television, a smart sound box, a wearable smart device, a personal computer, or the like. The server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform and the like. As shown in fig. 5, the computer device may include Radio Frequency (RF) circuitry 501, memory 502 including one or more computer-readable storage media, input unit 503, display unit 504, sensor 505, audio circuitry 506, Wireless Fidelity (WiFi) module 507, processor 508 including one or more processing cores, and power supply 509, among other components. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 5 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:
the RF circuit 501 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, for receiving downlink information of a base station and then sending the received downlink information to the one or more processors 508 for processing; in addition, data relating to uplink is transmitted to the base station. In addition, the RF circuitry 501 may also communicate with networks and other devices via wireless communications.
The memory 502 may be used to store software programs and modules, and the processor 508 executes various functional applications and data processing by operating the software programs and modules stored in the memory 502. The memory 502 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to use of the computer device, and the like.
The input unit 503 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
The display unit 504 may be used to display information input by or provided to a user as well as various graphical user interfaces of the computer device, which may be made up of graphics, text, icons, video, and any combination thereof. The display unit 504 may include a display panel.
The computer device may also include at least one sensor 505, such as light sensors, motion sensors, and other sensors.
Audio circuitry 506, a speaker, and a microphone may provide an audio interface between a user and a computer device. The audio circuit 506 may transmit the electrical signal converted from the received audio data to a speaker, and convert the electrical signal into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electrical signal, which is received by the audio circuit 506 and converted into audio data, which is then processed by the audio data output processor 508 and then sent to, for example, another computer device via the RF circuit 501, or output to the memory 502 for further processing. The audio circuit 506 may also include an earbud jack to provide communication of peripheral headphones with the computer device.
WiFi belongs to short-distance wireless transmission technology, and computer equipment can help a user to receive and send emails, browse webpages, access streaming media and the like through a WiFi module 507, and provides wireless broadband internet access for the user. Although fig. 5 shows the WiFi module 507, it is understood that it does not belong to the essential constitution of the computer device, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The processor 508 is a control center of the computer device, connects various parts of the entire cellular phone using various interfaces and lines, and performs various functions of the computer device and processes data by operating or executing software programs and/or modules stored in the memory 502 and calling data stored in the memory 502, thereby integrally monitoring the computer device.
The computer device also includes a power supply 509 (such as a battery) for powering the various components, which may preferably be logically connected to the processor 508 via a power management system that may be used to manage charging, discharging, and power consumption.
Although not shown, the computer device may further include a camera, a bluetooth module, etc., which will not be described herein. Specifically, in this embodiment, the processor 508 in the computer device loads the executable file corresponding to the process of one or more computer programs into the memory 502 according to the following instructions, and the processor 508 runs the computer programs stored in the memory 502, so as to implement various functions:
acquiring a target text and a relation set of a target entity pair; predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model; selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set; inputting the candidate relations into a trained question-answering system model for processing to obtain question-answering scores corresponding to each candidate relation in the candidate relations; updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations so as to obtain a first updated score corresponding to each candidate relation in the candidate relations; and predicting the semantic relation of the target entity pair in the target text according to the first updated score.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, the present application provides a computer-readable storage medium, in which a plurality of computer programs are stored, where the computer programs can be loaded by a processor to execute the steps in any one of the text relation extraction methods provided by the embodiments of the present application.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the computer program stored in the storage medium can execute the steps in any text relation extraction method provided in the embodiments of the present application, beneficial effects that can be achieved by any text relation extraction method provided in the embodiments of the present application can be achieved, and detailed descriptions are omitted here for the foregoing embodiments.
The text relation extraction method, the text relation extraction device, the storage medium and the computer device provided by the embodiments of the present application are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the present application, and the description of the embodiments is only used to help understanding the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (7)

1. A method for extracting text relations, the method comprising:
learning and training the relation extraction model according to the first training sample set to obtain the trained relation extraction model, comprising: obtaining a first training sample set comprising a plurality of first samples, wherein the first samples comprise positive and negative characteristics of first samples, a first sample entity pair, and texts and candidate relations of the first sample entity pair; calculating a loss function of the relational extraction model according to the positive and negative characteristics of the first sample of each first sample in the first training sample set, the text of the first sample entity pair and a candidate relation, wherein a loss function formula of the relational extraction model is as follows:
Figure 682074DEST_PATH_IMAGE001
where k denotes the kth first sample in the first set of training samples, ykRepresenting the positive and negative characteristics of the kth first sample, skText representing pairs of entities in the kth first sample, rkFeature vector, p(s), representing candidate relations in the kth first samplek, rk) Matching model, p(s), representing text and candidate relations of entity pairsk, rk) For determining whether the candidate relation is correct, p(s)k, rk) Is between 0 and 1; training the loss function of the relation extraction model based on a gradient descent method to obtain the trained relation extraction model;
performing learning training on the question-answering system model according to a second training sample set to obtain the trained question-answering system model, including: obtaining a second training sample set comprising a plurality of second samples, wherein the second samples comprise a second sample entity pair, a context of the second sample entity pair, a score of the second sample entity for a starting position of a leading entity in the context, and a score of the second sample entity for an ending position of a trailing entity in the context; calculating a loss function of the question-answering system model according to a second sample entity pair of each second sample in the second training sample set, the context of the second sample entity pair, the score of the head entity of the second sample entity pair in the context, and the score of the tail entity of the second sample entity pair in the context, wherein the loss function formula of the question-answering system model is as follows:
Figure 163871DEST_PATH_IMAGE002
wherein Q represents a second set of training samples of the question-answering system model,
Figure 452901DEST_PATH_IMAGE003
e.Q denotes the second sample
Figure 602123DEST_PATH_IMAGE003
Belonging to a set of training samples Q,
Figure 200463DEST_PATH_IMAGE004
representing a second sample
Figure 372819DEST_PATH_IMAGE003
Whether the loss function can be answered or not,
Figure 324594DEST_PATH_IMAGE005
the loss function of the question-answering system model with the tail entity as the correct answer is represented by the following formula:
Figure 469268DEST_PATH_IMAGE006
Figure 113876DEST_PATH_IMAGE007
wherein, in the step (A),
Figure 773527DEST_PATH_IMAGE008
a value of 0 or 1 indicates a second sample
Figure 420671DEST_PATH_IMAGE003
Whether the question in (1) is answered or not, PansRepresenting a second sample
Figure 13327DEST_PATH_IMAGE003
Score of question answering in (1)sDenotes the starting position of the tail entity in the context,/eIndicating the end position of the tail entity in the context,
Figure 704202DEST_PATH_IMAGE009
represents the correct answerThe starting position in this context is lsThe score of (a) is obtained,
Figure 647888DEST_PATH_IMAGE010
it means that the end position of the correct answer in the context is leScore of (a); training a loss function of the question-answering system model based on a gradient descent method to obtain the trained question-answering system model;
acquiring a target text and a relationship set of a target entity pair, wherein the target entity pair is a target entity pair to be predicted and uploaded by a user through a client, a browser client or an instant messaging client installed in computer equipment, the target text is a text containing the target entity pair, and the relationship set is a plurality of preset relationships;
predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model;
selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set;
inputting the candidate relations into a trained question-answering system model for processing to obtain a question-answering score corresponding to each candidate relation in the candidate relations, wherein the process comprises the following steps: based on the combination of the head entity of the target entity pair and the candidate relation, automatically constructing a problem which corresponds to each candidate relation in the candidate relation and accords with a natural statement; predicting whether a tail entity of the target entity pair is an answer matched with the question or not through the trained question-answering system model based on the constructed question and the target text of the target entity pair so as to obtain a question-answering score corresponding to each candidate relation in the candidate relations;
updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations so as to obtain a first updated score corresponding to each candidate relation in the candidate relations;
obtaining initial scores corresponding to the rest unselected relations except the candidate relation in the relation set;
updating the initial scores corresponding to the rest of the unselected relationships based on a second preset formula to obtain second updated scores corresponding to the rest of the unselected relationships;
predicting the semantic relation of the target entity in the target text according to the first updated score and the second updated score, specifically: and according to the first updated score and the second updated score, performing score sorting on all the relationships in the relationship set, and selecting the relationship with the score larger than a given threshold value from all the relationships in the relationship set after the score sorting as the semantic relationship of the target entity pair in the target text.
2. The method for extracting textual relationships according to claim 1, wherein selecting candidate relationships from the set of relationships according to the initial score corresponding to each relationship in the set of relationships comprises:
and selecting a relation corresponding to the front alpha% with the highest initial score and the rear beta% with the lowest initial score from the relation set as a candidate relation, wherein alpha and beta are both natural numbers between 0 and 100.
3. The method for extracting textual relationships according to claim 1, wherein selecting candidate relationships from the set of relationships according to the initial score corresponding to each relationship in the set of relationships comprises:
and selecting the first k relations with the highest initial scores from the relation set as candidate relations, wherein k is a positive integer larger than 0.
4. The method of extracting textual relationships according to claim 1, wherein updating the scores of all the candidate relationships in the candidate relationships according to the initial score and question-answer score corresponding to each of the candidate relationships to obtain a first updated score corresponding to each of the candidate relationships, comprises:
and updating the initial score and the question-answer score corresponding to each candidate relation in the candidate relations based on a first preset formula so as to obtain a first updated score corresponding to each candidate relation in the candidate relations.
5. A text relation extraction apparatus, characterized in that the apparatus comprises:
the first training unit is used for learning and training the relation extraction model according to a first training sample set to obtain the trained relation extraction model, and comprises: obtaining a first training sample set comprising a plurality of first samples, wherein the first samples comprise positive and negative characteristics of first samples, a first sample entity pair, and texts and candidate relations of the first sample entity pair; calculating a loss function of the relational extraction model according to the positive and negative characteristics of the first sample of each first sample in the first training sample set, the text of the first sample entity pair and a candidate relation, wherein a loss function formula of the relational extraction model is as follows:
Figure 410307DEST_PATH_IMAGE001
where k denotes the kth first sample in the first set of training samples, ykRepresenting the positive and negative characteristics of the kth first sample, skText representing pairs of entities in the kth first sample, rkFeature vector, p(s), representing candidate relations in the kth first samplek, rk) Matching model, p(s), representing text and candidate relations of entity pairsk, rk) For determining whether the candidate relation is correct, p(s)k, rk) Is between 0 and 1; training the loss function of the relation extraction model based on a gradient descent method to obtain the trained relation extraction model;
the second training unit is configured to perform learning training on the question-answering system model according to a second training sample set to obtain the trained question-answering system model, and includes: obtaining a second training sample set comprising a plurality of second samples, wherein the second samples comprise a second sample entity pair, a context of the second sample entity pair, a score of the second sample entity for a starting position of a leading entity in the context, and a score of the second sample entity for an ending position of a trailing entity in the context; calculating a loss function of the question-answering system model according to a second sample entity pair of each second sample in the second training sample set, the context of the second sample entity pair, the score of the head entity of the second sample entity pair in the context, and the score of the tail entity of the second sample entity pair in the context, wherein the loss function formula of the question-answering system model is as follows:
Figure 513261DEST_PATH_IMAGE011
wherein Q represents a second set of training samples of the question-answering system model,
Figure 234093DEST_PATH_IMAGE012
e.Q denotes the second sample
Figure 868336DEST_PATH_IMAGE012
Belonging to a set of training samples Q,
Figure 840971DEST_PATH_IMAGE013
representing a second sample
Figure 939377DEST_PATH_IMAGE012
Whether the loss function can be answered or not,
Figure 457209DEST_PATH_IMAGE005
the loss function of the question-answering system model with the tail entity as the correct answer is represented by the following formula:
Figure 578749DEST_PATH_IMAGE006
Figure 479708DEST_PATH_IMAGE007
wherein, in the step (A),
Figure 42408DEST_PATH_IMAGE008
a value of 0 or 1 indicates a second sample
Figure 573883DEST_PATH_IMAGE003
Whether the question in (1) is answered or not, PansRepresenting a second sample
Figure 713878DEST_PATH_IMAGE003
Score of question answering in (1)sDenotes the starting position of the tail entity in the context,/eIndicating the end position of the tail entity in the context,
Figure 808742DEST_PATH_IMAGE009
indicates that the correct answer has a starting position of l in the contextsThe score of (a) is obtained,
Figure 350581DEST_PATH_IMAGE010
it means that the end position of the correct answer in the context is leScore of (a); training a loss function of the question-answering system model based on a gradient descent method to obtain the trained question-answering system model;
the system comprises an acquisition unit, a prediction unit and a prediction unit, wherein the acquisition unit is used for acquiring a target text and a relationship set of a target entity pair, the target entity pair is a target entity pair to be predicted and uploaded by a user through a client, a browser client or an instant messaging client installed in computer equipment, the target text is a text containing the target entity pair, and the relationship set is a plurality of preset relationships;
the first prediction unit is used for predicting an initial score corresponding to each relation in the relation set through a trained relation extraction model;
the selecting unit is used for selecting a candidate relation from the relation set according to the initial score corresponding to each relation in the relation set;
the calculating unit is configured to input the candidate relationships into a trained question-answering system model for processing, so as to obtain a question-answering score corresponding to each candidate relationship in the candidate relationships, and includes: based on the combination of the head entity of the target entity pair and the candidate relation, automatically constructing a problem which corresponds to each candidate relation in the candidate relation and accords with a natural statement; predicting whether a tail entity of the target entity pair is an answer matched with the question or not through the trained question-answering system model based on the constructed question and the target text of the target entity pair so as to obtain a question-answering score corresponding to each candidate relation in the candidate relations;
the updating unit is used for updating the scores of all the candidate relations in the candidate relations according to the initial score and the question-answer score corresponding to each candidate relation in the candidate relations so as to obtain a first updated score corresponding to each candidate relation in the candidate relations; obtaining initial scores corresponding to other unselected relations except the candidate relation in the relation set, and updating the initial scores corresponding to the other unselected relations based on a second preset formula to obtain second updated scores corresponding to the other unselected relations;
a second prediction unit, configured to predict, according to the first updated score and the second updated score, a semantic relationship of the target entity pair in the target text, specifically: and according to the first updated score and the second updated score, performing score sorting on all the relationships in the relationship set, and selecting the relationship with the score larger than a given threshold value from all the relationships in the relationship set after the score sorting as the semantic relationship of the target entity pair in the target text.
6. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program adapted to be loaded by a processor for performing the steps of the text relation extraction method according to any one of claims 1-4.
7. A computer device, characterized in that the computer device comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for executing the steps in the text relation extraction method according to any one of claims 1-4 by calling the computer program stored in the memory.
CN202110569523.8A 2021-05-25 2021-05-25 Text relation extraction method and device, storage medium and computer equipment Active CN113033209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110569523.8A CN113033209B (en) 2021-05-25 2021-05-25 Text relation extraction method and device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110569523.8A CN113033209B (en) 2021-05-25 2021-05-25 Text relation extraction method and device, storage medium and computer equipment

Publications (2)

Publication Number Publication Date
CN113033209A CN113033209A (en) 2021-06-25
CN113033209B true CN113033209B (en) 2021-09-17

Family

ID=76455847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110569523.8A Active CN113033209B (en) 2021-05-25 2021-05-25 Text relation extraction method and device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN113033209B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880551B (en) * 2022-04-12 2023-05-02 北京三快在线科技有限公司 Method and device for acquiring upper and lower relationship, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110941962A (en) * 2019-11-26 2020-03-31 中国科学院自动化研究所 Answer sentence selection method and device based on graph network
CN111598118A (en) * 2019-12-10 2020-08-28 中山大学 Visual question-answering task implementation method and system
CN111651569A (en) * 2020-04-24 2020-09-11 中国电力科学研究院有限公司 Knowledge base question-answering method and system in electric power field

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159425B (en) * 2019-12-30 2023-02-10 浙江大学 Temporal knowledge graph representation method based on historical relationship and double-graph convolution network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110941962A (en) * 2019-11-26 2020-03-31 中国科学院自动化研究所 Answer sentence selection method and device based on graph network
CN111598118A (en) * 2019-12-10 2020-08-28 中山大学 Visual question-answering task implementation method and system
CN111651569A (en) * 2020-04-24 2020-09-11 中国电力科学研究院有限公司 Knowledge base question-answering method and system in electric power field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A Question-answering Based Framework for Relation Extraction Validation;Jiayang Cheng et al.;《arXiv》;20210407;第1-10页 *

Also Published As

Publication number Publication date
CN113033209A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN110569361B (en) Text recognition method and equipment
US20220180882A1 (en) Training method and device for audio separation network, audio separation method and device, and medium
CN110569377B (en) Media file processing method and device
CN110162593B (en) Search result processing and similarity model training method and device
CN110597943B (en) Interest point processing method and device based on artificial intelligence and electronic equipment
CN111602147A (en) Machine learning model based on non-local neural network
CN111401558A (en) Data processing model training method, data processing device and electronic equipment
CN110597962B (en) Search result display method and device, medium and electronic equipment
CN112749749B (en) Classification decision tree model-based classification method and device and electronic equipment
CN110597963A (en) Expression question-answer library construction method, expression search method, device and storage medium
CN113420128B (en) Text matching method and device, storage medium and computer equipment
CN111026858A (en) Project information processing method and device based on project recommendation model
CN110929806B (en) Picture processing method and device based on artificial intelligence and electronic equipment
US11822895B1 (en) Passive user authentication
WO2021155691A1 (en) User portrait generating method and apparatus, storage medium, and device
CN112529101B (en) Classification model training method and device, electronic equipment and storage medium
CN113127652A (en) Abstract acquisition method, device and computer readable storage medium
CN115130711A (en) Data processing method and device, computer and readable storage medium
CN113569111B (en) Object attribute identification method and device, storage medium and computer equipment
CN113033209B (en) Text relation extraction method and device, storage medium and computer equipment
CN111008213A (en) Method and apparatus for generating language conversion model
CN115221294A (en) Dialogue processing method, dialogue processing device, electronic equipment and storage medium
CN114547658A (en) Data processing method, device, equipment and computer readable storage medium
CN115840796A (en) Event integration method, device, equipment and computer readable storage medium
CN112861009A (en) Artificial intelligence based media account recommendation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40047310

Country of ref document: HK