CN112417878A - Entity relationship extraction method, system, electronic equipment and storage medium - Google Patents

Entity relationship extraction method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN112417878A
CN112417878A CN202011330157.2A CN202011330157A CN112417878A CN 112417878 A CN112417878 A CN 112417878A CN 202011330157 A CN202011330157 A CN 202011330157A CN 112417878 A CN112417878 A CN 112417878A
Authority
CN
China
Prior art keywords
entity
vector
inputting
obtaining
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011330157.2A
Other languages
Chinese (zh)
Other versions
CN112417878B (en
Inventor
郑悦
蔡怡蕾
景艳山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Minglue Zhaohui Technology Co Ltd
Original Assignee
Beijing Minglue Zhaohui Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Minglue Zhaohui Technology Co Ltd filed Critical Beijing Minglue Zhaohui Technology Co Ltd
Priority to CN202011330157.2A priority Critical patent/CN112417878B/en
Publication of CN112417878A publication Critical patent/CN112417878A/en
Application granted granted Critical
Publication of CN112417878B publication Critical patent/CN112417878B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to an entity relationship extraction method, an entity relationship extraction system, electronic equipment and a storage medium. The method comprises the following steps: performing word segmentation operation on an input original text and outputting a word segmentation; inputting the participles into an Embedding layer to obtain initial vector representation of the corresponding participles; inputting the initial vector representation into a Bi-LSTM layer to obtain two groups of entity vector representations in the forward direction and the backward direction; inputting the initial vector representation into a Tree-based-LSTM layer to obtain a node vector of a syntax Tree structure; inputting the two groups of entity vector representations into an Attention layer to obtain a first state vector, and inputting the first state vector into a Q-Learning network to obtain the probability of the relationship between the entities represented by the two entity vector representations; and inputting the node vector into the softmax layer to obtain a second state vector, inputting the second state vector into the Q-Learning network, and outputting an entity relation by combining probabilities of relations among the entities. By considering the dependency relationship between the named entity identification and the relationship extraction, the problem of error continuation in the entity relationship extraction process is solved, and the error rate of the entity relationship extraction is improved.

Description

Entity relationship extraction method, system, electronic equipment and storage medium
Technical Field
The present application relates to the field of natural language processing, and in particular, to a method, a system, an electronic device, and a storage medium for extracting entity relationships.
Background
In recent years, with the development of deep learning, various leading-edge techniques are applied to natural language processing. Among them, the dialogue robot is a branch of natural language processing, and especially, the dialogue robot in the vertical domain, such as the ali honey in the Taobao shopping, has been widely used in the industry. Entity and Relationship Extraction (ERE) is an essential module in a conversation robot and is responsible for extracting important information from input of a user. ERE is a cascaded task that can be divided into two subtasks: named entity recognition and relationship extraction.
The Named Entity Recognition (NER) is an important module in the dialogue robot, and is responsible for recognizing keywords from the input of a user and extracting information required by a downstream task from the keywords. Assuming that the current conversational robot is built in the field of digital products, consider the sentence "i want to buy iPhone 11" where "i" is identified as a person and "iPhone 11" is identified as a phone model, so the system can know that the user's intent is to buy a phone, and then can extract relevant information (e.g., price, official links, etc.) from the graph based on the identified entities.
Relationship Extraction (RE) extracts relationships between entities on the basis of named entity recognition, such as "buy" for the "me" and "iPhone 11" relationships in the above example sentence, which function in a composite sentence can help the system understand the relationships between two entities.
From the above description, the accuracy of the ERE module has a significant impact on subsequent operations.
At present, the prior art still performs named entity identification and then performs relationship extraction. The method has high flexibility and is easy to realize. But there are also disadvantages to doing so: when errors exist in the named entity identification process, the errors are continued to a relation extraction part, and the effect of relation extraction is influenced; since the relationship extraction requires pairwise pairing of all named entities. Therefore, a plurality of useless entities can be predicted, and the error rate is improved; template matching can only be realized by manually adding templates, so that the flexibility is poor and the universality is weak; the separation of named entity recognition and relationship extraction results in the overall system ignoring the inherent relationships and dependencies between the two.
At present, no effective solution has been proposed to the above problems in the related art.
Disclosure of Invention
The embodiment of the application provides an entity relationship extraction method, an entity relationship extraction system, electronic equipment and a computer readable storage medium.
In a first aspect, an embodiment of the present application provides an entity relationship extraction method, which is characterized by including the following steps:
a word segmentation step is obtained, and word segmentation operation is carried out on the input original text and word segmentation is output;
acquiring an initial vector representation step, inputting the participle into an Embedding layer to obtain an initial vector representation corresponding to the participle;
acquiring entity vector representation, namely inputting the initial vector representation into a Bi-LSTM layer to obtain two groups of entity vector representations in a forward direction and a backward direction;
a step of obtaining node vectors, which is to input the initial vector representation to a Tree-based-LSTM layer to obtain node vectors of a syntax Tree structure;
acquiring entity existence relation probability, namely inputting two groups of entity vector representations into an Attention layer to obtain a first state vector, and inputting the first state vector into a Q-Learning network to obtain the probability of existence relation between entities correspondingly represented by the two entity vectors;
and acquiring an entity relationship, namely inputting the node vector to a softmax layer to obtain a second state vector, inputting the second state vector to the Q-Learning network, and outputting the entity relationship by combining the probability of the relationship existing between the entities.
In some embodiments, the step of obtaining the entity existence relationship probability specifically includes:
a step of outputting a weight matrix, which is to obtain the weight of each participle through linear transformation and output the weight matrix;
and a step of obtaining a first state vector, wherein the first state vector is obtained by multiplying the weight matrix by a matrix formed by the entity vector representation.
In some embodiments, the obtaining an entity vector representation step further comprises the following steps:
acquiring an operation result, namely multiplying the two groups of entity vector representations by corresponding weight matrixes, inputting the multiplied entity vector representations to a softmax layer, and outputting the operation result;
and acquiring a named entity, receiving and utilizing a Viterbi algorithm to acquire the named entity according to the operation result.
In some embodiments, the step of obtaining the node vector specifically includes:
a step of obtaining a syntax tree structure, which is to accept and output the syntax tree structure of the original text by using a syntax analysis tool according to the initial vector representation;
and a step of obtaining Tree node vectors, wherein the syntax Tree structure is input to the Tree-based-LSTM layer to obtain the node vectors of each node in the syntax Tree structure.
In a second aspect, an embodiment of the present application provides an entity relationship extraction system, including:
the method comprises the steps of obtaining a word segmentation module, carrying out word segmentation operation on an input original text and outputting a word segmentation;
an initial vector representation obtaining module, which inputs the participles into an Embedding layer to obtain initial vector representations corresponding to the participles;
an entity vector representation obtaining module inputs the initial vector representation to a Bi-LSTM layer to obtain two groups of entity vector representations in the forward direction and the backward direction;
the node vector obtaining module is used for inputting the initial vector representation to the Tree-based-LSTM layer to obtain a node vector of a syntax Tree structure;
the entity existence relation probability obtaining module is used for inputting the two groups of entity vector representations into an Attention layer to obtain a first state vector, and inputting the first state vector into a Q-Learning network to obtain the probability of existence relation between entities correspondingly represented by the two entity vectors;
and the entity relation obtaining module is used for inputting the node vector to a softmax layer to obtain a second state vector, inputting the second state vector to the Q-Learning network, and outputting an entity relation by combining the probability of relation existing among the entities.
In some embodiments, the module for obtaining the entity existence relationship probability specifically includes:
the output weight matrix unit is used for obtaining the weight of each participle through linear transformation and outputting a weight matrix;
and a first state vector unit is obtained, and the first state vector is obtained by multiplying the weight matrix by a matrix formed by the entity vector representation.
In some embodiments, the obtaining entity vector representing module is connected to an obtaining operation result module and a obtaining named entity module, wherein:
the operation result obtaining module is used for multiplying the two groups of entity vector representations by the corresponding weight matrixes, inputting the multiplied entity vector representations to the softmax layer and outputting operation results;
and the acquisition named entity module receives and obtains a named entity by utilizing a Viterbi algorithm according to the operation result.
In some embodiments, the obtaining node vector module specifically includes:
a syntax tree structure unit is obtained, and a syntax tree structure of the original text is accepted and output by using a syntax analysis tool according to the initial vector representation of the participle;
and a Tree node vector obtaining unit which inputs the syntax Tree structure to the Tree-based-LSTM layer to obtain a node vector of each node in the syntax Tree structure.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor, when executing the computer program, implements the entity relationship extraction method according to the first aspect.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the entity relationship extraction method according to the first aspect is implemented.
Compared with the related art, the entity relationship extraction method provided by the embodiment of the application solves the problem of error continuation in the entity relationship extraction process by considering the dependency relationship between the named entity identification and the relationship extraction, and improves the error rate of the entity relationship extraction.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of an entity relationship extraction method according to an embodiment of the present application;
FIG. 2 is a diagram showing a structure of a Bi-LSTM network in the embodiment of the present application;
FIG. 3 is a diagram of a Tree-based-LSTM network according to an embodiment of the present application;
FIG. 4 is a flowchart of the steps of obtaining entity relationship probability according to an embodiment of the present application;
FIG. 5 is a flowchart of a step of obtaining node vectors according to an embodiment of the present application;
FIG. 6 is a flow chart of an entity relationship extraction method in accordance with the preferred embodiment of the present application;
FIG. 7 is a block diagram of an entity relationship extraction system according to an embodiment of the present application;
fig. 8 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Description of the drawings:
1. acquiring a word segmentation module; 2. Acquiring an initial vector representation module;
3. obtaining an entity vector representation module; 4. A node vector obtaining module;
5. obtaining entity existence relation probability module; 6. Acquiring an entity relationship module;
51. an output weight matrix unit; 52. obtain a first state vector unit
7. An operation result obtaining module; 8. Acquiring a named entity module;
41. acquiring a syntax tree structure unit; 42. Obtaining a tree node vector unit;
91. a processor; 92. a memory; 93. A communication interface; 90. a bus.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments that can be derived by a person skilled in the art from the embodiments provided herein without making any creative effort belong to the protection scope of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
In the prior art, named entity identification is firstly carried out, and then relationship extraction is carried out. The techniques and specific steps used in the two sections are described below:
named entity recognition pertains to sequence tagging problems in natural language processing. The problem is mainly divided into two parts.
The first part is a semantic encoder. The encoder is responsible for converting the textual input in the sentence into a context-dependent vector representation, which may be used with techniques such as Word2Vec, LSTM, GRU, BERT, and the like. The method comprises the following specific steps:
(1) segmenting the sentences;
(2) converting each participle in the sentence into an original vector by using a trained model such as Word2 Vec;
(3) inputting the vector obtained in the step (2) into an encoder (LSTM/GRU/BERT and the like) to obtain vector representation of each word;
the second part is a tag decoder. The decoder is responsible for converting the vector into a target output, named entity identification in the present proposal, and the techniques possibly used in this section are LSTM, GRU, CRF, pointer network, etc., and the steps are inputting the result of the encoder into the decoder to obtain the required tag sequence.
The relation extraction currently takes template matching as a main stream. The template matching can solve the problem of relation extraction in a very small range with high precision, but cannot solve the problem of language diversification in a conversation robot or the problem of wide related fields. The method comprises the following specific steps:
(1) acquiring the entity obtained in the step of identifying the named entity and the position of the entity in the sentence;
(2) inputting the information in the step (1) into a template to obtain the relationship between the entities;
however, the following problems still exist in the prior art:
(1) the accumulation of prediction errors, if a scheme of first entity and then relationship is adopted, the errors identified by the named entities can be continued to the relationship extraction part to influence the performance of the relationship extraction part;
(2) the relation extraction of the redundant entities needs pairwise pairing of all named entities, so that a plurality of useless entities can be predicted, and the error rate is improved;
(3) template matching can only be realized by manually adding templates, so that the flexibility is poor and the universality is weak;
(4) the separation of named entity recognition and relationship extraction results in the overall system ignoring the inherent relationships and dependencies between the two.
The entity relation extraction method is applied to the dialogue robot, after the preprocessing module and before the dialogue management module, so that the preprocessed sentences, namely original texts, are input and output as recognized entities and the owned relation between the recognized entities.
Example 1
The embodiment provides an entity relationship extraction method. Fig. 1 is a flowchart of an entity relationship extraction method according to an embodiment of the present application, and as shown in fig. 1, the flowchart includes the following steps:
a word segmentation step S1, performing word segmentation operation on the input original text and outputting a word segmentation;
an initial vector representation obtaining step S2, inputting the participles into an Embedding layer to obtain initial vector representations of the corresponding participles;
an entity vector representation obtaining step S3, wherein the initial vector representation is input to a Bi-LSTM layer to obtain two groups of entity vector representations in the forward direction and the backward direction;
a node vector obtaining step S4, inputting the initial vector representation to the Tree-based-LSTM layer to obtain a node vector of a syntax Tree structure;
step S5, obtaining entity existence relation probability, inputting two groups of entity vector representations into the Attention layer to obtain a first state vector, and inputting the first state vector into the Q-Learning network to obtain the probability of existence relation between entities represented by two entity vectors;
and an entity relationship obtaining step S6, inputting the node vector into the softmax layer to obtain a second state vector, inputting the second state vector into the Q-Learning network, and outputting the entity relationship by combining the probability of the relationship existing between the entities.
Through the steps, named entity recognition and relationship extraction are obtained in one algorithm, the named entity recognition and the relationship extraction are mutually influenced, and compared with the method that two subtasks are separately processed, the method considers the dependency relationship between the two subtasks, and is more valuable in a target task with high-precision requirement, namely a conversation robot. Meanwhile, the Tree LSTM is used for relation extraction, and compared with a template method, the method is wider in universality and better in effect. And connecting the two steps by using reinforcement learning, and continuously updating the model parameters in the two steps, namely optimizing the named entity by taking the target of the relation extraction into consideration, and simultaneously optimizing the relation extraction by taking the precision of the named entity into consideration.
Three types of neural networks are used in the steps, namely Bi-directional Long Short-Term Memory (Bi-LSTM), Attention, Tree based LSTM and Policy Gradient network for deep reinforcement learning.
Fig. 2 is a structure diagram of a Bi-LSTM network in the embodiment of the present application, and as shown in fig. 2, the Bi-LSTM is composed of two consecutive LSTM layers, and is responsible for analyzing a result of preprocessing input by a user, inputting the result into an Embedding layer to obtain a representation of each participle, and then inputting the result into the Bi-LSTM layer to obtain a vector representation related to a context.
Attention is a commonly used technique in natural language processing, and its essence is to obtain the weight of each word through linear transformation, and then in the current task, multiply the obtained weight matrix by the matrix composed of the vectors of each word obtained by Bi-LSTM to obtain the current first state variable for the subsequent reinforcement learning.
Fig. 3 is a structure diagram of a Tree-based LSTM network in the embodiment of the present application, and as shown in fig. 3, the Tree-based LSTM is an improved version of LSTM, and the essential idea is to construct a Tree according to the results of dependency syntax analysis, semantic dependency analysis, and the like, and then input vectors corresponding to different participles into the structure. Unlike ordinary LSTM, Tree based LSTM does not have neighboring inputs as outputs, the inputs of which are representations of all children, i.e., the input of the current node is a vector representation of all children participles. Meanwhile, because the relation needs to be classified, the input of the relation also needs to include embedding of word segmentation positions and embedding of entity types. In addition, the vector representation of the root node of the tree constructed in the first step is marked as the current second state variable for subsequent reinforcement learning.
Q-Learning is a technique in deep reinforcement Learning that essentially fits a neural network based on tags in training data, and functions to score states in a sequence. In this proposal, the technique is used in two places, the first one to identify whether there is a relationship between two entities, and if there is a relationship, the second one to classify the relationship into one of a predefined set of candidates.
Among the three technologies, Bi-LSTM is mainly used for named entity identification, Tree-based-LSTM is used for relation extraction, and Q-Learning is used for connecting the two steps, so that after the named entities and the relations between every two named entities are obtained, the named entities and the relations between every two named entities are input into a dialogue management module.
Fig. 4 is a flowchart of the step of obtaining the entity presence relationship probability according to the embodiment of the present application, and as shown in fig. 4, in some embodiments, the step of obtaining the entity presence relationship probability S5 specifically includes:
a step S51 of outputting a weight matrix, wherein the weight of each participle is obtained through linear transformation and the weight matrix is output;
the step S52 of obtaining the first state vector is to multiply the weight matrix by the matrix represented by the entity vector to obtain the first state vector.
In some embodiments, the obtaining the entity vector representation step S3 further includes the following steps:
an operation result obtaining step S7, wherein two groups of entity vector representations are multiplied by corresponding weight matrixes and then input to a softmax layer, and an operation result is output;
and a step S8 of obtaining the named entity, wherein the named entity is received and obtained by using a Viterbi algorithm according to the operation result.
Fig. 5 is a flowchart of a step of obtaining a node vector according to an embodiment of the present application, and as shown in fig. 5, in some embodiments, the step of obtaining a node vector S4 specifically includes:
a syntax tree structure obtaining step S41 of receiving and outputting a syntax tree structure of an original text by using a syntax analysis tool according to the initial vector representation of the participle;
and a step S42 of obtaining Tree node vectors, which is to input the syntax Tree structure into the Tree-based-LSTM layer to obtain the node vectors of each node in the syntax Tree structure.
The embodiments of the present application are described and illustrated below by means of preferred embodiments.
Fig. 6 is a flowchart of an entity relationship extraction method according to the preferred embodiment of the present application. As shown in fig. 5, the flow of the method includes the following steps:
(1) segmenting input sentences;
(2) inputting the participles obtained in the last step into an Embedding layer to obtain an initial vector representation of each participle;
(3) inputting the vector of 2) into a Bi-LSTM layer to obtain two groups of vector representations in the forward direction and the backward direction;
(4) multiplying the two groups of vector representations by the weight matrix respectively and inputting the vector representations into a soft max layer;
(5) obtaining the label of each participle, namely the identified entity, by using a Viterbi algorithm on the result of the step 4);
(6) obtaining a syntax tree structure of the whole sentence through a syntactic analysis tool;
(7) inputting the initial vector representation of 2) into a Tree-based-LSTM layer to obtain a node vector of each node;
(8) taking the vector representation of two entities in the result of 3) as an input into the Attention layer to obtain a first state vector S1
(9) Will S1Inputting the Q-Learning network to obtain a1And a2A is obtained by1And a2Respectively representing that two entities have a relationship and two entities have no relationship;
(10) similarly, the node vectors corresponding to the two entities obtained in the step 8) are input into a soft max layer to obtain a second state vector S2
(11) Second state vector S2Inputting the Q-Learning network to obtain a3A is obtained by3Is a set of candidate relationships;
(12) repeating steps 8) to 11) ultimately results in all identified entities and relationships between them.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The present embodiment further provides an entity relationship extraction system, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the system is omitted here. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
Fig. 7 is a block diagram of an entity relationship extraction system according to an embodiment of the present application, and as shown in fig. 6, the system includes:
the method comprises the steps that a word segmentation module 1 is obtained, word segmentation operation is conducted on an input original text, and word segmentation is output;
the initial vector representation obtaining module 2 inputs the participles into an Embedding layer to obtain initial vector representation of the corresponding participles;
the entity vector obtaining representation module 3 inputs the initial vector representation to the Bi-LSTM layer to obtain two groups of entity vector representations in the forward direction and the backward direction;
the node vector obtaining module 4 inputs the initial vector representation to the Tree-based-LSTM layer to obtain a node vector of a syntax Tree structure;
the entity existence relation probability obtaining module 5 inputs the two groups of entity vector representations to the Attention layer to obtain a first state vector, and inputs the first state vector to the Q-Learning network to obtain the probability of existence relation between entities correspondingly represented by the two entity vectors;
and the entity relation obtaining module 6 is used for inputting the node vector to the softmax layer to obtain a second state vector, inputting the second state vector to the Q-Learning network, and outputting the entity relation by combining the probability of the relation existing between the entities.
In some embodiments, the module for obtaining entity existence relationship probability 5 specifically includes:
an output weight matrix unit 51 that obtains the weight of each participle through linear transformation and outputs a weight matrix;
the first state vector obtaining unit 52 multiplies the matrix formed by the entity vector representation by the weight matrix to obtain a first state vector.
In some embodiments, the acquiring entity vector representing module 3 is connected to an acquiring operation result module 7 and an acquiring named entity module 8, wherein:
the operation result obtaining module 7 is used for multiplying the two groups of entity vector representations by the corresponding weight matrixes, inputting the multiplied entity vector representations to the softmax layer and outputting operation results;
and the acquisition named entity module 8 receives and acquires a named entity by utilizing a Viterbi algorithm according to an operation result.
In some embodiments, the obtaining node vector module 4 specifically includes:
a syntax tree structure obtaining unit 41, which accepts and outputs a syntax tree structure of an original text using a syntax analysis tool based on the initial vector representation of the segmented word,
the Tree node vector obtaining unit 42 inputs the syntax Tree structure to the Tree-based-LSTM layer to obtain a node vector of each node in the syntax Tree structure.
The named entity recognition and the relationship extraction are obtained in one algorithm, the named entity recognition and the relationship extraction are mutually influenced, and compared with the method that two subtasks are separately processed, the method considers the dependency relationship between the two subtasks, and is more valuable in a target task with high precision requirement, namely a conversation robot. Meanwhile, the Tree LSTM is used for relation extraction, and compared with a template method, the method is wider in universality and better in effect. And connecting the two steps by using reinforcement learning, and continuously updating the model parameters in the two steps, namely optimizing the named entity by taking the target of the relation extraction into consideration, and simultaneously optimizing the relation extraction by taking the precision of the named entity into consideration.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
In addition, the entity relationship extraction method described in conjunction with fig. 1 in the embodiment of the present application may be implemented by an electronic device. Fig. 8 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
The electronic device may comprise a processor 91 and a memory 92 storing computer program instructions.
Specifically, the processor 91 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 92 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 92 may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 92 may include removable or non-removable (or fixed) media, where appropriate. The memory 92 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 92 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 92 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory 92 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 91.
The processor 91 realizes any one of the entity relationship extraction methods in the above embodiments by reading and executing computer program instructions stored in the memory 92.
In some of these embodiments, the electronic device may also include a communication interface 93 and a bus 90. As shown in fig. 8, the processor 91, the memory 92, and the communication interface 93 are connected to each other via a bus 90 to complete communication therebetween.
The communication interface 93 is used for implementing communication among modules, systems, units and/or devices in the embodiments of the present application. The communication port 83 may also be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
The bus 90 comprises hardware, software, or both that couple the components of the electronic device to one another. Bus 90 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 90 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 90 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The electronic device may execute the entity relationship extraction method in the embodiment of the present application based on the obtained original text, thereby implementing the entity relationship extraction method described in conjunction with fig. 1.
In addition, in combination with the entity relationship extraction method in the foregoing embodiments, the embodiments of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the entity relationship extraction methods in the above embodiments.
Example 2
The present embodiment makes the following technical changes on the basis of embodiment 1: the relevant neural network of the LSTM may be replaced with a GRU relevant neural network.
The word vector extraction method can be used in BERT (bidirectional Encoder expressions from transformations), XLNET, etc.
GRU is a variant of LSTM, and many variants can be obtained by modifying the gate mechanism of LSTM, and GRU (Gated Recurrent Unit) is a simpler Recurrent neural network than LSTM.
The improvement of the GRU network to the LSTM network has two aspects:
1. the forgetting gate and the input gate are combined into one gate: the refresh gate and the other gate is called the reset gate.
2. And directly introducing a linear dependency relationship between the current state ht and the historical state ht-1 without introducing an additional internal state c.
The xlnet is used as an upgrading model of the bert and is mainly optimized in the following three aspects: the AR model is adopted to replace an AE model, so that negative effects brought by the mask are solved; a double-flow attention mechanism; the transformer-xl was introduced.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. An entity relationship extraction method is characterized by comprising the following steps:
a word segmentation step is obtained, and word segmentation operation is carried out on the input original text and word segmentation is output;
acquiring an initial vector representation step, inputting the participle into an Embedding layer to obtain an initial vector representation corresponding to the participle;
acquiring entity vector representation, namely inputting the initial vector representation into a Bi-LSTM layer to obtain two groups of entity vector representations in a forward direction and a backward direction;
a step of obtaining node vectors, which is to input the initial vector representation to a Tree-based-LSTM layer to obtain the node vectors;
acquiring entity existence relation probability, namely inputting two groups of entity vector representations into an Attention layer to obtain a first state vector, and inputting the first state vector into a Q-Learning network to obtain the probability of existence relation between entities represented by the two entity vector representations;
and acquiring an entity relationship, namely inputting the node vector to a softmax layer to obtain a second state vector, inputting the second state vector to the Q-Learning network, and outputting the entity relationship by combining the probability of the relationship existing between entities.
2. The entity relationship extraction method according to claim 1, wherein the step of obtaining the entity existence relationship probability specifically includes:
a step of outputting a weight matrix, which is to obtain the weight of each participle through linear transformation and output the weight matrix;
and a step of obtaining a first state vector, wherein the first state vector is obtained by multiplying the weight matrix by a matrix formed by the entity vector representation.
3. The entity relationship extraction method according to claim 1, wherein the step of obtaining the entity vector representation further comprises the following steps:
acquiring an operation result, namely multiplying the two groups of entity vector representations by corresponding weight matrixes, inputting the multiplied entity vector representations to a softmax layer, and outputting the operation result;
and acquiring a named entity, receiving and utilizing a Viterbi algorithm to acquire the named entity according to the operation result.
4. The entity relationship extraction method according to claim 1, wherein the step of obtaining the node vector specifically comprises:
a step of obtaining a syntax tree structure, which is to output the syntax tree structure of the original text by using a syntax analysis tool according to the initial vector representation;
and a step of obtaining Tree node vectors, wherein the syntax Tree structure is input to the Tree-based-LSTM layer to obtain the node vectors of each node in the syntax Tree structure.
5. An entity relationship extraction system, comprising:
the method comprises the steps of obtaining a word segmentation module, carrying out word segmentation operation on an input original text and outputting a word segmentation;
an initial vector representation obtaining module, which inputs the participles into an Embedding layer to obtain initial vector representations corresponding to the participles;
an entity vector representation obtaining module inputs the initial vector representation to a Bi-LSTM layer to obtain two groups of entity vector representations in the forward direction and the backward direction;
the node vector obtaining module is used for inputting the initial vector representation to the Tree-based-LSTM layer to obtain a node vector of a syntax Tree structure;
the entity existence relation probability obtaining module is used for inputting the two groups of entity vector representations into an Attention layer to obtain a first state vector, and inputting the first state vector into a Q-Learning network to obtain the probability of existence relation between entities correspondingly represented by the two entity vectors;
and the entity relation obtaining module is used for inputting the node vector to a softmax layer to obtain a second state vector, inputting the second state vector to the Q-Learning network, and outputting an entity relation by combining the probability of relation existing among the entities.
6. The entity relationship extraction system according to claim 5, wherein the module for obtaining the entity existence relationship probability specifically comprises:
the output weight matrix unit is used for obtaining the weight of each participle through linear transformation and outputting a weight matrix;
and a first state vector unit is obtained, and the first state vector is obtained by multiplying the weight matrix by a matrix formed by the entity vector representation.
7. The entity relationship extraction system according to claim 5, wherein the entity vector obtaining representation module is connected to an operation result obtaining module and a named entity obtaining module, wherein:
the operation result obtaining module is used for multiplying the two groups of entity vector representations by the corresponding weight matrixes, inputting the multiplied entity vector representations to the softmax layer and outputting operation results;
and the acquisition named entity module receives and obtains a named entity by utilizing a Viterbi algorithm according to the operation result.
8. The entity relationship extraction system of claim 7, wherein the module for obtaining node vectors specifically comprises:
a syntax tree structure unit is obtained, and a syntax tree structure of the original text is output by using a syntax analysis tool according to the initial vector representation;
and a Tree node vector obtaining unit, which inputs the syntax Tree structure to the Tree-based-LSTM layer to obtain the node vector of each node in the syntax Tree structure.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the entity relationship extraction method as claimed in any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the entity relationship extraction method as recited in any one of claims 1 to 4.
CN202011330157.2A 2020-11-24 2020-11-24 Entity relation extraction method, system, electronic equipment and storage medium Active CN112417878B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011330157.2A CN112417878B (en) 2020-11-24 2020-11-24 Entity relation extraction method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011330157.2A CN112417878B (en) 2020-11-24 2020-11-24 Entity relation extraction method, system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112417878A true CN112417878A (en) 2021-02-26
CN112417878B CN112417878B (en) 2024-06-14

Family

ID=74778176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011330157.2A Active CN112417878B (en) 2020-11-24 2020-11-24 Entity relation extraction method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112417878B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966517A (en) * 2021-04-30 2021-06-15 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN113158679A (en) * 2021-05-20 2021-07-23 广东工业大学 Marine industry entity identification method and device based on multi-feature superposition capsule network
CN113408289A (en) * 2021-06-29 2021-09-17 广东工业大学 Multi-feature fusion supply chain management entity knowledge extraction method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157643A1 (en) * 2016-12-06 2018-06-07 Siemens Aktiengesellschaft Device and method for natural language processing
CN110196978A (en) * 2019-06-04 2019-09-03 重庆大学 A kind of entity relation extraction method for paying close attention to conjunctive word
CN110210019A (en) * 2019-05-21 2019-09-06 四川大学 A kind of event argument abstracting method based on recurrent neural network
CN111126067A (en) * 2019-12-23 2020-05-08 北大方正集团有限公司 Entity relationship extraction method and device
CN111339774A (en) * 2020-02-07 2020-06-26 腾讯科技(深圳)有限公司 Text entity relation extraction method and model training method
CN111353306A (en) * 2020-02-22 2020-06-30 杭州电子科技大学 Entity relationship and dependency Tree-LSTM-based combined event extraction method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157643A1 (en) * 2016-12-06 2018-06-07 Siemens Aktiengesellschaft Device and method for natural language processing
CN110210019A (en) * 2019-05-21 2019-09-06 四川大学 A kind of event argument abstracting method based on recurrent neural network
CN110196978A (en) * 2019-06-04 2019-09-03 重庆大学 A kind of entity relation extraction method for paying close attention to conjunctive word
CN111126067A (en) * 2019-12-23 2020-05-08 北大方正集团有限公司 Entity relationship extraction method and device
CN111339774A (en) * 2020-02-07 2020-06-26 腾讯科技(深圳)有限公司 Text entity relation extraction method and model training method
CN111353306A (en) * 2020-02-22 2020-06-30 杭州电子科技大学 Entity relationship and dependency Tree-LSTM-based combined event extraction method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BING LIU等: "Incorporating HUman knowledge inNeural relation Extraction with Reinforcement Learning", 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS(IJCNN), 30 September 2019 (2019-09-30) *
XIN ZHOU等: "Joint Entity and Relation Extraction Based on Reinforcement Learning", IEEE, 2 September 2019 (2019-09-02) *
王传栋;徐娇;张永;: "实体关系抽取综述", 计算机工程与应用, no. 12, 31 May 2020 (2020-05-31) *
艾鑫;: "基于深度学习的实体和关系的联合抽取研究", 现代计算机, no. 06, 25 February 2020 (2020-02-25) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966517A (en) * 2021-04-30 2021-06-15 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN112966517B (en) * 2021-04-30 2022-02-18 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN113158679A (en) * 2021-05-20 2021-07-23 广东工业大学 Marine industry entity identification method and device based on multi-feature superposition capsule network
CN113408289A (en) * 2021-06-29 2021-09-17 广东工业大学 Multi-feature fusion supply chain management entity knowledge extraction method and system
CN113408289B (en) * 2021-06-29 2024-04-16 广东工业大学 Multi-feature fusion supply chain management entity knowledge extraction method and system

Also Published As

Publication number Publication date
CN112417878B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
CN109992782B (en) Legal document named entity identification method and device and computer equipment
JP5128629B2 (en) Part-of-speech tagging system, part-of-speech tagging model training apparatus and method
CN111460820B (en) Network space security domain named entity recognition method and device based on pre-training model BERT
JP2020520492A (en) Document abstract automatic extraction method, device, computer device and storage medium
CN112906392B (en) Text enhancement method, text classification method and related device
CN112417878A (en) Entity relationship extraction method, system, electronic equipment and storage medium
CN111967264B (en) Named entity identification method
CN111274797A (en) Intention recognition method, device and equipment for terminal and storage medium
CN112101031B (en) Entity identification method, terminal equipment and storage medium
CN112560506B (en) Text semantic analysis method, device, terminal equipment and storage medium
CN112232070A (en) Natural language processing model construction method, system, electronic device and storage medium
CN112668333A (en) Named entity recognition method and device, and computer-readable storage medium
CN113486178A (en) Text recognition model training method, text recognition device and medium
CN113986950A (en) SQL statement processing method, device, equipment and storage medium
TWI752406B (en) Speech recognition method, speech recognition device, electronic equipment, computer-readable storage medium and computer program product
CN113886601A (en) Electronic text event extraction method, device, equipment and storage medium
CN113326702A (en) Semantic recognition method and device, electronic equipment and storage medium
CN114722832A (en) Abstract extraction method, device, equipment and storage medium
CN112528653A (en) Short text entity identification method and system
CN112528657A (en) Text intention recognition method and device based on bidirectional LSTM, server and medium
CN110705258A (en) Text entity identification method and device
CN115130475A (en) Extensible universal end-to-end named entity identification method
CN117235205A (en) Named entity recognition method, named entity recognition device and computer readable storage medium
CN114896404A (en) Document classification method and device
Yasin et al. Transformer-Based Neural Machine Translation for Post-OCR Error Correction in Cursive Text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant