CN113449084A - Relationship extraction method based on graph convolution - Google Patents

Relationship extraction method based on graph convolution Download PDF

Info

Publication number
CN113449084A
CN113449084A CN202111021201.6A CN202111021201A CN113449084A CN 113449084 A CN113449084 A CN 113449084A CN 202111021201 A CN202111021201 A CN 202111021201A CN 113449084 A CN113449084 A CN 113449084A
Authority
CN
China
Prior art keywords
word
entity
graph convolution
representation
original sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111021201.6A
Other languages
Chinese (zh)
Inventor
陶建华
张华�
张大伟
杨国花
刘通
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202111021201.6A priority Critical patent/CN113449084A/en
Publication of CN113449084A publication Critical patent/CN113449084A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a relation extraction method based on graph convolution, which comprises the following steps: language analysis preprocessing: performing word segmentation and dependency syntax analysis on an original sentence in a data set by means of a natural language analysis tool to obtain a word segmentation result of the original sentence, constructing and obtaining a dependency syntax tree which represents semantic dependency relationship between words in the original sentence, and generating an adjacency matrix according to topological relationship between nodes in the dependency syntax tree; and (3) searching the word vector: each word of the original sentence can be converted into a corresponding word vector by querying the word vector table, so that vectorization representation of the original sentence is obtained; extracting features by a graph convolution neural network: inputting the adjacent matrix and the vectorization representation of each word into a graph convolution network, and learning to obtain feature representation; and (4) relation classification: and splicing the feature representations, sending the spliced feature representations into a learning neural network to obtain final representations, and obtaining probability distribution of the entity pairs on each relation according to the feature representations, wherein the relation with the maximum prediction probability is the relation type of the subject entity and the object entity in the model prediction sentence.

Description

Relationship extraction method based on graph convolution
Technical Field
The invention relates to the field of text data relation extraction, in particular to a relation extraction method based on graph convolution.
Background
In the era of information explosion, a great amount of text data, such as news reports, blogs, research documents, social media comments and the like, appear on the internet every day, and how to quickly and effectively dig out valuable information from the massive text data becomes a challenge to be solved urgently. The relationship extraction is to identify semantic relationships between named entities for a given text sentence and the named entities marked therein.
The existing relation extraction technology generally takes the characteristics of sentences and words near entities as the input characteristics of a model, obtains an integral representation after a series of processing, and finally obtains the relation classification probability after a trained classifier.
Disadvantages of the prior art
The traditional method based on characteristics needs to convert the relationship examples into characteristic vectors which can be received by a classifier in a display mode, and research focuses on how to extract characteristics with distinctiveness, and characteristics including vocabularies, syntaxes, semantics and the like are generally integrated, so that various local characteristics and global characteristics for describing the relationship examples are generated. The kernel function based method directly takes the structure tree as a processing object, and uses the kernel function to calculate the distance between the relations. The deep learning-based method generally converts an input sentence into a word vector through a word vector matrix and then inputs the word vector as a model, then further extracts and fuses local vocabulary characteristics and global sentence characteristics, and finally obtains representation characteristics for relation classification. The statistical learning method based on the feature engineering and the kernel function has the defects of small expandability of a model, meanwhile, the extraction of the manually designed features depends on a natural language processing tool, the feature extraction process is also a tandem (Pipeline) process, and the result of the natural language processing of the previous step is used as the input of the next step, so the natural language processing tools are easy to cause error accumulation and transmission, and the extracted features are not accurate. Meanwhile, when the language is spoken, the method is greatly limited due to the lack of related natural language processing tools.
Disclosure of Invention
In view of the above, the present invention provides a relationship extraction method based on graph convolution, including:
language analysis preprocessing: performing word segmentation and dependency syntax analysis on an original sentence in a data set by means of a natural language analysis tool to obtain a word segmentation result of the original sentence, constructing and obtaining a dependency syntax tree which represents semantic dependency relationship between words in the original sentence, and generating an adjacency matrix according to topological relationship among nodes in the dependency syntax tree;
and (3) searching the word vector: each word of the original sentence can be converted into a corresponding word vector by querying a word vector table, so that vectorization representation of the original sentence is obtained;
extracting features by a graph convolution neural network: inputting the vectorization representation of each word in the adjacency matrix and the sentence into a graph convolution network, and learning to obtain feature representation;
and (4) relation classification: and splicing the characteristic representations, sending the spliced characteristic representations to a learning neural network to obtain final representations, and obtaining probability distribution of the entity pairs on each relation according to the characteristic representations, wherein the relation with the maximum prediction probability is the relation type of the subject entity and the object entity in the model prediction sentence.
In some embodiments, in particular, the method further comprises:
and carrying out entity recognition on the original sentence by using an entity recognition tool, and calling the obtained entities as a subject entity and an object entity according to the sequence of the appearance of the entities in the original sentence.
In some embodiments, in particular, the method further comprises:
dependency syntax tree pruning: and pruning in the dependency syntax tree according to subtrees formed by the subject entities, the object entities and the nearest common ancestors of the subject entities and the object entities in the sentences, and generating a pruned adjacency matrix according to the topological relation among the nodes in the pruned dependency syntax tree.
In some embodiments, in particular, the feature representation comprises:
the sentence integral representation, the subject entity representation and the object entity representation which fuse the context and the semantic dependency relationship.
In some embodiments, in particular, the learning neural network is a feed-forward neural network.
In some embodiments, specifically, the specific method for obtaining the probability distribution of the entity pairs on the relationships according to the feature representation includes:
and inputting the obtained final representation into a linear layer, and finally obtaining the probability distribution of the entity pairs on each relation through Softmax operation.
In some embodiments, specifically, the specific method of inputting the vectorized representation of each word in the adjacency matrix and the sentence into the graph convolutional network to learn and obtain the sentence overall representation fusing the context and the semantic dependency relationship is as follows:
Figure 665412DEST_PATH_IMAGE001
wherein the content of the first and second substances,
h L()an overall implicit representation representing the L-th layer output of the graph convolution neural network;
GCN (.) denotes L-layer map convolutional neural network;
h (0)representing an input layer of the graph convolutional neural network;
f(.) represents the maximum pooling function.
In some embodiments, the specific method for obtaining the subject entity representation specifically is:
Figure 616050DEST_PATH_IMAGE002
wherein the content of the first and second substances,s1:s2, representing the index interval of all word sequences of the word sequences forming the subject entity after the words are segmented in the original sentence;h L() s s1:2the index interval of all word sequences of the subject entity after the word sequence of the original sentence is segmented is used as input, and the subject entity output by the L-th layer of the graph convolution neural network is implicitly represented.
In some embodiments, the specific method for obtaining the object entity representation is:
Figure DEST_PATH_IMAGE003
wherein the content of the first and second substances,o1:o2, an index section of all word sequences of the word sequences forming the object entity after the words are segmented in the original sentence;h L() o o1:2and the index interval of all the word sequences of the object entity after the word sequence of the original sentence is divided is used as input, and the object entity output by the L-th layer of the graph convolution neural network is implicitly represented.
In some embodiments, in particular, each node in the dependency syntax tree has a range affected by the neighborhood that is no more than L edges apart in the dependency tree. Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
the dependency syntax tree can help the relationship extraction model to capture long-distance semantic relationships between entities, and meanwhile, compared with the traditional statistical learning method and the model based on the sequence deep learning, the relationship extraction based on the graph convolution network and the dependency tree pruning technology can learn better entity representation and sentence representation in a context at a higher semantic level, and the final model has more advantages when extracting the semantic relationships between entities far away from each other in a sentence.
Drawings
Fig. 1 is a flowchart of a relationship extraction method based on graph convolution according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
As shown in fig. 1, the method for extracting a relationship based on graph convolution according to the embodiment of the present application includes:
segmenting the original sentence in the data set by means of a natural language analysis tool to obtain a segmentation result of the original sentence, so as to obtain a sentence represented by segmentationX=[X 1,…X n]Carrying out entity recognition on the original sentence by using an entity recognition tool, and calling the obtained entities as a subject entity and an object entity according to the sequence of the entities appearing in the original sentence; performing dependency syntax analysis on original sentences in a data set by means of a natural language analysis tool, representing each word as a node, taking semantic dependency relationship among the words as edges among corresponding nodes of related words, constructing and obtaining a dependency syntax tree representing the semantic dependency relationship among the words in the original sentences, and generating an adjacency matrix according to the topological relationship among the nodes in the dependency syntax tree, wherein the specific method comprises the following steps:
assuming that the original sentence is cut into n words during word segmentation, an n-row and n-column adjacency matrix is constructed corresponding to n nodes in the dependency syntax treeAWhereinA ijWhen the value is 1, the dependency syntax tree shows that edges exist from the node i to the node j, namely, the semantic dependency relationship exists between the ith word and the jth word in the corresponding original sentence; otherwiseA ijWhen the value is 0, the dependency syntax tree shows that no edge exists between the node i and the node j, namely, no semantic dependency relationship exists between the ith word and the jth word in the corresponding original sentence;
and (3) searching the word vector: each word of the original sentence can be converted into a corresponding word vector by querying a word vector table, so that vectorization representation of the original sentence is obtained;
dependency syntax tree pruning: pruning in a dependency syntax tree according to subtrees formed by the subject entities, the object entities and the nearest common ancestors of the subject entities and the object entities in the sentences, and generating a pruned adjacency matrix according to the topological relation among nodes in the pruned dependency syntax tree;
extracting features by a graph convolution neural network: most of the information for judging the relation between the entity pair semantics is usually contained in a subtree taking the nearest common ancestor of the subject entity and the object entity as a root, and the subject entity representation and the object entity representation can be learned by aggregating neighborhood information by using a graph convolution network; inputting the adjacency matrix and vectorized representation of each word in the sentenceLIn the layered graph convolution network, the sentence integral representation, the subject entity representation and the object entity representation which are integrated with the context and the semantic dependency relationship are obtained through learning; the distance of the range of each node affected by the neighborhood in the dependency syntax tree does not exceed L edges, and L takes a value of 2 or 3;
the expression method of the L-layer graph convolution network comprises the following steps:
Figure 934774DEST_PATH_IMAGE004
wherein, respectively useh i l-1()Andh i 1()representing an input vector and an output vector of an ith node in the graph volume network of the ith layer;
Figure 287258DEST_PATH_IMAGE005
the matrix being a contiguous matrixAAnd identity matrixIA result matrix of the addition;
Figure 229806DEST_PATH_IMAGE006
is the degree of node i in the graph;W 1()andb 1()model parameters, namely a first layer weight matrix and an offset item, which are respectively learned by the graph volume network; σ is a nonlinear activation function;
the concrete method for learning and obtaining the sentence integral representation fusing the context and the semantic dependency relationship by inputting the vectorized representation of each word in the adjacency matrix and the sentence into the graph convolution network is as follows:
Figure 667741DEST_PATH_IMAGE001
wherein the content of the first and second substances,
h L()an overall implicit representation representing the L-th layer output of the graph convolution neural network;
GCN (.) denotes L-layer map convolutional neural network;
h (0)representing an input layer of the graph convolutional neural network;
f(.) represents a maximum pooling function;
the specific method for obtaining the subject entity representation comprises the following steps:
Figure 88358DEST_PATH_IMAGE002
wherein the content of the first and second substances,s1:s2, representing the index interval of all word sequences of the word sequences forming the subject entity after the words are segmented in the original sentence;h L() s s1:2the index interval of all word sequences of the subject entity after the word sequence of the original sentence is segmented is used as input, and the subject entity output by the L-th layer of the graph convolution neural network is implicitly represented;
the specific method for obtaining the object entity representation comprises the following steps:
Figure 498610DEST_PATH_IMAGE003
wherein the content of the first and second substances,o1:o2, an index section of all word sequences of the word sequences forming the object entity after the words are segmented in the original sentence;h L() o o1:2the index interval of all word sequences of the object entity after the word sequence of the original sentence is divided is used as input, and the object entity output by the L-th layer of the graph convolution neural network is implicitly represented;
and (4) relation classification: the sentence integral representation, the subject entity representation and the object entity representation which are fused with the context and the semantic dependency relationship are spliced and then sent to a feedforward neural network to obtain the final representation, and the specific method comprises the following steps:
Figure 612060DEST_PATH_IMAGE007
inputting the obtained final representation into a linear layer, and finally obtaining the probability distribution of the entity pairs on each relation through Softmax operation, wherein the specific formula of Softmax is as follows:
Figure 537291DEST_PATH_IMAGE008
the relationship with the highest prediction probability is the relationship type of the subject entity and the object entity in the sentence predicted by the model.
Examples
The input data set: the ACE 2005 Data set is downloaded into a Linguistic Data Consortium office network, the Data in the ACE 2005 corpus folder comprises three languages of Arabic, English and Chinese, and multiple Data sources are arranged in each language.
Segmenting the original sentence in the data set by means of a natural language analysis tool to obtain a segmentation result of the original sentence, so as to obtain a sentence represented by segmentationX=[X 1,…X n]Carrying out entity recognition on the original sentence by using an entity recognition tool, and calling the obtained entities as a subject entity and an object entity according to the sequence of the entities appearing in the original sentence; performing dependency syntax analysis on original sentences in a data set by means of a natural language analysis tool, representing each word as a node, taking semantic dependency relationship among the words as edges among corresponding nodes of related words, constructing and obtaining a dependency syntax tree representing the semantic dependency relationship among the words in the original sentences, and generating an adjacency matrix according to the topological relationship among the nodes in the dependency syntax tree, wherein the specific method comprises the following steps:
suppose thatThe original sentence is cut into n words during word segmentation, and an n-row and n-column adjacency matrix is constructed corresponding to n nodes in the dependency syntax treeAWhereinA ijWhen the value is 1, the dependency syntax tree shows that edges exist from the node i to the node j, namely, the semantic dependency relationship exists between the ith word and the jth word in the corresponding original sentence; otherwiseA ijWhen the value is 0, the dependency syntax tree shows that no edge exists between the node i and the node j, namely, no semantic dependency relationship exists between the ith word and the jth word in the corresponding original sentence;
and (3) searching the word vector: each word of the original sentence can be converted into a corresponding word vector by querying a word vector table, so that vectorization representation of the original sentence is obtained;
dependency syntax tree pruning: pruning in a dependency syntax tree according to subtrees formed by the subject entities, the object entities and the nearest common ancestors of the subject entities and the object entities in the sentences, and generating a pruned adjacency matrix according to the topological relation among nodes in the pruned dependency syntax tree;
extracting features by a graph convolution neural network: most of the information for judging the relation between the entity pair semantics is usually contained in a subtree taking the nearest common ancestor of the subject entity and the object entity as a root, and the subject entity representation and the object entity representation can be learned by aggregating neighborhood information by using a graph convolution network; inputting the adjacency matrix and vectorized representation of each word in the sentenceLIn the layered graph convolution network, the sentence integral representation, the subject entity representation and the object entity representation which are integrated with the context and the semantic dependency relationship are obtained through learning; the distance of the range of each node affected by the neighborhood in the dependency syntax tree does not exceed L edges, and L takes a value of 2 or 3;
the expression method of the L-layer graph convolution network comprises the following steps:
Figure 761599DEST_PATH_IMAGE004
wherein, respectively useh i l-1()Andh i 1()representing an input vector and an output vector of an ith node in the graph volume network of the ith layer;
Figure 823096DEST_PATH_IMAGE005
the matrix being a contiguous matrixAAnd identity matrixIA result matrix of the addition;
Figure 45129DEST_PATH_IMAGE009
is the degree of node i in the graph;W 1()andb 1()model parameters, namely a first layer weight matrix and an offset item, which are respectively learned by the graph volume network; σ is a nonlinear activation function;
the concrete method for learning and obtaining the sentence integral representation fusing the context and the semantic dependency relationship by inputting the vectorized representation of each word in the adjacency matrix and the sentence into the graph convolution network is as follows:
Figure 457656DEST_PATH_IMAGE001
wherein the content of the first and second substances,
h L()an overall implicit representation representing the L-th layer output of the graph convolution neural network;
GCN (.) denotes L-layer map convolutional neural network;
h (0)representing an input layer of the graph convolutional neural network;
f(.) represents a maximum pooling function;
the specific method for obtaining the subject entity representation comprises the following steps:
Figure 485655DEST_PATH_IMAGE002
wherein the content of the first and second substances,s1:s2, representing the index interval of all word sequences of the word sequences forming the subject entity after the words are segmented in the original sentence;h L() s s1:2the index interval of all word sequences of the subject entity after the word sequence of the original sentence is segmented is used as input, and the subject entity output by the L-th layer of the graph convolution neural network is implicitly represented;
the specific method for obtaining the object entity representation comprises the following steps:
Figure 667238DEST_PATH_IMAGE003
wherein the content of the first and second substances,o1:o2, an index section of all word sequences of the word sequences forming the object entity after the words are segmented in the original sentence;h L() o o1:2the index interval of all word sequences of the object entity after the word sequence of the original sentence is divided is used as input, and the object entity output by the L-th layer of the graph convolution neural network is implicitly represented;
and (4) relation classification: the sentence integral representation, the subject entity representation and the object entity representation which are fused with the context and the semantic dependency relationship are spliced and then sent to a feedforward neural network to obtain the final representation, and the specific method comprises the following steps:
Figure 856911DEST_PATH_IMAGE007
inputting the obtained final representation into a linear layer, and finally obtaining the probability distribution of the entity pairs on each relation through Softmax operation, wherein the specific formula of Softmax is as follows:
Figure 694417DEST_PATH_IMAGE008
the relationship with the highest prediction probability is the relationship type of the subject entity and the object entity in the sentence predicted by the model.
The application also discloses a readable storage medium, which stores one or more programs, and the one or more programs can be executed by one or more processors to implement the graph convolution-based relationship extraction method described in the above embodiment.
The application also discloses computer equipment, which comprises a processor and a memory, wherein the memory is used for storing computer programs; the processor is configured to implement the steps of the graph convolution-based relationship extraction method when executing the computer program stored in the memory.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
Embodiments of the subject matter and the functional operations described in this specification can be implemented in: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this specification and their structural equivalents, or a combination of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
The processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Computers suitable for executing computer programs include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.
Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., an internal hard disk or a removable disk), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. In other instances, features described in connection with one embodiment may be implemented as discrete components or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Further, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for extracting relationships based on graph convolution, the method comprising:
language analysis preprocessing: performing word segmentation and dependency syntax analysis on an original sentence in a data set by means of a natural language analysis tool to obtain a word segmentation result of the original sentence, constructing and obtaining a dependency syntax tree which represents semantic dependency relationship between words in the original sentence, and generating an adjacency matrix according to topological relationship among nodes in the dependency syntax tree;
and (3) searching the word vector: each word of the original sentence can be converted into a corresponding word vector by querying a word vector table, so that vectorization representation of the original sentence is obtained;
extracting features by a graph convolution neural network: inputting the vectorization representation of each word in the adjacency matrix and the sentence into a graph convolution network, and learning to obtain feature representation;
and (4) relation classification: and splicing the characteristic representations, sending the spliced characteristic representations to a learning neural network to obtain final representations, and obtaining probability distribution of the entity pairs on each relation according to the characteristic representations, wherein the relation with the maximum prediction probability is the relation type of the subject entity and the object entity in the model prediction sentence.
2. The graph convolution-based relationship extraction method according to claim 1, further comprising:
and carrying out entity recognition on the original sentence by using an entity recognition tool, and calling the obtained entities as a subject entity and an object entity according to the sequence of the appearance of the entities in the original sentence.
3. The graph convolution-based relationship extraction method according to claim 2, further comprising:
dependency syntax tree pruning: and pruning in the dependency syntax tree according to subtrees formed by the subject entities, the object entities and the nearest common ancestors of the subject entities and the object entities in the sentences, and generating a pruned adjacency matrix according to the topological relation among the nodes in the pruned dependency syntax tree.
4. The graph convolution-based relationship extraction method according to claim 2, wherein the feature representation includes:
the sentence integral representation, the subject entity representation and the object entity representation which fuse the context and the semantic dependency relationship.
5. The graph convolution-based relationship extraction method of claim 1, wherein the learning neural network is a feed-forward neural network.
6. The graph convolution-based relationship extraction method according to claim 5, wherein the specific method for obtaining the probability distribution of the entity pairs on each relationship according to the feature representation includes:
and inputting the obtained final representation into a linear layer, and finally obtaining the probability distribution of the entity pairs on each relation through Softmax operation.
7. The graph convolution-based relationship extraction method according to claim 4, wherein a specific method for learning the vectorized representation of each word in the adjacency matrix and the sentence into the graph convolution network to obtain the sentence overall representation fused with context and semantic dependency relationships comprises:
Figure 978228DEST_PATH_IMAGE001
wherein the content of the first and second substances,
h L()an overall implicit representation representing the L-th layer output of the graph convolution neural network;
GCN (.) denotes L-layer map convolutional neural network;
h (0)representing an input layer of the graph convolutional neural network;
f(.) represents the maximum pooling function.
8. The graph convolution-based relationship extraction method according to claim 7, wherein the specific method for obtaining the subject entity representation is:
Figure 994725DEST_PATH_IMAGE002
wherein the content of the first and second substances,s1:s2, representing the index interval of all word sequences of the word sequences forming the subject entity after the words are segmented in the original sentence;h L() s s1:2the index interval of all word sequences of the subject entity after the word sequence of the original sentence is segmented is used as input, and the subject entity output by the L-th layer of the graph convolution neural network is implicitly represented.
9. The graph convolution-based relationship extraction method according to claim 8, wherein a specific method for obtaining the object entity representation is:
Figure 492703DEST_PATH_IMAGE003
wherein the content of the first and second substances,o1:o2, an index section of all word sequences of the word sequences forming the object entity after the words are segmented in the original sentence;h L() o o1:2and the index interval of all the word sequences of the object entity after the word sequence of the original sentence is divided is used as input, and the object entity output by the L-th layer of the graph convolution neural network is implicitly represented.
10. The graph convolution-based relationship extraction method of claim 9, wherein a distance of a range of each node in the dependency syntax tree affected by a neighborhood does not exceed L edges in the dependency tree.
CN202111021201.6A 2021-09-01 2021-09-01 Relationship extraction method based on graph convolution Pending CN113449084A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111021201.6A CN113449084A (en) 2021-09-01 2021-09-01 Relationship extraction method based on graph convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111021201.6A CN113449084A (en) 2021-09-01 2021-09-01 Relationship extraction method based on graph convolution

Publications (1)

Publication Number Publication Date
CN113449084A true CN113449084A (en) 2021-09-28

Family

ID=77819362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111021201.6A Pending CN113449084A (en) 2021-09-01 2021-09-01 Relationship extraction method based on graph convolution

Country Status (1)

Country Link
CN (1) CN113449084A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114091450A (en) * 2021-11-19 2022-02-25 南京通达海科技股份有限公司 Judicial domain relation extraction method and system based on graph convolution network
CN114860886A (en) * 2022-05-25 2022-08-05 北京百度网讯科技有限公司 Method for generating relation graph and method and device for determining matching relation
CN115688776A (en) * 2022-09-27 2023-02-03 北京邮电大学 Relation extraction method for Chinese financial text
CN116258504A (en) * 2023-03-16 2023-06-13 广州信瑞泰信息科技有限公司 Bank customer relationship management system and method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480125A (en) * 2017-07-05 2017-12-15 重庆邮电大学 A kind of relational links method of knowledge based collection of illustrative plates
CN110532328A (en) * 2019-08-26 2019-12-03 哈尔滨工程大学 A kind of text concept figure building method
CN113239186A (en) * 2021-02-26 2021-08-10 中国科学院电子学研究所苏州研究院 Graph convolution network relation extraction method based on multi-dependency relation representation mechanism
CN113297838A (en) * 2021-05-21 2021-08-24 华中科技大学鄂州工业技术研究院 Relationship extraction method based on graph neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480125A (en) * 2017-07-05 2017-12-15 重庆邮电大学 A kind of relational links method of knowledge based collection of illustrative plates
CN110532328A (en) * 2019-08-26 2019-12-03 哈尔滨工程大学 A kind of text concept figure building method
CN113239186A (en) * 2021-02-26 2021-08-10 中国科学院电子学研究所苏州研究院 Graph convolution network relation extraction method based on multi-dependency relation representation mechanism
CN113297838A (en) * 2021-05-21 2021-08-24 华中科技大学鄂州工业技术研究院 Relationship extraction method based on graph neural network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114091450A (en) * 2021-11-19 2022-02-25 南京通达海科技股份有限公司 Judicial domain relation extraction method and system based on graph convolution network
CN114860886A (en) * 2022-05-25 2022-08-05 北京百度网讯科技有限公司 Method for generating relation graph and method and device for determining matching relation
CN114860886B (en) * 2022-05-25 2023-07-18 北京百度网讯科技有限公司 Method for generating relationship graph and method and device for determining matching relationship
CN115688776A (en) * 2022-09-27 2023-02-03 北京邮电大学 Relation extraction method for Chinese financial text
CN116258504A (en) * 2023-03-16 2023-06-13 广州信瑞泰信息科技有限公司 Bank customer relationship management system and method thereof

Similar Documents

Publication Publication Date Title
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN108304468B (en) Text classification method and text classification device
CN107180023B (en) Text classification method and system
CN108446271B (en) Text emotion analysis method of convolutional neural network based on Chinese character component characteristics
CN113449084A (en) Relationship extraction method based on graph convolution
CN107273913B (en) Short text similarity calculation method based on multi-feature fusion
CN111611807B (en) Keyword extraction method and device based on neural network and electronic equipment
CN109086265B (en) Semantic training method and multi-semantic word disambiguation method in short text
JP6738769B2 (en) Sentence pair classification device, sentence pair classification learning device, method, and program
CN111191442A (en) Similar problem generation method, device, equipment and medium
CN113627447A (en) Label identification method, label identification device, computer equipment, storage medium and program product
CN111159409A (en) Text classification method, device, equipment and medium based on artificial intelligence
CN111191031A (en) Entity relation classification method of unstructured text based on WordNet and IDF
CN113268560A (en) Method and device for text matching
CN112818121A (en) Text classification method and device, computer equipment and storage medium
CN113761868A (en) Text processing method and device, electronic equipment and readable storage medium
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN112215629B (en) Multi-target advertisement generating system and method based on construction countermeasure sample
CN114925702A (en) Text similarity recognition method and device, electronic equipment and storage medium
CN112906368B (en) Industry text increment method, related device and computer program product
CN113934848A (en) Data classification method and device and electronic equipment
CN116522905B (en) Text error correction method, apparatus, device, readable storage medium, and program product
CN113254637A (en) Grammar-fused aspect-level text emotion classification method and system
CN113065350A (en) Biomedical text word sense disambiguation method based on attention neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210928