CN111898364A - Neural network relation extraction method, computer device and readable storage medium - Google Patents

Neural network relation extraction method, computer device and readable storage medium Download PDF

Info

Publication number
CN111898364A
CN111898364A CN202010752459.2A CN202010752459A CN111898364A CN 111898364 A CN111898364 A CN 111898364A CN 202010752459 A CN202010752459 A CN 202010752459A CN 111898364 A CN111898364 A CN 111898364A
Authority
CN
China
Prior art keywords
extraction
neural network
channel
sentence
clauses
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010752459.2A
Other languages
Chinese (zh)
Other versions
CN111898364B (en
Inventor
回艳菲
王健宗
程宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010752459.2A priority Critical patent/CN111898364B/en
Priority to PCT/CN2020/111513 priority patent/WO2021174774A1/en
Publication of CN111898364A publication Critical patent/CN111898364A/en
Application granted granted Critical
Publication of CN111898364B publication Critical patent/CN111898364B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a neural network relation extraction method, which comprises the following steps: constructing a two-channel neural network model; obtaining sentences to be processed; performing dependency syntax analysis on the sentence to obtain two clauses of the sentence; inputting the two clauses into a first channel, and performing characteristic extraction through a CNN model to obtain first extraction information; inputting the sentence into a second channel, and performing feature extraction through an LSTM model to obtain second extraction information; and weighting and summarizing the first extraction information and the second extraction information through an attention mechanism to obtain a final extraction feature of the statement, and inputting the final extraction feature into a softmax layer to finish classifying the relationship categories among the target entities. The invention also provides a computer device and a computer readable storage medium. The neural network relation extraction method provided by the invention can perform relation extraction with high quality.

Description

Neural network relation extraction method, computer device and readable storage medium
Technical Field
The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a neural network relation extraction method, computer equipment and a readable storage medium.
Background
The relation extraction is a very important research in the field of natural language processing, and is an important subtask, the relation extraction aims to extract a predefined semantic relation between two entities from a text, the extracted relation and the entities can be organized into a triple form and stored in a graph database, and the triple form is applied to a medical knowledge graph based on a related knowledge graph technology. How to construct a high-quality medical knowledge map without the need of high-quality relation extraction. Therefore, for the medical knowledge map, the position of relation extraction is particularly important.
In a conventional relationship extraction task, sentences are generally vectorized and represented by a single model such as a Convolutional Neural Network (CNN) or a Recurrent Neural Network (RNN), but the quality of relationship extraction of the single model is not high.
Disclosure of Invention
An object of an embodiment of the present invention is to provide a neural network relationship extraction method that can perform relationship extraction with high quality.
In order to solve the above technical problem, an embodiment of the present invention provides a neural network relationship extraction method, including: constructing a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel; obtaining sentences to be processed; performing dependency syntax analysis on the sentence to generate a dependency syntax analysis tree, and finding out two shortest dependency paths between target entities from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence; inputting the two clauses into the first channel, and performing feature extraction through a convolutional neural network model to obtain first extraction information; inputting the sentence into the second channel, and performing feature extraction through a long-term and short-term memory network model to obtain second extraction information; and weighting and summarizing the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentence, and inputting the final extraction features into a softmax layer to finish classifying the relationship categories among the target entities.
Preferably, the method further comprises the following steps: and training the constructed two-channel neural network model.
Preferably, the training of the constructed two-channel neural network model includes: acquiring a training set; inputting the training set to the two-channel neural network model to output a predicted relationship class for the training set; calculating a loss function cross entropy according to the prediction relation category output by the two-channel neural network model and the actual relation category of the training set; and minimizing the loss function through an optimization algorithm to train the two-channel neural network model.
Preferably, inputting the two clauses into the first channel, and performing feature extraction through a convolutional neural network model to obtain first extraction information, including: carrying out vector representation on words of the two clauses; processing the vector representations of the two clauses through a convolutional layer, a pooling layer, and a non-linear layer; and fusing the vector representations of the two processed clauses through a hidden layer to obtain first extraction information.
Preferably, the inputting the sentence into the second channel, and performing feature extraction through a long-term and short-term memory network model to obtain second extraction information includes: performing word segmentation operation on the sentence to obtain L word segments; performing word vector mapping on the L participles respectively to obtain an L-x-d dimension word vector matrix, wherein the L participles are mapped into a d dimension word vector; and inputting the d-dimensional word vectors of the L participles into the long-short term memory network model in sequence for feature extraction to obtain second extraction information.
Preferably, the performing dependency parsing on the sentence to obtain two clauses of the sentence includes: performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree; and finding two shortest dependency paths between target entities from the dependency parsing tree, wherein the two shortest paths represent two clauses of the sentence.
The embodiment of the present invention further provides a neural network relationship extraction system, including: the device comprises an establishing module, a calculating module and a judging module, wherein the establishing module is used for establishing a two-channel neural network model which comprises a first channel and a second channel; the obtaining module is used for obtaining sentences to be processed; the shortest path generating module is used for carrying out dependency syntax analysis on the sentence to obtain two clauses of the sentence; the first extraction module is used for inputting the two clauses into the first channel in the first channel and extracting the characteristics through a convolutional neural network model to obtain first extraction information; the second extraction module is used for inputting the sentences into the second channel in the second channel and extracting the characteristics through a long-term and short-term memory network model to obtain second extraction information; and the classification module is used for weighting and summarizing the first extraction information and the second extraction information through an attention mechanism to obtain the final extraction features of the sentence, and inputting the final extraction features into a softmax layer of the dual-channel neural network model to finish classification of the relationship categories among the target entities.
Preferably, the first extraction module is further configured to: carrying out vector representation on words of the two clauses; processing the vector representations of the two clauses through a convolutional layer, a pooling layer, and a non-linear layer; and fusing the vector representations of the two processed clauses through a hidden layer to obtain first extraction information.
An embodiment of the present invention also provides a computer device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the steps of the neural network relationship extraction method.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program, which when executed by a processor implements the steps of the neural network relationship extraction method described above. Compared with the prior art, the dual-channel neural network relation extraction model provided by the embodiment of the invention integrates the key information of the shortest dependency path, uses the original sentence to keep the information which cannot be captured by the dependency path, extracts the local information through the CNN, and gathers the most useful information through the pooling layer, extracts excellent local information, and retains the key information for classifying the relation. Information extraction is performed on the whole sentence by using the LSTM, and excellent representation can be extracted for long-distance sentences. And weighting and summarizing the information extracted by the two models through an attention mechanism to obtain the final representation of the current sentence, wherein the current sentence contains the information which contributes most to the relation classification, and finally classifying the current sentence through a softmax layer to achieve the effect of extracting the preset relation.
Drawings
One or more embodiments are illustrated by the corresponding figures in the drawings, which are not meant to be limiting.
FIG. 1 is a schematic flow chart diagram of a neural network relationship extraction method according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of a two-channel neural network model in a first embodiment of the present invention;
FIG. 3 is a schematic diagram of feature extraction performed by a convolutional neural network model according to a first embodiment of the present invention;
FIG. 4 is a block diagram of a program for a neural network relationship extraction system according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer apparatus according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
The invention can be applied to intelligent government affairs/intelligent city management/intelligent community/intelligent security/intelligent logistics/intelligent medical treatment/intelligent education/intelligent environmental protection/intelligent traffic scenes, thereby promoting the construction of intelligent cities.
The first embodiment of the invention relates to a neural network relationship extraction method, and the core of the embodiment lies in that a dual-channel neural network relationship extraction model is provided, a Convolutional Neural Network (CNN) model is adopted to extract the key information of the shortest dependence path, a Long Short-Term Memory (LSTM) model is used to extract the information of the whole sentence, excellent expressions can be extracted from Long-distance sentences, the most useful information is gathered by a pooling layer, excellent local information is extracted, and the key information for classifying the relationship is reserved. The features extracted from the two models are weighted and summarized by an attention mechanism (also called an attention mechanism) to obtain the final vector representation of the current sentence, and finally the final vector representation is classified by a softmax layer, so that the effect of extracting the predetermined relationship is achieved. The following describes implementation details of the neural network relationship extraction method according to the present embodiment in detail, and the following is only provided for easy understanding and is not necessary for implementing the present embodiment.
Fig. 1 is a schematic flow chart of a neural network relationship extraction method in this embodiment, and the method is applied to a computer device.
In this embodiment, the execution order of the steps in the flowchart shown in fig. 1 may be changed and some steps may be omitted according to different requirements.
Step S101: and constructing a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel.
The relation extraction aims at extracting a predefined semantic relation between two entities from a text, the extracted relation and the entities can be organized into a triple form and stored in a graph database, and the triple relation is applied to a medical knowledge graph based on a related knowledge graph technology. And constructing a high-quality medical knowledge map, and extracting the relationship with high quality. Therefore, for the medical knowledge map, the position of relation extraction is particularly important.
In the prior art, a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) are used as two main architecture types of a Deep Neural Network (DNN), in a traditional relationship extraction task, sentences are generally represented by a single CNN or RNN model in a vectorization manner, but the single model may not catch a key point, especially in the medical field, the length of the sentence is different, no single model is adapted, and not all words in the sentence contribute to the entity relationship, and because some sentences are too long, the relationship extraction quality of the single model is not high. Therefore, in the embodiment, in order to improve the quality of the relationship extraction, the relationship extraction is performed by establishing a two-channel neural network model.
In this embodiment, after the two-channel neural network model is constructed, the constructed two-channel neural network model needs to be trained. Specifically, training the constructed two-channel neural network model includes: obtaining a training set, inputting the training set into the two-channel neural network model to output a prediction relation category of the training set, calculating a loss function cross entropy according to the prediction relation category output by the two-channel neural network model and an actual relation category of the training set, and minimizing the loss function cross entropy through an optimization algorithm to train the two-channel neural network model.
In this embodiment, the training set is a set of training data for a set of known actual relationship categories.
In this embodiment, the loss function is:
Figure BDA0002610466460000061
wherein r isiThe relationship among the entities is represented by a probability value of the ith category, S is a single sentence, S is a sentence set, and t is the number of categories of the entity relationship. When the physical relationship is the ith category, then riIs 1, otherwise is 0.
For example, assuming a total of 10 relationships between entities, and an entity relationship of 3 between entity 1 and entity 2, then r is3=1。
In this embodiment, the two-channel neural network model is trained by minimizing the loss function cross entropy, which is actually the loss function cross entropy between the prediction relationship class and the actual relationship class.
Step S102: and acquiring a sentence to be processed.
In this embodiment, the sentence to be processed is a sentence for which relationship extraction is required.
Step S103: and performing dependency syntax analysis on the sentence to obtain two clauses of the sentence.
Specifically, performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree; and finding two shortest dependency paths between target entities from the dependency parsing tree, wherein the two shortest paths represent two clauses of the sentence.
Currently, Stanfordparser and Berkeleyarser are representative in the open source Chinese syntax parser. Stanford parser is based on an factor model and Berkeley parser is based on a non-lexical analytical model. In this embodiment, the sentence is subjected to dependency parsing by a syntax parser (Stanfordparser). Of course, in other embodiments, other syntax analyzers may be used to perform dependency syntax analysis, and this embodiment is not limited to this.
In this embodiment, dependency syntax analysis is performed on the sentences to be processed to obtain two shortest dependency paths between the target entities. Since the shortest dependent path excludes unimportant modifier blocks and includes the stem portion of the expression relationship pattern, the two shortest paths are actually two clauses of the sentence. Moreover, by acquiring the shortest dependency path of a sentence, information that contributes most to the relationship classification can be captured, and excellent local information in the sentence can be extracted.
Step S104: and inputting the two clauses into the first channel, and performing characteristic extraction through a CNN (computer network communication) model to obtain first extraction information.
In this embodiment, fig. 2 is a schematic diagram of a two-channel model in a preferred embodiment of the present invention. As shown in fig. 2, in the first channel (left channel), two clauses (two shortest dependent paths) are input to the CNN model for feature extraction, so as to obtain first extraction information. Specifically, the step of inputting two clauses into the CNN model in the first channel for feature extraction to obtain first extraction information includes:
fig. 3 is a schematic diagram of feature extraction performed by the convolutional neural network model in the first embodiment of the present invention, and as shown in fig. 3, two clauses are vector-represented, and the vector representations of the two clauses are processed by a convolutional layer, a pooling layer and a nonlinear layer, specifically, a convolution operation is performed on the convolutional layer, and maximum pooling (maxporoling) processing is used in the pooling layer, where a value after the maximum pooling operation is performed on a certain row vector is the maximum value in the row. Further, as shown in fig. 2, the vector representations of the two processed clauses are fused through a hidden layer to obtain the first extraction information s1
In this embodiment, a sentence to be processed is subjected to dependency syntax analysis by a Stanfordparser syntax analyzer to obtain two shortest dependency paths (which are also two clauses of the sentence to be processed), and then vectors of the two clauses are represented, where the vector representation of the two clauses specifically includes: definition ofVector of words i on two shortest dependency paths
Figure BDA0002610466460000071
Wherein, We iIs word embedding, wherein, the word vector of the corresponding word can be searched directly through the open source word vector file by utilizing the pre-trained word vector. Pe iIs a position embedding. Position embedding refers to the relative distance of the current word on the shortest dependency path from the two entity words of the clause,
Figure BDA0002610466460000072
wherein
Figure BDA0002610466460000073
Is the relative distance of the current word and the jth entity. By vector representation of clause, clause x is converted to
Figure BDA0002610466460000074
Given a clause x, assuming the clause has n words, vector representation is performed on the clause x by the formula, where formula one is:
Figure BDA0002610466460000075
where n denotes the number of words contained in each clause, Ve iVector representing the ith word in sentence x, ZnA vector representation representing a clause. Then, the word vector and the position vector are used as the input of the convolutional neural network, and the vector representation of the processed sentence is obtained by processing through a convolutional layer, a pooling layer and a nonlinear layer. Specifically, the vector representations of the two clauses are processed through the convolutional layer, the pooling layer and the nonlinear layer according to a second formula: [ r ] ofx i]j=max[f(W1zn+b1)]jWherein [ r ]x i]jRepresents a vector rx iThe jth vector of (1), rx iMeans taking the value after the maximum pooling operation for a certain line vector, W1Weight matrix being convolutional layerF is a nonlinear transformation tanh function, ZnVector representation representing clauses, b1Is a bias value and is a constant value. In this embodiment, the convolutional layer uses a pair of matrixed vectors Z in two virtual windows of size k per windownA convolution operation is performed.
Further, as shown in fig. 2, after the two clauses have been processed by the convolutional Layer, the pooling Layer and the non-linear Layer, the vector representations of the two clauses are fused by a Hidden Layer (Hidden Layer) to obtain the first extracted information, which is also the last sentence representation s of the two clauses in the CNN model1. In other words, the sentence representation s1To be able to represent the final feature vector of the sentence to be processed, the information in the sentence to be processed is contained in this feature vector.
Step S105: and inputting the sentence into the second channel, and performing feature extraction through an LSTM model to obtain second extraction information.
Specifically, word segmentation is carried out on the sentence to obtain L word segments, word vector mapping is carried out on the L word segments respectively to obtain an L-d-dimensional word vector matrix, and the L word segments are mapped into a d-dimensional word vector; and inputting the d-dimensional word vectors of the L participles into the long-short term memory network model in sequence for feature extraction to obtain second extraction information.
In this embodiment, as shown in the right part of fig. 2, the vector representation of the complete sentence to be processed is input to the LSTM model for feature extraction, and the second extraction information, that is, the final sentence representation s of the sentence to be processed in the LSTM model, is obtained2. Wherein, when the vector representation of the complete sentence to be processed is inputted to the LSTM model for feature extraction, the same as the diagram in fig. 3 is applied, that is, the sentence to be processed is vector-represented, the vector representation of the sentence to be processed is passed through the convolutional layer, the sentence to be processed is obtained after the processing of the pooling layer and the nonlinear layer, and the sentence to be processed is inputted to the LSTM model, and the vector representation of the sentence to be processed is obtained, wherein the vector representation of the sentence to be processed is passed through the convolutional layer, and the specific calculation methods for processing the pooling layer and the nonlinear layer are the same as those in step S104, and here, the specific calculation methods are not the same as those in step SAnd will be described in detail.
In this embodiment, the vector representation of the complete sentence to be processed is a word-embedded representation of the complete sentence to be processed.
Step S106: weighting and summarizing the first extraction information and the second extraction information through an attention mechanism to obtain the final extraction features of the sentence, and inputting the final extraction features into a softmax layer of the dual-channel neural network model to complete classification of the relationship categories among the target entities.
In the prior art, the CNN model has great advantages for short sentence processing, while the LSTM model is easy to learn long-distance information and has excellent performance for extracting long-distance sentence characteristics. In this embodiment, in the first channel, two clauses of a sentence to be processed are input to the CNN model for feature extraction, and a final sentence representation s of the two clauses in the CNN model is obtained1And in the second channel, inputting the sentence into the LSTM model for feature extraction to obtain the final sentence representation s of the sentence to be processed in the LSTM model2Then, in order to process a long sentence and a short sentence simultaneously and to take into account that the shortest dependency path may omit information, in this embodiment, an entry mechanism is adopted to perform weighted aggregation on the first extracted information and the second extracted information to obtain a final extracted feature of the sentence, that is, a final vector representation s of the sentence. Specifically, the first extracted information and the second extracted information are weighted and summarized through a formula three and a formula four,
the third formula is:
Figure BDA0002610466460000091
wherein alpha isiWeight, s, for the final vector representation of each sentenceiThe vector representation after feature extraction for the sentence, e.g. s, as described above1,s2
The fourth formula is:
Figure BDA0002610466460000092
wherein t isiIs a query-based method consisting of a sentence siMatching with the prediction relation r;
the fifth formula is:
ti=siar, wherein, siThe vector representation being characteristic of the sentence, e.g. first extraction information s1Or second extraction information s2A is a weighted diagonal matrix, r is a query vector associated with the relationship r, and is a vector representation of the relationship r.
Further, in this embodiment, a conditional probability is defined by the softmax layer, where the calculation formula of the conditional probability is:
Figure BDA0002610466460000101
wherein n isrRepresenting a predefined number of relationships.
In this embodiment, after the classification is performed by the softma layer, the probability values of all the output relationship categories are obtained by a formula six, where the formula six is:
o is Ms + d, where o is the probability value for all relationship classes, M is the relationship matrix representation, and d is a deviation vector.
In this embodiment, the probability values of all the relationship categories output by o are essentially a one-dimensional column vector, and each number in the column vector represents a probability value of one relationship category, which indicates the probability of the target entity being a certain relationship category.
In this embodiment, an attention mechanism is used to fuse the representation output by the CNN model and the representation output by the LSTM model, and an excellent representation of the current sentence is extracted, so that the finally trained relationship extraction model can be suitable for long and short sentences.
Compared with the prior art, the dual-channel neural network model provided by the embodiment of the invention integrates the key information of the shortest dependency path, uses the original sentence to keep the information which cannot be captured by the dependency path, adopts the CNN to extract the local information, and uses the pooling layer to gather the most useful information, thereby extracting excellent local information and retaining the key information for classifying the relationship. Information extraction is performed on the whole sentence by using the LSTM, and excellent representation can be extracted for long-distance sentences. And weighting and summarizing the information extracted by the two models through an attention mechanism to obtain the final representation of the current sentence, wherein the current sentence contains the information which contributes most to the relation classification, and finally classifying the current sentence through a softmax layer of the dual-channel neural network model to achieve the effect of extracting the preset relation.
The steps of the above methods are divided for clarity, and the order of execution of the steps is not limited, and when implemented, the steps may be combined into one step or some steps may be split into multiple steps, and the steps are all within the scope of the patent as long as the steps include the same logical relationship; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
In an exemplary embodiment, data related to network relationships may be uploaded into a blockchain. The related data based on the network relationship obtains corresponding digest information, and specifically, the digest information is obtained by hashing the related data of the network relationship, for example, by using the sha256s algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The user equipment can download the summary information from the blockchain so as to verify whether the related data of the network relationship is tampered. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
A second embodiment of the present invention is directed to a block diagram of a neural network relationship extraction system of a computer device that may be partitioned into one or more program modules, the one or more program modules being stored in a storage medium and executed by one or more processors to implement the embodiments of the present application. The program modules referred to in the embodiments of the present application refer to a series of computer program instruction segments that can perform specific functions, and the following description will specifically describe the functions of the program modules in the embodiments.
As shown in fig. 4, the neural network relationship extraction system 400 may include an establishing module 410, an obtaining module 420, a shortest path generating module 430, a first module 440, a second module 450, a classification module 460, and a training module 470, wherein:
the establishing module 410 is configured to construct a two-channel neural network model, where the two-channel neural network model includes a first channel and a second channel.
An obtaining module 420, configured to obtain a sentence to be processed.
The shortest path generating module 430 is configured to perform dependency parsing on the sentence to obtain two clauses of the sentence.
Specifically, the sentence is subjected to dependency syntax analysis through a syntax analyzer, a dependency syntax analysis tree is generated, and two shortest dependency paths between target entities are found from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence.
The first extraction module 440 is configured to input the two clauses into the first channel, and perform feature extraction through a Convolutional Neural Network (CNN) model to obtain first extraction information.
The second extraction module 450 is configured to input the sentence into the second channel, and perform feature extraction through a Long Short-Term Memory network (LSTM) model to obtain second extraction information.
A classification module 460, configured to perform a weighted summarization on the first extraction information and the second extraction information through an attention mechanism to obtain a final extraction feature of the sentence, and input the final extraction feature to a softmax layer of the two-channel neural network model to complete classification of relationship categories between the target entities.
Further, the neural network relationship extraction system 400 further includes:
and a training module 470, configured to train the constructed two-channel neural network model.
The training module 470 is further configured to: acquiring a training set; inputting the training set to the two-channel neural network model to output a predicted relationship class for the training set; calculating a loss function cross entropy according to the prediction relation category output by the two-channel neural network model and the actual relation category of the training set; and minimizing the loss function through an optimization algorithm to train the two-channel neural network model.
The first decimation module 440 is further configured to: carrying out vector representation on words of the two clauses; processing the vector representations of the two clauses through a convolutional layer, a pooling layer, and a non-linear layer; and fusing the vector representations of the two processed clauses through a hidden layer to obtain first extraction information.
A third embodiment of the present invention relates to a computer device, and is shown in fig. 5, which is a schematic diagram of a hardware architecture of a computer device suitable for a block chain secure transaction method according to the present invention.
In the present embodiment, the computer device 500 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set or stored in advance. For example, the server may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster composed of a plurality of servers). As shown in fig. 5, computer device 500 includes at least, but is not limited to: memory 510, processor 520, and network interface 530 may be communicatively linked to each other by a system bus. Wherein:
the memory 510 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 510 may be an internal storage module of the computer device 500, such as a hard disk or a memory of the computer device 400. In other embodiments, the memory 510 may also be an external storage device of the computer device 500, such as a plug-in hard disk provided on the computer device 500, a Smart Media Card (SMC), a Secure Digital (SD) Card, a flash Card (FlashCard), and the like. Of course, the memory 510 may also include both internal and external memory modules of the computer device 500. In this embodiment, the memory 510 is generally used for storing the operating system and various application software installed on the computer device 500, such as program codes of the blockchain secure transaction method. In addition, the memory 510 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 420 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 520 is generally configured to control overall operations of the computer device 500, such as performing control and processing related to data interaction or communication with the computer device 500. In this embodiment, processor 520 is configured to execute program codes stored in memory 510 or process data.
Network interface 530 may include a wireless network interface or a wired network interface, and network interface 530 is typically used to establish communication links between computer device 500 and other computer devices. For example, the network interface 530 is used to connect the computer device 500 to an external terminal via a network, establish a data transmission channel and a communication link between the computer device 500 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G network, Bluetooth (Bluetooth), Wi-Fi, etc.
It should be noted that FIG. 5 only shows a computer device having components 510 and 530, but it should be understood that not all of the shown components are required and that more or fewer components may be implemented instead.
In this embodiment, the blockchain secure transaction method stored in the memory 510 may be further divided into one or more program modules and executed by one or more processors (in this embodiment, the processor 520) to complete the present application.
The memory 510 stores instructions executable by the at least one processor 520 to enable the at least one processor 520 to perform the steps of the neural network relationship extraction method described above.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program, which when executed by a processor implements the steps of the neural network relationship extraction method described above.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims (10)

1. A neural network relationship extraction method, comprising:
constructing a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel;
obtaining sentences to be processed;
performing dependency syntax analysis on the sentence to obtain two clauses of the sentence;
inputting the two clauses into the first channel, and performing feature extraction through a convolutional neural network model to obtain first extraction information;
inputting the sentence into the second channel, and performing feature extraction through a long-term and short-term memory network model to obtain second extraction information;
weighting and summarizing the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentence, and inputting the final extraction features to a softmax layer of the two-channel neural network model to finish classifying the relationship categories among the target entities.
2. The neural network relationship extraction method of claim 1, further comprising:
and training the constructed two-channel neural network model.
3. The neural network relationship extraction method of claim 2, wherein the training of the constructed two-channel neural network model comprises:
acquiring a training set;
inputting the training set to the two-channel neural network model to output a predicted relationship class for the training set;
calculating a loss function cross entropy according to the prediction relation category output by the two-channel neural network model and the actual relation category of the training set;
and minimizing the loss function through an optimization algorithm to train the two-channel neural network model.
4. The neural network relationship extraction method of claim 1, wherein the inputting the two clauses into the first channel, and performing feature extraction through a convolutional neural network model to obtain first extraction information comprises:
carrying out vector representation on words of the two clauses;
processing the vector representations of the two clauses through a convolutional layer, a pooling layer, and a non-linear layer;
and fusing the vector representations of the two processed clauses through a hidden layer to obtain first extraction information.
5. The neural network relationship extraction method of claim 1, wherein the inputting the sentence into the second channel, and performing feature extraction through a long-short term memory network model to obtain second extraction information comprises:
performing word segmentation operation on the sentence to obtain L word segments;
performing word vector mapping on the L participles respectively to obtain an L-x-d dimension word vector matrix, wherein the L participles are mapped into a d dimension word vector;
and inputting the d-dimensional word vectors of the L participles into the long-short term memory network model in sequence for feature extraction to obtain second extraction information.
6. The neural network relationship extraction method of claim 1, wherein the performing dependency parsing on the sentence to obtain two clauses of the sentence comprises:
performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree;
and finding two shortest dependency paths between target entities from the dependency parsing tree, wherein the two shortest paths represent two clauses of the sentence.
7. A neural network relationship extraction system, comprising:
the device comprises an establishing module, a calculating module and a judging module, wherein the establishing module is used for establishing a two-channel neural network model which comprises a first channel and a second channel;
the obtaining module is used for obtaining sentences to be processed;
the shortest path generating module is used for carrying out dependency syntax analysis on the sentence to obtain two clauses of the sentence;
the first extraction module is used for inputting the two clauses into the first channel in the first channel and extracting the characteristics through a convolutional neural network model to obtain first extraction information;
the second extraction module is used for inputting the sentences into the second channel in the second channel and extracting the characteristics through a long-term and short-term memory network model to obtain second extraction information;
and the classification module is used for weighting and summarizing the first extraction information and the second extraction information through an attention mechanism to obtain the final extraction features of the sentence, and inputting the final extraction features into a softmax layer of the dual-channel neural network model to finish classification of the relationship categories among the target entities.
8. The neural network relationship extraction system of claim 7, wherein the first extraction module is further configured to:
carrying out vector representation on words of the two clauses;
processing the vector representations of the two clauses through a convolutional layer, a pooling layer, and a non-linear layer;
and fusing the vector representations of the two processed clauses through a hidden layer to obtain first extraction information.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the computer program, is adapted to implement the steps of the neural network relationship extraction method of any one of claims 1 to 6.
10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the neural network relationship extraction method of any one of claims 1 to 6.
CN202010752459.2A 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium Active CN111898364B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010752459.2A CN111898364B (en) 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium
PCT/CN2020/111513 WO2021174774A1 (en) 2020-07-30 2020-08-26 Neural network relationship extraction method, computer device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010752459.2A CN111898364B (en) 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN111898364A true CN111898364A (en) 2020-11-06
CN111898364B CN111898364B (en) 2023-09-26

Family

ID=73182595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010752459.2A Active CN111898364B (en) 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium

Country Status (2)

Country Link
CN (1) CN111898364B (en)
WO (1) WO2021174774A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528326A (en) * 2020-12-09 2021-03-19 维沃移动通信有限公司 Information processing method and device and electronic equipment
CN112560481A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Statement processing method, device and storage medium

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113901758A (en) * 2021-09-27 2022-01-07 南京邮电大学 Relation extraction method for knowledge graph automatic construction system
CN114065702B (en) * 2021-09-28 2024-07-12 南京邮电大学 Event detection method integrating entity relation and event element
CN113626608B (en) * 2021-10-12 2022-02-15 深圳前海环融联易信息科技服务有限公司 Semantic-enhancement relationship extraction method and device, computer equipment and storage medium
CN113990473B (en) * 2021-10-28 2022-09-30 上海昆亚医疗器械股份有限公司 Medical equipment operation and maintenance information collecting and analyzing system and using method thereof
CN114417846B (en) * 2021-11-25 2023-12-19 湘潭大学 Entity relation extraction method based on attention contribution degree
CN114385817A (en) * 2022-01-14 2022-04-22 平安科技(深圳)有限公司 Entity relationship identification method and device and readable storage medium
CN114417824B (en) * 2022-01-14 2024-09-10 大连海事大学 Chapter-level relation extraction method and system based on dependency syntax pre-training model
CN114861630B (en) * 2022-05-10 2024-07-19 马上消费金融股份有限公司 Training method and device for information acquisition and related model, electronic equipment and medium
CN116386895B (en) * 2023-04-06 2023-11-28 之江实验室 Epidemic public opinion entity identification method and device based on heterogeneous graph neural network
CN116108206B (en) * 2023-04-13 2023-06-27 中南大学 Combined extraction method of financial data entity relationship and related equipment
CN117054396B (en) * 2023-10-11 2024-01-05 天津大学 Raman spectrum detection method and device based on double-path multiplicative neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710932A (en) * 2018-12-22 2019-05-03 北京工业大学 A kind of medical bodies Relation extraction method based on Fusion Features
CN109783618A (en) * 2018-12-11 2019-05-21 北京大学 Pharmaceutical entities Relation extraction method and system based on attention mechanism neural network
CN110020671A (en) * 2019-03-08 2019-07-16 西北大学 The building of drug relationship disaggregated model and classification method based on binary channels CNN-LSTM network
CN110598001A (en) * 2019-08-05 2019-12-20 平安科技(深圳)有限公司 Method, device and storage medium for extracting association entity relationship
US20200065374A1 (en) * 2018-08-23 2020-02-27 Shenzhen Keya Medical Technology Corporation Method and system for joint named entity recognition and relation extraction using convolutional neural network
CN111428481A (en) * 2020-03-26 2020-07-17 南京搜文信息技术有限公司 Entity relation extraction method based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563653B (en) * 2017-12-21 2020-07-31 清华大学 Method and system for constructing knowledge acquisition model in knowledge graph
WO2019220128A1 (en) * 2018-05-18 2019-11-21 Benevolentai Technology Limited Graph neutral networks with attention

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065374A1 (en) * 2018-08-23 2020-02-27 Shenzhen Keya Medical Technology Corporation Method and system for joint named entity recognition and relation extraction using convolutional neural network
CN109783618A (en) * 2018-12-11 2019-05-21 北京大学 Pharmaceutical entities Relation extraction method and system based on attention mechanism neural network
CN109710932A (en) * 2018-12-22 2019-05-03 北京工业大学 A kind of medical bodies Relation extraction method based on Fusion Features
CN110020671A (en) * 2019-03-08 2019-07-16 西北大学 The building of drug relationship disaggregated model and classification method based on binary channels CNN-LSTM network
CN110598001A (en) * 2019-08-05 2019-12-20 平安科技(深圳)有限公司 Method, device and storage medium for extracting association entity relationship
CN111428481A (en) * 2020-03-26 2020-07-17 南京搜文信息技术有限公司 Entity relation extraction method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张晓斌 等: "基于CNN和双向LSTM融合的实体关系抽取", 《网络与信息安全学报》, vol. 14, no. 9, pages 44 - 51 *
郑钰婷: "学术文献的实体关系抽取研究及实现", 《中国优秀硕士学位论文全文数据库信息科技辑(月刊)》 *
郑钰婷: "学术文献的实体关系抽取研究及实现", 《中国优秀硕士学位论文全文数据库信息科技辑(月刊)》, no. 06, 15 June 2020 (2020-06-15), pages 138 - 1278 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528326A (en) * 2020-12-09 2021-03-19 维沃移动通信有限公司 Information processing method and device and electronic equipment
CN112528326B (en) * 2020-12-09 2024-01-02 维沃移动通信有限公司 Information processing method and device and electronic equipment
CN112560481A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Statement processing method, device and storage medium
CN112560481B (en) * 2020-12-25 2024-05-31 北京百度网讯科技有限公司 Statement processing method, device and storage medium

Also Published As

Publication number Publication date
CN111898364B (en) 2023-09-26
WO2021174774A1 (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN111898364B (en) Neural network relation extraction method, computer equipment and readable storage medium
US11501182B2 (en) Method and apparatus for generating model
WO2020140386A1 (en) Textcnn-based knowledge extraction method and apparatus, and computer device and storage medium
CN109918560B (en) Question and answer method and device based on search engine
CN113011189A (en) Method, device and equipment for extracting open entity relationship and storage medium
US20220171936A1 (en) Analysis of natural language text in document
WO2022048363A1 (en) Website classification method and apparatus, computer device, and storage medium
CN113761893B (en) Relation extraction method based on mode pre-training
CN109710921B (en) Word similarity calculation method, device, computer equipment and storage medium
CN113254649B (en) Training method of sensitive content recognition model, text recognition method and related device
CN111401065A (en) Entity identification method, device, equipment and storage medium
CN111523420A (en) Header classification and header list semantic identification method based on multitask deep neural network
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN113821635A (en) Text abstract generation method and system for financial field
CN110852066B (en) Multi-language entity relation extraction method and system based on confrontation training mechanism
CN115470232A (en) Model training and data query method and device, electronic equipment and storage medium
CN112395425A (en) Data processing method and device, computer equipment and readable storage medium
CN114529903A (en) Text refinement network
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN115017879A (en) Text comparison method, computer device and computer storage medium
CN110232328A (en) A kind of reference report analytic method, device and computer readable storage medium
CN114065769A (en) Method, device, equipment and medium for training emotion reason pair extraction model
CN114372454B (en) Text information extraction method, model training method, device and storage medium
CN117332064A (en) Instruction generation and database operation method, electronic device and computer storage medium
US11481389B2 (en) Generating an executable code based on a document

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant