CN111898364B - Neural network relation extraction method, computer equipment and readable storage medium - Google Patents

Neural network relation extraction method, computer equipment and readable storage medium Download PDF

Info

Publication number
CN111898364B
CN111898364B CN202010752459.2A CN202010752459A CN111898364B CN 111898364 B CN111898364 B CN 111898364B CN 202010752459 A CN202010752459 A CN 202010752459A CN 111898364 B CN111898364 B CN 111898364B
Authority
CN
China
Prior art keywords
neural network
extraction
channel
sentence
clauses
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010752459.2A
Other languages
Chinese (zh)
Other versions
CN111898364A (en
Inventor
回艳菲
王健宗
程宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010752459.2A priority Critical patent/CN111898364B/en
Priority to PCT/CN2020/111513 priority patent/WO2021174774A1/en
Publication of CN111898364A publication Critical patent/CN111898364A/en
Application granted granted Critical
Publication of CN111898364B publication Critical patent/CN111898364B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a neural network relation extraction method, which comprises the following steps: constructing a two-channel neural network model; acquiring sentences to be processed; performing dependency syntactic analysis on the sentence to obtain two clauses of the sentence; inputting the two clauses into a first channel, and extracting features through a CNN model to obtain first extraction information; inputting the sentence into a second channel, and extracting features through an LSTM model to obtain second extraction information; and carrying out weighted summarization on the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentence, and inputting the final extraction features into a softmax layer to finish classifying the relationship types among the target entities. The application also provides a computer device and a computer readable storage medium. The neural network relation extraction method provided by the application can be used for extracting the relation with high quality.

Description

Neural network relation extraction method, computer equipment and readable storage medium
Technical Field
The embodiment of the application relates to the technical field of artificial intelligence, in particular to a neural network relation extraction method, computer equipment and a readable storage medium.
Background
The relation extraction is an important research in the field of natural language processing, and is used as an important subtask, the relation extraction aims to extract a pre-defined semantic relation between two entities in a text, the extracted relation and the entities can be organized into a triplet form and stored in a graph database, and the relation extraction is applied to a medical knowledge graph based on a related knowledge graph technology. How to construct a high-quality medical knowledge graph, and how to extract the high-quality relationship. Therefore, the relation extraction is particularly important for medical knowledge maps.
Conventional relation extraction tasks generally use a single convolutional neural network (Convolutional Neural Network, CNN) or a cyclic neural network (Recurrent Neural Network, RNN) to perform vectorization representation on sentences, but the relation extraction quality of the single model is not high.
Disclosure of Invention
The purpose of the present application is to provide a neural network relationship extraction method that can perform relationship extraction with high quality.
In order to solve the above technical problems, an embodiment of the present application provides a neural network relationship extraction method, including: constructing a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel; acquiring sentences to be processed; performing dependency syntax analysis on the sentence, generating a dependency syntax analysis tree, and finding out two shortest dependency paths between target entities from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence; inputting the two clauses into the first channel, and extracting features through a convolutional neural network model to obtain first extraction information; inputting the sentence into the second channel, and extracting features through a long-short-term memory network model to obtain second extraction information; and carrying out weighted summarization on the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentences, and inputting the final extraction features into a softmax layer to finish classifying the relationship categories among the target entities.
Preferably, the method further comprises: and training the constructed two-channel neural network model.
Preferably, the training the constructed dual-channel neural network model includes: acquiring a training set; inputting the training set to the two-channel neural network model to output a predicted relationship class of the training set; calculating the cross entropy of the loss function according to the predicted relationship category output by the two-channel neural network model and the actual relationship category of the training set; and minimizing the loss function through an optimization algorithm to train the two-channel neural network model.
Preferably, the two clauses are input into the first channel, feature extraction is performed through a convolutional neural network model, and first extraction information is obtained, including: carrying out vector representation on the words of the two clauses; processing the vector representations of the two clauses through a convolution layer, a pooling layer and a nonlinear layer; and fusing the vector representations of the two clauses after processing through the hidden layer to obtain first extraction information.
Preferably, the inputting the sentence into the second channel, performing feature extraction through a long-short term memory network model, to obtain second extraction information, including: performing word segmentation operation on the sentence to obtain L word segments; respectively carrying out word vector mapping on the L word segments to obtain an L x d-dimensional word vector matrix, wherein the L word segments are mapped into a d-dimensional word vector; and sequentially inputting the d-dimensional word vectors of the L segmented words into the long-short-term memory network model to perform feature extraction, so as to obtain the second extraction information.
Preferably, the performing dependency syntax analysis on the sentence to obtain two clauses of the sentence includes: performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree; and finding out two shortest dependency paths between target entities from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence.
The embodiment of the application also provides a neural network relation extraction system, which comprises: the building module is used for building a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel; the acquisition module is used for acquiring sentences to be processed; the shortest path generation module is used for carrying out dependency syntactic analysis on the sentence to obtain two clauses of the sentence; the first extraction module is used for inputting the two clauses into the first channel in the first channel, and extracting features through a convolutional neural network model to obtain first extraction information; the second extraction module is used for inputting the sentence into the second channel in the second channel, and extracting the characteristics through the long-term and short-term memory network model to obtain second extraction information; and the classification module is used for carrying out weighted summarization on the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentences, and inputting the final extraction features into a softmax layer of the dual-channel neural network model to finish classifying the relationship types among the target entities.
Preferably, the first extraction module is further configured to: carrying out vector representation on the words of the two clauses; processing the vector representations of the two clauses through a convolution layer, a pooling layer and a nonlinear layer; and fusing the vector representations of the two clauses after processing through the hidden layer to obtain first extraction information.
The embodiment of the application also provides computer equipment, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the neural network relationship extraction method described above.
The embodiment of the application also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the neural network relation extraction method. Compared with the prior art, the two-channel neural network relation extraction model provided by the embodiment of the application fuses the key information of the shortest dependency path, uses the primitive sentence to keep the information which cannot be captured by the dependency path, extracts the local information through CNN, gathers the most useful information through a pooling layer, extracts excellent local information and keeps the key information for classifying the relation. The LSTM is used for extracting information from the whole sentence, and excellent representation can be extracted from long-distance sentences. And carrying out weighted summarization on the information extracted by the two models through an attention mechanism to obtain the final representation of the current sentence, wherein the current sentence contains the information with the greatest contribution to the relation classification, and finally classifying the information through a softmax layer, so that the effect of extracting the preset relation is achieved.
Drawings
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings.
Fig. 1 is a flowchart of a neural network relationship extraction method according to a first embodiment of the present application;
FIG. 2 is a schematic diagram of a two-channel neural network model in a first embodiment of the present application;
FIG. 3 is a schematic diagram of feature extraction of a convolutional neural network model in a first embodiment of the present application;
FIG. 4 is a program block diagram of a neural network relationship extraction system according to a second embodiment of the present application;
fig. 5 is a schematic structural view of a computer device according to a third embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings. However, those of ordinary skill in the art will understand that in various embodiments of the present application, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the claimed application may be practiced without these specific details and with various changes and modifications based on the following embodiments.
The intelligent city intelligent management system can be applied to intelligent government affairs/intelligent city management/intelligent communities/intelligent security, intelligent logistics/intelligent medical treatment/intelligent education/intelligent environmental protection/intelligent traffic scenes, so that the construction of intelligent cities is promoted.
The first embodiment of the present application relates to a neural network relation extraction method, and the core of the present embodiment is that a two-channel neural network relation extraction model is provided, a convolutional neural network (Convolutional Neural Networks, CNN) model is adopted to extract key information of a shortest dependency path, a Long Short-Term Memory (LSTM) model is used to extract information of a whole sentence, excellent representation can be extracted for Long-distance sentences, and pooling layer is used to gather most useful information, so that excellent local information is extracted, and key information for classifying relations is reserved. The features extracted from the two models are weighted and summarized through an attention mechanism (also called an attention mechanism) to obtain the final vector representation of the current sentence, and finally the final vector representation is classified through a softmax layer, so that the effect of extracting the preset relationship is achieved. The implementation details of the neural network relation extraction method of the present embodiment are specifically described below, and the following description is provided only for convenience of understanding, and is not necessary to implement the present embodiment.
A schematic flow chart of a neural network relation extraction method in this embodiment is shown in fig. 1, and the method is applied to a computer device.
In this embodiment, the execution sequence of the steps in the flowchart shown in fig. 1 may be changed, and some steps may be omitted according to different requirements.
Step S101: a two-channel neural network model is constructed, the two-channel neural network model comprising a first channel and a second channel.
The relation extraction aims at extracting a predefined semantic relation between two entities from a text, the extracted relation and the entities can be organized into a triplet form and stored in a graph database, and the relation extraction is applied to a medical knowledge graph based on a related knowledge graph technology. Constructing a high-quality medical knowledge graph, and extracting the high-quality relationship. Therefore, the relation extraction is particularly important for medical knowledge maps.
In the prior art, convolutional neural networks (Convolutional Neural Network, CNN) and cyclic neural networks (Recurrent Neural Network, RNN) are two main architecture types of deep neural networks (Deep Neural Networks, DNN), in a conventional relational extraction task, sentences are generally represented in a vectorization manner through a single model such as CNN or RNN, but a single model may not grasp emphasis, especially in the medical field, the length of sentences is different, no single model is adapted, and not all words in the sentences contribute to entity relations, because some sentences are too lengthy, so the relational extraction quality of a single model is not high. Therefore, in this embodiment, in order to make the quality of the relationship extraction higher, the relationship extraction is performed by establishing a two-channel neural network model in the present application.
In this embodiment, after the two-channel neural network model is built, the built two-channel neural network model is further trained. Specifically, training the constructed two-channel neural network model includes: obtaining a training set, inputting the training set into the two-channel neural network model to output a predicted relation type of the training set, calculating a loss function cross entropy according to the predicted relation type output by the two-channel neural network model and an actual relation type of the training set, and minimizing the loss function cross entropy through an optimization algorithm to train the two-channel neural network model.
In this embodiment, the training set is a set of training data of a set of known actual relationship categories.
In this embodiment, the loss function is:
wherein r is i The probability value representing the relation between entities is the ith category, S is a single sentence, S is a sentence set, and t is the category number of the entity relation. When the entity relationship is of the ith class, then r i 1, otherwise 0.
For example, assuming that there are 10 relationships among entities, the entity relationship between entity 1 and entity 2 is 3, then r 3 =1。
In this embodiment, the two-channel neural network model is trained by minimizing the cross entropy of the loss function, which is in effect minimized between the predicted relationship class and the actual relationship class.
Step S102: and acquiring sentences to be processed.
In this embodiment, the sentence to be processed refers to a sentence that needs to be extracted from a relationship.
Step S103: and carrying out dependency syntactic analysis on the sentence to obtain two clauses of the sentence.
Specifically, performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree; and finding out two shortest dependency paths between target entities from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence.
Representative examples of the open source chinese syntactic analyzers are Stanfordparser and Berkeley parser. Stanford parser is based on a factor model and Berkeley parser is based on a non-lexical analytical model. In this embodiment, the sentence is subjected to dependency syntax analysis by a syntax analyzer (Stanfordparser). Of course, in other embodiments, other syntactic analyzers may be used for dependency syntactic analysis, which is not limited in this embodiment.
In this embodiment, dependency syntax analysis is performed on sentences to be processed to obtain two shortest dependency paths between target entities. Since the shortest dependency path screen contains the stem portion of the expressed relationship pattern, except for the insignificant modifier blocks, the two shortest paths are actually the two clauses of the sentence. Further, by acquiring the shortest dependency path of the sentence, it is possible to capture information that contributes most to the relationship classification, and extract excellent local information in the sentence.
Step S104: and inputting the two clauses into the first channel, and extracting features through a CNN model to obtain first extraction information.
In this embodiment, fig. 2 is a schematic diagram of a dual-channel model in a preferred embodiment of the application. As shown in fig. 2, in the first channel (the left channel), two clauses (two shortest dependency paths) are respectively input to a CNN model to perform feature extraction, so as to obtain first extraction information. Specifically, in the first channel, inputting the two clauses to a CNN model for feature extraction, to obtain first extraction information, including:
fig. 3 is a schematic diagram of feature extraction performed on a convolutional neural network model according to a first embodiment of the present application, where, as shown in fig. 3, two clauses are represented by vectors, and the vector representations of the two clauses are processed by a convolutional layer, a pooling layer and a nonlinear layer, specifically, a convolutional operation is performed at the convolutional layer, and a maximum pooling (maxpooling) process is used at the pooling layer, where, a value after the maximum pooling operation is taken for a certain line direction is the maximum value in the line. Further, as shown in fig. 2, the vector representations of the two clauses after processing are fused through a hidden layer to obtain first extraction information s 1
In this embodiment, the sentence to be processed is subjected to dependency syntax analysis by the Stanfordparser syntax analyzer to obtain two shortest dependency paths (two clauses of the sentence to be processed), and then the vectors of the two clauses are represented, where the vector of the two clauses is specifically: defining vectors for word i on two shortest dependency pathsWherein W is e i Is a word vector (word unbedding), in whichBy utilizing the pre-trained word vector, the word vector of the corresponding word can be searched directly through the word vector file of the open source. P (P) e i Is a position vector (position embedding). Position embedding refers to the relative distance of the current word from the two entity words of the clause on the shortest dependency path,wherein->Is the relative distance of the current word and the j-th entity. By vector representation of clause, then clause x is converted into + ->Given a clause x, assuming that the clause has n words, vector representation is performed on the clause x by a formula one: />Wherein n represents the number of words contained in each clause, V e i A vector representing the i-th word in sentence x, Z n A vector representation representing the clause. Then, the word vector and the position vector are used as the input of the convolutional neural network, and the vector representation of the processed sentence is obtained through the processing of the convolutional layer, the pooling layer and the nonlinear layer. Specifically, the vector representations of the two clauses are processed through a convolution layer, a pooling layer and a nonlinear layer according to a formula II: [ r ] x i ] j =max[f(W 1 z n +b 1 )] j Wherein [ r x i ] j Representing vector r x i Is the j-th vector, r x i Refers to the value after maximum pooling operation for a certain row direction, W 1 Is the weight matrix of the convolution layer, f is the nonlinear transformation tanh function, Z n Vector representation representing clauses, b 1 Is a bias value and is a constant. In this embodiment, the convolution layer uses a matrixing vector pair Z in two imaginary windows each having a window size k n And performing convolution operation.
Further, as shown in fig. 2, when the two clauses have completed the processing of the convolution Layer, the pooling Layer and the nonlinear Layer, the vector representations of the two clauses are fused by a Hidden Layer (Hidden Layer) to obtain the first extraction information, which is also the last sentence representation s of the two clauses in the CNN model 1 . In other words, sentence representation s 1 In order to be able to represent the final feature vector of the sentence to be processed, information in the sentence to be processed is contained in this feature vector.
Step S105: and inputting the sentence into the second channel, and extracting the characteristics through an LSTM model to obtain second extraction information.
Specifically, performing word segmentation operation on the sentence to obtain L segmented words, and performing word vector mapping on the L segmented words respectively to obtain an L x d-dimensional word vector matrix, wherein the L segmented words are mapped into a d-dimensional word vector; and sequentially inputting the d-dimensional word vectors of the L segmented words into the long-short-term memory network model to perform feature extraction, so as to obtain the second extraction information.
In this embodiment, as shown in the right part of fig. 2, the vector representation of the complete sentence to be processed is input into the LSTM model for feature extraction to obtain the second extraction information, that is, the final sentence representation s of the sentence to be processed in the LSTM model 2 . When the vector representation of the complete sentence to be processed is input to the LSTM model to perform feature extraction is the same as the schematic diagram in fig. 3, that is, the vector representation of the sentence to be processed is performed, the vector representation of the sentence to be processed is processed by the convolution layer, the pooling layer and the nonlinear layer, and then the sentence to be processed is input to the LSTM model to obtain the vector representation of the sentence to be processed, where the specific calculation method of the vector representation of the sentence to be processed by the convolution layer, the pooling layer and the nonlinear layer is the same as that in step S104, and will not be repeated here.
In this embodiment, the vector representation of the complete sentence to be processed is a word embedded representation of the complete sentence to be processed.
Step S106: and carrying out weighted summarization on the first extraction information and the second extraction information through an intent mechanism to obtain final extraction features of the sentences, and inputting the final extraction features into a softmax layer of the dual-channel neural network model to finish classifying the relationship types among the target entities.
In the prior art, the CNN model has great advantages for short sentence processing, the LSTM model is easier to learn long-distance information, and has excellent performance for extracting long-distance sentence characteristics. In this embodiment, in the first channel, two clauses of a sentence to be processed are input into the CNN model to perform feature extraction, so as to obtain a final sentence representation s of the two clauses in the CNN model 1 Inputting the sentence into an LSTM model for feature extraction in a second channel to obtain the final sentence representation s of the sentence to be processed in the LSTM model 2 Then, in order to process long sentences and short sentences simultaneously and consider that the shortest dependency path sometimes omits information, in this embodiment, an attention mechanism is adopted to perform weighted summation on the first extraction information and the second extraction information to obtain final extraction features of the sentences, that is, final vector representation s of the sentences. Specifically, the first extraction information and the second extraction information are weighted and summarized through a formula III and a formula IV,
the formula III is:
wherein alpha is i Weights represented for each sentence final vector s i Vector representation after feature extraction for sentences, e.g. s as described above 1 ,s 2
The fourth formula is:
wherein t is i Is a query-based method, which consists of sentences s i Matching with a prediction relation r;
the fifth formula is:
t i =s i ar, wherein s i Vector representation after feature extraction for sentences, e.g. first extraction information s 1 Or second extraction information s 2 A is a weighted diagonal matrix, r is a query vector associated with the relationship r, and is a vector representation of the relationship r.
Further, in this embodiment, the conditional probability is defined by the softmax layer, where the calculation formula of the conditional probability is:
wherein n is r Representing a predefined number of relationships.
In this embodiment, after classification by the softma layer, the probability values of all the output relationship categories are also obtained by the formula six, where the formula six is:
o=ms+d, where o is the probability value of all relationship classes, M is the relationship matrix representation, and d is a bias vector.
In this embodiment, the probability value of all the relationship classes output by o is essentially a one-dimensional column vector, and each number in the column vector represents the probability value of one relationship class and represents the probability that the target entity is a certain relationship class.
In this embodiment, the representation output by the CNN model and the representation output by the LSTM model are fused by using an attribute mechanism, and a representation excellent to the current sentence is extracted, so that the finally trained relation extraction model can be suitable for long short sentences.
Compared with the prior art, the dual-channel neural network model provided by the embodiment of the application fuses the key information of the shortest dependency path, uses the primitive sentence to keep the information which cannot be captured by the dependency path, adopts CNN to extract local information, uses the pooling layer to gather the most useful information, extracts excellent local information and keeps the key information for classifying the relationship. The LSTM is used for extracting information from the whole sentence, and excellent representation can be extracted from long-distance sentences. And carrying out weighted summarization on the information extracted by the two models through an attribute mechanism to obtain the final representation of the current sentence, wherein the current sentence contains the information with the greatest contribution to the relation classification, and finally classifying the information through a softmax layer of the two-channel neural network model, so that the effect of extracting the preset relation is achieved.
The above steps of the methods are divided, for clarity of description, and the execution sequence of the steps is not limited, and the steps can be combined into one step or split into a plurality of steps when implemented, so long as the steps comprise the same logic relationship, and the steps are all within the protection scope of the present patent; it is within the scope of this patent to add insignificant modifications to the algorithm or flow or introduce insignificant designs, but not to alter the core design of its algorithm and flow.
In an exemplary embodiment, the relevant data for the network relationship may be uploaded into the blockchain. The corresponding summary information is obtained based on the related data of the network relation, specifically, the summary information is obtained by hashing the related data of the network relation, for example, the summary information is obtained by using a sha256s algorithm. Uploading summary information to the blockchain can ensure its security and fair transparency to the user. The user device may download the summary information from the blockchain to verify that the relevant data of the network relationship has been tampered with. The blockchain referred to in this example is a novel mode of application for computer technology such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The second embodiment of the present application relates to a block diagram of a neural network relationship extraction system of a computer apparatus, which may be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors to accomplish an embodiment of the present application. Program modules in accordance with the embodiments of the present application are directed to a series of computer program instruction segments capable of performing the specified functions, and the following description describes each program module in detail.
As shown in fig. 4, the neural network relationship extraction system 400 may include an establishment module 410, an acquisition module 420, a shortest path generation module 430, a first module 440, a second module 450, a classification module 460, and a training module 470, wherein:
the building module 410 is configured to build a two-channel neural network model, where the two-channel neural network model includes a first channel and a second channel.
An obtaining module 420, configured to obtain a sentence to be processed.
And the shortest path generation module 430 is configured to perform dependency syntax analysis on the sentence to obtain two clauses of the sentence.
Specifically, the dependency syntax analysis is carried out on the sentences through the syntax analyzer, a dependency syntax analysis tree is generated, and two shortest dependency paths between target entities are found out from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentences.
The first extraction module 440 is configured to input the two clauses into the first channel, and perform feature extraction through a convolutional neural network (Convolutional Neural Networks, CNN) model to obtain first extraction information.
And a second extraction module 450, configured to input the sentence into the second channel, and perform feature extraction through a Long Short-Term Memory (LSTM) model, so as to obtain second extraction information.
The classification module 460 is configured to perform weighted summarization on the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentence, and input the final extraction features to a softmax layer of the dual-channel neural network model to complete classification of the relationship category between the target entities.
Further, the neural network relation extraction system 400 further includes:
and the training module 470 is used for training the constructed two-channel neural network model.
The training module 470 is further configured to: acquiring a training set; inputting the training set to the two-channel neural network model to output a predicted relationship class of the training set; calculating the cross entropy of the loss function according to the predicted relationship category output by the two-channel neural network model and the actual relationship category of the training set; and minimizing the loss function through an optimization algorithm to train the two-channel neural network model.
The first decimation module 440 is further configured to: carrying out vector representation on the words of the two clauses; processing the vector representations of the two clauses through a convolution layer, a pooling layer and a nonlinear layer; and fusing the vector representations of the two clauses after processing through the hidden layer to obtain first extraction information.
A third embodiment of the present application relates to a computer device, and referring to fig. 5, a hardware architecture diagram of the computer device suitable for a blockchain secure transaction method according to the present application is shown.
In the present embodiment, the computer device 500 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a preset or stored instruction. For example, it may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including a stand-alone server or a server cluster composed of a plurality of servers), etc. As shown in fig. 5, computer device 500 includes at least, but is not limited to: the memory 510, processor 520, and network interface 530 may be communicatively linked to each other by a system bus. Wherein:
the memory 510 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 510 may be an internal storage module of the computer device 500, such as a hard disk or memory of the computer device 400. In other embodiments, the memory 510 may also be an external storage device of the computer device 500, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 500. Of course, the memory 510 may also include both internal memory modules of the computer device 500 and external memory devices. In this embodiment, the memory 510 is typically used to store an operating system and various types of application software installed on the computer device 500, such as program code for a blockchain secure transaction method. In addition, the memory 510 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 420 may be a central processing unit (Central Processing Unit, simply CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 520 is generally used to control overall operation of the computer device 500, such as performing control and processing related to data interaction or communication with the computer device 500, and the like. In this embodiment, the processor 520 is configured to execute program codes or process data stored in the memory 510.
The network interface 530 may include a wireless network interface or a wired network interface, the network interface 530 typically being used to establish a communication link between the computer device 500 and other computer devices. For example, the network interface 530 is used to connect the computer device 500 to an external terminal through a network, establish a data transmission channel and a communication link between the computer device 500 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, abbreviated as GSM), wideband code division multiple access (Wideband Code Division Multiple Access, abbreviated as WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, etc.
It should be noted that fig. 5 only shows a computer device having components 510-530, but it should be understood that not all of the illustrated components are required to be implemented, and that more or fewer components may be implemented instead.
In this embodiment, the blockchain secure transaction method stored in the memory 510 may also be divided into one or more program modules and executed by one or more processors (processor 520 in this embodiment) to complete the present application.
The memory 510 stores instructions executable by the at least one processor 520 for execution by the at least one processor 520 to enable the at least one processor 520 to perform the steps of the neural network relationship extraction method described above.
The embodiment of the application also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the neural network relation extraction method.
That is, it will be understood by those skilled in the art that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program stored in a storage medium, where the program includes several instructions for causing a device (which may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps in the methods of the embodiments of the application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples of carrying out the application and that various changes in form and details may be made therein without departing from the spirit and scope of the application.

Claims (7)

1. A neural network relationship extraction method, comprising:
constructing a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel;
acquiring sentences to be processed;
performing dependency syntactic analysis on the sentence to obtain two clauses of the sentence, wherein the clauses are the shortest dependency paths; the step of performing dependency syntax analysis on the sentence to obtain two clauses of the sentence includes: performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree; finding out two shortest dependency paths between target entities from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence;
inputting the two clauses into the first channel, and extracting features through a convolutional neural network model to obtain first extraction information;
inputting the sentence into the second channel, and extracting features through a long-short-term memory network model to obtain second extraction information;
the first extraction information and the second extraction information are weighted and summarized through an attention mechanism to obtain final extraction features of the sentences, and the final extraction features are input into a softmax layer of the dual-channel neural network model to finish classifying relationship categories among the target entities;
inputting the two clauses into the first channel, and performing feature extraction through a convolutional neural network model to obtain first extraction information, wherein the method comprises the following steps of:
carrying out vector representation on the words of the two clauses;
processing the vector representations of the two clauses through a convolution layer, a pooling layer and a nonlinear layer;
and fusing the vector representations of the two clauses after processing through the hidden layer to obtain first extraction information.
2. The neural network relationship extraction method of claim 1, further comprising:
and training the constructed two-channel neural network model.
3. The neural network relationship extraction method according to claim 2, wherein the training the constructed two-channel neural network model includes:
acquiring a training set;
inputting the training set to the two-channel neural network model to output a predicted relationship class of the training set;
calculating the cross entropy of the loss function according to the predicted relationship category output by the two-channel neural network model and the actual relationship category of the training set;
and minimizing the loss function through an optimization algorithm to train the two-channel neural network model.
4. The neural network relation extraction method according to claim 1, wherein the inputting the sentence into the second channel, performing feature extraction through a long-short-term memory network model, and obtaining second extraction information includes:
performing word segmentation operation on the sentence to obtain L word segments;
respectively carrying out word vector mapping on the L word segments to obtain an L x d-dimensional word vector matrix, wherein the L word segments are mapped into a d-dimensional word vector;
and sequentially inputting the d-dimensional word vectors of the L segmented words into the long-short-term memory network model to perform feature extraction, so as to obtain the second extraction information.
5. A neural network relationship extraction system, comprising:
the building module is used for building a two-channel neural network model, wherein the two-channel neural network model comprises a first channel and a second channel;
the acquisition module is used for acquiring sentences to be processed;
the shortest path generation module is used for carrying out dependency syntax analysis on the sentence to obtain two clauses of the sentence, wherein the clauses are the shortest dependency paths; the shortest path generation module is further configured to: performing dependency syntax analysis on the sentence through a syntax analyzer to generate a dependency syntax analysis tree; finding out two shortest dependency paths between target entities from the dependency syntax analysis tree, wherein the two shortest paths represent two clauses of the sentence;
the first extraction module is used for inputting the two clauses into the first channel in the first channel, and extracting features through a convolutional neural network model to obtain first extraction information;
the second extraction module is used for inputting the sentence into the second channel in the second channel, and extracting the characteristics through the long-term and short-term memory network model to obtain second extraction information;
the classification module is used for carrying out weighted summarization on the first extraction information and the second extraction information through an attention mechanism to obtain final extraction features of the sentences, and inputting the final extraction features into a softmax layer of the dual-channel neural network model to finish classifying the relationship types among the target entities;
the first extraction module is further used for carrying out vector representation on the words of the two clauses; processing the vector representations of the two clauses through a convolution layer, a pooling layer and a nonlinear layer; and fusing the vector representations of the two clauses after processing through the hidden layer to obtain first extraction information.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor is adapted to implement the steps of the neural network relation extraction method of any one of claims 1 to 4 when the computer program is executed by the processor.
7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the neural network relation extraction method of any one of claims 1 to 4.
CN202010752459.2A 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium Active CN111898364B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010752459.2A CN111898364B (en) 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium
PCT/CN2020/111513 WO2021174774A1 (en) 2020-07-30 2020-08-26 Neural network relationship extraction method, computer device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010752459.2A CN111898364B (en) 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN111898364A CN111898364A (en) 2020-11-06
CN111898364B true CN111898364B (en) 2023-09-26

Family

ID=73182595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010752459.2A Active CN111898364B (en) 2020-07-30 2020-07-30 Neural network relation extraction method, computer equipment and readable storage medium

Country Status (2)

Country Link
CN (1) CN111898364B (en)
WO (1) WO2021174774A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528326B (en) * 2020-12-09 2024-01-02 维沃移动通信有限公司 Information processing method and device and electronic equipment
CN113626608B (en) * 2021-10-12 2022-02-15 深圳前海环融联易信息科技服务有限公司 Semantic-enhancement relationship extraction method and device, computer equipment and storage medium
CN113990473B (en) * 2021-10-28 2022-09-30 上海昆亚医疗器械股份有限公司 Medical equipment operation and maintenance information collecting and analyzing system and using method thereof
CN114417846B (en) * 2021-11-25 2023-12-19 湘潭大学 Entity relation extraction method based on attention contribution degree
CN114385817A (en) * 2022-01-14 2022-04-22 平安科技(深圳)有限公司 Entity relationship identification method and device and readable storage medium
CN116386895B (en) * 2023-04-06 2023-11-28 之江实验室 Epidemic public opinion entity identification method and device based on heterogeneous graph neural network
CN116108206B (en) * 2023-04-13 2023-06-27 中南大学 Combined extraction method of financial data entity relationship and related equipment
CN117054396B (en) * 2023-10-11 2024-01-05 天津大学 Raman spectrum detection method and device based on double-path multiplicative neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710932A (en) * 2018-12-22 2019-05-03 北京工业大学 A kind of medical bodies Relation extraction method based on Fusion Features
CN109783618A (en) * 2018-12-11 2019-05-21 北京大学 Pharmaceutical entities Relation extraction method and system based on attention mechanism neural network
CN110020671A (en) * 2019-03-08 2019-07-16 西北大学 The building of drug relationship disaggregated model and classification method based on binary channels CNN-LSTM network
CN110598001A (en) * 2019-08-05 2019-12-20 平安科技(深圳)有限公司 Method, device and storage medium for extracting association entity relationship
CN111428481A (en) * 2020-03-26 2020-07-17 南京搜文信息技术有限公司 Entity relation extraction method based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563653B (en) * 2017-12-21 2020-07-31 清华大学 Method and system for constructing knowledge acquisition model in knowledge graph
EP3794511A1 (en) * 2018-05-18 2021-03-24 BenevolentAI Technology Limited Graph neutral networks with attention
US11574122B2 (en) * 2018-08-23 2023-02-07 Shenzhen Keya Medical Technology Corporation Method and system for joint named entity recognition and relation extraction using convolutional neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783618A (en) * 2018-12-11 2019-05-21 北京大学 Pharmaceutical entities Relation extraction method and system based on attention mechanism neural network
CN109710932A (en) * 2018-12-22 2019-05-03 北京工业大学 A kind of medical bodies Relation extraction method based on Fusion Features
CN110020671A (en) * 2019-03-08 2019-07-16 西北大学 The building of drug relationship disaggregated model and classification method based on binary channels CNN-LSTM network
CN110598001A (en) * 2019-08-05 2019-12-20 平安科技(深圳)有限公司 Method, device and storage medium for extracting association entity relationship
CN111428481A (en) * 2020-03-26 2020-07-17 南京搜文信息技术有限公司 Entity relation extraction method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于CNN和双向LSTM融合的实体关系抽取;张晓斌 等;《网络与信息安全学报》;第14卷(第9期);第44-51页 *
学术文献的实体关系抽取研究及实现;郑钰婷;《中国优秀硕士学位论文全文数据库信息科技辑(月刊)》;20200615(第06期);第I138-1278页 *

Also Published As

Publication number Publication date
WO2021174774A1 (en) 2021-09-10
CN111898364A (en) 2020-11-06

Similar Documents

Publication Publication Date Title
CN111898364B (en) Neural network relation extraction method, computer equipment and readable storage medium
Zhou et al. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
CN110263324B (en) Text processing method, model training method and device
CN111444340A (en) Text classification and recommendation method, device, equipment and storage medium
US20210018332A1 (en) Poi name matching method, apparatus, device and storage medium
CN111753189A (en) Common characterization learning method for few-sample cross-modal Hash retrieval
CN107766555A (en) Image search method based on the unsupervised type cross-module state Hash of soft-constraint
CN113761893B (en) Relation extraction method based on mode pre-training
CN113127632B (en) Text summarization method and device based on heterogeneous graph, storage medium and terminal
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN111523420A (en) Header classification and header list semantic identification method based on multitask deep neural network
CN114528898A (en) Scene graph modification based on natural language commands
CN112395425A (en) Data processing method and device, computer equipment and readable storage medium
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN116821373A (en) Map-based prompt recommendation method, device, equipment and medium
CN116662488A (en) Service document retrieval method, device, equipment and storage medium
CN113254649B (en) Training method of sensitive content recognition model, text recognition method and related device
CN112598039A (en) Method for acquiring positive sample in NLP classification field and related equipment
CN116226404A (en) Knowledge graph construction method and knowledge graph system for intestinal-brain axis
CN115470232A (en) Model training and data query method and device, electronic equipment and storage medium
WO2022127124A1 (en) Meta learning-based entity category recognition method and apparatus, device and storage medium
CN115759254A (en) Question-answering method, system and medium based on knowledge-enhanced generative language model
WO2022141855A1 (en) Text regularization method and apparatus, and electronic device and storage medium
CN114398980A (en) Cross-modal Hash model training method, encoding method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant