WO2018218707A1

WO2018218707A1 - Neural network and attention mechanism-based information relation extraction method

Info

Publication number: WO2018218707A1
Application number: PCT/CN2017/089137
Authority: WO
Inventors: 刘兵; 周勇; 张润岩; 王重秋
Original assignee: 中国矿业大学
Priority date: 2017-05-27
Filing date: 2017-06-20
Publication date: 2018-12-06
Also published as: CN107239446B; CN107239446A

Abstract

The present invention relates to the fields of recurrent neural networks, natural language processing and information analysis combined with attention mechanisms, and provided thereby are a neural network and an attention mechanism-based information relation extraction method, which are used for solving the problems of large workload and low generalization in existing information analysis systems that are mostly based on artificially constructed knowledge bases. The specific implementation of the method comprises a training phase and an application phase. In the training phase, first a user dictionary is constructed and word vectors are trained, a training set is constructed from within a historical information database, a corpus is pre-processed, and then neural network model training is conducted; and in the application phase, information is obtained, the information is pre-processed, an information relation extraction task may be automatically completed while supporting user dictionary expansion and error correction determination, the result of which is added to a training neural network model having an incremental training set. The information relation extraction method can find relationship between pieces of information, provide a basis for event context integration and decision making, and has a wide range of application value.

Description

An Information Relationship Extraction Method Based on Neural Network and Attention Mechanism

Technical field

The invention relates to the field of cyclic neural network, natural language processing and intelligence analysis combined with attention mechanism, in particular to a method for extracting intelligence relations by using a bidirectional cyclic neural network combining attention mechanism.

Background technique

With the development of various technologies in the information age, the amount of information data has exploded. Nowadays, the information acquisition and storage technology is relatively mature, and many technical improvements are still needed in the fields of intelligence analysis and key information extraction of massive intelligence data. Intelligence data has the characteristics of strong theme, high timeliness and rich information. Conducting relationship analysis on intelligence under the same theme, integrating intelligence according to time and space, causality, etc., can complete tasks such as description of subject events and multi-angle analysis, and provide a basis for final decision-making. Therefore, it is of great practical significance to find the relationship between intelligence and integrate the context of events.

At present, the classification of intelligence is based on the standard knowledge framework or model paradigm, that is, the domain experts extract key features of intelligence, organize the expression forms of various relationship categories, and build a knowledge base to complete the relationship classification. The information analysis system of patent CN201410487829.9, based on the standard knowledge framework, uses computer to accumulate knowledge, integrate scattered information, comprehensive historical information to complete the identification of intelligence relations, and finally provide the thinking brain map of command decision-making and assist decision-making. The information association processing method of patent CN201610015796, based on the domain knowledge model, extracts the feature vocabulary by means of naming body recognition and domain dictionary, and trains the topic relevance degree of the feature words with the topic map model, thereby establishing the topic template of the event, and completing the template by using the template. Relevance judgment of intelligence.

In addition, some studies have used neural network methods of machine learning for relationship extraction. The patent CN201610532802.6, the patent CN201610393749.6 and the patent CN201610685532.2 respectively use a multi-layer convolutional neural network, a convolutional neural network combined with distance supervision, and a convolutional neural network combined with attention to perform relationship extraction.

Based on the above research status, the following methods are mainly used for the relationship extraction method of intelligence: First, the intelligence analysis based on the knowledge framework or model requires a large number of historical cases with wide coverage, and requires domain experts with professional knowledge to build the knowledge base. That is, the workload is large and the completed framework may have weak generalization ability. Secondly, the neural network-based method mostly stays on the theoretical method research, and needs some adjustment in practical applications, and now more convolutional nerves are used. The network has a poor effect on the grasp of the whole sentence context. The accuracy of the non-special processing is not as good as that of the bi-directional RNN.

Summary of the invention

OBJECT OF THE INVENTION In order to overcome the deficiencies in the prior art, the present invention provides an intelligent, accurate, and effective presentation of an intelligence relationship extraction method.

Technical Solution: In order to achieve the above object, the technical solution adopted by the present invention is:

An information relationship extraction method based on neural network and attention mechanism includes the following steps:

Step 1) Construct a user dictionary, and the neural network system has an initial user dictionary.

Step 2) training the word vector, extracting the text data from the database related to the field, and using the user dictionary training word vector library obtained in step 1), mapping the text words in the text data into numerical vector data;

Step 3) construct a training set, extract an intelligence pair from the historical intelligence database, and convert each pair of information into an intelligence relationship triplet training data <intelligence 1, intelligence 2, relationship> using the word vector library obtained in step 2);

Step 4) corpus preprocessing, first using the user dictionary obtained in step 1) to perform corpus preprocessing on the training data obtained in step 3), namely word segmentation and naming body recognition; word segmentation and naming body recognition are implemented using existing automation tools, The final result of processing is to transform each piece of information into a vector of words of behavioral words, a matrix of information words listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two;

Step 5) training the neural network model, adding the matrix obtained in step 4) to the neural network for training, and obtaining a relation extraction neural network model; wherein the neural network training method comprises the following steps:

Step 5-1) Input the information word matrix into the bidirectional length and time memory network Bi-LSTM unit to extract the information of the integrated context, and input the positive sequence statement and the reverse order statement into the two long and short time memory network LSTM units respectively; when calculating the current time, Iteratively considers the role of the last moment; the combined expression of the hidden layer calculation and feature extraction of the LSTM unit is as follows:

i _t =σ(W _xi x _t +W _hi h _t-1 +W _ci c _t-1 +b _i )

f _t =σ(W _xf x _t +W _hf h _t-1 +W _cf c _t-1 +b _f )

g _t =tanh(W _xc x _t +W _hc h _t-1 +W _cc c _t-1 +b _c )

c _t =i _t g _t +f _t c _t-1

o _t =σ(W _xo x _t +W _ho h _t-1 +W _co c _t +b _o )

h _t =o _t ·tanh(c _t )

Where: x _t represents the information word matrix obtained in step 4) at time t, and is also the input matrix of the neural network;

i _t represents the output result of the input gate at time t;

f _t represents the output result of the forgetting gate at time t;

g _t represents the output of the integrated input at time t;

c _t and c _t-1 respectively represent memory flow states at time t and time t-1;

o _t represents the output of the output gate at time t;

h _t and h _t-1 respectively represent hidden layer information at time t and t-1, that is, feature output extracted by neural network;

σ() represents the sigmoid activation function, and tanh() represents the hyperbolic tangent activation function;

W _xi , W _hi , W _{ci ,} etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part of the subordinate;

b _i , b _{f ,} etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part;

The parameters W _xi , W _hi , W _ci , b _i , b _f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;

Step 5-2) weighting the two long and short time memory network LSTM unit outputs of the positive sequence statement and the reverse order statement as the final output of the neural network;

o _final =W _fw h _fw +W _bw h _bw

Where h _fw represents the output of the LSTM network processing the positive sequence statement, and W _fw represents its corresponding weight to be trained;

h _bw represents the output of the LSTM network that processes the reverse statement, and W _bw represents the corresponding weight to be trained;

o _final represents the final output of the neural network;

The weights W _fw and W _bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;

Step 5-3) Calculate the attention distribution of the whole sentence according to the neural network output of the corresponding position of the named body, and output according to the whole sentence of the allocation combined neural network, and the formula is as follows:

α=softmax(tanh(E)·W _a ·O _final )

r=α·O _final

Where α is the attention distribution matrix, r is the output of the targeted integration of the intelligence statement; E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix

O _final is the output of the cyclic neural network, which is shaped like [o ₁ , o ₂ , o ₃ ... o _n ], where o ₁ , o ₂ , o ₃ ... o _n are the outputs of the corresponding nodes of the neural network, and n is the word of intelligence. Quantity

W _a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function; the weight W _a to be trained here is also randomly initialized first, and then automatically corrected during the training, and finally Will get the final value with the training of the neural network;

Step 5-4) For the feature information r of the two intelligences, input the fully connected layer after splicing, and finally use the softmax classifier to classify the relationship, and use the gradient descent method to train the obtained prediction result;

Step 6) Information acquisition, input two sets of text information, a batch can have multiple groups, wherein the text information is a clear text in the center, if it is new information, you can choose to expand the user dictionary obtained in step 1). ;

Step 7) text preprocessing, through the trained word segmentation tool in step 4), the word vector library and step obtained in step 2) The character recognition tool used in step 4) converts the text information of the original whole sentence in step 6) into an information value matrix; wherein each line is a vector representation of each word, and a matrix represents an intelligence, and is marked with The location of the named body;

Step 8) Relationship extraction, extracting the two or two sets of intelligence matrix processed in step 7), extracting the neural network model from the training step of the input step 5), performing automatic relationship extraction, and finally obtaining the relationship category of each group of intelligence; Each group of intelligence relationship categories;

Step 9) incrementally updating, determining whether the relationship type of each group of information obtained in step 8) is correct or not, and if the judgment is correct, performing visual display in combination with the information acquired in step 6) and the corresponding relationship category, if the judgment is wrong, Select to add the correctly determined intelligence relationship triplet training data to the training set in step 3), repeat steps 4) and 5), and retrain the modified neural network model.

Further, the optional solution in step 1) is to construct a professional domain user dictionary, which refers to a proper noun in a specific field and is difficult to recognize from the field; other common words can be automatically recognized; The proprietary vocabulary can be selected from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only needs to add the known proprietary vocabulary to the user dictionary of the neural network system.

Preferably, the training set is constructed by extracting sufficient intelligence from the historical intelligence database and constructing the intelligence relationship triplet training data, requiring more than 5,000; specifically determining the relationship category first, the relationship category including the antecedents and consequences, the theme and the details Description, location contact, time contact, according to different relationships, the information is divided into three groups such as <intelligence 1, intelligence 2, relationship>.

Preferably, the text data is extracted from the database related to the domain, combined with the text corpus of the network encyclopedia and the news broadcast, and the word vocabulary is trained by the Google toolkit word2vector to map the text vocabulary into numerical vector data, and the vector data includes the original Semantic information to complete the conversion of natural language to numerical representation.

Preferably: Chinese is semantically a word unit. For the input of the whole sentence, word segmentation processing is required first; in the process of word segmentation, the professional domain user dictionary is added.

Preferably, the information in the step of acquiring information should be a center-defined text within a short period of 100 words; the relationship extraction is for a binary relationship, that is, the processing object is a pair of information, so the input of the LSTM unit of the long and short memory network should be two A group of textual intelligence.

Preferred: word segmentation and naming recognition are implemented using existing automated tools such as nlpir and stanford-ner.

Preferably: a user dictionary of the professional domain is used when the automated tool identifies the word segmentation and the naming body.

Compared with the prior art, the invention has the following beneficial effects:

The invention uses a bidirectional cyclic neural network, combines the naming entity to pay attention to each word in the intelligence, extracts the feature information in the word vector representation of the intelligence, and further classifies the extracted feature information by using the softmax classifier, thereby completing the relationship of the intelligence. Extract the task. Two-way cyclic neural network has powerful feature extraction ability on text data, which can overcome the traditional The problem of large amount of artificial feature extraction in the knowledge base method and the weak generalization ability caused by subjectivity; the use of two-way long-term memory network can effectively consider the complete context information, and the attention weight of the named entity can be based on these narrative centers. The word automatically assigns the importance of each word in the intelligence, which makes the relationship extraction method of the present invention have higher accuracy than other neural network methods.

DRAWINGS

The specific embodiments of the present invention are further described in detail below with reference to the accompanying drawings, wherein:

1 is a flow chart of a method for extracting an intelligence relationship based on a neural network and an attention mechanism according to the present invention.

2 is a schematic diagram of a bidirectional cyclic neural network used in an information relation extraction method based on a neural network and an attention mechanism according to the present invention.

FIG. 3 is a schematic diagram of an attention mechanism used in an information relationship extraction method based on a neural network and an attention mechanism according to the present invention.

detailed description

The invention will be further clarified with reference to the accompanying drawings and specific embodiments, which are to be construed as illustrative only and not to limit the scope of the invention. Modifications in the form of the price are all within the scope defined by the claims appended hereto.

As shown in Figure 1, an information relationship extraction method based on neural network and attention mechanism is divided into two phases: training phase and application phase.

(1) Training stage:

As shown in Figure 1, in the training phase, the system needs to first construct a user dictionary (optional), a training word vector, and then construct a training set from the historical intelligence database, perform corpus preprocessing, and finally perform training on the relationship extraction neural network model.

a. Building a user dictionary: The neural network system has an initial user dictionary, and the vocabulary is extracted from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only needs to add the known proprietary vocabulary to the neural network. A user dictionary of the network system can build a dictionary of proprietary vocabulary users. Professional domain user lexicon refers to proper nouns in a particular field and is difficult to identify from the field; other common vocabulary can be automatically identified;

b. Training word vector: extract text data from the database related to the domain, combine the text corpus with network encyclopedia, news broadcast, etc., and use the user dictionary obtained in step (a) a) to train the word vector library through the Google toolkit word2vector, and the text The vocabulary is mapped into numerical vector data, and the vector data contains the original semantic information, thereby completing the transformation from natural language to numerical representation.

c. Build a training set: extract more than 5,000 intelligence pairs from the historical intelligence database, and construct the intelligence relationship triplet training data using the word vector library obtained in step (a) b). Specifically, you need to first determine the relationship category, such as the cause and effect, The theme is related to the detailed description, location, and time. According to different relationships, the information is divided into three groups: <intelligence 1, intelligence 2, relationship>.

d. Corpus preprocessing: firstly use the user dictionary obtained in step a) to perform corpus preprocessing on the triplet training data obtained in step (1) c), namely segmentation and naming recognition, word segmentation and naming recognition using existing Automated tool implementations such as nlpir and stanford-ner. In this process, the user dictionary of the professional field will be used, and finally the accuracy rate of 95% or more will be achieved. The final result of the preprocessing is to convert each piece of information in the triplet training data into an action matrix of the behavior word vector dimension, which is listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two.

e, neural network model training: step (a) d) pre-processed two or two sets of intelligence matrix are subjected to the following neural network training process: the step (a) d) pre-processed information matrix input relationship extraction neural The network is trained. First, the information word matrix is input into the bidirectional long-term memory network Bi-LSTM to extract the information of the comprehensive context. The formula of the LSTM network is as follows:

i _t =σ(W _xi x _t +W _hi h _t-1 +W _ci c _t-1 +b _i )

f _t =σ(W _xf x _t +W _hf h _t-1 +W _cf c _t-1 +b _f )

g _t =tanh(W _xc x _t +W _hc h _t-1 +W _cc c _t-1 +b _c )

c _t =i _t g _t +f _t c _t-1

o _t =σ(W _xo x _t +W _ho h _t-1 +W _co c _t +b _o )

h _t =o _t ·tanh(c _t )

Where: x _t represents the matrix obtained in step 4) at time t (corresponding to the input of the t-th word vector), and is also the input matrix of the neural network;

i _t represents the output result of the input gate at time t (corresponding to the input of the t-th word vector), which determines the proportion of the information recorded by the memory stream;

f _t represents the output result of the forgetting gate at time t (corresponding to the input of the t-th word vector), which determines the proportion of the memory stream based on the information, forgetting the memory data;

g _t represents the output of the input integration at time t (corresponding to the t-th word vector input), which integrates the information input this time;

c _t and c _t-1 respectively represent the memory flow state at time t (corresponding to the t-th word vector input) and t-1 time (corresponding to the t-1th word vector input);

o _t represents the output result of the output gate at time t (corresponding to the t-th word vector input), which determines the proportion of the output data from the memory stream;

h _t and h _t-1 respectively represent the t-time (corresponding to the t-th word vector input) and the t-1 time (corresponding to the t-1th word vector input) hidden layer information, that is, the feature output extracted by the neural network;

W _xi , W _hi , W _{ci ,} etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part to which it belongs;

b _i , b _{f ,} etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part.

As shown in Figure 2, the specific implementation of the bidirectional cyclic neural network is to train two cyclic neural networks. The inputs are positive sequence statements and reverse order statements. In the figure, w1, w2, w3... are a series of words (statements). Two neural networks are input in positive and negative order, respectively. After that, the output of the two is spliced as the final output of the neural network, that is, the corresponding formulas of o1, o2, and o3 in the figure are as follows:

o _final =W _fw h _fw +W _bw h _bw

Where h _fw represents the output of the neural network processing the positive sequence statement, and W _fw represents its corresponding weight to be trained;

h _bw represents the output of the neural network processing the reverse sentence, and W _bw represents the corresponding weight to be trained;

o _final represents the final output of the neural network.

As shown in FIG. 3, the attention distribution of the whole sentence is calculated according to the neural network output of the corresponding position of the naming body, and the whole sentence output according to the allocation combination neural network is as follows:

α=softmax(tanh(E)·W _a ·O _final )

r=α·O _final

W _a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function;

The weight W _a to be trained here is also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;

For the characteristic information r of the two intelligences, the splicing is input to the fully connected layer, and finally the softmax classifier is used for the relationship classification, and the obtained prediction result is trained by the gradient descent method;

(2) Application stage:

As shown in FIG. 1, the information relationship extraction method of the present invention includes four steps of intelligence acquisition, text preprocessing, relationship extraction, and incremental update in the application phase:

a. Information acquisition, the information should be a clear text in the center of a short paragraph of 100 words. Relationship extraction is for binary relations, that is, the processing object is a pair of intelligence, so the input of the system should be two sets of text information, and one batch can have multiple groups. As shown in Figure 1, if it is new information, you can choose to expand the steps (a) a) user dictionary to adapt to the new vocabulary in the new information.

b, text preprocessing, through the step (a) d) trained word segmentation tool, step (a) b) word vector library and step (a) d) used in the name recognition tool, the steps (two The text information of the original whole sentence of two groups in a) is converted into a numerical matrix, where each line is a vector representation of each word, and a matrix represents an intelligence, and the position of the named body is marked.

c. Relationship extraction, step (2) b) processed two pairs of information matrix to the input step (a) e) trained relationship extract neural network model, automatic relationship extraction, and finally get each group of information Relationship category.

d, incremental update, as shown in Figure 1, the system supports correcting the error judgment, judging the relationship type of each group of information obtained in step (2) c), if the judgment is correct, then combine with step (2) a) The information and the corresponding relationship categories are visualized. If the judgment is wrong, you can choose to add the correctly judged intelligence relationship triplet training data to the training set in step (a)c), repeating steps (a) d) and steps. (a) e), retraining the modified neural network model.

The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims

An information relationship extraction method based on a neural network and a attention mechanism, characterized in that it comprises the following steps:

Step 1) Construct a user dictionary, and the neural network system has an initial user dictionary.

Step 2) training the word vector, extracting the text data from the database related to the field, and using the user dictionary training word vector library obtained in step 1), mapping the text words in the text data into numerical vector data;

Step 3) construct a training set, extract an intelligence pair from the historical intelligence database, and convert each pair of information into an intelligence relationship triplet training data <intelligence 1, intelligence 2, relationship> using the word vector library obtained in step 2);

Step 4) corpus preprocessing, first using the user dictionary obtained in step 1) to perform corpus preprocessing on the training data obtained in step 3), namely word segmentation and naming body recognition; word segmentation and naming body recognition are implemented using existing automation tools, The final result of processing is to transform each piece of information into a vector of words of behavioral words, a matrix of information words listed as the length of the sentence, and mark the position of the named body, and the information is in groups of two;

Step 5) training the neural network model, adding the matrix obtained in step 4) to the neural network for training, and obtaining a relation extraction neural network model; wherein the neural network training method comprises the following steps:

Step 5-1) Input the information word matrix into the bidirectional length and time memory network Bi-LSTM unit to extract the information of the integrated context, and input the positive sequence statement and the reverse order statement into the two long and short time memory network LSTM units respectively; when calculating the current time, Iteratively considers the role of the last moment; the combined expression of the hidden layer calculation and feature extraction of the LSTM unit is as follows:

i t =σ(W xi x t +W hi h t-1 +W ci c t-1 +b i )

f t =σ(W xf x t +W hf h t-1 +W cf c t-1 +b f )

g t =tanh(W xc x t +W hc h t-1 +W cc c t-1 +b c )

c t =i t g t +f t c t-1

o t =σ(W xo x t +W ho h t-1 +W co c t +b o )

h t =o t ·tanh(c t )

Where: x t represents the information word matrix obtained in step 4) at time t, and is also the input matrix of the neural network;

i t represents the output result of the input gate at time t;

f t represents the output result of the forgetting gate at time t;

g t represents the output of the integrated input at time t;

c t and c t-1 respectively represent memory flow states at time t and time t-1;

o t represents the output of the output gate at time t;

h t and h t-1 respectively represent hidden layer information at time t and t-1, that is, feature output extracted by neural network;

σ() represents the sigmoid activation function, and tanh() represents the hyperbolic tangent activation function;

W xi , W hi , W ci , etc. represent the weight parameters to be trained, the former of which indicates the multiplied input quantity, and the latter represents the calculation part of the subordinate;

b i , b f , etc. represent the offset parameters to be trained, and the corner marks indicate the associated calculation part;

The parameters W xi , W hi , W ci , b i , b f to be trained here are randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;

Step 5-2) weighting the two long and short time memory network LSTM unit outputs of the positive sequence statement and the reverse order statement as the final output of the neural network;

o final =W fw h fw +W bw h bw

Where h fw represents the output of the LSTM network processing the positive sequence statement, and W fw represents its corresponding weight to be trained;

h bw represents the output of the LSTM network that processes the reverse statement, and W bw represents the corresponding weight to be trained;

o final represents the final output of the neural network;

The weights W fw and W bw to be trained here are also randomly initialized first, and then automatically corrected during the training process, and finally the final value is obtained with the training of the neural network;

Step 5-3) Calculate the attention distribution of the whole sentence according to the neural network output of the corresponding position of the named body, and output according to the whole sentence of the allocation combined neural network, and the formula is as follows:

α=softmax(tanh(E)·W a ·O final )

r=α·O final

Where α is the attention distribution matrix, r is the output of the targeted integration of the intelligence statement; E is the output of the circulatory neural network at the naming position, using the fixed window mode, selecting the pre-K important naming splicing into the naming Body matrix; O final is the output of the cyclic neural network, such as [o 1 , o 2 , o 3 ... o n ], where o 1 , o 2 , o 3 ... o n are the outputs of the corresponding nodes of the neural network, n is The number of words in intelligence;

W a is the weight matrix to be trained, softmax() is the softmax classifier function, and tanh() is the hyperbolic tangent activation function; the weight W a to be trained here is also randomly initialized first, and then automatically corrected during the training, and finally Will get the final value with the training of the neural network;

Step 5-4) For the feature information r of the two intelligences, input the fully connected layer after splicing, and finally use the softmax classifier to classify the relationship, and use the gradient descent method to train the obtained prediction result;

Step 6) Information acquisition, input two sets of text information, a batch can have multiple groups, wherein the text information is a clear text in the center, if it is new information, you can choose to expand the user dictionary obtained in step 1). ;

Step 7) text preprocessing, through the trained word segmentation tool in step 4), the word vector library and step obtained in step 2) The character recognition tool used in step 4) converts the text information of the original whole sentence in step 6) into an information value matrix; wherein each line is a vector representation of each word, and a matrix represents an intelligence, and is marked with The location of the named body;

Step 8) Relationship extraction, extracting the two or two sets of intelligence matrix processed in step 7), extracting the neural network model from the training step of the input step 5), performing automatic relationship extraction, and finally obtaining the relationship category of each group of intelligence;

Step 9) incrementally updating, determining whether the relationship type of each group of information obtained in step 8) is correct or not, and if the judgment is correct, performing visual display in combination with the information acquired in step 6) and the corresponding relationship category, if the judgment is wrong, Select to add the correctly determined intelligence relationship triplet training data to the training set in step 3), repeat steps 4) and 5), and retrain the modified neural network model.
The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein: the optional solution in step 1) is to construct a professional domain user dictionary, and the professional domain user dictionary refers to a specific field. Words with nouns that are more difficult to identify in the field; other common vocabulary can be automatically identified; the proprietary vocabulary can be selected from the historical intelligence database. If the vocabulary extracted from the historical intelligence database is a proprietary vocabulary, the user only A known proprietary vocabulary needs to be added to the user dictionary of the neural network system.
The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein the training set is constructed by extracting sufficient information from the historical intelligence database to construct an intelligence relationship triplet training data. More than 5,000 are required; specifically, the relationship category is first determined. The relationship categories include antecedents and consequences, topics and details, location linkages, and time linkages. According to different relationships, intelligence dichotomy is formed as <intelligence 1, intelligence 2, relationship> The triplet.
The method for extracting intelligence relationship based on neural network and attention mechanism according to claim 1, wherein: extracting text data from a database related to the domain, combining text corpus of network encyclopedia and news broadcast, and using Google tools The word2vector training word vector library maps the text vocabulary into numerical vector data, and the vector data contains the original semantic information, thereby completing the transformation from natural language to numerical representation.
The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein: Chinese is semantically a word unit, and for the input of the entire sentence, word segmentation processing is required first; In, join the professional domain user dictionary.
The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 1, wherein the information in the step of acquiring information should be a center-defined text within a short period of 100 words; the relationship extraction is for two The meta-relationship, that is, the processing object is a pair of intelligence, so the input of the LSTM unit of the long and short memory network should be two sets of text information.
The method for extracting intelligence relationship based on neural network and attention mechanism according to claim 1, wherein word segmentation and nominal recognition are implemented using existing automated tools, such as nlpir and stanford-ner.
The method for extracting an intelligence relationship based on a neural network and an attention mechanism according to claim 7, wherein a user dictionary of a professional domain is used when the automated tool recognizes the word segmentation and the naming body.