CN110334354A

CN110334354A - A kind of Chinese Relation abstracting method

Info

Publication number: CN110334354A
Application number: CN201910626307.5A
Authority: CN
Inventors: 丁宁; 李自然; 郑海涛; 刘知远; 沈颖
Original assignee: Shenzhen Graduate School Tsinghua University
Current assignee: Shenzhen Graduate School Tsinghua University
Priority date: 2019-07-11
Filing date: 2019-07-11
Publication date: 2019-10-15
Anticipated expiration: 2039-07-11
Also published as: CN110334354B

Abstract

The present invention provides a kind of Chinese Relation abstracting method, includes the following steps: S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, to extract the distributed vector of three word in the text, word and the meaning of a word ranks；S2: feature coding: using two-way length, memory network obtains the hidden state vector of the hidden state vector of word, word by the distributed vector of three word, word and the meaning of a word ranks as basic framework in short-term, and then obtains the final hidden state vector of word rank；S3: relationship classification: learn the final hidden state vector of the word rank, using the attention mechanism of the word rank by the hidden state vector fusion of the word rank at the hidden state vector of a sentence level.The problem of efficiently solving segmentation ambiguity and polysemant ambiguity greatly improves performance of the model in Relation extraction task, improves accuracy rate and robustness that Chinese Relation extracts.

Description

A kind of Chinese Relation abstracting method

Technical field

The present invention relates to computer application technology more particularly to a kind of Chinese Relation abstracting methods.

Background technique

Natural language processing is the sub- subject of artificial intelligence and the cross discipline of computer science and computational linguistics. Wherein, Relation extraction is one of basic task of natural language processing field.The purpose is to for given sentence and mark Entity (usually noun), accurately find out the relationship between entity.Relation extraction technology can be used to construct large-scale knowledge Map, the semantic network that knowledge mapping is made of concept, entity, entity attribute and entity relationship are to real world structure The expression of change.The building of extensive knowledge mapping can provide comprehensive and structuring external knowledge for artificial intelligence system, from And develop more powerful application.

Traditional Relation extraction task be there is a problem that certain, they often formulate feature by manually, so that model It is effectively run on the specific data set of small range, this way limits the development in Relation extraction field.

Simultaneously as the dependence to manual feature, traditional Relation extraction technology has poor robustness and expansible Property, cause model that can not carry out in different data and corpus extensive.

In recent years, the Relation extraction based on deep learning makes great progress and traditional Relation extraction method ratio, These methods have many good qualities.Firstly, since the application of neural network, these models can automatically remove the semanteme of learning text Feature removes design feature so as to avoid specific data is artificially directed to, reduces human cost, and achieve better effect Fruit.This neural network model provides a kind of solution end to end, minimizes artificial degree of participation.Meanwhile base Also possess higher robustness in the model of neural network, can be arrived for the different feature of ever-changing natural language learning The mapping of output.

But even deep learning model, also it is faced with some yet unresolved issues.For not having as Chinese is this For the language of natural separator, current method is that the realization of word rank or word rank is carried out to main stream approach.The former List entries is input in model as unit of word, and it is special that this method can allow model to be difficult to learn the word rank into semantic space Sign results in information deficiency, reduces the accuracy of Relation extraction task；The latter is first to carry out list entries with participle tool Participle, then is input in model, although this method it is considered that word level information, due to segmenting work by means of outside Tool, it is easy to which the phenomenon that generating segmentation ambiguity limits relationship pumping so that the error of external tool can be propagated in entire model Take the development of task.And either word rank or word level model, the polysemia of word is not accounted for, and only with one The phenomenon that a term vector goes to indicate word feature, and this strategy can not handle polysemant ambiguity, to reduce the upper limit of model.

Summary of the invention

The present invention provides to solve the problems, such as segmentation ambiguity and polysemant ambiguity that Chinese Relation in the prior art extracts A kind of Chinese Relation abstracting method.

To solve the above-mentioned problems, the technical solution adopted by the present invention is as described below:

A kind of Chinese Relation abstracting method, includes the following steps: S1: data prediction: carrying out to the text of input data The pre-training of more granular informations is handled, to extract the distributed vector of three word in the text, word and the meaning of a word ranks； S2: feature coding: using two-way length, memory network passes through the word, the distribution of three ranks of word and the meaning of a word as basic framework in short-term Formula vector obtains the hidden state vector of the hidden state vector of word, word, and then obtains the final hidden state vector of word rank； S3: relationship classification: learning the final hidden state vector of the word rank, using the attention mechanism of the word rank by the word The hidden state vector fusion of rank at a sentence level hidden state vector.

Preferably, the distributed vector for extracting word rank includes extracting word vector sum position vector；The word vector: for The word level sequences in order s={ c of the text of the given input data₁..., c_MTotal M has a character, use the side word2vec Method, by each character c_iAll it is mapped as a word vectorWherein, c_iIndicate i-th of character,It is i-th of character Word vector, R are real number space, d^cIt is the dimension of the word vector；The position vector indicates character c_iTo two entity P¹And P² Between relative positionWherein,Calculation method it is as follows:Wherein, b¹And e¹It is first The beginning of a entity P1 and end position,Calculation method andCalculation method it is identical, willWithIt is converted into corresponding Position vector isWithFor indicating the position feature of the word level sequences in order, d^pIndicate position vector Dimension；

The final expression of the distributed vector of word rank is that two position vectors of the word vector sum are stitched together, I.e. are as follows:At this point,D=d^c+2*d^p, d is that position vector described in the word vector sum splices it Total dimension afterwards；

At this point, the expression of the word level sequences in order of the text of the input data becomes

Preferably, the distributed vector of extraction word rank includes: the word grade for the text of the given input data Other sequence s={ c₁..., c_MAnd word level sequences in order s={ w₁..., w_M, it is indicated using initial position b and final position e One word, that is, w_{B, e}；By word2vec method by word w_{B, e}It is converted into the distributed vector of word rank

Preferably, each word w is obtained from external semantic knowledge base Hownet_{B, e}Meaning of a word set Sense (w_{B, e}), by each The meaning of a word in the meaning of a word set isIt is converted to the distributed vector an of meaning of a word rank I.e.Wherein, K is word w_{B, e}The meaning of a word number.

Preferably, step S2 includes: S21: using word as basic unit, by the word level sequences in order of the text of the input data It is directly inputted to the two-way length and obtains the hidden state vector of the word in memory network in short-term；S22: by the input data Text word level sequences in order using each word as ending word by external semantic knowledge base Hownet obtain institute's predicate All meaning of a word vectors, the meaning of a word vector is input to the two-way length, meaning of a word rank is calculated in memory network in short-term Hidden state vector is obtained described using the method for weighted sum by the hidden state Vector Fusion of all meaning of a word ranks The hidden state vector of word；S23: the weight of the word and institute's predicate is calculated using a gate cell, passes through the side of weighted sum The hidden state Vector Fusion of the hidden state vector sum institute predicate of the word is the final hiding shape of the word by method State vector.

Preferably, step S21 includes: j-th of word in the word level sequences in order of the text, is input to the two-way length When memory network calculating process are as follows:

Wherein, i is input gate, is stored for controlling which information；F is to forget door, will be by for controlling which information Forget；O is out gate, will be exported for controlling which information；C is cell factory, and U and b are to state two-way long short-term memory net Parameter to be learned in network, h indicate hidden state vector, common certainly by hidden state and the data at the current time input at last moment It is fixed.

Preferably, one is started with subscript b in step S22, the word w to be ended up with subscript e_{B, e}, vocabulary is shown asIt is defeated Enter into the two-way length in short-term memory network, the cell factory of institute's predicateIt calculates as follows:

For institute predicate w_{B, e}K-th of meaning of a word, indicate vector beThe cell factory of one meaning of a word rank's Calculating process is as follows:

Additional door machine system is introduced to control the contribution of each word sense information:

The word cell state calculation for having merged multiple word sense informations is as follows:

Then all meaning of a word cell factories can be fused into a word cell state

For character c_e, calculation method is as follows:

Wherein,WithIt is the normalization expression of door, calculation method is as follows:

The corresponding cell factory of each word will merge the information of word and meaning of a word rank, and then obtain the final of the word Hidden state vector:

The final hidden state vector of the word will be fed in classifier, synthesize the spy of corresponding sentence level Sign indicates.

Preferably, the hidden state vector of the sentence levelh^*Calculating it is as follows:

H=tanh (h)

α=softmax (w^TH)

h^*=h α^T

Then h^*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:

O=Wh^*+b

P (y | s)=softmax (o)

For T training data, entire training process will be optimized by following cross entropy loss function:

Wherein, d_hIt is the dimension of hidden state variable, M is the length of list entries, and R is real number space, and T represents transposition, and w is One parameter to be learnt, α are the weight vectors of h,It is transfer matrix, b ∈ R^YIt is bias vector, Y indicates all The total quantity of classification, p (y) then indicate to predict that the probability of some classification, θ indicate all parameters for needing training in entire model.

Preferably, dropout mechanism, the memory network in short-term of the two-way length described in training are used in the training process Each neuron there is 50% probability to be closed.

The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer Program, when the computer program is executed by processor realize as above any the method the step of.

The invention has the benefit that providing a kind of Chinese Relation abstracting method, carried out by the text to input data The pre-training of more granular informations is handled, can be certainly to extract the distributed vector of three word in text, word and the meaning of a word ranks Dynamic study semantic feature, greatly reduces artificial participation；Asking for segmentation ambiguity and polysemant ambiguity can be efficiently solved Topic greatly improves performance of the model in Relation extraction task, improves accuracy rate and robustness that Chinese Relation extracts.

Detailed description of the invention

Fig. 1 is Chinese Relation abstracting method schematic diagram in the embodiment of the present invention.

Fig. 2 is the flow diagram of Chinese Relation abstracting method in the embodiment of the present invention.

Specific embodiment

In order to which technical problem to be solved of the embodiment of the present invention, technical solution and beneficial effect is more clearly understood, The present invention is further described in detail below with reference to the accompanying drawings and embodiments.It should be appreciated that specific implementation described herein Example is only used to explain the present invention, is not intended to limit the present invention.

It should be noted that it can be directly another when element is referred to as " being fixed on " or " being set to " another element On one element or indirectly on another element.When an element is known as " being connected to " another element, it can To be directly to another element or be indirectly connected on another element.In addition, connection can be for fixing Effect is also possible to act on for circuit communication.

It is to be appreciated that term " length ", " width ", "upper", "lower", "front", "rear", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "top", "bottom" "inner", "outside" is that orientation based on the figure or position are closed System is merely for convenience of the description embodiment of the present invention and simplifies description, rather than the device or element of indication or suggestion meaning must There must be specific orientation, be constructed and operated in a specific orientation, therefore be not considered as limiting the invention.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include one or more this feature.In the description of the embodiment of the present invention, the meaning of " plurality " is two or two More than, unless otherwise specifically defined.

Embodiment 1

As shown in Figure 1, the present invention provides a kind of Chinese Relation abstracting method.Include the following steps:

S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, described to extract The distributed vector of word, word and three ranks of the meaning of a word in text；

S2: feature coding: using two-way length, memory network passes through three word, word and the meaning of a word grades as basic framework in short-term Other distribution vector obtains hidden state vector, and then obtains the final hidden state vector of word rank；

S3: relationship classification: learn the final hidden state vector of the word rank, using the attention mechanism of the word rank By the hidden state vector fusion of the word rank at the hidden state vector of a sentence level.

Wherein data prediction step is that the pre-training processing of more granular informations is carried out to the text of input data, to extract Out the word in text, three ranks of word and the meaning of a word distributed vector, traditional pre-training processing is usually only with a term vector A word is indicated to correspond to, and in the present invention, a distributed meaning of a word vector can be generated for each meaning of a word of each word.

A kind of long memory network in short-term of lattice of multi-layer path is realized in feature coding step, it is multiple to efficiently use The hidden state variable of word rank is arrived in the semantic information of rank, study, which can be considered as automatically extracts from data Feature；The hidden variable learnt in feature coding step will be input in relationship classifying step, introduced gate at this time and paid attention to Power mechanism carries out automatic weight distribution to hidden status switch and merges, to filter out noise letter in the fusion process of weighting Breath, retains significant characteristic information, so that final classification exports to obtain more accurate relationship classification.

The present invention includes two stages: training stage and forecast period.Training stage defines an initial model, the mould Parameter in type is random initializtion, in the training process, the data of relationship class label, mould is constantly had to mode input Type can during training continuous learning training data and carry out the update of parameter.Simultaneously we with the prediction of model output and Cross entropy between correct option measures the prediction effect of model as loss function, when the value of the loss function tends towards stability When, illustrate that modelling effect is restrained, training is over, and obtains a trained Relation extraction device；In forecast period, directly Data to be predicted are input in trained Relation extraction device, corresponding prediction entity relationship is obtained.

Data prediction:

In this step, main purpose is point for converting the text in input data in computer and can reading and operating Cloth vector, the vector contain implicit semantic information.At the same time, in order to subsequent module can using in text word, More granular informations of word and the meaning of a word, the study which has all carried out vector for these three language granularities indicate.

For word vector, this technology is trained in Large Scale Corpus using common word2vec algorithm, to obtain Taking the hidden feature of each word indicates.The contextual information in the Large Scale Corpus where word is utilized in the expression, can be sufficiently Embody the syntactic and semantic information of word.

For term vector, it is however generally that, training method is identical with word vector to use word2vec algorithm, and only word vector is It is trained using the word in text as basic unit, and after term vector first will carry out automatic word segmentation to text with participle tool, then It is trained by basic unit of word；But in this way, each word can only correspond to a fixed term vector, have ignored word Ambiguity, therefore the present invention selection to the meaning of a word rather than word carry out vector expression.

For meaning of a word vector, due to can not directly tell the word from word face it is polysemant, has which word respectively Justice, therefore by being modeled by external semantic knowledge base Hownet (HowNet) to the meaning of a word.In HowNet, each word it is a variety of The meaning of a word and justice former (indicating semantic minimum unit) are all manually explicitly marked out, by it, available each word The meaning of a word, and train meaning of a word vector.So, a word may be indicated by multiple meaning of a word vectors, so as to input Into subsequent module, model is allowed to be dynamically selected the language of the most suitable current word in current sentence during training Justice helps semantic information and feature more profound in model capture sentence.

Feature coding:

The neural network structure that can efficiently use more granular information features is realized in feature coding step.It Basic framework be two-way length memory network (LSTM) in short-term, compared to traditional Recognition with Recurrent Neural Network (RNN) model, LSTM can With the more flexible important information for effectively handling contextual information, saving in input, the invalid information in input is forgotten, and The problem of can disappearing to avoid the gradient that deep neural network is likely to encounter with gradient explosion.But traditional LSTM model without Method solves the segmentation ambiguity and polysemant ambiguity problem that Chinese Relation extracts, therefore the present invention has carried out a series of improvement.

First of all for the error propagation for avoiding participle tool, the present invention regards each sentence text using word as basic unit It is the sequence of word rank as directly inputting, is input in two-way LSTM unit and obtains its hidden state vector；Then, in order to The information that can consider word rank simultaneously in an encoding process, for each word in sentence, the present invention can be by all in the sentence In using the word as ending word be added to LSTM unit calculate in.Such as: in sentence " Darwin studies all cuckoos " " cuckoo " word, " cuckoo " is the word using " cuckoo " as ending in the sentence.Then by all using current word as the word of ending It is sent in another two-way LSTM unit, the hidden state of word rank is calculated.Finally calculated using a gate cell The weight of word and word, and then merged the hidden state of word and word by the method for weighted sum, as final hidden of current word State vector is hidden, which contains the information of word and word rank simultaneously.

Although the above method can effectively avoid influence of the participle mistake to model, not have in conjunction with the information of word and word There is the presence in view of polysemant in sentence.For example " cuckoo " is exactly a polysemant in above-mentioned example, there is " azalea " and " Du Two kinds of completely different semantemes of cuckoo bird ".Therefore further, the meaning of a word of each word has been also coupled to hidden state by the present invention Calculating in.Specifically, for each using current word as the word of ending, inquiry HowNet first obtains all words of this word Meaning of a word vector is then input in a two-way LSTM unit as term vector and meaning of a word rank is calculated by adopted vector The state fusion of all meaning of a word ranks is finally obtained the hidden state of the word using the method for weighted sum by hidden state.Phase It is compared than directly obtaining word hidden state by a term vector before, merges to this method dynamic, selects the most suitable meaning of a word To constitute word hidden state.After obtaining the hidden state of word, the hidden state of word and word is merged as method before, is obtained To the final hidden state vector of current word.

Relationship classification:

The sentence characteristics learnt are indicated input classifier, the relationship class label predicted by the step.Upper one In a module, encoder learns to have obtained the character representation (hidden state vector) of each word, but since relationship classification is with sentence Son is what unit was extracted, it is therefore desirable to which the character representation of all word ranks, which is fused into corresponding sentence characteristics, to be indicated.This Invention introduces the character representation that gate attention mechanism is grade malapropism and distributes weight automatically, is then based on the weight and adds to all words Power summation, obtaining final sentence characteristics indicates.This method intuitively be meant that in a sentence, the significance level of each word Be different, noise word or commonly used word such as " " and " " lesser weight should be assigned, and in keyword such as entity Word in word and verb should then correspond to higher attention, and it is just more accurate to merge obtained sentence expression in this way.

After fusion obtains the character representation vector of sentence, which is input in a full articulamentum, mapping is tieed up Degree is the new vector of total relationship classification number.New vector is normalized then through softmax function, so that in vector Be all probability value in 0 to 1 section per one-dimensional value, it is other generally to indicate that sentence is classified into relation object corresponding to leading dimension Rate.For the training stage, the loss function of Definition Model be normalization after vector and current correct relation classification instruction to The cross entropy between (one-hot) is measured, and passes through the parameter in the method more new model of gradient decline；For forecast period, when The relationship classification of preceding prediction is that maximum one-dimensional corresponding relationship classification of probability value in vector after normalization.

Embodiment 2

As shown in Fig. 2, the present embodiment uses the Relation extraction method of Chinese provided by the invention, it is defined as given sentence Two specified entities in s and sentence, judge what relationship the two entities are in sentence s.Such as " Darwin grinds given sentence Study carefully all cuckoos " and designated entities " Darwin " and " cuckoo ", target be that judgement " Darwin " and " cuckoo " is assorted in the sentence Relationship.

Step 1. data prediction:

1.1 word ranks indicate

For given list entries s={ c₁..., c_MTotal M has a character, the present invention uses word2vec method, will Each character (for i-th) c_iAll it is mapped as a word vector It is the word vector of i-th of character, R is real Number space, d^cIt is the dimension of the word vector.Other than word vector, this technology additionally uses position vector to indicate word to two Relative position between entity.Particularly, for i-th of character c_iFor, it arrives the opposite position between two specified entities Setting can be expressed as Calculation method it is as follows:

Herein, b¹And e¹It is the beginning and end position of first entity,Calculation method andAlmost phase Together.In this way,WithIt will be converted into corresponding position vector, beWithFor indicating the word grade The position feature of other sequence, d^pIndicate the dimension of position vector.

In an embodiment of the present invention, input is defined as a sentence and specified two entities therein；It is practical Situation is, if a sentence has multiple entities, carries out Relation extraction, the result of input for wherein every two entity For relationship of the two currently assigned entities in this sentence.

Therefore, for i-th of word c of input_i, its final expression is to spell common two position vectors of word vector sum It picks up and, asAt this point,D=d^c+2*d^p, d is position vector described in the word vector sum Total dimension after splicing；The word expression of list entries just will becomeIt is then fed to subsequent coding step In rapid.

The expression of 1.2 word ranks:

Although the input of model is the sequence of word rank, in order to obtain word grade another characteristic, the present invention will be in sentence All candidate words being likely to occur have carried out the expression study of word rank.For a list entries s, it can not only be expressed as S={ c₁..., c_MForm word level sequences in order, moreover it is possible to be expressed as s={ w₁..., w_MWord level sequences in order.In this section, The present invention indicates a word, i.e. w using initial position b and final position e_{B, e}.Still through word2vec method, word w_{B, e}It can To be converted into term vector

For each word, the present invention obtains its meaning of a word set Sense (w from HowNet_{B, e}), then pass through Skip- The method of Gram, by each meaning of a word in set, i.e.,It is converted to a meaning of a word vectorIt goes individually to indicate a meaning of a word.So, a word may be indicated by multiple meaning of a word vectors (assuming that it has K word Justice), i.e.,

Representation of word meaning vector will be used among the training of coder module, and model is dynamically believed using the meaning of a word Breath.

Step 2. feature coding:

One common word rank LSTM unit is mainly made of following three doors: which an input gate i is for controlling Information can be stored；One forgetting door f is for controlling which information will pass into silence；One out gate o incites somebody to action to control which information It is exported.For j-th of word, the calculating process of LSTM unit is as follows:

Here c indicates cell factory, it stores sequence from the information started to current location.H indicates hidden state vector, It is codetermined by the hidden state at last moment and the input at current time.U and b is parameter to be learned in LSTM.

For one with subscript b beginning, the word w to be ended up with subscript e_{B, e}, its vocabulary is shown asIn Lattice LSTM In, the cell factory of a wordIt calculates as follows:

I.e. for a word c^b, the present invention can first search it is all with its be beginning and in all of external dictionary matching Then word calculates the cell factory c of these words^w.On this basis, the present invention has expanded the calculating of meaning of a word rank, i.e., for every Each meaning of a word of a word all distributes additional LSTM unit and is calculated.It was mentioned in indicating study module, for word w_{B, e} K-th of meaning of a word, its expression vector isTherefore, the cell factory of a meaning of a word rankCalculating process it is as follows:

As soon as then all meaning of a word cell factories can be fused into a word cell factory, such model can take into account word The information of ambiguity.Word cell factory after having merged multiple meaning of a word isIn order to calculateNeed to introduce an additional door Mechanism controls the contribution of each word sense information:

By above-mentioned calculating, for each word w_{B, e}, this model can calculate it and merge the thin of multi-semantic meaning information Born of the same parents' state

Then for character c_e, the present invention will be each with c_eThe word information of ending merges, and has obtained completely new word rank cell State.Calculation method is as follows:

It has passed through current calculating, the corresponding cell factory of each word will merge the information of word and meaning of a word rank, then again Calculate the final hidden state vector of word:

The final hidden state vector of word will be fed in classifier, synthesized the expression of sentence level, then answered The probability distribution of case.

The classification of step 3. relationship:

After the hidden state vector h of each word is learnt, present invention employs the attention mechanism of word rank by word rank Hidden state fusion at a sentence level hidden state

Here d_hIt is the dimension of hidden state variable, M is the length of list entries.h^*Be calculated as an automatic distribution weight Weighted sum:

H=tanh (h)

α=softmax (wTH)

h^*=h α^T

Here T represents transposition, and w is the parameter to be learnt, and α is the weight vectors of h, and H is that h passes through tanh functional transformation Value afterwards, tanh function in this way the range for being mapped to [- 1,1] in h per one-dimensional value can be effectively relieved in training process Gradient explosion the problems such as.

O=Wh^*+b

P (y | s)=softmax (o)

HereIt is transfer matrix, b ∈ R^YIt is bias vector.Y indicates the total quantity of all categories, p (y) then table Show the probability for predicting some classification.

Here θ indicates all parameters that training is needed in entire model.Meanwhile over-fitting, the present invention are gone back in order to prevent Dropout mechanism is used in the training process, and each neuron has 50% probability to be closed and (instructs every time in training White silk has the hiding node layer of random half to be not involved in calculating)；Test phase then keeps all trained neurons all It participates in calculating.

The present invention realizes all or part of the process in above-described embodiment method, can also be instructed by computer program Relevant hardware is completed, and the computer program can be stored in a computer readable storage medium, the computer program When being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer Program code, the computer program code can be source code form, object identification code form, executable file or certain centres Form etc..The computer-readable medium may include: can carry the computer program code any entity or device, Recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software Distribution medium etc..It should be noted that the content that the computer-readable medium includes can be according to making laws in jurisdiction Requirement with patent practice carries out increase and decrease appropriate, such as in certain jurisdictions, according to legislation and patent practice, computer Readable medium does not include electric carrier signal and telecommunication signal.

The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those skilled in the art to which the present invention belongs, it is not taking off Under the premise of from present inventive concept, several equivalent substitute or obvious modifications can also be made, and performance or use is identical, all answered When being considered as belonging to protection scope of the present invention.

Claims

1. a kind of Chinese Relation abstracting method, which comprises the steps of:

S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, to extract the text In word, three ranks of word and the meaning of a word distributed vector；

S2: feature coding: using two-way length, memory network passes through the word, word and the meaning of a word three ranks as basic framework in short-term Distributed vector obtains the hidden state vector of the hidden state vector of word, word, so obtain the final hidden state of word rank to Amount；

S3: relationship classification: learning the final hidden state vector of the word rank, using the attention mechanism of the word rank by institute The hidden state vector fusion of word rank is stated into the hidden state vector of a sentence level.

2. Chinese Relation method for extracting as described in claim 1, which is characterized in that extract word rank distributed vector include Extract word vector sum position vector；

The word vector: for the word level sequences in order s={ c of the text of the given input data₁..., c_MTotal M has a word Symbol, using word2vec method, by each character c_iAll it is mapped as a word vectorWherein, c_iIndicate i-th of word Symbol,It is the word vector of i-th of character, R is real number space, d^cIt is the dimension of the word vector；

The position vector indicates character c_iTo two entity P¹And P²Between relative positionWherein,Calculation method It is as follows:

Wherein, b¹And e¹It is beginning and the end position of first entity P1,Calculation method andCalculation method it is identical, It willWithIt is converted into corresponding position vector, isWithFor indicating the position of the word level sequences in order Feature, d^pIndicate the dimension of position vector；

The final expression of the distributed vector of word rank is that two position vectors of the word vector sum are stitched together, i.e., Are as follows:

At this point,D=d^c+2*d^p, d is total dimension after the splicing of position vector described in the word vector sum；

3. Chinese Relation method for extracting as described in claim 1, which is characterized in that extract the distributed vector packet of word rank It includes:

For the word level sequences in order s={ c of the text of the given input data₁..., c_MAnd word level sequences in order s= {w₁..., w_M, the i.e. w of word is indicated using initial position b and final position e_{B, e}；By word2vec method by word w_{B, e} It is converted into the distributed vector of word rank

4. Chinese Relation method for extracting as claimed in claim 3, which is characterized in that obtained from external semantic knowledge base Hownet every A word w_{B, e}Meaning of a word set Sense (w_{B, e}), it is by each meaning of a word in the meaning of a word set It is converted to the distributed vector an of meaning of a word rankI.e.

Wherein, K is word w_{B, e}The meaning of a word number.

5. Chinese Relation method for extracting as described in claim 1, which is characterized in that step S2 includes:

S21: using word as basic unit, the word level sequences in order of the text of the input data is directly inputted to the two-way length When memory network in obtain the hidden state vector of the word；

S22: the word using each word as ending of the word level sequences in order of the text of the input data is passed through into external language Adopted knowledge base Hownet obtains all meaning of a word vectors of institute's predicate, and the meaning of a word vector is input to the two-way long short-term memory net The hidden state vector of meaning of a word rank is calculated in network, uses method the hiding all meaning of a word ranks of weighted sum State vector fusion obtains the hidden state vector of institute's predicate；

S23: calculating the weight of the word and institute's predicate using a gate cell, by the method for weighted sum by the hidden of the word The hidden state Vector Fusion for hiding state vector and institute's predicate is the final hidden state vector of the word.

6. Chinese Relation method for extracting as claimed in claim 5, which is characterized in that step S21 includes: the word grade of the text J-th of word in other sequence is input to the calculating process of two-way length memory network in short-term are as follows:

Wherein, i is input gate, is stored for controlling which information；F is to forget door, will be passed into silence for controlling which information； O is out gate, will be exported for controlling which information；C is cell factory, U and b be state two-way length in short-term in memory network to The parameter of study, h indicate hidden state vector, are inputted and are codetermined by the hidden state and the data at current time at last moment.

7. Chinese Relation method for extracting as claimed in claim 6, which is characterized in that one is opened with subscript b in step S22 Head, the word w to be ended up with subscript e_{B, e}, vocabulary is shown asThe two-way length is input in short-term in memory network, the cell of institute's predicate UnitIt calculates as follows:

For institute predicate w_{B, e}K-th of meaning of a word, indicate vector beThe cell factory of one meaning of a word rankCalculating Process is as follows:

Then all meaning of a word cell factories can be fused into a word cell state

For character c_e, calculation method is as follows:

The corresponding cell factory of each word will merge the information of word and meaning of a word rank, and then obtain the final of the word and hide State vector:

The final hidden state vector of the word will be fed in classifier, synthesize the mark sheet of corresponding sentence level Show.

8. Chinese Relation method for extracting as claimed in claim 7, which is characterized in that the hidden state vector of the sentence levelh^*Calculating it is as follows:

H=tanh (h)

α=softmax (w^TH)

h^*=h α^T

O=Wh^*+b

P (y | s)=softmax (o)

Wherein, d_hIt is the dimension of hidden state variable, M is the length of list entries, and R is real number space, and T represents transposition, and w is one The parameter to be learnt, α are the weight vectors of h,It is transfer matrix, b ∈ R^YIt is bias vector, Y indicates all categories Total quantity, p (y) then indicates to predict that the probability of some classification, θ indicate all parameters that training is needed in entire model.

9. Chinese Relation method for extracting as claimed in claim 8, which is characterized in that used in the training process Each neuron of dropout mechanism, the memory network in short-term of the two-way length described in training has 50% probability to be closed.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In being realized when the computer program is executed by processor such as the step of claim 1-9 any the method.