CN110334354A - A kind of Chinese Relation abstracting method - Google Patents
A kind of Chinese Relation abstracting method Download PDFInfo
- Publication number
- CN110334354A CN110334354A CN201910626307.5A CN201910626307A CN110334354A CN 110334354 A CN110334354 A CN 110334354A CN 201910626307 A CN201910626307 A CN 201910626307A CN 110334354 A CN110334354 A CN 110334354A
- Authority
- CN
- China
- Prior art keywords
- word
- vector
- hidden state
- meaning
- rank
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Machine Translation (AREA)
Abstract
The present invention provides a kind of Chinese Relation abstracting method, includes the following steps: S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, to extract the distributed vector of three word in the text, word and the meaning of a word ranks;S2: feature coding: using two-way length, memory network obtains the hidden state vector of the hidden state vector of word, word by the distributed vector of three word, word and the meaning of a word ranks as basic framework in short-term, and then obtains the final hidden state vector of word rank;S3: relationship classification: learn the final hidden state vector of the word rank, using the attention mechanism of the word rank by the hidden state vector fusion of the word rank at the hidden state vector of a sentence level.The problem of efficiently solving segmentation ambiguity and polysemant ambiguity greatly improves performance of the model in Relation extraction task, improves accuracy rate and robustness that Chinese Relation extracts.
Description
Technical field
The present invention relates to computer application technology more particularly to a kind of Chinese Relation abstracting methods.
Background technique
Natural language processing is the sub- subject of artificial intelligence and the cross discipline of computer science and computational linguistics.
Wherein, Relation extraction is one of basic task of natural language processing field.The purpose is to for given sentence and mark
Entity (usually noun), accurately find out the relationship between entity.Relation extraction technology can be used to construct large-scale knowledge
Map, the semantic network that knowledge mapping is made of concept, entity, entity attribute and entity relationship are to real world structure
The expression of change.The building of extensive knowledge mapping can provide comprehensive and structuring external knowledge for artificial intelligence system, from
And develop more powerful application.
Traditional Relation extraction task be there is a problem that certain, they often formulate feature by manually, so that model
It is effectively run on the specific data set of small range, this way limits the development in Relation extraction field.
Simultaneously as the dependence to manual feature, traditional Relation extraction technology has poor robustness and expansible
Property, cause model that can not carry out in different data and corpus extensive.
In recent years, the Relation extraction based on deep learning makes great progress and traditional Relation extraction method ratio,
These methods have many good qualities.Firstly, since the application of neural network, these models can automatically remove the semanteme of learning text
Feature removes design feature so as to avoid specific data is artificially directed to, reduces human cost, and achieve better effect
Fruit.This neural network model provides a kind of solution end to end, minimizes artificial degree of participation.Meanwhile base
Also possess higher robustness in the model of neural network, can be arrived for the different feature of ever-changing natural language learning
The mapping of output.
But even deep learning model, also it is faced with some yet unresolved issues.For not having as Chinese is this
For the language of natural separator, current method is that the realization of word rank or word rank is carried out to main stream approach.The former
List entries is input in model as unit of word, and it is special that this method can allow model to be difficult to learn the word rank into semantic space
Sign results in information deficiency, reduces the accuracy of Relation extraction task;The latter is first to carry out list entries with participle tool
Participle, then is input in model, although this method it is considered that word level information, due to segmenting work by means of outside
Tool, it is easy to which the phenomenon that generating segmentation ambiguity limits relationship pumping so that the error of external tool can be propagated in entire model
Take the development of task.And either word rank or word level model, the polysemia of word is not accounted for, and only with one
The phenomenon that a term vector goes to indicate word feature, and this strategy can not handle polysemant ambiguity, to reduce the upper limit of model.
Summary of the invention
The present invention provides to solve the problems, such as segmentation ambiguity and polysemant ambiguity that Chinese Relation in the prior art extracts
A kind of Chinese Relation abstracting method.
To solve the above-mentioned problems, the technical solution adopted by the present invention is as described below:
A kind of Chinese Relation abstracting method, includes the following steps: S1: data prediction: carrying out to the text of input data
The pre-training of more granular informations is handled, to extract the distributed vector of three word in the text, word and the meaning of a word ranks;
S2: feature coding: using two-way length, memory network passes through the word, the distribution of three ranks of word and the meaning of a word as basic framework in short-term
Formula vector obtains the hidden state vector of the hidden state vector of word, word, and then obtains the final hidden state vector of word rank;
S3: relationship classification: learning the final hidden state vector of the word rank, using the attention mechanism of the word rank by the word
The hidden state vector fusion of rank at a sentence level hidden state vector.
Preferably, the distributed vector for extracting word rank includes extracting word vector sum position vector;The word vector: for
The word level sequences in order s={ c of the text of the given input data1..., cMTotal M has a character, use the side word2vec
Method, by each character ciAll it is mapped as a word vectorWherein, ciIndicate i-th of character,It is i-th of character
Word vector, R are real number space, dcIt is the dimension of the word vector;The position vector indicates character ciTo two entity P1And P2
Between relative positionWherein,Calculation method it is as follows:Wherein, b1And e1It is first
The beginning of a entity P1 and end position,Calculation method andCalculation method it is identical, willWithIt is converted into corresponding
Position vector isWithFor indicating the position feature of the word level sequences in order, dpIndicate position vector
Dimension;
The final expression of the distributed vector of word rank is that two position vectors of the word vector sum are stitched together,
I.e. are as follows:At this point,D=dc+2*dp, d is that position vector described in the word vector sum splices it
Total dimension afterwards;
At this point, the expression of the word level sequences in order of the text of the input data becomes
Preferably, the distributed vector of extraction word rank includes: the word grade for the text of the given input data
Other sequence s={ c1..., cMAnd word level sequences in order s={ w1..., wM, it is indicated using initial position b and final position e
One word, that is, wB, e;By word2vec method by word wB, eIt is converted into the distributed vector of word rank
Preferably, each word w is obtained from external semantic knowledge base HownetB, eMeaning of a word set Sense (wB, e), by each
The meaning of a word in the meaning of a word set isIt is converted to the distributed vector an of meaning of a word rank
I.e.Wherein, K is word wB, eThe meaning of a word number.
Preferably, step S2 includes: S21: using word as basic unit, by the word level sequences in order of the text of the input data
It is directly inputted to the two-way length and obtains the hidden state vector of the word in memory network in short-term;S22: by the input data
Text word level sequences in order using each word as ending word by external semantic knowledge base Hownet obtain institute's predicate
All meaning of a word vectors, the meaning of a word vector is input to the two-way length, meaning of a word rank is calculated in memory network in short-term
Hidden state vector is obtained described using the method for weighted sum by the hidden state Vector Fusion of all meaning of a word ranks
The hidden state vector of word;S23: the weight of the word and institute's predicate is calculated using a gate cell, passes through the side of weighted sum
The hidden state Vector Fusion of the hidden state vector sum institute predicate of the word is the final hiding shape of the word by method
State vector.
Preferably, step S21 includes: j-th of word in the word level sequences in order of the text, is input to the two-way length
When memory network calculating process are as follows:
Wherein, i is input gate, is stored for controlling which information;F is to forget door, will be by for controlling which information
Forget;O is out gate, will be exported for controlling which information;C is cell factory, and U and b are to state two-way long short-term memory net
Parameter to be learned in network, h indicate hidden state vector, common certainly by hidden state and the data at the current time input at last moment
It is fixed.
Preferably, one is started with subscript b in step S22, the word w to be ended up with subscript eB, e, vocabulary is shown asIt is defeated
Enter into the two-way length in short-term memory network, the cell factory of institute's predicateIt calculates as follows:
For institute predicate wB, eK-th of meaning of a word, indicate vector beThe cell factory of one meaning of a word rank's
Calculating process is as follows:
Additional door machine system is introduced to control the contribution of each word sense information:
The word cell state calculation for having merged multiple word sense informations is as follows:
Then all meaning of a word cell factories can be fused into a word cell state
For character ce, calculation method is as follows:
Wherein,WithIt is the normalization expression of door, calculation method is as follows:
The corresponding cell factory of each word will merge the information of word and meaning of a word rank, and then obtain the final of the word
Hidden state vector:
The final hidden state vector of the word will be fed in classifier, synthesize the spy of corresponding sentence level
Sign indicates.
Preferably, the hidden state vector of the sentence levelh*Calculating it is as follows:
H=tanh (h)
α=softmax (wTH)
h*=h αT
Then h*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:
O=Wh*+b
P (y | s)=softmax (o)
For T training data, entire training process will be optimized by following cross entropy loss function:
Wherein, dhIt is the dimension of hidden state variable, M is the length of list entries, and R is real number space, and T represents transposition, and w is
One parameter to be learnt, α are the weight vectors of h,It is transfer matrix, b ∈ RYIt is bias vector, Y indicates all
The total quantity of classification, p (y) then indicate to predict that the probability of some classification, θ indicate all parameters for needing training in entire model.
Preferably, dropout mechanism, the memory network in short-term of the two-way length described in training are used in the training process
Each neuron there is 50% probability to be closed.
The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer
Program, when the computer program is executed by processor realize as above any the method the step of.
The invention has the benefit that providing a kind of Chinese Relation abstracting method, carried out by the text to input data
The pre-training of more granular informations is handled, can be certainly to extract the distributed vector of three word in text, word and the meaning of a word ranks
Dynamic study semantic feature, greatly reduces artificial participation;Asking for segmentation ambiguity and polysemant ambiguity can be efficiently solved
Topic greatly improves performance of the model in Relation extraction task, improves accuracy rate and robustness that Chinese Relation extracts.
Detailed description of the invention
Fig. 1 is Chinese Relation abstracting method schematic diagram in the embodiment of the present invention.
Fig. 2 is the flow diagram of Chinese Relation abstracting method in the embodiment of the present invention.
Specific embodiment
In order to which technical problem to be solved of the embodiment of the present invention, technical solution and beneficial effect is more clearly understood,
The present invention is further described in detail below with reference to the accompanying drawings and embodiments.It should be appreciated that specific implementation described herein
Example is only used to explain the present invention, is not intended to limit the present invention.
It should be noted that it can be directly another when element is referred to as " being fixed on " or " being set to " another element
On one element or indirectly on another element.When an element is known as " being connected to " another element, it can
To be directly to another element or be indirectly connected on another element.In addition, connection can be for fixing
Effect is also possible to act on for circuit communication.
It is to be appreciated that term " length ", " width ", "upper", "lower", "front", "rear", "left", "right", "vertical",
The orientation or positional relationship of the instructions such as "horizontal", "top", "bottom" "inner", "outside" is that orientation based on the figure or position are closed
System is merely for convenience of the description embodiment of the present invention and simplifies description, rather than the device or element of indication or suggestion meaning must
There must be specific orientation, be constructed and operated in a specific orientation, therefore be not considered as limiting the invention.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or
Implicitly include one or more this feature.In the description of the embodiment of the present invention, the meaning of " plurality " is two or two
More than, unless otherwise specifically defined.
Embodiment 1
As shown in Figure 1, the present invention provides a kind of Chinese Relation abstracting method.Include the following steps:
S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, described to extract
The distributed vector of word, word and three ranks of the meaning of a word in text;
S2: feature coding: using two-way length, memory network passes through three word, word and the meaning of a word grades as basic framework in short-term
Other distribution vector obtains hidden state vector, and then obtains the final hidden state vector of word rank;
S3: relationship classification: learn the final hidden state vector of the word rank, using the attention mechanism of the word rank
By the hidden state vector fusion of the word rank at the hidden state vector of a sentence level.
Wherein data prediction step is that the pre-training processing of more granular informations is carried out to the text of input data, to extract
Out the word in text, three ranks of word and the meaning of a word distributed vector, traditional pre-training processing is usually only with a term vector
A word is indicated to correspond to, and in the present invention, a distributed meaning of a word vector can be generated for each meaning of a word of each word.
A kind of long memory network in short-term of lattice of multi-layer path is realized in feature coding step, it is multiple to efficiently use
The hidden state variable of word rank is arrived in the semantic information of rank, study, which can be considered as automatically extracts from data
Feature;The hidden variable learnt in feature coding step will be input in relationship classifying step, introduced gate at this time and paid attention to
Power mechanism carries out automatic weight distribution to hidden status switch and merges, to filter out noise letter in the fusion process of weighting
Breath, retains significant characteristic information, so that final classification exports to obtain more accurate relationship classification.
The present invention includes two stages: training stage and forecast period.Training stage defines an initial model, the mould
Parameter in type is random initializtion, in the training process, the data of relationship class label, mould is constantly had to mode input
Type can during training continuous learning training data and carry out the update of parameter.Simultaneously we with the prediction of model output and
Cross entropy between correct option measures the prediction effect of model as loss function, when the value of the loss function tends towards stability
When, illustrate that modelling effect is restrained, training is over, and obtains a trained Relation extraction device;In forecast period, directly
Data to be predicted are input in trained Relation extraction device, corresponding prediction entity relationship is obtained.
Data prediction:
In this step, main purpose is point for converting the text in input data in computer and can reading and operating
Cloth vector, the vector contain implicit semantic information.At the same time, in order to subsequent module can using in text word,
More granular informations of word and the meaning of a word, the study which has all carried out vector for these three language granularities indicate.
For word vector, this technology is trained in Large Scale Corpus using common word2vec algorithm, to obtain
Taking the hidden feature of each word indicates.The contextual information in the Large Scale Corpus where word is utilized in the expression, can be sufficiently
Embody the syntactic and semantic information of word.
For term vector, it is however generally that, training method is identical with word vector to use word2vec algorithm, and only word vector is
It is trained using the word in text as basic unit, and after term vector first will carry out automatic word segmentation to text with participle tool, then
It is trained by basic unit of word;But in this way, each word can only correspond to a fixed term vector, have ignored word
Ambiguity, therefore the present invention selection to the meaning of a word rather than word carry out vector expression.
For meaning of a word vector, due to can not directly tell the word from word face it is polysemant, has which word respectively
Justice, therefore by being modeled by external semantic knowledge base Hownet (HowNet) to the meaning of a word.In HowNet, each word it is a variety of
The meaning of a word and justice former (indicating semantic minimum unit) are all manually explicitly marked out, by it, available each word
The meaning of a word, and train meaning of a word vector.So, a word may be indicated by multiple meaning of a word vectors, so as to input
Into subsequent module, model is allowed to be dynamically selected the language of the most suitable current word in current sentence during training
Justice helps semantic information and feature more profound in model capture sentence.
Feature coding:
The neural network structure that can efficiently use more granular information features is realized in feature coding step.It
Basic framework be two-way length memory network (LSTM) in short-term, compared to traditional Recognition with Recurrent Neural Network (RNN) model, LSTM can
With the more flexible important information for effectively handling contextual information, saving in input, the invalid information in input is forgotten, and
The problem of can disappearing to avoid the gradient that deep neural network is likely to encounter with gradient explosion.But traditional LSTM model without
Method solves the segmentation ambiguity and polysemant ambiguity problem that Chinese Relation extracts, therefore the present invention has carried out a series of improvement.
First of all for the error propagation for avoiding participle tool, the present invention regards each sentence text using word as basic unit
It is the sequence of word rank as directly inputting, is input in two-way LSTM unit and obtains its hidden state vector;Then, in order to
The information that can consider word rank simultaneously in an encoding process, for each word in sentence, the present invention can be by all in the sentence
In using the word as ending word be added to LSTM unit calculate in.Such as: in sentence " Darwin studies all cuckoos "
" cuckoo " word, " cuckoo " is the word using " cuckoo " as ending in the sentence.Then by all using current word as the word of ending
It is sent in another two-way LSTM unit, the hidden state of word rank is calculated.Finally calculated using a gate cell
The weight of word and word, and then merged the hidden state of word and word by the method for weighted sum, as final hidden of current word
State vector is hidden, which contains the information of word and word rank simultaneously.
Although the above method can effectively avoid influence of the participle mistake to model, not have in conjunction with the information of word and word
There is the presence in view of polysemant in sentence.For example " cuckoo " is exactly a polysemant in above-mentioned example, there is " azalea " and " Du
Two kinds of completely different semantemes of cuckoo bird ".Therefore further, the meaning of a word of each word has been also coupled to hidden state by the present invention
Calculating in.Specifically, for each using current word as the word of ending, inquiry HowNet first obtains all words of this word
Meaning of a word vector is then input in a two-way LSTM unit as term vector and meaning of a word rank is calculated by adopted vector
The state fusion of all meaning of a word ranks is finally obtained the hidden state of the word using the method for weighted sum by hidden state.Phase
It is compared than directly obtaining word hidden state by a term vector before, merges to this method dynamic, selects the most suitable meaning of a word
To constitute word hidden state.After obtaining the hidden state of word, the hidden state of word and word is merged as method before, is obtained
To the final hidden state vector of current word.
Relationship classification:
The sentence characteristics learnt are indicated input classifier, the relationship class label predicted by the step.Upper one
In a module, encoder learns to have obtained the character representation (hidden state vector) of each word, but since relationship classification is with sentence
Son is what unit was extracted, it is therefore desirable to which the character representation of all word ranks, which is fused into corresponding sentence characteristics, to be indicated.This
Invention introduces the character representation that gate attention mechanism is grade malapropism and distributes weight automatically, is then based on the weight and adds to all words
Power summation, obtaining final sentence characteristics indicates.This method intuitively be meant that in a sentence, the significance level of each word
Be different, noise word or commonly used word such as " " and " " lesser weight should be assigned, and in keyword such as entity
Word in word and verb should then correspond to higher attention, and it is just more accurate to merge obtained sentence expression in this way.
After fusion obtains the character representation vector of sentence, which is input in a full articulamentum, mapping is tieed up
Degree is the new vector of total relationship classification number.New vector is normalized then through softmax function, so that in vector
Be all probability value in 0 to 1 section per one-dimensional value, it is other generally to indicate that sentence is classified into relation object corresponding to leading dimension
Rate.For the training stage, the loss function of Definition Model be normalization after vector and current correct relation classification instruction to
The cross entropy between (one-hot) is measured, and passes through the parameter in the method more new model of gradient decline;For forecast period, when
The relationship classification of preceding prediction is that maximum one-dimensional corresponding relationship classification of probability value in vector after normalization.
Embodiment 2
As shown in Fig. 2, the present embodiment uses the Relation extraction method of Chinese provided by the invention, it is defined as given sentence
Two specified entities in s and sentence, judge what relationship the two entities are in sentence s.Such as " Darwin grinds given sentence
Study carefully all cuckoos " and designated entities " Darwin " and " cuckoo ", target be that judgement " Darwin " and " cuckoo " is assorted in the sentence
Relationship.
Step 1. data prediction:
1.1 word ranks indicate
For given list entries s={ c1..., cMTotal M has a character, the present invention uses word2vec method, will
Each character (for i-th) ciAll it is mapped as a word vector It is the word vector of i-th of character, R is real
Number space, dcIt is the dimension of the word vector.Other than word vector, this technology additionally uses position vector to indicate word to two
Relative position between entity.Particularly, for i-th of character ciFor, it arrives the opposite position between two specified entities
Setting can be expressed as Calculation method it is as follows:
Herein, b1And e1It is the beginning and end position of first entity,Calculation method andAlmost phase
Together.In this way,WithIt will be converted into corresponding position vector, beWithFor indicating the word grade
The position feature of other sequence, dpIndicate the dimension of position vector.
In an embodiment of the present invention, input is defined as a sentence and specified two entities therein;It is practical
Situation is, if a sentence has multiple entities, carries out Relation extraction, the result of input for wherein every two entity
For relationship of the two currently assigned entities in this sentence.
Therefore, for i-th of word c of inputi, its final expression is to spell common two position vectors of word vector sum
It picks up and, asAt this point,D=dc+2*dp, d is position vector described in the word vector sum
Total dimension after splicing;The word expression of list entries just will becomeIt is then fed to subsequent coding step
In rapid.
The expression of 1.2 word ranks:
Although the input of model is the sequence of word rank, in order to obtain word grade another characteristic, the present invention will be in sentence
All candidate words being likely to occur have carried out the expression study of word rank.For a list entries s, it can not only be expressed as
S={ c1..., cMForm word level sequences in order, moreover it is possible to be expressed as s={ w1..., wMWord level sequences in order.In this section,
The present invention indicates a word, i.e. w using initial position b and final position eB, e.Still through word2vec method, word wB, eIt can
To be converted into term vector
For each word, the present invention obtains its meaning of a word set Sense (w from HowNetB, e), then pass through Skip-
The method of Gram, by each meaning of a word in set, i.e.,It is converted to a meaning of a word vectorIt goes individually to indicate a meaning of a word.So, a word may be indicated by multiple meaning of a word vectors (assuming that it has K word
Justice), i.e.,
Representation of word meaning vector will be used among the training of coder module, and model is dynamically believed using the meaning of a word
Breath.
Step 2. feature coding:
One common word rank LSTM unit is mainly made of following three doors: which an input gate i is for controlling
Information can be stored;One forgetting door f is for controlling which information will pass into silence;One out gate o incites somebody to action to control which information
It is exported.For j-th of word, the calculating process of LSTM unit is as follows:
Here c indicates cell factory, it stores sequence from the information started to current location.H indicates hidden state vector,
It is codetermined by the hidden state at last moment and the input at current time.U and b is parameter to be learned in LSTM.
For one with subscript b beginning, the word w to be ended up with subscript eB, e, its vocabulary is shown asIn Lattice LSTM
In, the cell factory of a wordIt calculates as follows:
I.e. for a word cb, the present invention can first search it is all with its be beginning and in all of external dictionary matching
Then word calculates the cell factory c of these wordsw.On this basis, the present invention has expanded the calculating of meaning of a word rank, i.e., for every
Each meaning of a word of a word all distributes additional LSTM unit and is calculated.It was mentioned in indicating study module, for word wB, e
K-th of meaning of a word, its expression vector isTherefore, the cell factory of a meaning of a word rankCalculating process it is as follows:
As soon as then all meaning of a word cell factories can be fused into a word cell factory, such model can take into account word
The information of ambiguity.Word cell factory after having merged multiple meaning of a word isIn order to calculateNeed to introduce an additional door
Mechanism controls the contribution of each word sense information:
The word cell state calculation for having merged multiple word sense informations is as follows:
By above-mentioned calculating, for each word wB, e, this model can calculate it and merge the thin of multi-semantic meaning information
Born of the same parents' state
Then for character ce, the present invention will be each with ceThe word information of ending merges, and has obtained completely new word rank cell
State.Calculation method is as follows:
Wherein,WithIt is the normalization expression of door, calculation method is as follows:
It has passed through current calculating, the corresponding cell factory of each word will merge the information of word and meaning of a word rank, then again
Calculate the final hidden state vector of word:
The final hidden state vector of word will be fed in classifier, synthesized the expression of sentence level, then answered
The probability distribution of case.
The classification of step 3. relationship:
After the hidden state vector h of each word is learnt, present invention employs the attention mechanism of word rank by word rank
Hidden state fusion at a sentence level hidden state
Here dhIt is the dimension of hidden state variable, M is the length of list entries.h*Be calculated as an automatic distribution weight
Weighted sum:
H=tanh (h)
α=softmax (wTH)
h*=h αT
Here T represents transposition, and w is the parameter to be learnt, and α is the weight vectors of h, and H is that h passes through tanh functional transformation
Value afterwards, tanh function in this way the range for being mapped to [- 1,1] in h per one-dimensional value can be effectively relieved in training process
Gradient explosion the problems such as.
Then h*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:
O=Wh*+b
P (y | s)=softmax (o)
HereIt is transfer matrix, b ∈ RYIt is bias vector.Y indicates the total quantity of all categories, p (y) then table
Show the probability for predicting some classification.
For T training data, entire training process will be optimized by following cross entropy loss function:
Here θ indicates all parameters that training is needed in entire model.Meanwhile over-fitting, the present invention are gone back in order to prevent
Dropout mechanism is used in the training process, and each neuron has 50% probability to be closed and (instructs every time in training
White silk has the hiding node layer of random half to be not involved in calculating);Test phase then keeps all trained neurons all
It participates in calculating.
The present invention realizes all or part of the process in above-described embodiment method, can also be instructed by computer program
Relevant hardware is completed, and the computer program can be stored in a computer readable storage medium, the computer program
When being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer
Program code, the computer program code can be source code form, object identification code form, executable file or certain centres
Form etc..The computer-readable medium may include: can carry the computer program code any entity or device,
Recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software
Distribution medium etc..It should be noted that the content that the computer-readable medium includes can be according to making laws in jurisdiction
Requirement with patent practice carries out increase and decrease appropriate, such as in certain jurisdictions, according to legislation and patent practice, computer
Readable medium does not include electric carrier signal and telecommunication signal.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that
Specific implementation of the invention is only limited to these instructions.For those skilled in the art to which the present invention belongs, it is not taking off
Under the premise of from present inventive concept, several equivalent substitute or obvious modifications can also be made, and performance or use is identical, all answered
When being considered as belonging to protection scope of the present invention.
Claims (10)
1. a kind of Chinese Relation abstracting method, which comprises the steps of:
S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, to extract the text
In word, three ranks of word and the meaning of a word distributed vector;
S2: feature coding: using two-way length, memory network passes through the word, word and the meaning of a word three ranks as basic framework in short-term
Distributed vector obtains the hidden state vector of the hidden state vector of word, word, so obtain the final hidden state of word rank to
Amount;
S3: relationship classification: learning the final hidden state vector of the word rank, using the attention mechanism of the word rank by institute
The hidden state vector fusion of word rank is stated into the hidden state vector of a sentence level.
2. Chinese Relation method for extracting as described in claim 1, which is characterized in that extract word rank distributed vector include
Extract word vector sum position vector;
The word vector: for the word level sequences in order s={ c of the text of the given input data1..., cMTotal M has a word
Symbol, using word2vec method, by each character ciAll it is mapped as a word vectorWherein, ciIndicate i-th of word
Symbol,It is the word vector of i-th of character, R is real number space, dcIt is the dimension of the word vector;
The position vector indicates character ciTo two entity P1And P2Between relative positionWherein,Calculation method
It is as follows:
Wherein, b1And e1It is beginning and the end position of first entity P1,Calculation method andCalculation method it is identical,
It willWithIt is converted into corresponding position vector, isWithFor indicating the position of the word level sequences in order
Feature, dpIndicate the dimension of position vector;
The final expression of the distributed vector of word rank is that two position vectors of the word vector sum are stitched together, i.e.,
Are as follows:
At this point,D=dc+2*dp, d is total dimension after the splicing of position vector described in the word vector sum;
At this point, the expression of the word level sequences in order of the text of the input data becomes
3. Chinese Relation method for extracting as described in claim 1, which is characterized in that extract the distributed vector packet of word rank
It includes:
For the word level sequences in order s={ c of the text of the given input data1..., cMAnd word level sequences in order s=
{w1..., wM, the i.e. w of word is indicated using initial position b and final position eB, e;By word2vec method by word wB, e
It is converted into the distributed vector of word rank
4. Chinese Relation method for extracting as claimed in claim 3, which is characterized in that obtained from external semantic knowledge base Hownet every
A word wB, eMeaning of a word set Sense (wB, e), it is by each meaning of a word in the meaning of a word set
It is converted to the distributed vector an of meaning of a word rankI.e.
Wherein, K is word wB, eThe meaning of a word number.
5. Chinese Relation method for extracting as described in claim 1, which is characterized in that step S2 includes:
S21: using word as basic unit, the word level sequences in order of the text of the input data is directly inputted to the two-way length
When memory network in obtain the hidden state vector of the word;
S22: the word using each word as ending of the word level sequences in order of the text of the input data is passed through into external language
Adopted knowledge base Hownet obtains all meaning of a word vectors of institute's predicate, and the meaning of a word vector is input to the two-way long short-term memory net
The hidden state vector of meaning of a word rank is calculated in network, uses method the hiding all meaning of a word ranks of weighted sum
State vector fusion obtains the hidden state vector of institute's predicate;
S23: calculating the weight of the word and institute's predicate using a gate cell, by the method for weighted sum by the hidden of the word
The hidden state Vector Fusion for hiding state vector and institute's predicate is the final hidden state vector of the word.
6. Chinese Relation method for extracting as claimed in claim 5, which is characterized in that step S21 includes: the word grade of the text
J-th of word in other sequence is input to the calculating process of two-way length memory network in short-term are as follows:
Wherein, i is input gate, is stored for controlling which information;F is to forget door, will be passed into silence for controlling which information;
O is out gate, will be exported for controlling which information;C is cell factory, U and b be state two-way length in short-term in memory network to
The parameter of study, h indicate hidden state vector, are inputted and are codetermined by the hidden state and the data at current time at last moment.
7. Chinese Relation method for extracting as claimed in claim 6, which is characterized in that one is opened with subscript b in step S22
Head, the word w to be ended up with subscript eB, e, vocabulary is shown asThe two-way length is input in short-term in memory network, the cell of institute's predicate
UnitIt calculates as follows:
For institute predicate wB, eK-th of meaning of a word, indicate vector beThe cell factory of one meaning of a word rankCalculating
Process is as follows:
Additional door machine system is introduced to control the contribution of each word sense information:
The word cell state calculation for having merged multiple word sense informations is as follows:
Then all meaning of a word cell factories can be fused into a word cell state
For character ce, calculation method is as follows:
Wherein,WithIt is the normalization expression of door, calculation method is as follows:
The corresponding cell factory of each word will merge the information of word and meaning of a word rank, and then obtain the final of the word and hide
State vector:
The final hidden state vector of the word will be fed in classifier, synthesize the mark sheet of corresponding sentence level
Show.
8. Chinese Relation method for extracting as claimed in claim 7, which is characterized in that the hidden state vector of the sentence levelh*Calculating it is as follows:
H=tanh (h)
α=softmax (wTH)
h*=h αT
Then h*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:
O=Wh*+b
P (y | s)=softmax (o)
For T training data, entire training process will be optimized by following cross entropy loss function:
Wherein, dhIt is the dimension of hidden state variable, M is the length of list entries, and R is real number space, and T represents transposition, and w is one
The parameter to be learnt, α are the weight vectors of h,It is transfer matrix, b ∈ RYIt is bias vector, Y indicates all categories
Total quantity, p (y) then indicates to predict that the probability of some classification, θ indicate all parameters that training is needed in entire model.
9. Chinese Relation method for extracting as claimed in claim 8, which is characterized in that used in the training process
Each neuron of dropout mechanism, the memory network in short-term of the two-way length described in training has 50% probability to be closed.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In being realized when the computer program is executed by processor such as the step of claim 1-9 any the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910626307.5A CN110334354B (en) | 2019-07-11 | 2019-07-11 | Chinese relation extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910626307.5A CN110334354B (en) | 2019-07-11 | 2019-07-11 | Chinese relation extraction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110334354A true CN110334354A (en) | 2019-10-15 |
CN110334354B CN110334354B (en) | 2022-12-09 |
Family
ID=68146526
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910626307.5A Active CN110334354B (en) | 2019-07-11 | 2019-07-11 | Chinese relation extraction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110334354B (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111061843A (en) * | 2019-12-26 | 2020-04-24 | 武汉大学 | Knowledge graph guided false news detection method |
CN111160017A (en) * | 2019-12-12 | 2020-05-15 | 北京文思海辉金信软件有限公司 | Keyword extraction method, phonetics scoring method and phonetics recommendation method |
CN111274794A (en) * | 2020-01-19 | 2020-06-12 | 浙江大学 | Synonym expansion method based on transmission |
CN111274394A (en) * | 2020-01-16 | 2020-06-12 | 重庆邮电大学 | Method, device and equipment for extracting entity relationship and storage medium |
CN111291556A (en) * | 2019-12-17 | 2020-06-16 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111428505A (en) * | 2020-01-17 | 2020-07-17 | 北京理工大学 | Entity relation extraction method fusing trigger word recognition features |
CN111680510A (en) * | 2020-07-07 | 2020-09-18 | 腾讯科技(深圳)有限公司 | Text processing method and device, computer equipment and storage medium |
CN111783418A (en) * | 2020-06-09 | 2020-10-16 | 北京北大软件工程股份有限公司 | Chinese meaning representation learning method and device |
CN111859978A (en) * | 2020-06-11 | 2020-10-30 | 南京邮电大学 | Emotion text generation method based on deep learning |
CN112015891A (en) * | 2020-07-17 | 2020-12-01 | 山东师范大学 | Method and system for classifying messages of network inquiry platform based on deep neural network |
CN112380872A (en) * | 2020-11-27 | 2021-02-19 | 深圳市慧择时代科技有限公司 | Target entity emotional tendency determination method and device |
CN112560487A (en) * | 2020-12-04 | 2021-03-26 | 中国电子科技集团公司第十五研究所 | Entity relationship extraction method and system based on domestic equipment |
CN112883194A (en) * | 2021-04-06 | 2021-06-01 | 安徽科大讯飞医疗信息技术有限公司 | Symptom information extraction method, device, equipment and storage medium |
CN112883153A (en) * | 2021-01-28 | 2021-06-01 | 北京联合大学 | Information-enhanced BERT-based relationship classification method and device |
CN112948535A (en) * | 2019-12-10 | 2021-06-11 | 复旦大学 | Method and device for extracting knowledge triples of text and storage medium |
CN113051371A (en) * | 2021-04-12 | 2021-06-29 | 平安国际智慧城市科技股份有限公司 | Chinese machine reading understanding method and device, electronic equipment and storage medium |
CN113239663A (en) * | 2021-03-23 | 2021-08-10 | 国家计算机网络与信息安全管理中心 | Multi-meaning word Chinese entity relation identification method based on Hopkinson |
CN113326676A (en) * | 2021-04-19 | 2021-08-31 | 上海快确信息科技有限公司 | Deep learning model device for structuring financial text into form |
CN113392648A (en) * | 2021-06-02 | 2021-09-14 | 北京三快在线科技有限公司 | Entity relationship acquisition method and device |
CN114372125A (en) * | 2021-12-03 | 2022-04-19 | 北京北明数科信息技术有限公司 | Government affair knowledge base construction method, system, equipment and medium based on knowledge graph |
CN115034302A (en) * | 2022-06-07 | 2022-09-09 | 四川大学 | Relation extraction method, device, equipment and medium for optimizing information fusion strategy |
CN115169326A (en) * | 2022-04-15 | 2022-10-11 | 山西长河科技股份有限公司 | Chinese relation extraction method, device, terminal and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080275694A1 (en) * | 2007-05-04 | 2008-11-06 | Expert System S.P.A. | Method and system for automatically extracting relations between concepts included in text |
CN108733792A (en) * | 2018-05-14 | 2018-11-02 | 北京大学深圳研究生院 | A kind of entity relation extraction method |
-
2019
- 2019-07-11 CN CN201910626307.5A patent/CN110334354B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080275694A1 (en) * | 2007-05-04 | 2008-11-06 | Expert System S.P.A. | Method and system for automatically extracting relations between concepts included in text |
CN108733792A (en) * | 2018-05-14 | 2018-11-02 | 北京大学深圳研究生院 | A kind of entity relation extraction method |
Non-Patent Citations (2)
Title |
---|
PENG ZHOU ET AL.: "Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification", 《PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》 * |
孙紫阳 等: "基于深度学习的中文实体关系抽取方法", 《计算机工程》 * |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112948535A (en) * | 2019-12-10 | 2021-06-11 | 复旦大学 | Method and device for extracting knowledge triples of text and storage medium |
CN112948535B (en) * | 2019-12-10 | 2022-06-14 | 复旦大学 | Method and device for extracting knowledge triples of text and storage medium |
CN111160017A (en) * | 2019-12-12 | 2020-05-15 | 北京文思海辉金信软件有限公司 | Keyword extraction method, phonetics scoring method and phonetics recommendation method |
CN111291556A (en) * | 2019-12-17 | 2020-06-16 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111291556B (en) * | 2019-12-17 | 2021-10-26 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111061843B (en) * | 2019-12-26 | 2023-08-25 | 武汉大学 | Knowledge-graph-guided false news detection method |
CN111061843A (en) * | 2019-12-26 | 2020-04-24 | 武汉大学 | Knowledge graph guided false news detection method |
CN111274394A (en) * | 2020-01-16 | 2020-06-12 | 重庆邮电大学 | Method, device and equipment for extracting entity relationship and storage medium |
CN111428505A (en) * | 2020-01-17 | 2020-07-17 | 北京理工大学 | Entity relation extraction method fusing trigger word recognition features |
CN111274794A (en) * | 2020-01-19 | 2020-06-12 | 浙江大学 | Synonym expansion method based on transmission |
CN111274794B (en) * | 2020-01-19 | 2022-03-18 | 浙江大学 | Synonym expansion method based on transmission |
CN111783418B (en) * | 2020-06-09 | 2024-04-05 | 北京北大软件工程股份有限公司 | Chinese word meaning representation learning method and device |
CN111783418A (en) * | 2020-06-09 | 2020-10-16 | 北京北大软件工程股份有限公司 | Chinese meaning representation learning method and device |
CN111859978B (en) * | 2020-06-11 | 2023-06-20 | 南京邮电大学 | Deep learning-based emotion text generation method |
CN111859978A (en) * | 2020-06-11 | 2020-10-30 | 南京邮电大学 | Emotion text generation method based on deep learning |
CN111680510A (en) * | 2020-07-07 | 2020-09-18 | 腾讯科技(深圳)有限公司 | Text processing method and device, computer equipment and storage medium |
CN112015891A (en) * | 2020-07-17 | 2020-12-01 | 山东师范大学 | Method and system for classifying messages of network inquiry platform based on deep neural network |
CN112380872A (en) * | 2020-11-27 | 2021-02-19 | 深圳市慧择时代科技有限公司 | Target entity emotional tendency determination method and device |
CN112380872B (en) * | 2020-11-27 | 2023-11-24 | 深圳市慧择时代科技有限公司 | Method and device for determining emotion tendencies of target entity |
CN112560487A (en) * | 2020-12-04 | 2021-03-26 | 中国电子科技集团公司第十五研究所 | Entity relationship extraction method and system based on domestic equipment |
CN112883153B (en) * | 2021-01-28 | 2023-06-23 | 北京联合大学 | Relationship classification method and device based on information enhancement BERT |
CN112883153A (en) * | 2021-01-28 | 2021-06-01 | 北京联合大学 | Information-enhanced BERT-based relationship classification method and device |
CN113239663A (en) * | 2021-03-23 | 2021-08-10 | 国家计算机网络与信息安全管理中心 | Multi-meaning word Chinese entity relation identification method based on Hopkinson |
CN113239663B (en) * | 2021-03-23 | 2022-07-12 | 国家计算机网络与信息安全管理中心 | Multi-meaning word Chinese entity relation identification method based on Hopkinson |
CN112883194B (en) * | 2021-04-06 | 2024-02-20 | 讯飞医疗科技股份有限公司 | Symptom information extraction method, device, equipment and storage medium |
CN112883194A (en) * | 2021-04-06 | 2021-06-01 | 安徽科大讯飞医疗信息技术有限公司 | Symptom information extraction method, device, equipment and storage medium |
CN113051371A (en) * | 2021-04-12 | 2021-06-29 | 平安国际智慧城市科技股份有限公司 | Chinese machine reading understanding method and device, electronic equipment and storage medium |
CN113326676A (en) * | 2021-04-19 | 2021-08-31 | 上海快确信息科技有限公司 | Deep learning model device for structuring financial text into form |
CN113392648A (en) * | 2021-06-02 | 2021-09-14 | 北京三快在线科技有限公司 | Entity relationship acquisition method and device |
CN114372125A (en) * | 2021-12-03 | 2022-04-19 | 北京北明数科信息技术有限公司 | Government affair knowledge base construction method, system, equipment and medium based on knowledge graph |
CN115169326A (en) * | 2022-04-15 | 2022-10-11 | 山西长河科技股份有限公司 | Chinese relation extraction method, device, terminal and storage medium |
CN115034302A (en) * | 2022-06-07 | 2022-09-09 | 四川大学 | Relation extraction method, device, equipment and medium for optimizing information fusion strategy |
CN115034302B (en) * | 2022-06-07 | 2023-04-11 | 四川大学 | Relation extraction method, device, equipment and medium for optimizing information fusion strategy |
Also Published As
Publication number | Publication date |
---|---|
CN110334354B (en) | 2022-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110334354A (en) | A kind of Chinese Relation abstracting method | |
CN107992597B (en) | Text structuring method for power grid fault case | |
CN108733792B (en) | Entity relation extraction method | |
CN108536754A (en) | Electronic health record entity relation extraction method based on BLSTM and attention mechanism | |
CN110532557B (en) | Unsupervised text similarity calculation method | |
CN110083831A (en) | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF | |
CN110209817A (en) | Training method and device of text processing model and text processing method | |
CN109635124A (en) | A kind of remote supervisory Relation extraction method of combination background knowledge | |
CN109858041A (en) | A kind of name entity recognition method of semi-supervised learning combination Custom Dictionaries | |
CN113255320A (en) | Entity relation extraction method and device based on syntax tree and graph attention machine mechanism | |
CN110555084A (en) | remote supervision relation classification method based on PCNN and multi-layer attention | |
CN113743099B (en) | System, method, medium and terminal for extracting terms based on self-attention mechanism | |
CN109933792A (en) | Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method | |
CN114548099B (en) | Method for extracting and detecting aspect words and aspect categories jointly based on multitasking framework | |
CN113255366B (en) | Aspect-level text emotion analysis method based on heterogeneous graph neural network | |
CN115861995B (en) | Visual question-answering method and device, electronic equipment and storage medium | |
CN113806494A (en) | Named entity recognition method based on pre-training language model | |
CN111177402A (en) | Evaluation method and device based on word segmentation processing, computer equipment and storage medium | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
CN114841151A (en) | Medical text entity relation joint extraction method based on decomposition-recombination strategy | |
CN111783464A (en) | Electric power-oriented domain entity identification method, system and storage medium | |
CN116757195B (en) | Implicit emotion recognition method based on prompt learning | |
CN114239584A (en) | Named entity identification method based on self-supervision learning | |
CN117216617A (en) | Text classification model training method, device, computer equipment and storage medium | |
CN116362242A (en) | Small sample slot value extraction method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |