CN107194422A - A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination - Google Patents
A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination Download PDFInfo
- Publication number
- CN107194422A CN107194422A CN201710354990.2A CN201710354990A CN107194422A CN 107194422 A CN107194422 A CN 107194422A CN 201710354990 A CN201710354990 A CN 201710354990A CN 107194422 A CN107194422 A CN 107194422A
- Authority
- CN
- China
- Prior art keywords
- sentence
- word
- mrow
- reverse
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses the relation sorting technique based on convolutional neural networks that a kind of positive example of combination and reverse example are combined, it is related to Relation extraction and sorting technique field.This method comprises the following steps:S1. to sentence text entities to be sorted, positive example and reverse example are divided into according to the front and rear linear precedence of word in sentence;S2. positive instance entity and reverse instance entity are encoded respectively using CNN sentences encoder, the coding characteristic vector for constructing the positive example of sentence isCoding characteristic vector with reverse example isS3. it is according to the coding characteristic vector of positive exampleCoding characteristic vector with reverse example isUsing softmax layers of progress relation classification, classification results r is obtainedi.Compared with other method, the present invention classifies on CNN with reference to forward and reverse information summary of same entity pair, improves final classification effect.
Description
Technical field
The present invention relates to Relation extraction and sorting technique field, refer in particular to a kind of positive example of combination and reverse example combines
Relation sorting technique based on convolutional neural networks.
Background technology
Current worldwide existing Relation extraction technology can be largely classified into 3 kinds:Method based on pattern match,
Relation extraction method and open field information extraction method based on machine learning.Method based on pattern match utilizes artificial constructed
Pattern match relation, it is necessary to engineer's pattern its to get migration poor;Open field information extraction method extracts certain sentence
Then relation character string cluster is obtained relation by predicate as the relation character string between subject and object, and this method extracts accurate
The exactness difference and relation that extracts is difficult to be mapped to the relation for building database needs.And the Relation extraction skill based on machine learning
Extraction problem is converted into the relation classification problem under known predefined relationship type by art, which ensure that extracting accuracy and only having
Seldom manual intervention.
And the Relation extraction technology based on machine learning is broadly divided into 3 kinds:The relation sorting technique of feature based, based on tree
The relation sorting technique of core and network relation sorting technique.The method of feature based extract a large amount of linguistics (meaning of a word and
Grammer) feature, assemblage characteristic formation characteristic vector simultaneously using all kinds of graders (such as maximum entropy model and SVMs)
Progress, which is classified, obtains relationship by objective (RBO), and it needs expert design feature, it is difficult to migrate field.Method based on tree core passes through text
Grammer tree representation, obtains inner product of two sentences on higher-dimension evacuated space by designing kernel function and is used as its structured features, should
Method kernel function extracts characteristic limitations greatly, point
In addition, the representative sex work of the relation sorting algorithm based on neutral net mainly includes both at home and abroad, based on convolution god
Method (CNN) [1] through network, based on the method (CR-CNN) [2] of sequence convolutional neural networks, based on recurrent neural network
Method (RNN) [3], based on the method such as method (ATT-BLSTM) [4] of memory models in short-term of the two-way length with notice.These
Method is all that the sentence of relation to be extracted and entity are inputted into neutral net, predetermined using being categorized into after neutral net acquisition feature
Adopted relationship type, to obtain relationship by objective (RBO).
Immediate with the present invention is the method [1] (such as Fig. 1) based on convolutional neural networks, and it utilizes the word of external trainer
Vector sum represents that word is combined with the position vector of physical distance and obtains word vectors, with reference to all word vectors conducts in sentence
Sentence vector, then by sentence vector input convolutional neural networks, is obtained local feature using convolutional layer, is obtained using pond layer
Obvious characteristic, obtains classification relation by softmax layers afterwards.
Relation sorting technique of the prior art is primarily present following deficiency:The relation sorting technique of feature based needs people
Work design feature, migration is poor;Method based on tree core can only obtain feature by defining kernel function, and feature is single;And be based on
CNN (convolutional neural networks) is easy to implement in the method for neutral net and training effectiveness is high, good classification effect, and other are more complicated
Method training effectiveness it is relatively low and be difficult to obtain the classifying quality suitable with CNN.But its still method still suffers from asking for following aspect
Topic:Entity according to different sequencings to inputting neutral net in same sentence, and its classification results may be different.For example,
“Financial stress is one of the main causes of divorce." in, entity " stress " is made
e1, entity " divorce " make e2Result is obtained for Cause-Effect;" stress " is made into e2, " divorce " make e1It ought to obtain
To result Effect-Cause;But there is the not corresponding situation of two kinds of results during actual classification.
[1] Zeng D, Liu K, Lai S, Zhou G, Zhao J.Relation Classification via
Convolutional Deep Neural Network[C]//COLING。2014:2335-2344。
[2] Santos C N, Xiang B, Zhou B.Classifying relations by ranking with
convolutional neural networks[C]//ACL(1)。2015:626-634
[3] Hashimoto K, Miwa M, Tsuruoka Y, Chikayama T.Simple Customization of
Recursive Neural Networks for Semantic Relation Classification[C]//EMNLP。
2013:1372-1376。
[4] Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B.Attention-based
bidirectional long short-term memory networks for relation classification
[C]//ACL(2)。2016:207。
The content of the invention
It is an object of the invention to overcome above-mentioned the deficiencies in the prior art, the Direct/Reverse based on convolutional neural networks is proposed
The relation sorting technique that example is combined.
To achieve the above object, technical solution of the present invention is specific as follows:
The relation sorting technique based on convolutional neural networks that a kind of positive example of combination and reverse example are combined, this method
Comprise the following steps:
S1. to sentence text entities to be sorted, positive example is divided into and anti-according to the front and rear linear precedence of word in sentence
To example;
S2. positive instance entity and reverse instance entity are encoded respectively using CNN sentences encoder, constructs sentence
The coding characteristic vector of positive example of son isCoding characteristic vector with reverse example is
S3. it is according to the coding characteristic vector of positive exampleCoding characteristic vector with reverse example isUtilize
Softmax layers of progress relation classification, obtain classification results ri。
As the further improvement of technical solution of the present invention, wherein, a sentence for having marked two entities is given, according to
The front and rear linear precedence of word in sentence, the entity using equivalent in preceding appearance is used as e1, in the entity of rear appearance be used as e2Example,
Referred to as positive example;
Wherein, a sentence for having marked two entities is given, according to the front and rear linear precedence of word in sentence, equivalent is existed
The entity occurred afterwards is as e1, in the entity of preceding appearance be used as e2Example, referred to as reverse example.
As the further improvement of technical solution of the present invention, for any sentence, the coding characteristic of the positive example of definition to
Measure and beReversely the coding characteristic vector of example isPositive example relationship isReversely example relationship is
Set it there are the positive examples of probability h correct, have the reverse example of 1-h probability correct, define cross entropy design object letter
Number is:
Wherein, n is sentence quantity, and θ and θ ' are respectively sentence encoder nerve net in positive example and reverse instance model
The mapping of convolutional layer and hidden layer and bigoted parameter in network.
Further, object function is minimized using stochastic gradient descent method, mini- is randomly choosed from training set
Batch sample is trained until convergence;Positive Exemplary classes probability vector is C+=[c1, c2..., cr], reverse example point
Class probability vector is C-=[c1, c2..., cr], ciRepresent entity e in the sentence1With e2Between there is relation riProbability;
The result that must can classify:
C=ω C++(1-ω)C-
Finally, corresponding classification results r is obtained by maximum inverse function i=argmax (C)i。
As the further improvement of technical solution of the present invention, the CNN sentences encoder includes three-decker, first layer to the
Three-decker is followed successively by coding layer, convolutional layer, pond and non-linear layer;
Wherein, the coding layer is used to the word in sentence being converted to low-dimensional real number vector;The convolutional layer is used to obtain
The high-level characteristic of each word;The pondization and non-linear layer are used for the coded representation for constructing sentence.
As the further improvement of technical solution of the present invention, the coding layer includes word for the coded representation of word in sentence
Coding, position encoded and dependence coding;
Wherein, Chinese word coding is specifically included:It includes n word to a known sentence x, is expressed as x=[x1, x2... xn], its
Middle xiI-th of word in the sequence is represented, n is filling intercepted length set in advance.Each word xiBy searching term vector table W
Obtain its equivalent vector representation ei, i.e. ei=Wxi;
Wherein, it is position encoded to specifically include:Using each word with and entity distance generation position feature it is vectorial, using every
Individual word xiWith two entities in sentence apart from i-i1And i-i2Vector of the correspondence in the feature coding table D of position is used as position
Coding, be denoted asWithPosition feature coding schedule is initialized using random value;
Wherein, coding is relied on to specifically include:Using word and upper layer node distance generation rely on direction vector, using word it
Between dependence label generation dependence characteristics vector;
The Chinese word coding of each word, position encoded and dependence coded strings are tied together as to the coded representation of each word.
As the further improvement of technical solution of the present invention, the convolutional layer is used to merge all local features, the volume
Lamination extracts local feature by a size for w sliding window;
Specifically, convolution kernel is matrix f=[f1, f2..., fw], then characteristic sequence s=[s are obtained after convolution1,
s2..., sn];
Wherein,
Wherein, b is bias term, and g is a nonlinear function, can be obtained not using different convolution kernel and window size
Same feature.
As the further improvement of technical solution of the present invention, the pondization and non-linear layer obtain most heavy using max functions
Feature is wanted, then its convolution fraction is for each convolution kernel:
pf=max { sA}
The pond fraction that each convolution kernel is obtained is connected for the characteristic vector z=[p for representing the sentence1, p2...,
pm], wherein m is convolution nuclear volume;
Finally, nonlinear function is added as output to characteristic vector, the output is the coded representation for inputting sentence.
Compared with prior art, the invention has the advantages that:
1st, the present invention proposes the relation based on convolutional neural networks that a kind of positive example of combination and reverse example are combined
Sorting technique, it is contemplated that positive and negative two kinds of situations of example, relation classification strong robustness, classifying quality is more preferable;
2nd, the inventive method is easily achieved and training process is fast, first by convolutional neural networks to positive example and direction
Example carries out sentence coding, obtains its different coded representation.Same sentence entity is considered to suitable according to different priorities
Sequence inputs neutral net, the relation come with reference to two kinds of situations between comprehensive descision entity pair.Compared with other method, the present invention exists
Forward and reverse information summary on CNN with reference to same entity pair is classified, and improves final classification effect.
Brief description of the drawings
Fig. 1 is the relation taxonomic structure figure based on convolutional neural networks in background technology.
Fig. 2 is relation classification method flow diagram of the present invention.
Fig. 3 is the relation taxonomy model figure of forward and reverse example combination in the embodiment of the present invention.
Fig. 4 is dependency analysis tree structure diagram in the embodiment of the present invention.
Embodiment
The present invention relates to the Relation extraction technology in information extraction, more particularly to the relation classification side based on machine learning
Method, existing Relation extraction technology main flow way is realized by relation sorting technique.The present invention utilizes existing term vector
Training technique and syntactic analysis instrument are indicated to text, and we carry out the relation based on neutral net on this basis
Class.Extracted the invention mainly comprises neural network characteristics and representation module, the combining classification module of a variety of expressions.It is of the invention main
Pass through relation sorting technique, implementation relation extraction technique.Relation extraction, i.e., recognize from unformatted text and generate between entity
Semantic relation.For example, input text " Financial stress is one of the main causes of
Divorce ", wherein having marked entity e1=" stress " and e2=" divorce ", relation classification task is by automatic identification entity
e1And e2Between there is Cause-Effect relations, and be expressed as Cause-Effect (e1,e2)。
Below in conjunction with present specification accompanying drawing, a kind of positive example of combination of the present invention and reverse example are combined
The specific embodiment of relation sorting technique based on convolutional neural networks is described in further details, it is clear that described implementation
Example only a part of embodiment of the invention, rather than whole embodiments, based on the embodiment in the present invention, this area is common
The every other embodiment that technical staff is obtained under the premise of creative work is not made, belongs to the model of the application protection
Enclose
The present embodiment method idiographic flow is as shown in Fig. 2 comprise the following steps:
S1. to sentence text entities to be sorted, positive example is divided into and anti-according to the front and rear linear precedence of word in sentence
To example;
S2. positive instance entity and reverse instance entity are encoded respectively using CNN sentences encoder, constructs sentence
The coding characteristic vector of positive example of son isCoding characteristic vector with reverse example is
S3. it is according to the coding characteristic vector of positive exampleCoding characteristic vector with reverse example isUtilize
Softmax layers of progress relation classification, obtain classification results ri。
The framework relation classification composition structure as shown in figure 3, including forward direction instance entity, reverse instance entity, CNN
Sub-encoders, softmax layers and causality.
Wherein, a sentence for having marked two entities is given, according to the front and rear linear precedence of word in sentence, equivalent is existed
The entity of preceding appearance is used as e1, in the entity of rear appearance be used as e2Example, be referred to as positive example.
Wherein, a sentence for having marked two entities is given, according to the front and rear linear precedence of word in sentence, equivalent is existed
The entity occurred afterwards is as e1, in the entity of preceding appearance be used as e2Example, referred to as reverse example.
For example, in sentence " Financial stress is one of the main causes of divorce.”
In, e is made with " stress "1, " divorce " make e2For positive example, positive example has Cause-Effect relations;With
" divorce " makees e1, " stress " make e2For reverse example, reverse example has Effect-Cause relations.Research discovery, just
The semantic relation of semantic relation and reverse example to example is mutually corresponding, and an outstanding categorizing system is it is ensured that forward direction
Example and reverse Exemplary classes result are also mutually corresponded to.
For any sentence, the coding characteristic vector for defining its positive example isThe coding characteristic vector of reverse example
ForPositive example relationship isReversely example relationship isDue to there are positive example and the not corresponding feelings of reverse example
Condition, therefore set it to have the positive examples of probability h correct, there is the reverse example of 1-h probability correct;Utilize cross entropy design object function
For:
Wherein, n is sentence quantity, and θ and θ ' are respectively all in neutral net in positive example and reverse instance model
Parameter.
To solve object function J (θ) optimization problem, object function is minimized using stochastic gradient descent method.Specifically
Ground, mini-batch sample of random selection is trained until convergence from training set;In test, positive Exemplary classes are general
Rate vector is C+=[c1, c2..., cr], reverse Exemplary classes probability vector is C-=[c1, c2..., cr], ciRepresent in the sentence
Entity e1With e2Between there is relation riProbability.Therefore, the result of classification is:
C=ω C++(1-ω)C-。
Finally, corresponding classification results r is obtained by maximum inverse function i=argmax (C)i。
Further, the CNN sentence encoders in this specific embodiment, the sentence coder structure specifically includes coding
Layer, convolutional layer, pond and non-linear layer.First, the word in sentence is converted to low-dimensional real number vector by coding layer, positioned at coding layer
On convolutional layer obtain the high-level characteristic of each word;Then, sentence vector representation is constructed via pondization and non-linear layer, compiled
Vector after code is denoted as s.
Each layer in the CNN sentence coder structures is described in detail below:
Coding layer
The input of CNN sentence encoders is original sentence text, because CNN can only handle fixed length input, therefore in input
Original sentence, which is filled with the consistent word sequence of length, the present embodiment, before sets target length to be that the most long sentence of data set is long
N is spent, filling word is " NaN ".In the coding layer, each word is low-dimensional vector by term vector matrix conversion, for mark entity
In position, the present embodiment, to each word point of addition characteristic vector.In addition, sentence dependency structure is understood for raising system, this
In embodiment, direction vector sum dependence characteristics vector is relied on to the addition of each word.
In coding layer, the coded representation of word includes Chinese word coding, position encoded and dependence coding.
Wherein, Chinese word coding is:It includes n word to a known sentence x, is expressed as x=[x1, x2... xn], wherein xiTable
Show i-th of word in the sequence, n is filling intercepted length set in advance.Each word xiIt is obtained by searching term vector table W
Equivalent vector representation ei, i.e. ei=Wxi;Passed through in the present embodiment using term vector training tool (Google word2vec) of increasing income
Term vector table is obtained to the training of wikipedia off-line data.
Wherein, it is position encoded to be:Position of the entity in sentence influences the relation of inter-entity, without position feature vector
When, it is entity that CNN, which will be unable to which word in identification sentence, causes classifying quality poor.To solve the technical deficiency that CNN is present, this reality
Apply CNN in example using each word with and entity distance generation position feature it is vectorial, for example, in sentence " Financial
stress is one of the main causes of divorce." in, word " main " and entity " stress " distance are
5, and entity " divorce " distance is -3.Specifically, using each word xiWith two entities in sentence apart from i-i1And i-
i2Vector of the correspondence in the feature coding table D of position as position encoded, be denoted asWithPosition feature coding schedule is used
Random value is initialized.
Wherein, dependence is encoded to:Based on dependency analysis tree dependence coding include dependence direction vector sum dependence characteristics to
Amount;Dependency analysis tree is the tree to being constituted after sentence structure analysis according to relation of interdependence between word, is a base for reason and good sense solution
This instrument.(Stamford syntactic analysis tool analysis result), each in addition to root node in dependency analysis tree as shown in Figure 4
There is dependence between node and superior node, dependence not only also includes relying on label comprising its superior node.At this
In embodiment, direction vector is relied on using the distance generation of word and upper layer node, is generated using the label of dependence between word
Dependence characteristics vector.
For example, " city " is 3 with superior node " go " distance, feature tag is " nmod;" go " and superior node
" intends " distance is 2, and feature tag is " xcomp ", uses for reference position encoded mode, using each word and a upper word away from
FromReal number vector of the correspondence in direction coding schedule P is relied on is used as pi, it is corresponding in dependence characteristics coding schedule using label is relied on
Vector in F is used as fi, rely on direction coding schedule and dependence characteristics coding schedule initialized using random value.
So far, the Chinese word coding of each word, position encoded and dependence coded strings are tied together as to the coded representation of the word,
And for filling word, set unique vector to be identified.Specifically, to each word, connect term vector ei, two entities position
VectorWithRely on direction vector piAnd dependence characteristics vector fiObtain each word XiExpression vector, i.e.
And the coded representation of sentence is then:
X=[X1, X2... Xn]。
Convolutional layer
The ultimate challenge of relation classification is derived from semantic statement diversity, and position of the important information in sentence is not fixed.
Therefore, all local features are merged using a convolutional layer in the present embodiment, the convolutional layer passes through sliding window of the size for w
To extract local feature, when sliding window may cross the border near border, null vector can be filled on sentence both sides to ensure dimension after convolution
Number is constant.
Specifically, convolution kernel is matrix f=[f1, f2..., fw], then characteristic sequence s=[s are obtained after convolution1,
s2..., sn], wherein,
Wherein, b is bias term, and g is a nonlinear function, can be obtained not using different convolution kernel and window size
Same feature.
Pond and non-linear layer
In pond layer, most important characteristics are obtained using max functions, then its convolution fraction is for each convolution kernel:
pf=max { sA}。
The pond fraction that each convolution kernel is obtained is connected for the characteristic vector z=[p for representing the sentence1, p2...,
pm], wherein m is convolution nuclear volume.
Finally, nonlinear function is added as output to characteristic vector, the output is the coded representation for inputting sentence.
The present invention is directed to relation classification problem there is provided the more efficient and comprehensive feature extraction of one kind and coded system,
And entity is considered to the classification relation under different order, improve the effect of relation classification.
Those skilled in the art will be clear that the scope of the present invention is not restricted to example discussed above, it is possible to which it is carried out
Some changes and modification, the scope of the present invention limited without departing from appended claims.Although oneself is through in accompanying drawing and explanation
The present invention is illustrated and described in book in detail, but such explanation and description are only explanations or schematical, and it is nonrestrictive.
The present invention is not limited to the disclosed embodiments.
Claims (7)
1. the relation sorting technique based on convolutional neural networks that a kind of positive example of combination and reverse example are combined, its feature exists
In,
S1. to sentence text entities to be sorted, positive example is divided into and reversely real according to the front and rear linear precedence of word in sentence
Example;
S2. positive instance entity and reverse instance entity are encoded respectively using CNN sentences encoder, constructs sentence
The coding characteristic vector of positive example isCoding characteristic vector with reverse example is
S3. it is according to the coding characteristic vector of positive exampleCoding characteristic vector with reverse example isUtilize softmax
Layer carries out relation classification, obtains classification results ri。
2. the pass based on convolutional neural networks that the positive example of a kind of combination according to claim 1 and reverse example are combined
It is sorting technique, it is characterised in that
Wherein, a sentence for having marked two entities is given, according to the front and rear linear precedence of word in sentence, equivalent is gone out preceding
Existing entity is used as e1, in the entity of rear appearance be used as e2Example, be referred to as positive example;
Wherein, a sentence for having marked two entities is given, according to the front and rear linear precedence of word in sentence, equivalent is gone out after
Existing entity is used as e1, in the entity of preceding appearance be used as e2Example, referred to as reverse example.
3. the pass based on convolutional neural networks that the positive example of a kind of combination according to claim 2 and reverse example are combined
It is sorting technique, it is characterised in that
For any sentence, the coding characteristic vector of the positive example of definition isReversely the coding characteristic vector of example isJust
It is to example relationshipReversely example relationship is
Set it there are the positive examples of probability h correct, have the reverse example of 1-h probability correct, defining cross entropy design object function is:
<mrow>
<mi>J</mi>
<mrow>
<mo>(</mo>
<mi>&theta;</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<mi>h</mi>
<mrow>
<mo>(</mo>
<mi>log</mi>
<mi> </mi>
<mi>p</mi>
<mo>(</mo>
<mrow>
<msubsup>
<mi>r</mi>
<mi>i</mi>
<mo>+</mo>
</msubsup>
<mo>|</mo>
<msubsup>
<mi>z</mi>
<mi>i</mi>
<mo>+</mo>
</msubsup>
<mo>,</mo>
<mi>&theta;</mi>
</mrow>
<mo>)</mo>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>h</mi>
<mo>)</mo>
</mrow>
<mrow>
<mo>(</mo>
<mi>log</mi>
<mi>p</mi>
<mo>(</mo>
<mrow>
<msubsup>
<mi>r</mi>
<mi>i</mi>
<mo>-</mo>
</msubsup>
<mo>|</mo>
<msubsup>
<mi>z</mi>
<mi>i</mi>
<mo>-</mo>
</msubsup>
<mo>,</mo>
<msup>
<mi>&theta;</mi>
<mo>&prime;</mo>
</msup>
</mrow>
<mo>)</mo>
<mo>)</mo>
</mrow>
</mrow>
Wherein, n is sentence quantity, and θ and θ ' are respectively in positive example and reverse instance model in sentence encoder neutral net
The mapping of convolutional layer and hidden layer and bigoted parameter;
Further, object function is minimized using stochastic gradient descent method, mini-batch is randomly choosed from training set
Individual sample is trained until convergence;Positive Exemplary classes probability vector is C+=[c1, c2..., cr], reverse Exemplary classes are general
Rate vector is C-=[c1, c2..., cr], ciRepresent entity e in the sentence1With e2Between there is relation riProbability;
The result that must can classify:
C=ω C++(1-ω)C-
Finally, corresponding classification results r is obtained by maximum inverse function i=argmax (C)i。
4. the pass based on convolutional neural networks that the positive example of a kind of combination according to claim 1 and reverse example are combined
It is sorting technique, it is characterised in that the CNN sentences encoder includes four-layer structure, first layer to four-layer structure is followed successively by volume
Code layer, convolutional layer, selective attention layer, pond and non-linear layer;
Wherein, the coding layer is used to the word in sentence being converted to low-dimensional real number vector;The convolutional layer is used to obtain each
The high-level characteristic of word;The selective attention layer is used to find out by most short independent path to contact most close with two Entity Semantics
Word, represented by weight matrix;The pondization and non-linear layer are used for the coded representation for constructing sentence.
5. the pass based on convolutional neural networks that the positive example of a kind of combination according to claim 4 and reverse example are combined
Be sorting technique, it is characterised in that the coding layer for word in sentence coded representation include Chinese word coding, it is position encoded and according to
Rely coding;
Wherein, Chinese word coding is specifically included:It includes n word to a known sentence x, is expressed as x=[x1, x2... xn], wherein xiTable
Show i-th of word in the sequence, n is filling intercepted length set in advance.Each word xiIt is obtained by searching term vector table W
Equivalent vector representation ei, i.e. ei=Wxi;
Wherein, it is position encoded to specifically include:Using each word with and entity distance generation position feature it is vectorial, use each word
xiWith two entities in sentence apart from i-i1And i-i2Vector of the correspondence in the feature coding table D of position as position encoded,
Be denoted asWithPosition feature coding schedule is initialized using random value;
Wherein, coding is relied on to specifically include:Using word and upper layer node distance generation rely on direction vector, using between word according to
The label generation dependence characteristics vector for the relation of relying;
The Chinese word coding of each word, position encoded and dependence coded strings are tied together as to the coded representation of each word.
6. the pass based on convolutional neural networks that the positive example of a kind of combination according to claim 4 and reverse example are combined
It is sorting technique, it is characterised in that the convolutional layer is used to merge all local features, and the convolutional layer is by a size
W sliding window extracts local feature;
Specifically, convolution kernel is matrix f=[f1, f2..., fw], then characteristic sequence s=[s are obtained after convolution1, s2...,
sn];
Wherein,
<mrow>
<msub>
<mi>s</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mi>g</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>&Sigma;f</mi>
<mrow>
<mi>j</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
<mi>T</mi>
</msubsup>
<msubsup>
<mi>X</mi>
<mrow>
<mi>j</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<mi>b</mi>
<mo>)</mo>
</mrow>
</mrow>
Wherein, b is bias term, and g is a nonlinear function, can obtain different using different convolution kernel and window size
Feature.
7. the pass based on convolutional neural networks that the positive example of a kind of combination according to claim 4 and reverse example are combined
Be sorting technique, it is characterised in that in the pondization and softmax layer, using max functions acquisition most important characteristics, then for
Its convolution fraction of each convolution kernel is:
pf=max { sA}
The pond fraction that each convolution kernel is obtained is connected for the characteristic vector z=[p for representing the sentence1, p2..., pm], its
Middle m is convolution nuclear volume;
Finally, nonlinear function is added as output to characteristic vector, the output is the coded representation for inputting sentence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710354990.2A CN107194422A (en) | 2017-06-19 | 2017-06-19 | A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710354990.2A CN107194422A (en) | 2017-06-19 | 2017-06-19 | A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107194422A true CN107194422A (en) | 2017-09-22 |
Family
ID=59874194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710354990.2A Pending CN107194422A (en) | 2017-06-19 | 2017-06-19 | A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107194422A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108182394A (en) * | 2017-12-22 | 2018-06-19 | 浙江大华技术股份有限公司 | Training method, face identification method and the device of convolutional neural networks |
CN108734290A (en) * | 2018-05-16 | 2018-11-02 | 湖北工业大学 | It is a kind of based on the convolutional neural networks construction method of attention mechanism and application |
CN109284378A (en) * | 2018-09-14 | 2019-01-29 | 北京邮电大学 | A kind of relationship classification method towards knowledge mapping |
CN109493931A (en) * | 2018-10-25 | 2019-03-19 | 平安科技(深圳)有限公司 | A kind of coding method of patient file, server and computer readable storage medium |
CN109558605A (en) * | 2018-12-17 | 2019-04-02 | 北京百度网讯科技有限公司 | Method and apparatus for translating sentence |
CN109582958A (en) * | 2018-11-20 | 2019-04-05 | 厦门大学深圳研究院 | A kind of disaster story line construction method and device |
WO2019085328A1 (en) * | 2017-11-02 | 2019-05-09 | 平安科技(深圳)有限公司 | Enterprise relationship extraction method and device, and storage medium |
CN111291556A (en) * | 2019-12-17 | 2020-06-16 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111753081A (en) * | 2019-03-28 | 2020-10-09 | 百度(美国)有限责任公司 | Text classification system and method based on deep SKIP-GRAM network |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106354710A (en) * | 2016-08-18 | 2017-01-25 | 清华大学 | Neural network relation extracting method |
-
2017
- 2017-06-19 CN CN201710354990.2A patent/CN107194422A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106354710A (en) * | 2016-08-18 | 2017-01-25 | 清华大学 | Neural network relation extracting method |
Non-Patent Citations (1)
Title |
---|
李博 等: "改进的卷积神经网络关系分类方法研究", 《计算机科学与探索》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019085328A1 (en) * | 2017-11-02 | 2019-05-09 | 平安科技(深圳)有限公司 | Enterprise relationship extraction method and device, and storage medium |
CN108182394B (en) * | 2017-12-22 | 2021-02-02 | 浙江大华技术股份有限公司 | Convolutional neural network training method, face recognition method and face recognition device |
CN108182394A (en) * | 2017-12-22 | 2018-06-19 | 浙江大华技术股份有限公司 | Training method, face identification method and the device of convolutional neural networks |
CN108734290A (en) * | 2018-05-16 | 2018-11-02 | 湖北工业大学 | It is a kind of based on the convolutional neural networks construction method of attention mechanism and application |
CN108734290B (en) * | 2018-05-16 | 2021-05-18 | 湖北工业大学 | Convolutional neural network construction method based on attention mechanism and application |
CN109284378A (en) * | 2018-09-14 | 2019-01-29 | 北京邮电大学 | A kind of relationship classification method towards knowledge mapping |
CN109493931A (en) * | 2018-10-25 | 2019-03-19 | 平安科技(深圳)有限公司 | A kind of coding method of patient file, server and computer readable storage medium |
CN109493931B (en) * | 2018-10-25 | 2024-06-04 | 平安科技(深圳)有限公司 | Medical record file encoding method, server and computer readable storage medium |
CN109582958A (en) * | 2018-11-20 | 2019-04-05 | 厦门大学深圳研究院 | A kind of disaster story line construction method and device |
CN109558605B (en) * | 2018-12-17 | 2022-06-10 | 北京百度网讯科技有限公司 | Method and device for translating sentences |
CN109558605A (en) * | 2018-12-17 | 2019-04-02 | 北京百度网讯科技有限公司 | Method and apparatus for translating sentence |
CN111753081A (en) * | 2019-03-28 | 2020-10-09 | 百度(美国)有限责任公司 | Text classification system and method based on deep SKIP-GRAM network |
CN111753081B (en) * | 2019-03-28 | 2023-06-09 | 百度(美国)有限责任公司 | System and method for text classification based on deep SKIP-GRAM network |
CN111291556B (en) * | 2019-12-17 | 2021-10-26 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
CN111291556A (en) * | 2019-12-17 | 2020-06-16 | 东华大学 | Chinese entity relation extraction method based on character and word feature fusion of entity meaning item |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107194422A (en) | A kind of convolutional neural networks relation sorting technique of the forward and reverse example of combination | |
CN107180247A (en) | Relation grader and its method based on selective attention convolutional neural networks | |
CN108009285B (en) | Forest Ecology man-machine interaction method based on natural language processing | |
CN106383816B (en) | The recognition methods of Chinese minority area place name based on deep learning | |
CN108073711A (en) | A kind of Relation extraction method and system of knowledge based collection of illustrative plates | |
CN105631479B (en) | Depth convolutional network image labeling method and device based on non-equilibrium study | |
CN107153642A (en) | A kind of analysis method based on neural network recognization text comments Sentiment orientation | |
CN107526799A (en) | A kind of knowledge mapping construction method based on deep learning | |
CN112487143A (en) | Public opinion big data analysis-based multi-label text classification method | |
CN110298037A (en) | The matched text recognition method of convolutional neural networks based on enhancing attention mechanism | |
CN108133038A (en) | A kind of entity level emotional semantic classification system and method based on dynamic memory network | |
CN109635109A (en) | Sentence classification method based on LSTM and combination part of speech and more attention mechanism | |
CN109740148A (en) | A kind of text emotion analysis method of BiLSTM combination Attention mechanism | |
CN107330446A (en) | A kind of optimization method of depth convolutional neural networks towards image classification | |
CN107153713A (en) | Overlapping community detection method and system based on similitude between node in social networks | |
CN107463609A (en) | It is a kind of to solve the method for video question and answer using Layered Space-Time notice codec network mechanism | |
CN107038159A (en) | A kind of neural network machine interpretation method based on unsupervised domain-adaptive | |
CN106570148A (en) | Convolutional neutral network-based attribute extraction method | |
CN109857871A (en) | A kind of customer relationship discovery method based on social networks magnanimity context data | |
CN110209789A (en) | A kind of multi-modal dialog system and method for user's attention guidance | |
CN111932026A (en) | Urban traffic pattern mining method based on data fusion and knowledge graph embedding | |
CN109711883A (en) | Internet advertising clicking rate predictor method based on U-Net network | |
CN108052625A (en) | A kind of entity sophisticated category method | |
CN107992890A (en) | A kind of various visual angles grader and design method based on local feature | |
CN108932278A (en) | Interactive method and system based on semantic frame |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170922 |