CN110334354A - A kind of Chinese Relation abstracting method - Google Patents

A kind of Chinese Relation abstracting method Download PDF

Info

Publication number
CN110334354A
CN110334354A CN201910626307.5A CN201910626307A CN110334354A CN 110334354 A CN110334354 A CN 110334354A CN 201910626307 A CN201910626307 A CN 201910626307A CN 110334354 A CN110334354 A CN 110334354A
Authority
CN
China
Prior art keywords
word
vector
hidden state
meaning
rank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910626307.5A
Other languages
Chinese (zh)
Other versions
CN110334354B (en
Inventor
丁宁
李自然
郑海涛
刘知远
沈颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Tsinghua University
Original Assignee
Shenzhen Graduate School Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Tsinghua University filed Critical Shenzhen Graduate School Tsinghua University
Priority to CN201910626307.5A priority Critical patent/CN110334354B/en
Publication of CN110334354A publication Critical patent/CN110334354A/en
Application granted granted Critical
Publication of CN110334354B publication Critical patent/CN110334354B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The present invention provides a kind of Chinese Relation abstracting method, includes the following steps: S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, to extract the distributed vector of three word in the text, word and the meaning of a word ranks;S2: feature coding: using two-way length, memory network obtains the hidden state vector of the hidden state vector of word, word by the distributed vector of three word, word and the meaning of a word ranks as basic framework in short-term, and then obtains the final hidden state vector of word rank;S3: relationship classification: learn the final hidden state vector of the word rank, using the attention mechanism of the word rank by the hidden state vector fusion of the word rank at the hidden state vector of a sentence level.The problem of efficiently solving segmentation ambiguity and polysemant ambiguity greatly improves performance of the model in Relation extraction task, improves accuracy rate and robustness that Chinese Relation extracts.

Description

A kind of Chinese Relation abstracting method
Technical field
The present invention relates to computer application technology more particularly to a kind of Chinese Relation abstracting methods.
Background technique
Natural language processing is the sub- subject of artificial intelligence and the cross discipline of computer science and computational linguistics. Wherein, Relation extraction is one of basic task of natural language processing field.The purpose is to for given sentence and mark Entity (usually noun), accurately find out the relationship between entity.Relation extraction technology can be used to construct large-scale knowledge Map, the semantic network that knowledge mapping is made of concept, entity, entity attribute and entity relationship are to real world structure The expression of change.The building of extensive knowledge mapping can provide comprehensive and structuring external knowledge for artificial intelligence system, from And develop more powerful application.
Traditional Relation extraction task be there is a problem that certain, they often formulate feature by manually, so that model It is effectively run on the specific data set of small range, this way limits the development in Relation extraction field.
Simultaneously as the dependence to manual feature, traditional Relation extraction technology has poor robustness and expansible Property, cause model that can not carry out in different data and corpus extensive.
In recent years, the Relation extraction based on deep learning makes great progress and traditional Relation extraction method ratio, These methods have many good qualities.Firstly, since the application of neural network, these models can automatically remove the semanteme of learning text Feature removes design feature so as to avoid specific data is artificially directed to, reduces human cost, and achieve better effect Fruit.This neural network model provides a kind of solution end to end, minimizes artificial degree of participation.Meanwhile base Also possess higher robustness in the model of neural network, can be arrived for the different feature of ever-changing natural language learning The mapping of output.
But even deep learning model, also it is faced with some yet unresolved issues.For not having as Chinese is this For the language of natural separator, current method is that the realization of word rank or word rank is carried out to main stream approach.The former List entries is input in model as unit of word, and it is special that this method can allow model to be difficult to learn the word rank into semantic space Sign results in information deficiency, reduces the accuracy of Relation extraction task;The latter is first to carry out list entries with participle tool Participle, then is input in model, although this method it is considered that word level information, due to segmenting work by means of outside Tool, it is easy to which the phenomenon that generating segmentation ambiguity limits relationship pumping so that the error of external tool can be propagated in entire model Take the development of task.And either word rank or word level model, the polysemia of word is not accounted for, and only with one The phenomenon that a term vector goes to indicate word feature, and this strategy can not handle polysemant ambiguity, to reduce the upper limit of model.
Summary of the invention
The present invention provides to solve the problems, such as segmentation ambiguity and polysemant ambiguity that Chinese Relation in the prior art extracts A kind of Chinese Relation abstracting method.
To solve the above-mentioned problems, the technical solution adopted by the present invention is as described below:
A kind of Chinese Relation abstracting method, includes the following steps: S1: data prediction: carrying out to the text of input data The pre-training of more granular informations is handled, to extract the distributed vector of three word in the text, word and the meaning of a word ranks; S2: feature coding: using two-way length, memory network passes through the word, the distribution of three ranks of word and the meaning of a word as basic framework in short-term Formula vector obtains the hidden state vector of the hidden state vector of word, word, and then obtains the final hidden state vector of word rank; S3: relationship classification: learning the final hidden state vector of the word rank, using the attention mechanism of the word rank by the word The hidden state vector fusion of rank at a sentence level hidden state vector.
Preferably, the distributed vector for extracting word rank includes extracting word vector sum position vector;The word vector: for The word level sequences in order s={ c of the text of the given input data1..., cMTotal M has a character, use the side word2vec Method, by each character ciAll it is mapped as a word vectorWherein, ciIndicate i-th of character,It is i-th of character Word vector, R are real number space, dcIt is the dimension of the word vector;The position vector indicates character ciTo two entity P1And P2 Between relative positionWherein,Calculation method it is as follows:Wherein, b1And e1It is first The beginning of a entity P1 and end position,Calculation method andCalculation method it is identical, willWithIt is converted into corresponding Position vector isWithFor indicating the position feature of the word level sequences in order, dpIndicate position vector Dimension;
The final expression of the distributed vector of word rank is that two position vectors of the word vector sum are stitched together, I.e. are as follows:At this point,D=dc+2*dp, d is that position vector described in the word vector sum splices it Total dimension afterwards;
At this point, the expression of the word level sequences in order of the text of the input data becomes
Preferably, the distributed vector of extraction word rank includes: the word grade for the text of the given input data Other sequence s={ c1..., cMAnd word level sequences in order s={ w1..., wM, it is indicated using initial position b and final position e One word, that is, wB, e;By word2vec method by word wB, eIt is converted into the distributed vector of word rank
Preferably, each word w is obtained from external semantic knowledge base HownetB, eMeaning of a word set Sense (wB, e), by each The meaning of a word in the meaning of a word set isIt is converted to the distributed vector an of meaning of a word rank I.e.Wherein, K is word wB, eThe meaning of a word number.
Preferably, step S2 includes: S21: using word as basic unit, by the word level sequences in order of the text of the input data It is directly inputted to the two-way length and obtains the hidden state vector of the word in memory network in short-term;S22: by the input data Text word level sequences in order using each word as ending word by external semantic knowledge base Hownet obtain institute's predicate All meaning of a word vectors, the meaning of a word vector is input to the two-way length, meaning of a word rank is calculated in memory network in short-term Hidden state vector is obtained described using the method for weighted sum by the hidden state Vector Fusion of all meaning of a word ranks The hidden state vector of word;S23: the weight of the word and institute's predicate is calculated using a gate cell, passes through the side of weighted sum The hidden state Vector Fusion of the hidden state vector sum institute predicate of the word is the final hiding shape of the word by method State vector.
Preferably, step S21 includes: j-th of word in the word level sequences in order of the text, is input to the two-way length When memory network calculating process are as follows:
Wherein, i is input gate, is stored for controlling which information;F is to forget door, will be by for controlling which information Forget;O is out gate, will be exported for controlling which information;C is cell factory, and U and b are to state two-way long short-term memory net Parameter to be learned in network, h indicate hidden state vector, common certainly by hidden state and the data at the current time input at last moment It is fixed.
Preferably, one is started with subscript b in step S22, the word w to be ended up with subscript eB, e, vocabulary is shown asIt is defeated Enter into the two-way length in short-term memory network, the cell factory of institute's predicateIt calculates as follows:
For institute predicate wB, eK-th of meaning of a word, indicate vector beThe cell factory of one meaning of a word rank's Calculating process is as follows:
Additional door machine system is introduced to control the contribution of each word sense information:
The word cell state calculation for having merged multiple word sense informations is as follows:
Then all meaning of a word cell factories can be fused into a word cell state
For character ce, calculation method is as follows:
Wherein,WithIt is the normalization expression of door, calculation method is as follows:
The corresponding cell factory of each word will merge the information of word and meaning of a word rank, and then obtain the final of the word Hidden state vector:
The final hidden state vector of the word will be fed in classifier, synthesize the spy of corresponding sentence level Sign indicates.
Preferably, the hidden state vector of the sentence levelh*Calculating it is as follows:
H=tanh (h)
α=softmax (wTH)
h*=h αT
Then h*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:
O=Wh*+b
P (y | s)=softmax (o)
For T training data, entire training process will be optimized by following cross entropy loss function:
Wherein, dhIt is the dimension of hidden state variable, M is the length of list entries, and R is real number space, and T represents transposition, and w is One parameter to be learnt, α are the weight vectors of h,It is transfer matrix, b ∈ RYIt is bias vector, Y indicates all The total quantity of classification, p (y) then indicate to predict that the probability of some classification, θ indicate all parameters for needing training in entire model.
Preferably, dropout mechanism, the memory network in short-term of the two-way length described in training are used in the training process Each neuron there is 50% probability to be closed.
The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer Program, when the computer program is executed by processor realize as above any the method the step of.
The invention has the benefit that providing a kind of Chinese Relation abstracting method, carried out by the text to input data The pre-training of more granular informations is handled, can be certainly to extract the distributed vector of three word in text, word and the meaning of a word ranks Dynamic study semantic feature, greatly reduces artificial participation;Asking for segmentation ambiguity and polysemant ambiguity can be efficiently solved Topic greatly improves performance of the model in Relation extraction task, improves accuracy rate and robustness that Chinese Relation extracts.
Detailed description of the invention
Fig. 1 is Chinese Relation abstracting method schematic diagram in the embodiment of the present invention.
Fig. 2 is the flow diagram of Chinese Relation abstracting method in the embodiment of the present invention.
Specific embodiment
In order to which technical problem to be solved of the embodiment of the present invention, technical solution and beneficial effect is more clearly understood, The present invention is further described in detail below with reference to the accompanying drawings and embodiments.It should be appreciated that specific implementation described herein Example is only used to explain the present invention, is not intended to limit the present invention.
It should be noted that it can be directly another when element is referred to as " being fixed on " or " being set to " another element On one element or indirectly on another element.When an element is known as " being connected to " another element, it can To be directly to another element or be indirectly connected on another element.In addition, connection can be for fixing Effect is also possible to act on for circuit communication.
It is to be appreciated that term " length ", " width ", "upper", "lower", "front", "rear", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "top", "bottom" "inner", "outside" is that orientation based on the figure or position are closed System is merely for convenience of the description embodiment of the present invention and simplifies description, rather than the device or element of indication or suggestion meaning must There must be specific orientation, be constructed and operated in a specific orientation, therefore be not considered as limiting the invention.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include one or more this feature.In the description of the embodiment of the present invention, the meaning of " plurality " is two or two More than, unless otherwise specifically defined.
Embodiment 1
As shown in Figure 1, the present invention provides a kind of Chinese Relation abstracting method.Include the following steps:
S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, described to extract The distributed vector of word, word and three ranks of the meaning of a word in text;
S2: feature coding: using two-way length, memory network passes through three word, word and the meaning of a word grades as basic framework in short-term Other distribution vector obtains hidden state vector, and then obtains the final hidden state vector of word rank;
S3: relationship classification: learn the final hidden state vector of the word rank, using the attention mechanism of the word rank By the hidden state vector fusion of the word rank at the hidden state vector of a sentence level.
Wherein data prediction step is that the pre-training processing of more granular informations is carried out to the text of input data, to extract Out the word in text, three ranks of word and the meaning of a word distributed vector, traditional pre-training processing is usually only with a term vector A word is indicated to correspond to, and in the present invention, a distributed meaning of a word vector can be generated for each meaning of a word of each word.
A kind of long memory network in short-term of lattice of multi-layer path is realized in feature coding step, it is multiple to efficiently use The hidden state variable of word rank is arrived in the semantic information of rank, study, which can be considered as automatically extracts from data Feature;The hidden variable learnt in feature coding step will be input in relationship classifying step, introduced gate at this time and paid attention to Power mechanism carries out automatic weight distribution to hidden status switch and merges, to filter out noise letter in the fusion process of weighting Breath, retains significant characteristic information, so that final classification exports to obtain more accurate relationship classification.
The present invention includes two stages: training stage and forecast period.Training stage defines an initial model, the mould Parameter in type is random initializtion, in the training process, the data of relationship class label, mould is constantly had to mode input Type can during training continuous learning training data and carry out the update of parameter.Simultaneously we with the prediction of model output and Cross entropy between correct option measures the prediction effect of model as loss function, when the value of the loss function tends towards stability When, illustrate that modelling effect is restrained, training is over, and obtains a trained Relation extraction device;In forecast period, directly Data to be predicted are input in trained Relation extraction device, corresponding prediction entity relationship is obtained.
Data prediction:
In this step, main purpose is point for converting the text in input data in computer and can reading and operating Cloth vector, the vector contain implicit semantic information.At the same time, in order to subsequent module can using in text word, More granular informations of word and the meaning of a word, the study which has all carried out vector for these three language granularities indicate.
For word vector, this technology is trained in Large Scale Corpus using common word2vec algorithm, to obtain Taking the hidden feature of each word indicates.The contextual information in the Large Scale Corpus where word is utilized in the expression, can be sufficiently Embody the syntactic and semantic information of word.
For term vector, it is however generally that, training method is identical with word vector to use word2vec algorithm, and only word vector is It is trained using the word in text as basic unit, and after term vector first will carry out automatic word segmentation to text with participle tool, then It is trained by basic unit of word;But in this way, each word can only correspond to a fixed term vector, have ignored word Ambiguity, therefore the present invention selection to the meaning of a word rather than word carry out vector expression.
For meaning of a word vector, due to can not directly tell the word from word face it is polysemant, has which word respectively Justice, therefore by being modeled by external semantic knowledge base Hownet (HowNet) to the meaning of a word.In HowNet, each word it is a variety of The meaning of a word and justice former (indicating semantic minimum unit) are all manually explicitly marked out, by it, available each word The meaning of a word, and train meaning of a word vector.So, a word may be indicated by multiple meaning of a word vectors, so as to input Into subsequent module, model is allowed to be dynamically selected the language of the most suitable current word in current sentence during training Justice helps semantic information and feature more profound in model capture sentence.
Feature coding:
The neural network structure that can efficiently use more granular information features is realized in feature coding step.It Basic framework be two-way length memory network (LSTM) in short-term, compared to traditional Recognition with Recurrent Neural Network (RNN) model, LSTM can With the more flexible important information for effectively handling contextual information, saving in input, the invalid information in input is forgotten, and The problem of can disappearing to avoid the gradient that deep neural network is likely to encounter with gradient explosion.But traditional LSTM model without Method solves the segmentation ambiguity and polysemant ambiguity problem that Chinese Relation extracts, therefore the present invention has carried out a series of improvement.
First of all for the error propagation for avoiding participle tool, the present invention regards each sentence text using word as basic unit It is the sequence of word rank as directly inputting, is input in two-way LSTM unit and obtains its hidden state vector;Then, in order to The information that can consider word rank simultaneously in an encoding process, for each word in sentence, the present invention can be by all in the sentence In using the word as ending word be added to LSTM unit calculate in.Such as: in sentence " Darwin studies all cuckoos " " cuckoo " word, " cuckoo " is the word using " cuckoo " as ending in the sentence.Then by all using current word as the word of ending It is sent in another two-way LSTM unit, the hidden state of word rank is calculated.Finally calculated using a gate cell The weight of word and word, and then merged the hidden state of word and word by the method for weighted sum, as final hidden of current word State vector is hidden, which contains the information of word and word rank simultaneously.
Although the above method can effectively avoid influence of the participle mistake to model, not have in conjunction with the information of word and word There is the presence in view of polysemant in sentence.For example " cuckoo " is exactly a polysemant in above-mentioned example, there is " azalea " and " Du Two kinds of completely different semantemes of cuckoo bird ".Therefore further, the meaning of a word of each word has been also coupled to hidden state by the present invention Calculating in.Specifically, for each using current word as the word of ending, inquiry HowNet first obtains all words of this word Meaning of a word vector is then input in a two-way LSTM unit as term vector and meaning of a word rank is calculated by adopted vector The state fusion of all meaning of a word ranks is finally obtained the hidden state of the word using the method for weighted sum by hidden state.Phase It is compared than directly obtaining word hidden state by a term vector before, merges to this method dynamic, selects the most suitable meaning of a word To constitute word hidden state.After obtaining the hidden state of word, the hidden state of word and word is merged as method before, is obtained To the final hidden state vector of current word.
Relationship classification:
The sentence characteristics learnt are indicated input classifier, the relationship class label predicted by the step.Upper one In a module, encoder learns to have obtained the character representation (hidden state vector) of each word, but since relationship classification is with sentence Son is what unit was extracted, it is therefore desirable to which the character representation of all word ranks, which is fused into corresponding sentence characteristics, to be indicated.This Invention introduces the character representation that gate attention mechanism is grade malapropism and distributes weight automatically, is then based on the weight and adds to all words Power summation, obtaining final sentence characteristics indicates.This method intuitively be meant that in a sentence, the significance level of each word Be different, noise word or commonly used word such as " " and " " lesser weight should be assigned, and in keyword such as entity Word in word and verb should then correspond to higher attention, and it is just more accurate to merge obtained sentence expression in this way.
After fusion obtains the character representation vector of sentence, which is input in a full articulamentum, mapping is tieed up Degree is the new vector of total relationship classification number.New vector is normalized then through softmax function, so that in vector Be all probability value in 0 to 1 section per one-dimensional value, it is other generally to indicate that sentence is classified into relation object corresponding to leading dimension Rate.For the training stage, the loss function of Definition Model be normalization after vector and current correct relation classification instruction to The cross entropy between (one-hot) is measured, and passes through the parameter in the method more new model of gradient decline;For forecast period, when The relationship classification of preceding prediction is that maximum one-dimensional corresponding relationship classification of probability value in vector after normalization.
Embodiment 2
As shown in Fig. 2, the present embodiment uses the Relation extraction method of Chinese provided by the invention, it is defined as given sentence Two specified entities in s and sentence, judge what relationship the two entities are in sentence s.Such as " Darwin grinds given sentence Study carefully all cuckoos " and designated entities " Darwin " and " cuckoo ", target be that judgement " Darwin " and " cuckoo " is assorted in the sentence Relationship.
Step 1. data prediction:
1.1 word ranks indicate
For given list entries s={ c1..., cMTotal M has a character, the present invention uses word2vec method, will Each character (for i-th) ciAll it is mapped as a word vector It is the word vector of i-th of character, R is real Number space, dcIt is the dimension of the word vector.Other than word vector, this technology additionally uses position vector to indicate word to two Relative position between entity.Particularly, for i-th of character ciFor, it arrives the opposite position between two specified entities Setting can be expressed as Calculation method it is as follows:
Herein, b1And e1It is the beginning and end position of first entity,Calculation method andAlmost phase Together.In this way,WithIt will be converted into corresponding position vector, beWithFor indicating the word grade The position feature of other sequence, dpIndicate the dimension of position vector.
In an embodiment of the present invention, input is defined as a sentence and specified two entities therein;It is practical Situation is, if a sentence has multiple entities, carries out Relation extraction, the result of input for wherein every two entity For relationship of the two currently assigned entities in this sentence.
Therefore, for i-th of word c of inputi, its final expression is to spell common two position vectors of word vector sum It picks up and, asAt this point,D=dc+2*dp, d is position vector described in the word vector sum Total dimension after splicing;The word expression of list entries just will becomeIt is then fed to subsequent coding step In rapid.
The expression of 1.2 word ranks:
Although the input of model is the sequence of word rank, in order to obtain word grade another characteristic, the present invention will be in sentence All candidate words being likely to occur have carried out the expression study of word rank.For a list entries s, it can not only be expressed as S={ c1..., cMForm word level sequences in order, moreover it is possible to be expressed as s={ w1..., wMWord level sequences in order.In this section, The present invention indicates a word, i.e. w using initial position b and final position eB, e.Still through word2vec method, word wB, eIt can To be converted into term vector
For each word, the present invention obtains its meaning of a word set Sense (w from HowNetB, e), then pass through Skip- The method of Gram, by each meaning of a word in set, i.e.,It is converted to a meaning of a word vectorIt goes individually to indicate a meaning of a word.So, a word may be indicated by multiple meaning of a word vectors (assuming that it has K word Justice), i.e.,
Representation of word meaning vector will be used among the training of coder module, and model is dynamically believed using the meaning of a word Breath.
Step 2. feature coding:
One common word rank LSTM unit is mainly made of following three doors: which an input gate i is for controlling Information can be stored;One forgetting door f is for controlling which information will pass into silence;One out gate o incites somebody to action to control which information It is exported.For j-th of word, the calculating process of LSTM unit is as follows:
Here c indicates cell factory, it stores sequence from the information started to current location.H indicates hidden state vector, It is codetermined by the hidden state at last moment and the input at current time.U and b is parameter to be learned in LSTM.
For one with subscript b beginning, the word w to be ended up with subscript eB, e, its vocabulary is shown asIn Lattice LSTM In, the cell factory of a wordIt calculates as follows:
I.e. for a word cb, the present invention can first search it is all with its be beginning and in all of external dictionary matching Then word calculates the cell factory c of these wordsw.On this basis, the present invention has expanded the calculating of meaning of a word rank, i.e., for every Each meaning of a word of a word all distributes additional LSTM unit and is calculated.It was mentioned in indicating study module, for word wB, e K-th of meaning of a word, its expression vector isTherefore, the cell factory of a meaning of a word rankCalculating process it is as follows:
As soon as then all meaning of a word cell factories can be fused into a word cell factory, such model can take into account word The information of ambiguity.Word cell factory after having merged multiple meaning of a word isIn order to calculateNeed to introduce an additional door Mechanism controls the contribution of each word sense information:
The word cell state calculation for having merged multiple word sense informations is as follows:
By above-mentioned calculating, for each word wB, e, this model can calculate it and merge the thin of multi-semantic meaning information Born of the same parents' state
Then for character ce, the present invention will be each with ceThe word information of ending merges, and has obtained completely new word rank cell State.Calculation method is as follows:
Wherein,WithIt is the normalization expression of door, calculation method is as follows:
It has passed through current calculating, the corresponding cell factory of each word will merge the information of word and meaning of a word rank, then again Calculate the final hidden state vector of word:
The final hidden state vector of word will be fed in classifier, synthesized the expression of sentence level, then answered The probability distribution of case.
The classification of step 3. relationship:
After the hidden state vector h of each word is learnt, present invention employs the attention mechanism of word rank by word rank Hidden state fusion at a sentence level hidden state
Here dhIt is the dimension of hidden state variable, M is the length of list entries.h*Be calculated as an automatic distribution weight Weighted sum:
H=tanh (h)
α=softmax (wTH)
h*=h αT
Here T represents transposition, and w is the parameter to be learnt, and α is the weight vectors of h, and H is that h passes through tanh functional transformation Value afterwards, tanh function in this way the range for being mapped to [- 1,1] in h per one-dimensional value can be effectively relieved in training process Gradient explosion the problems such as.
Then h*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:
O=Wh*+b
P (y | s)=softmax (o)
HereIt is transfer matrix, b ∈ RYIt is bias vector.Y indicates the total quantity of all categories, p (y) then table Show the probability for predicting some classification.
For T training data, entire training process will be optimized by following cross entropy loss function:
Here θ indicates all parameters that training is needed in entire model.Meanwhile over-fitting, the present invention are gone back in order to prevent Dropout mechanism is used in the training process, and each neuron has 50% probability to be closed and (instructs every time in training White silk has the hiding node layer of random half to be not involved in calculating);Test phase then keeps all trained neurons all It participates in calculating.
The present invention realizes all or part of the process in above-described embodiment method, can also be instructed by computer program Relevant hardware is completed, and the computer program can be stored in a computer readable storage medium, the computer program When being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer Program code, the computer program code can be source code form, object identification code form, executable file or certain centres Form etc..The computer-readable medium may include: can carry the computer program code any entity or device, Recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software Distribution medium etc..It should be noted that the content that the computer-readable medium includes can be according to making laws in jurisdiction Requirement with patent practice carries out increase and decrease appropriate, such as in certain jurisdictions, according to legislation and patent practice, computer Readable medium does not include electric carrier signal and telecommunication signal.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those skilled in the art to which the present invention belongs, it is not taking off Under the premise of from present inventive concept, several equivalent substitute or obvious modifications can also be made, and performance or use is identical, all answered When being considered as belonging to protection scope of the present invention.

Claims (10)

1. a kind of Chinese Relation abstracting method, which comprises the steps of:
S1: data prediction: the pre-training for carrying out more granular informations to the text of input data is handled, to extract the text In word, three ranks of word and the meaning of a word distributed vector;
S2: feature coding: using two-way length, memory network passes through the word, word and the meaning of a word three ranks as basic framework in short-term Distributed vector obtains the hidden state vector of the hidden state vector of word, word, so obtain the final hidden state of word rank to Amount;
S3: relationship classification: learning the final hidden state vector of the word rank, using the attention mechanism of the word rank by institute The hidden state vector fusion of word rank is stated into the hidden state vector of a sentence level.
2. Chinese Relation method for extracting as described in claim 1, which is characterized in that extract word rank distributed vector include Extract word vector sum position vector;
The word vector: for the word level sequences in order s={ c of the text of the given input data1..., cMTotal M has a word Symbol, using word2vec method, by each character ciAll it is mapped as a word vectorWherein, ciIndicate i-th of word Symbol,It is the word vector of i-th of character, R is real number space, dcIt is the dimension of the word vector;
The position vector indicates character ciTo two entity P1And P2Between relative positionWherein,Calculation method It is as follows:
Wherein, b1And e1It is beginning and the end position of first entity P1,Calculation method andCalculation method it is identical, It willWithIt is converted into corresponding position vector, isWithFor indicating the position of the word level sequences in order Feature, dpIndicate the dimension of position vector;
The final expression of the distributed vector of word rank is that two position vectors of the word vector sum are stitched together, i.e., Are as follows:
At this point,D=dc+2*dp, d is total dimension after the splicing of position vector described in the word vector sum;
At this point, the expression of the word level sequences in order of the text of the input data becomes
3. Chinese Relation method for extracting as described in claim 1, which is characterized in that extract the distributed vector packet of word rank It includes:
For the word level sequences in order s={ c of the text of the given input data1..., cMAnd word level sequences in order s= {w1..., wM, the i.e. w of word is indicated using initial position b and final position eB, e;By word2vec method by word wB, e It is converted into the distributed vector of word rank
4. Chinese Relation method for extracting as claimed in claim 3, which is characterized in that obtained from external semantic knowledge base Hownet every A word wB, eMeaning of a word set Sense (wB, e), it is by each meaning of a word in the meaning of a word set It is converted to the distributed vector an of meaning of a word rankI.e.
Wherein, K is word wB, eThe meaning of a word number.
5. Chinese Relation method for extracting as described in claim 1, which is characterized in that step S2 includes:
S21: using word as basic unit, the word level sequences in order of the text of the input data is directly inputted to the two-way length When memory network in obtain the hidden state vector of the word;
S22: the word using each word as ending of the word level sequences in order of the text of the input data is passed through into external language Adopted knowledge base Hownet obtains all meaning of a word vectors of institute's predicate, and the meaning of a word vector is input to the two-way long short-term memory net The hidden state vector of meaning of a word rank is calculated in network, uses method the hiding all meaning of a word ranks of weighted sum State vector fusion obtains the hidden state vector of institute's predicate;
S23: calculating the weight of the word and institute's predicate using a gate cell, by the method for weighted sum by the hidden of the word The hidden state Vector Fusion for hiding state vector and institute's predicate is the final hidden state vector of the word.
6. Chinese Relation method for extracting as claimed in claim 5, which is characterized in that step S21 includes: the word grade of the text J-th of word in other sequence is input to the calculating process of two-way length memory network in short-term are as follows:
Wherein, i is input gate, is stored for controlling which information;F is to forget door, will be passed into silence for controlling which information; O is out gate, will be exported for controlling which information;C is cell factory, U and b be state two-way length in short-term in memory network to The parameter of study, h indicate hidden state vector, are inputted and are codetermined by the hidden state and the data at current time at last moment.
7. Chinese Relation method for extracting as claimed in claim 6, which is characterized in that one is opened with subscript b in step S22 Head, the word w to be ended up with subscript eB, e, vocabulary is shown asThe two-way length is input in short-term in memory network, the cell of institute's predicate UnitIt calculates as follows:
For institute predicate wB, eK-th of meaning of a word, indicate vector beThe cell factory of one meaning of a word rankCalculating Process is as follows:
Additional door machine system is introduced to control the contribution of each word sense information:
The word cell state calculation for having merged multiple word sense informations is as follows:
Then all meaning of a word cell factories can be fused into a word cell state
For character ce, calculation method is as follows:
Wherein,WithIt is the normalization expression of door, calculation method is as follows:
The corresponding cell factory of each word will merge the information of word and meaning of a word rank, and then obtain the final of the word and hide State vector:
The final hidden state vector of the word will be fed in classifier, synthesize the mark sheet of corresponding sentence level Show.
8. Chinese Relation method for extracting as claimed in claim 7, which is characterized in that the hidden state vector of the sentence levelh*Calculating it is as follows:
H=tanh (h)
α=softmax (wTH)
h*=h αT
Then h*It can be admitted in a softmax classification layer, calculate the probability distribution of each classification:
O=Wh*+b
P (y | s)=softmax (o)
For T training data, entire training process will be optimized by following cross entropy loss function:
Wherein, dhIt is the dimension of hidden state variable, M is the length of list entries, and R is real number space, and T represents transposition, and w is one The parameter to be learnt, α are the weight vectors of h,It is transfer matrix, b ∈ RYIt is bias vector, Y indicates all categories Total quantity, p (y) then indicates to predict that the probability of some classification, θ indicate all parameters that training is needed in entire model.
9. Chinese Relation method for extracting as claimed in claim 8, which is characterized in that used in the training process Each neuron of dropout mechanism, the memory network in short-term of the two-way length described in training has 50% probability to be closed.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In being realized when the computer program is executed by processor such as the step of claim 1-9 any the method.
CN201910626307.5A 2019-07-11 2019-07-11 Chinese relation extraction method Active CN110334354B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910626307.5A CN110334354B (en) 2019-07-11 2019-07-11 Chinese relation extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910626307.5A CN110334354B (en) 2019-07-11 2019-07-11 Chinese relation extraction method

Publications (2)

Publication Number Publication Date
CN110334354A true CN110334354A (en) 2019-10-15
CN110334354B CN110334354B (en) 2022-12-09

Family

ID=68146526

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910626307.5A Active CN110334354B (en) 2019-07-11 2019-07-11 Chinese relation extraction method

Country Status (1)

Country Link
CN (1) CN110334354B (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061843A (en) * 2019-12-26 2020-04-24 武汉大学 Knowledge graph guided false news detection method
CN111160017A (en) * 2019-12-12 2020-05-15 北京文思海辉金信软件有限公司 Keyword extraction method, phonetics scoring method and phonetics recommendation method
CN111274794A (en) * 2020-01-19 2020-06-12 浙江大学 Synonym expansion method based on transmission
CN111274394A (en) * 2020-01-16 2020-06-12 重庆邮电大学 Method, device and equipment for extracting entity relationship and storage medium
CN111291556A (en) * 2019-12-17 2020-06-16 东华大学 Chinese entity relation extraction method based on character and word feature fusion of entity meaning item
CN111428505A (en) * 2020-01-17 2020-07-17 北京理工大学 Entity relation extraction method fusing trigger word recognition features
CN111680510A (en) * 2020-07-07 2020-09-18 腾讯科技(深圳)有限公司 Text processing method and device, computer equipment and storage medium
CN111783418A (en) * 2020-06-09 2020-10-16 北京北大软件工程股份有限公司 Chinese meaning representation learning method and device
CN111859978A (en) * 2020-06-11 2020-10-30 南京邮电大学 Emotion text generation method based on deep learning
CN112015891A (en) * 2020-07-17 2020-12-01 山东师范大学 Method and system for classifying messages of network inquiry platform based on deep neural network
CN112380872A (en) * 2020-11-27 2021-02-19 深圳市慧择时代科技有限公司 Target entity emotional tendency determination method and device
CN112560487A (en) * 2020-12-04 2021-03-26 中国电子科技集团公司第十五研究所 Entity relationship extraction method and system based on domestic equipment
CN112883194A (en) * 2021-04-06 2021-06-01 安徽科大讯飞医疗信息技术有限公司 Symptom information extraction method, device, equipment and storage medium
CN112883153A (en) * 2021-01-28 2021-06-01 北京联合大学 Information-enhanced BERT-based relationship classification method and device
CN112948535A (en) * 2019-12-10 2021-06-11 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN113051371A (en) * 2021-04-12 2021-06-29 平安国际智慧城市科技股份有限公司 Chinese machine reading understanding method and device, electronic equipment and storage medium
CN113239663A (en) * 2021-03-23 2021-08-10 国家计算机网络与信息安全管理中心 Multi-meaning word Chinese entity relation identification method based on Hopkinson
CN113326676A (en) * 2021-04-19 2021-08-31 上海快确信息科技有限公司 Deep learning model device for structuring financial text into form
CN113392648A (en) * 2021-06-02 2021-09-14 北京三快在线科技有限公司 Entity relationship acquisition method and device
CN114372125A (en) * 2021-12-03 2022-04-19 北京北明数科信息技术有限公司 Government affair knowledge base construction method, system, equipment and medium based on knowledge graph
CN115034302A (en) * 2022-06-07 2022-09-09 四川大学 Relation extraction method, device, equipment and medium for optimizing information fusion strategy
CN115169326A (en) * 2022-04-15 2022-10-11 山西长河科技股份有限公司 Chinese relation extraction method, device, terminal and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080275694A1 (en) * 2007-05-04 2008-11-06 Expert System S.P.A. Method and system for automatically extracting relations between concepts included in text
CN108733792A (en) * 2018-05-14 2018-11-02 北京大学深圳研究生院 A kind of entity relation extraction method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080275694A1 (en) * 2007-05-04 2008-11-06 Expert System S.P.A. Method and system for automatically extracting relations between concepts included in text
CN108733792A (en) * 2018-05-14 2018-11-02 北京大学深圳研究生院 A kind of entity relation extraction method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PENG ZHOU ET AL.: "Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification", 《PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》 *
孙紫阳 等: "基于深度学习的中文实体关系抽取方法", 《计算机工程》 *

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112948535A (en) * 2019-12-10 2021-06-11 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN112948535B (en) * 2019-12-10 2022-06-14 复旦大学 Method and device for extracting knowledge triples of text and storage medium
CN111160017A (en) * 2019-12-12 2020-05-15 北京文思海辉金信软件有限公司 Keyword extraction method, phonetics scoring method and phonetics recommendation method
CN111291556A (en) * 2019-12-17 2020-06-16 东华大学 Chinese entity relation extraction method based on character and word feature fusion of entity meaning item
CN111291556B (en) * 2019-12-17 2021-10-26 东华大学 Chinese entity relation extraction method based on character and word feature fusion of entity meaning item
CN111061843B (en) * 2019-12-26 2023-08-25 武汉大学 Knowledge-graph-guided false news detection method
CN111061843A (en) * 2019-12-26 2020-04-24 武汉大学 Knowledge graph guided false news detection method
CN111274394A (en) * 2020-01-16 2020-06-12 重庆邮电大学 Method, device and equipment for extracting entity relationship and storage medium
CN111428505A (en) * 2020-01-17 2020-07-17 北京理工大学 Entity relation extraction method fusing trigger word recognition features
CN111274794A (en) * 2020-01-19 2020-06-12 浙江大学 Synonym expansion method based on transmission
CN111274794B (en) * 2020-01-19 2022-03-18 浙江大学 Synonym expansion method based on transmission
CN111783418B (en) * 2020-06-09 2024-04-05 北京北大软件工程股份有限公司 Chinese word meaning representation learning method and device
CN111783418A (en) * 2020-06-09 2020-10-16 北京北大软件工程股份有限公司 Chinese meaning representation learning method and device
CN111859978B (en) * 2020-06-11 2023-06-20 南京邮电大学 Deep learning-based emotion text generation method
CN111859978A (en) * 2020-06-11 2020-10-30 南京邮电大学 Emotion text generation method based on deep learning
CN111680510A (en) * 2020-07-07 2020-09-18 腾讯科技(深圳)有限公司 Text processing method and device, computer equipment and storage medium
CN112015891A (en) * 2020-07-17 2020-12-01 山东师范大学 Method and system for classifying messages of network inquiry platform based on deep neural network
CN112380872A (en) * 2020-11-27 2021-02-19 深圳市慧择时代科技有限公司 Target entity emotional tendency determination method and device
CN112380872B (en) * 2020-11-27 2023-11-24 深圳市慧择时代科技有限公司 Method and device for determining emotion tendencies of target entity
CN112560487A (en) * 2020-12-04 2021-03-26 中国电子科技集团公司第十五研究所 Entity relationship extraction method and system based on domestic equipment
CN112883153B (en) * 2021-01-28 2023-06-23 北京联合大学 Relationship classification method and device based on information enhancement BERT
CN112883153A (en) * 2021-01-28 2021-06-01 北京联合大学 Information-enhanced BERT-based relationship classification method and device
CN113239663A (en) * 2021-03-23 2021-08-10 国家计算机网络与信息安全管理中心 Multi-meaning word Chinese entity relation identification method based on Hopkinson
CN113239663B (en) * 2021-03-23 2022-07-12 国家计算机网络与信息安全管理中心 Multi-meaning word Chinese entity relation identification method based on Hopkinson
CN112883194B (en) * 2021-04-06 2024-02-20 讯飞医疗科技股份有限公司 Symptom information extraction method, device, equipment and storage medium
CN112883194A (en) * 2021-04-06 2021-06-01 安徽科大讯飞医疗信息技术有限公司 Symptom information extraction method, device, equipment and storage medium
CN113051371A (en) * 2021-04-12 2021-06-29 平安国际智慧城市科技股份有限公司 Chinese machine reading understanding method and device, electronic equipment and storage medium
CN113326676A (en) * 2021-04-19 2021-08-31 上海快确信息科技有限公司 Deep learning model device for structuring financial text into form
CN113392648A (en) * 2021-06-02 2021-09-14 北京三快在线科技有限公司 Entity relationship acquisition method and device
CN114372125A (en) * 2021-12-03 2022-04-19 北京北明数科信息技术有限公司 Government affair knowledge base construction method, system, equipment and medium based on knowledge graph
CN115169326A (en) * 2022-04-15 2022-10-11 山西长河科技股份有限公司 Chinese relation extraction method, device, terminal and storage medium
CN115034302A (en) * 2022-06-07 2022-09-09 四川大学 Relation extraction method, device, equipment and medium for optimizing information fusion strategy
CN115034302B (en) * 2022-06-07 2023-04-11 四川大学 Relation extraction method, device, equipment and medium for optimizing information fusion strategy

Also Published As

Publication number Publication date
CN110334354B (en) 2022-12-09

Similar Documents

Publication Publication Date Title
CN110334354A (en) A kind of Chinese Relation abstracting method
CN107992597B (en) Text structuring method for power grid fault case
CN108733792B (en) Entity relation extraction method
CN108536754A (en) Electronic health record entity relation extraction method based on BLSTM and attention mechanism
CN110532557B (en) Unsupervised text similarity calculation method
CN110083831A (en) A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN110209817A (en) Training method and device of text processing model and text processing method
CN109635124A (en) A kind of remote supervisory Relation extraction method of combination background knowledge
CN109858041A (en) A kind of name entity recognition method of semi-supervised learning combination Custom Dictionaries
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN110555084A (en) remote supervision relation classification method based on PCNN and multi-layer attention
CN113743099B (en) System, method, medium and terminal for extracting terms based on self-attention mechanism
CN109933792A (en) Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method
CN114548099B (en) Method for extracting and detecting aspect words and aspect categories jointly based on multitasking framework
CN113255366B (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
CN115861995B (en) Visual question-answering method and device, electronic equipment and storage medium
CN113806494A (en) Named entity recognition method based on pre-training language model
CN111177402A (en) Evaluation method and device based on word segmentation processing, computer equipment and storage medium
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN114841151A (en) Medical text entity relation joint extraction method based on decomposition-recombination strategy
CN111783464A (en) Electric power-oriented domain entity identification method, system and storage medium
CN116757195B (en) Implicit emotion recognition method based on prompt learning
CN114239584A (en) Named entity identification method based on self-supervision learning
CN117216617A (en) Text classification model training method, device, computer equipment and storage medium
CN116362242A (en) Small sample slot value extraction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant