CN110413999A

CN110413999A - Entity relation extraction method, model training method and relevant apparatus

Info

Publication number: CN110413999A
Application number: CN201910645405.3A
Authority: CN
Inventors: 王李鹏
Original assignee: New H3C Big Data Technologies Co Ltd
Current assignee: New H3C Big Data Technologies Co Ltd
Priority date: 2019-07-17
Filing date: 2019-07-17
Publication date: 2019-11-05
Anticipated expiration: 2039-07-17
Also published as: CN110413999B

Abstract

The application proposes a kind of entity relation extraction method, model training method and relevant apparatus, it is related to natural language processing technique field, when by learning to multiple training samples that target training sample set includes, each training sample is adequately learnt from multiple dimensions, and the semantic association vector obtained after study is handled, obtain multi-dimensional semantic feature vector, to handle again multi-dimensional semantic feature vector, obtain the corresponding prediction entity relationship of the target training sample set, so that being based on the prediction entity relationship, semantic association vector and training entity relationship, update the model parameter of entity relation extraction model, compared with the prior art, entity relation extraction model is enabled to learn the semantic expressiveness to training sample under different dimensions, rather than only semantic expressiveness of the learning training sample under single dimension, from And when making entity relation extraction, entity relationship can be determined in conjunction with multiple dimensions of sample, promote the accuracy of entity relation extraction.

Description

Entity relation extraction method, model training method and relevant apparatus

Technical field

This application involves natural language processing technique fields, in particular to a kind of entity relation extraction method, model Training method and relevant apparatus.

Background technique

Relation extraction (Relation Extraction, RE) belongs to a kind of application scenarios of natural language processing；Utilize pass System extracts, and can will be labelled with the sentence of two entities, obtain the semantic relation between two entities.

For example, given sentence " Li Xiang shines girl and does housework photographs, and Wang Shi age is charmingly naive, and the slotting rose of head, which sprouts, turns over many online friends ", Two entities " Li Xiang " and " Wang Shiling " are wherein marked, it is to return to two entities that Relation extraction task, which needs complete target, Semantic relation, such as " mother and daughter ".

The result of Relation extraction is commonly used in the application scenarios such as question answering system either knowledge mapping, but is directed to relationship The training of extraction model generally requires a large amount of labeled data, and is directed to the acquisition modes of labeled data at present, due to training sample There are a large amount of noise datas in this, often lower so as to cause the precision of entity relation extraction.

Summary of the invention

The application's is designed to provide a kind of entity relation extraction method, model training method and relevant apparatus, can Promote the accuracy of entity relation extraction.

To achieve the goals above, the embodiment of the present application the technical solution adopted is as follows:

In a first aspect, the embodiment of the present application provides a kind of entity relation extraction model training method, which comprises

Obtain target training sample set and the corresponding trained entity relationship of the target training sample set, wherein the mesh Marking training sample set includes multiple training samples；

By each training sample vectorization in the multiple training sample, each training sample is obtained respectively Corresponding feature is embedded in vector；

Each feature insertion vector is extracted in the semantic information of multiple dimensions, obtains each training sample respectively Corresponding multi-dimensional semantic feature vector, wherein after the multi-dimensional semantic feature vector characterization learns semantic association vector Obtained semantic results, the language that the semantic association vector characterization obtains after learning to multiple dimensions of the training sample Adopted result；

According to all multi-dimensional semantic feature vectors, obtains the corresponding prediction entity of the target training sample set and close System；

Based on the prediction entity relationship, the semantic association vector and the trained entity relationship, the entity is updated The model parameter of Relation extraction model.

Second aspect, the embodiment of the present application provide a kind of entity relation extraction method, which comprises

Receive sample to be predicted；

The entity that the entity relation extraction model training method training provided using the embodiment of the present application first aspect is completed Relation extraction model handles the sample to be predicted, obtain the corresponding prediction of the sample to be predicted extract as a result, its In, the prediction extract result include in multiple entity relationships and the multiple entity relationship each entity relationship it is corresponding Class probability；

Using entity relationship corresponding to maximum class probability as the corresponding prediction entity relationship of the sample to be predicted.

The third aspect, the embodiment of the present application provide a kind of entity relation extraction model training apparatus, and described device includes:

First processing module, for obtaining target training sample set and the corresponding trained entity of the target training sample set Relationship, wherein the target training sample set includes multiple training samples；

The first processing module is also used to, by each training sample vectorization in the multiple training sample, Obtain the corresponding feature insertion vector of each training sample；

The first processing module is also used to, and extracts each feature insertion vector in the semantic information of multiple dimensions, Obtain the corresponding multi-dimensional semantic feature vector of each training sample, wherein the multi-dimensional semantic feature vector characterization The semantic results obtained after learning to semantic association vector, the semantic association vector characterization is to the more of the training sample The semantic results that a dimension obtains after being learnt；

The first processing module is also used to, and according to all multi-dimensional semantic feature vectors, obtains the target training The corresponding prediction entity relationship of sample set；

Parameter updating module, for being based on the prediction entity relationship, the semantic association vector and the trained entity Relationship updates the model parameter of the entity relation extraction model.

Fourth aspect, the embodiment of the present application provide a kind of entity relation extraction device, and described device includes:

Receiving module, for receiving sample to be predicted；

Second processing module, the entity relation extraction model training side for being provided using the embodiment of the present application first aspect The entity relation extraction model that method training is completed handles the sample to be predicted, and it is corresponding to obtain the sample to be predicted Result is extracted in prediction, wherein it includes each reality in multiple entity relationships and the multiple entity relationship that result is extracted in the prediction The corresponding class probability of body relationship；

The Second processing module is also used to, using entity relationship corresponding to maximum class probability as described to be predicted The corresponding prediction entity relationship of sample.

5th aspect, the embodiment of the present application provide a kind of electronic equipment, and the electronic equipment includes memory, for storing One or more programs；Processor.When one or more of programs are executed by the processor, realize that above-mentioned entity closes It is extraction model training method or entity relation extraction method.

6th aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer program, The computer program realizes above-mentioned entity relation extraction model training method or entity relation extraction side when being executed by processor Method.

A kind of entity relation extraction method, model training method and relevant apparatus provided by the embodiments of the present application, by right When multiple training samples that target training sample set includes are learnt, each training sample is carried out from multiple dimensions sufficient Study, and the semantic association vector obtained after study is handled, multi-dimensional semantic feature vector is obtained, thus again to multidimensional language Adopted feature vector is handled, and the corresponding prediction entity relationship of the target training sample set is obtained, so that being based on the prediction entity Relationship, semantic association vector and training entity relationship, update the model parameter of entity relation extraction model, compared to existing skill Art enables entity relation extraction model to learn the semantic expressiveness to training sample under different dimensions, rather than only study instruction Practice semantic expressiveness of the sample under single dimension, to can determine in conjunction with multiple dimensions of sample when making entity relation extraction Entity relationship promotes the accuracy of entity relation extraction.

To enable the above objects, features, and advantages of the application to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.

Detailed description of the invention

Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 is a kind of schematic block diagram of electronic equipment provided by the embodiments of the present application；

Fig. 2 is a kind of schematic flow chart of entity relation extraction model training method provided by the embodiments of the present application；

Fig. 3 is a kind of schematic diagram of entity relation extraction model；

Fig. 4 is a kind of schematic flow chart of the sub-step of S207 in Fig. 2；

Fig. 5 is a kind of schematic diagram of multi-dimensional semantic learning layer in Fig. 3；

Fig. 6 is a kind of schematic flow chart of the sub-step of S209 in Fig. 2；

Fig. 7 is a kind of schematic flow chart of the sub-step of S211 in Fig. 2；

Fig. 8 is a kind of schematic flow chart of the sub-step of S205 in Fig. 2；

Fig. 9 is another schematic flow chart of entity relation extraction model training method provided by the embodiments of the present application；

Figure 10 is a kind of schematic flow chart of entity relation extraction method provided by the embodiments of the present application；

Figure 11 is a kind of schematic diagram of entity relation extraction model training apparatus provided by the embodiments of the present application；

Figure 12 is a kind of schematic diagram of entity relation extraction device provided by the embodiments of the present application.

In figure: 100- electronic equipment；101- memory；102- processor；103- communication interface；400- entity relation extraction Model training apparatus；401- first processing module；402- parameter updating module；500- entity relation extraction device；501- is received Module；502- Second processing module.

Specific embodiment

To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is implemented The component of example can be arranged and be designed with a variety of different configurations.

Therefore, the detailed description of the embodiments herein provided in the accompanying drawings is not intended to limit below claimed Scope of the present application, but be merely representative of the selected embodiment of the application.Based on the embodiment in the application, this field is common Technical staff's every other embodiment obtained without creative efforts belongs to the model of the application protection It encloses.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile the application's In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.

It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

As noted previously, as the training for Relation extraction model generally requires a large amount of labeled data, and manually mark Mode need to expend the resources such as a large amount of human and material resources, and limited amount, so as to cause obtain labeled data be one more Luxurious thing；In addition, the data volume due to labeled data is less, the effect of model training is equally also limited.

Therefore, a large amount of mark sample is generally obtained by the way of remote supervisory (Distant Supervision) at present This.Wherein, remote supervisory is a kind of methods of the scheme for using knowledge base to remove aligning texts to obtain a large amount of mark samples, knowledge Library can be used for characterizing record there are two the entity relationship of entity, such as<entity 1, relationship, and entity 2>；By by sample data It is aligned with knowledge base, determines the entity relationship in sample data between two entities, to realize the mark to sample data Note obtains a large amount of mark sample.

Such as, it is assumed that a knowledge base include<Qiao Busi, founder, apple>entity relationship entry, with the entity relationship The specific practice of entry aligning texts is: as long as containing " Qiao Busi " and " apple " in some sentence, in this sentence " Qiao Busi " regards as " founder " of " apple ".

But this alignment scheme will receive the influence of noise data.For example assume that a sentence is that " Qiao Busi has eaten one Apple " equally contains " Qiao Busi " and " apple " in the sentence using the scheme of above-mentioned text justification, but it will be apparent that should " Qiao Busi " and " apple " is not the relationship of " founder " between the two in sentence.

Therefore, on the basis of above-mentioned remote supervisory, attention is added in the model of remote supervisory there are also a kind of at present The entity relation extraction scheme of Attention mechanism.Such as above-mentioned triple<Qiao Busi, founder, apple>, it is assumed that it takes out The sentence taken has " Qiao Busi has eaten an apple " and " Qiao Busi creates apple "；Obviously, only one in the two sentences It is " founder " relationship, another is not " founder " relationship；Therefore, in the remote supervisory scheme based on Attention mechanism, By using Attention mechanism in different sentences, to reduce the weight coefficient of noise data, to reduce noise data Interference；Such as in aforementioned exemplary, by the weight coefficient of reduction " Qiao Busi has eaten an apple " this sentence, thus will Reduce the influence of noise data.

But inventor has found in actual application, in the above-mentioned remote supervisory scheme based on Attention mechanism, only It is that weight coefficient is reduced to the noise data in different sentences to sentence after single dimension learns；But since study is tieed up Spend it is less it is also more rough to the weight coefficient of each sentence so that the study to sentence is not enough so that real The precision of body Relation extraction is equally lower.

Therefore, drawbacks described above, a kind of possible implementation provided by the embodiments of the present application are as follows: by instructing to target are based on When multiple training samples that white silk sample set includes are learnt, each training sample is adequately learnt from multiple dimensions, And handled according to the semantic association vector obtained after study, multi-dimensional semantic feature vector is obtained, thus again to multi-dimensional semantic Feature vector is handled, and the corresponding prediction entity relationship of the target training sample set is obtained, so that being closed based on the prediction entity System, semantic association vector and training entity relationship, update the model parameter of entity relation extraction model, so that entity relation extraction Model can learn the semantic expressiveness to training sample under different dimensions.

With reference to the accompanying drawing, it elaborates to some embodiments of the application.In the absence of conflict, following Feature in embodiment and embodiment can be combined with each other.

Referring to Fig. 1, Fig. 1 is a kind of schematic block diagram of electronic equipment 100 provided by the embodiments of the present application, the electricity Sub- equipment 100 can be used as trained entity relation extraction model, to realize entity relation extraction model provided by the embodiments of the present application The equipment of training method, or operation have the entity relationship completed using entity relation extraction model training method training to take out Modulus type, the equipment to realize entity relation extraction method provided by the embodiments of the present application, such as mobile phone, PC (personal computer, PC), tablet computer, server etc..

Wherein, electronic equipment 100 includes memory 101, processor 102 and communication interface 103, the memory 101, processing Device 102 and communication interface 103 are directly or indirectly electrically connected between each other, to realize the transmission or interaction of data.For example, this A little elements can be realized by one or more communication bus or signal wire be electrically connected between each other.

Memory 101 can be used for storing software program and module, such as entity relation extraction mould provided by the embodiments of the present application The either corresponding program instruction/module of entity relation extraction device 500 of type training device 400, processor 102 are deposited by executing Software program and module in memory 101 are stored up, thereby executing various function application and data processing.The communication interface 103 can be used for carrying out the communication of signaling or data with other node devices.

Wherein, memory 101 can be but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..

Processor 102 can be a kind of IC chip, have signal handling capacity.The processor 102 can be logical With processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.；It can also be digital signal processor (Digital Signal Processing, DSP), dedicated collection At circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components.

It is appreciated that structure shown in FIG. 1 is only to illustrate, electronic equipment 100 can also include it is more than shown in Fig. 1 or The less component of person, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can using hardware, software or A combination thereof is realized.

Below using electronic equipment 100 shown in FIG. 1 as schematic executing subject, to entity provided by the embodiments of the present application Relation extraction training method and entity relation extraction method further illustrate.

Referring to Fig. 2, Fig. 2 is a kind of schematic of entity relation extraction model training method provided by the embodiments of the present application Flow chart, comprising the following steps:

S203 obtains target training sample set and the corresponding trained entity relationship of target training sample set；

Each training sample vectorization in multiple training samples it is corresponding to be obtained each training sample by S205 Feature is embedded in vector；

S207 extracts each feature insertion vector in the semantic information of multiple dimensions, it is respectively right to obtain each training sample The multi-dimensional semantic feature vector answered；

S209 obtains the corresponding prediction entity relationship of target training sample set according to all multi-dimensional semantic feature vectors；

S211 updates entity relation extraction model based on prediction entity relationship, semantic association vector and training entity relationship Model parameter.

Obtaining target training sample set in the embodiment of the present application includes multiple training samples, in addition, the embodiment of the present application The entity relation extraction model trained may include multiple sample input interfaces, and each sample input interface can correspond to input one A training sample, the training which can input simultaneously multiple training samples to carry out model.

Wherein, in certain primary specific training process, the target training sample set of acquisition is corresponding with trained entity and closes System, the target training sample concentrate all training samples for including to correspond to the training entity relationship.

For example, can be based on the mode of remote supervisory, such as triple < Qiao Busi in above-mentioned example, founder, apple >, in a large amount of training sample, k training sample being aligned with the triple is obtained, by k training sample obtained Set is used as the target training sample set；And assume that k training sample is expressed as x₁、x₂、…、x_k, this k training sample The same trained entity relationship " founder " is corresponded to jointly；Wherein, it when being trained to model, needs k training once Sample is input in entity relation extraction model, i.e., the primary training of entity relation extraction model is completed using k training sample Process.

In the embodiment of the present application, the multiple training samples for including are concentrated for target training sample, needs multiple instructions Practice each training sample vectorization in sample, obtains the corresponding feature insertion vector of each training sample；Such as it is above-mentioned " Qiao Busi has eaten an apple " and " Qiao Busi creates apple " in example, needs two sentences carrying out vectorization, Obtain the corresponding feature insertion vector of two sentences.

Wherein, the entity relation extraction model that the embodiment of the present application is trained may exist multiple network result formats.Show Example property, referring to Fig. 3, Fig. 3 is a kind of schematic diagram of entity relation extraction model, which can To include embeding layer (embedding layer), multi-dimensional semantic learning layer (Multi-dimensional semantic Learning layer), attention layer (sentence-level attention layer) and feedforward neural network between sentence Layer (feed-forward layer), embeding layer can be used for executing S205, the multiple trained samples for including by target training sample set The equal vectorization of each training sample in this, to obtain the corresponding feature insertion vector of each training sample.

It is worth noting that generally when obtaining a large amount of training samples to model progress batch training needing that batch ruler is arranged Very little batch_size, so that the dimension of the feature insertion vector of model treatment is identical.

But in actual application scenarios, the size of the training sample of acquisition may be different with batch_size；Than Such as assume that batch_size is set as n × 20, characteristic feature insertion vector is the matrix of a n row 20 column, corresponding sentence Size includes 20 words；But in such as above-mentioned example, " Qiao Busi has eaten an apple " only includes 9 words, " Qiao Busi wound Build apple " 8 words are then only contained, this is not consistent with the size of batch_size.

Therefore, as a kind of possible implementation, all features can be embedded in the batch_ of vector and setting Size alignment；Such as in the examples described above, batch_size is set as n × 20, " Qiao Busi has eaten an apple " corresponding spy The dimension of sign insertion vector script is n × 9, and the dimension of " Qiao Busi creates apple " corresponding feature insertion vector script is n ×8；11 column elements default in " Qiao Busi has eaten an apple " corresponding feature insertion vector can be then set to 0 and 12 column elements default in " Qiao Busi creates apple " corresponding feature insertion vector are set to 0, to keep two sentences each The dimension of self-corresponding feature insertion vector is n × 20.

Illustratively, it is assumed that entity relation extraction mode input shown in Fig. 3 has k training sample for training, if instruction Practice in sample (x, y), it is assumed that the value that entity relationship y can be obtained shares n_classes kind, it is possible to have n_classes kind real Body relationship, then y ∈ { y¹,y²,…,y^n_classes}；And assume that the corresponding feature of each training sample of embeding layer output is embedding Incoming vector is respectively e₁₁、e₁₂、…、e_1n, e₂₁、e₂₂、…、e_2n..., e_k1、e_k2、…、e_kn；e_ijDimension be h.

Vector is embedded according to the corresponding feature of each training sample obtained after S205 vectorization as a result, is executed S207 extracts each feature insertion vector in the semantic information of multiple dimensions, so that it is corresponding more to obtain each training sample Tie up semantic feature vector l_i, i=1,2 ..., k.

For example, the feature of embeding layer output is embedded in vector e in entity relation extraction model as shown in Figure 3₁₁、 e₁₂、…、e_1n, e₂₁、e₂₂、…、e_2n..., e_k1、e_k2、…、e_knIt is input to the semanteme that multi-dimensional semantic learning layer extracts multiple dimensions Information, to export the corresponding multi-dimensional semantic feature vector l of each training sample_i。

Wherein, each feature insertion vector is being extracted in the semantic information of multiple dimensions, it can be to each training sample Learnt in multiple dimensions, obtains the semanteme of the semantic results obtained after characterization learns multiple dimensions of training sample Interconnection vector, so that training sample can adequately be learnt from multiple dimensions when learning to training sample, and It is not limited only to the association between different sentences in single dimension；Then all semantic association vectors are learnt again, it will be to all The semantic results that semantic association vector obtains after being learnt, as the corresponding multi-dimensional semantic feature vector of each training sample.

As a result, according to S207 all multi-dimensional semantic feature vector l obtained_i, S209 is executed, using for example shown in Fig. 3 Attention layer and Feedforward Neural Networks network layers are to all multi-dimensional semantic features between sentence in entity relation extraction prototype network structure Vector l_iIt is handled, to obtain the corresponding prediction entity relationship of the target training sample set；And then S211 is executed, being based on should It predicts entity relationship, semantic association vector and training entity relationship, updates the model parameter of entity relation extraction model, from And realize the training to the entity relation extraction model.

As it can be seen that being based on above-mentioned design, a kind of entity relation extraction model training method provided by the embodiments of the present application passes through When learning to multiple training samples that target training sample set includes, each training sample is carried out from multiple dimensions abundant Study, and the semantic association vector obtained after study is handled, multi-dimensional semantic feature vector is obtained, thus again to multidimensional Semantic feature vector is handled, and the corresponding prediction entity relationship of the target training sample set is obtained, so that real based on the prediction Body relationship, semantic association vector and training entity relationship, update the model parameter of entity relation extraction model, compared to existing skill Art enables entity relation extraction model to learn the semantic expressiveness to training sample under different dimensions, rather than only study instruction Practice semantic expressiveness of the sample under single dimension, to can determine in conjunction with multiple dimensions of sample when making entity relation extraction Entity relationship promotes the accuracy of entity relation extraction.

It is worth noting that as a kind of possible implementation, above-mentioned S203, S205, S207 and S209 be can use The structure sheaf realization being had by oneself in entity relation extraction model, such as in entity relation extraction model as shown in Figure 3, Ke Yiyou Embeding layer realizes that S203 and S205, multi-dimensional semantic learning layer realize attention layer and Feedforward Neural Networks network layers phase between S207, sentence S209 is realized in cooperation；And in some other possible application scenarios of the embodiment of the present application, above-mentioned S203, S205, S207 and S209 can also be walked using other Implement of Function Module, such as using the process of S203, S205, S207 and S209 as pretreatment Suddenly, finally again with obtaining as a result, updating the model parameter of entity relation extraction model；The embodiment of the present application to execute S203, The functional module of S205, S207 and S209 and the relationship of entity relation extraction model and without limitation, as long as being able to achieve update reality The model parameter of body Relation extraction model.

In addition, referring to Fig. 4, Fig. 4 is a kind of schematic flow chart of the sub-step of S207 in Fig. 2, making to realize S207 For a kind of possible implementation, one of training sample in the multiple training samples for including is concentrated with target training sample For target training sample, S207 may include following sub-step:

S207-1 is embedded in vector according to the corresponding target signature of target training sample, it is corresponding to obtain target training sample Median feature vector；

S207-2, the study for carrying out multiple dimensions to median feature vector are handled, and obtain the corresponding mesh of target training sample Poster justice interconnection vector；

S207-3 obtains the corresponding target of target training sample according to target semanteme interconnection vector and median feature vector Multi-dimensional semantic feature vector.

When executing S207, the multi-dimensional semantic learning layer in Fig. 3 can have a variety of structures form, illustratively, please refer to Fig. 5, Fig. 5 are a kind of schematic diagram of multi-dimensional semantic learning layer in Fig. 3, as a kind of possible implementation, in Fig. 3 Multi-dimensional semantic learning layer can be based on BiLSTM (Bi-directional Long Short-Term Memory, two-way shot and long term Memory network), 2 dimension-Attention mechanism and feedforward neural network building.

In the network structure based on multi-dimensional semantic learning layer as shown in Figure 5, below with first in k training sample For a training sample is as target training sample, the calculating process of S207 is illustrated.

When executing S207-1, it is assumed that the corresponding target signature insertion vector of first training sample is expressed as e₁₁、 e₁₂、…、e_1n, which is input in BiLSTM, by BiLSTM to the contextual information of target training sample into Row study, and using the semantic results obtained after study as the corresponding median feature vector of target training sample；It is assumed that The vector of BiLSTM output is expressed as u₁₁、u₁₂、…、u_1n, and the dimension of LSTM output is h, then has:

u_1t=BiLSTM (u_1t-1,e_1t) ... t=1,2 ..., n；

In formula, u_t∈R^2h。

Then for first entire sentence of training sample, obtained median feature vector U is indicated are as follows:

U=[u₁₁,u₁₂,…,u_1n]；

In formula, U ∈ R^n×2h, [] indicates the union operation of vector, such as a=(1,2,3), b=(4,5,6), then [a, b]= (1,2,3,4,5,6)。

And when executing S207-2, it can use and median feature vector is carried out such as 2 dimension-Attention mechanism in Fig. 5 The study of multiple dimensions is handled, to obtain the corresponding target semanteme interconnection vector of target training sample, learns the process of processing Following formula can be met:

A=soft max (W₂ tanh(W₁U^T))；

In formula, A indicates target semanteme interconnection vector, A ∈ R^r×n, W₁∈R^d×2h, W₂∈R^r×d, W₁And W₂It is to need to learn Parameter.

Wherein, it should be noted that the 2 dimension-Attention mechanism utilized in S207-2, due to needing learning objective to instruct Practice the semantic expressiveness of sample vector different dimensions, therefore, the process object median feature vector U in S207-2 is to include N row The matrix of feature, N are the integer greater than 1.

As a result, when executing S207-2, study processing is respectively carried out to every a line subcharacter of median feature vector U, and By all row subcharacters, respectively treated that result merges, and obtains target semanteme interconnection vector.

For example the median feature vector U, U obtained for above-mentioned calculating is the matrix of n row 2h column, then is executing S207-2 When, study processing is respectively carried out using all elements that the every a line of Attention mechanism to median feature vector U includes, is obtained To the processing result of each row element；Then the result obtained after each self study processing of n row element is merged, merging obtains Result be then target semanteme interconnection vector A, wherein each row element in target semanteme interconnection vector A indicates a dimension The semantic results of lower study.

In addition, the target semanteme interconnection vector A obtained first with S207-2 obtains S207-1 when executing S207-3 Median feature vector U further learnt, than the m in structure as shown in Fig. 5, in Fig. 5_rFor the r row element of M, Characterize the semantic results learnt under r-th of dimension；Then using M as the input of feedforward neural network, through feedforward neural network into After row processing, target multidimensional semantic feature vector is obtained；It is assumed that the output vector of feedforward neural network is expressed as l₁, then:

M=AU；

l₁=Wflatten (M)+b；

In formula, M ∈ R^r×2h, flatten function characterization evens up matrix, i.e. dimensionality reduction, flatten (M) ∈ R^2hr×1, characterization It needs dimension to be R^r×2hMatrix be reduced to R^2hr×1；W∈R^L×2hr, b ∈ R^L, L is the output dimension of feedforward neural network, W and b It is the parameter for needing to learn.

Each training sample in k training sample is regard as target training sample as a result, by executing above-mentioned S207 All sub-steps, obtain the corresponding multi-dimensional semantic feature vector l of k training sample₁、l₂、…、l_k。

Wherein, it should be noted that Fig. 5 is only to illustrate, and a kind of structure of multi-dimensional semantic learning layer is enumerated, in the application In some other possible application scenarios of embodiment, multi-dimensional semantic learning layer can also exist for other structures, such as exemplary Ground is also based on BiLSTM building multi-dimensional semantic learning layer, is carried out by multiple BiLSTM to training sample repetitious Training, being capable of deeper semantic information under learning training sample different dimensions；The embodiment of the present application learns multi-dimensional semantic The structure of layer is without limiting, as long as can learn the semantic information to training sample in multiple dimensions.

In addition, the multi-dimensional semantic feature vector l based on above-mentioned acquisition₁、l₂、…、l_k, to realize S209, referring to Fig. 6, figure 6 for the sub-step of S209 in Fig. 2 a kind of schematic flow chart, as a kind of possible implementation, S209 may include with Lower sub-step:

S209-1 handles all multi-dimensional semantic feature vectors using attention Attention mechanism, obtains target The corresponding attention feature vector of training sample set；

S209-2 is based on attention feature vector, obtains the corresponding prediction entity relationship of target training sample set.

It, can be by attention layer and feedforward neural network between sentence in entity relation extraction model for example shown in Fig. 3 Both layers match, to realize the calculating process of S209.

Wherein, attention layer can be used for the step of realizing S209-1 between sentence, between sentence the purpose of attention layer be k Real sample in training sample distributes biggish weight coefficient, and is the lesser power of noisy samples distribution of k training sample Weight coefficient, the influence that noise data generates when reducing entity relation extraction training.

It is assumed that the attention feature vector that attention layer exports between sentence is expressed as v, then have:

In formula,Indicate the point multiplication operation of corresponding position, w_aAnd w_bIt is to need the parameter that learns, and w_a∈R^1×L, w_b∈ R^L×1, v ∈ R^1×L。

Feedforward Neural Networks network layers can be used for the step of realizing S209-2, paying attention between Feedforward Neural Networks network layers distich as a result, When the attention feature vector v of power layer output carries out output study, it is assumed that obtained output vector is expressed as o, then:

O=soft max (W_ov+b_o)；

In formula, W_oAnd b_oIt is the parameter for needing to learn, W_o∈R^L×n_classes, b_o∈R^n_classes；o∈R^n_classes, and o_j =p (y=y^j| x), j=1,2 ..., n_classes, i.e. o_jIndicate that the corresponding entity relationship of training sample x is y^jProbability.

In addition, referring to Fig. 7, Fig. 7 is a kind of schematic flow chart of the sub-step of S211 in Fig. 2, making to realize S211 For a kind of possible implementation, S211 may include following sub-step:

S211-1 is obtained each in multiple dimensions based on prediction entity relationship, semantic association vector and training entity relationship Loss function value under dimension；

S211-2 updates the model parameter of entity relation extraction model according to the sum of the loss function value under all dimensions.

As it appears from the above, what the semantic association vector in the embodiment of the present application characterized is carried out to multiple dimensions of training sample The semantic results obtained after study；Therefore, when updating the model parameter of entity relation extraction model, it can use semanteme pass Connection, the semantic results obtained after being learnt under different dimensions according to training sample calculate loss function.

It illustratively, can be based on prediction entity relationship, semanteme when updating the model parameter of entity relation extraction model Interconnection vector and training entity relationship, calculate the loss function value obtained in multiple dimensions under every dimension, then according to institute Have the sum of the loss function value under dimension, using such as gradient optimization algorithm etc., the loss function value that minimizes it With, thus realize update entity relation extraction model model parameter.

Illustratively, the embodiment of the present application can construct loss function based on cross entropy, therefore, as a kind of possible reality Existing mode, for giving sample (x_i,y_i), i=1,2 ..., N, the sum of the loss function value under all dimensions can satisfy as follows Formula:

In formula, p (y_i|x_i) indicate training sample x_iEntity relationship be y_iProbability, | | | |_FIndicate not Luo Beini crow this Frobenius norm, α are the adjustment factor of setting, and A indicates that semantic association vector, I indicate unit matrix.

Also, optionally, to realize S205, referring to Fig. 8, one kind that Fig. 8 is the sub-step of S205 in Fig. 2 schematically flows Cheng Tu includes two entities in each training sample, equally with target training sample set as a kind of possible implementation Including multiple training samples in one of training sample as target training sample for, S205 may include following son Step:

S205-1 obtains first instance and second instance in target training sample；

It is respectively right to obtain each corresponding word vector of text and each text in target training sample by S205-2 The first position insertion vector sum second position insertion vector answered；

The corresponding word vector of each text, first position insertion vector and the second position are embedded in vector by S205-3 It merges, obtains the corresponding word insertion vector of each text；

The corresponding word insertion vector of texts all in target training sample is merged, obtains target by S205-4 The corresponding feature of training sample is embedded in vector.

In the application scenarios of entity relation extraction, since each word in sentence is between the entity relationship two entities There are contributions, and word generally closer apart from entity is bigger to the contribution of entity relationship, for this purpose, the embodiment of the present application uses character The mode vectorization training sample of insertion and position insertion, to obtain the corresponding feature insertion vector of each training sample.

Illustratively, in vectorization target training sample, can first obtain first instance in target training sample and Second instance.

Wherein, the mode for obtaining first instance and second instance, can be realized based on the mode of remote supervisory；For example, real Body Relation extraction model input obtained are as follows: x=(" Qiao Busi ", " apple ", sentence 1, sentence 2 ..., sentence k), vectorization When target training sample, " Qiao Busi " and " apple for including in target training sample can be compared by way of text justification Fruit ", to obtain first instance and second instance.

Then, each text in target training sample is converted into corresponding word vector, and is instructed according to target Practice the first instance and second instance in sample, obtains each text corresponding first position insertion vector sum second position It is embedded in vector, wherein first position insertion vector characterizes the relative position distance of each text and first instance, and the second position is embedding Incoming vector characterizes the relative position distance of each text and second instance.

It wherein, illustratively, can when being corresponding word vector by each text conversion in target training sample It is realized in a manner of using feature vector table is stored in electronic equipment；Specifically, electronic equipment in store this feature to What scale characterized is the set of multiple word vectors, and the set of each column all elements in feature vector table represents a word； By comparing this feature vector table, the corresponding word vector of each text in target training sample can be obtained.

In addition, when obtaining each text corresponding first position insertion vector sum second position insertion vector, Ke Yitong Cross the position subscript for each word in sentence relative to first instance and second instance, calculate separately out each text respectively away from With a distance from the first position of first instance and each text second position distance apart from second instance respectively, then basis The mode for searching position insertion vector table, respectively by first position distance and second position distance vector, to obtain each The text corresponding first position insertion vector sum second position is embedded in vector；Wherein, position, which is embedded in the effect of vector table, is equally Each positional distance, which is converted to corresponding vector, to be indicated, is embedded in vector table by comparing the position, position is embedded in vector table The all elements of middle respective column are embedded in vector as position corresponding to positional distance.

For example, the position subscript of each word is respectively in above-mentioned example sentence " Qiao Busi has eaten an apple " " Qiao/0 cloth/1 this/2 eat/3/4 one/5/6 apples/7 fruit/8 ", it is assumed that pos1 and pos2 respectively indicates first instance " Qiao Busi " Position subscript and second instance " apple " position subscript, then pos1=0, pos2=7；Further, it is assumed that each in the sentence The location information of text is expressed as position=(I₁,I₂,…,I_n), then the location information of all texts is pos=((I₁- pos1,I₁-pos2),(I₂-pos1,I₂-pos2),…,(I_n-pos1,I_n-pos2))；Such as in above-mentioned example sentence, " Qiao " Location information be expressed as (0, -7), the location information of " one " is expressed as (4, -3).

In addition, to avoid occurring negative value in location information, it can be by the location information positive number of all texts, for example be every A numerical value adds the value of a setting, such as 10, then in above-mentioned example, the updating location information of " Qiao " is (10,3), " one " Updating location information be (14,7).

The corresponding word vector of text each in the target training sample of acquisition, first position are embedded in vector as a result, And the second position insertion vector merge, thus obtain each corresponding word of text in target training sample be embedded in Amount.

Wherein, it should be noted that it is assumed that the vector dimension of the corresponding word vector of each word is dim1, position when word is embedded in Setting corresponding vector dimension when insertion is dim2, then the vector dimension of the corresponding word insertion vector of final each word is dim1+2* dim2；Such as, it is assumed that the vector dimension of word vector is 100, that is, includes 100 elements, position corresponding vector dimension when being embedded in It is 4, that is, utilizes 4 element representation positional distances, then the vector dimension of the corresponding word insertion vector of final each word is then 108.

Vector is embedded in based on the corresponding word of text each in target training sample obtained as a result, it will be all Word insertion vector merges, to obtain the corresponding feature insertion vector of target training sample.

For example, for using first training sample in k training sample as target training sample, if first training Each corresponding word insertion vector of word is respectively e in sample₁₁、e₁₂、…、e_1n, then the corresponding feature of first training sample Insertion vector is expressed as [e₁₁,e₁₂,…,e_1n]。

It should be noted that it is above-mentioned only by taking a target training sample set as an example, it details and is mentioned using the embodiment of the present application The entity relation extraction model training method of confession carries out once trained process and generally requires more in actual application scenarios A training sample set repeatedly trains entity relation extraction model, until entity relation extraction model meets the convergence of setting Condition.

For this purpose, optionally, on the basis of process step shown in Fig. 2, referring to Fig. 9, Fig. 9 mentions for the embodiment of the present application Another schematic flow chart of the entity relation extraction model training method of confession is being held as a kind of possible implementation Before row S203, the entity relation extraction method is further comprising the steps of:

S201 is obtained and multiple entity relationships multiple training sample sets correspondingly；

S202, one of training sample set that multiple training samples are concentrated is as target training sample set.

In the embodiment of the present application, a large amount of instruction can be obtained for example, by modes such as web crawlers (web crawler) Practice sample, then again by the way of such as remote supervisory, using multiple triples (such as above-mentioned < Qiao Busi, founder, Apple >), by way of text justification, obtain and multiple entity relationships multiple training sample sets correspondingly；For example assume 10 triples are previously provided with, 100,000 sentences have been obtained by way of web crawlers as training sample, have passed through text The sentence that the same triple is aligned in 100,000 sentences is classified as a training sample set by the mode of alignment, and so on, Thus according to acquisition in 100,000 sentences and one-to-one 10 training sample sets of 10 triples, and each training sample set Include multiple training samples.

Then, the one of training sample set multiple training samples of acquisition concentrated as target training sample set, And then using target training sample set as the input of entity relation extraction model, the step of above-mentioned S203~S211 is successively executed Suddenly, the training process of an entity relation extraction model is completed.

Then, by way of S212 in execution such as Fig. 9, judge whether the entity relation extraction model has met and set Whether the fixed condition of convergence, such as trained number reach the threshold value of setting, or twice in succession undated parameter when parameter Changing value is less than the threshold value of setting or error is less than the threshold value etc. set；If the entity relation extraction model does not meet The condition of convergence of setting then returns to S202 and continues to execute the entity relation extraction model training method, to continue to update the entity The model parameter of Relation extraction model, until determining that the entity relation extraction model meets the convergence of setting when executing S212 Condition completes the training of the entity relation extraction model.

Utilize the reality obtained after the completion of above-mentioned entity relation extraction model training method training provided by the embodiments of the present application Body Relation extraction model, can apply in plurality of application scenes such as intelligent Answer System, entity relation extraction, text classifications.

For example, illustratively, below to utilize the entity of above-mentioned entity relation extraction model training method training completion Relation extraction model is applied for the application scenarios of entity relation extraction, a kind of entity passes other to the embodiment of the present application It is that abstracting method carries out entity explanation.

Referring to Fig. 10, Figure 10 is a kind of schematic flow of entity relation extraction method provided by the embodiments of the present application Figure, comprising the following steps:

S301 receives sample to be predicted；

S303 is treated pre- using the entity relation extraction model that the training of above-mentioned entity relation extraction model training method is completed Test sample is originally handled, and is obtained the corresponding prediction of sample to be predicted and is extracted result；

S305 is closed entity relationship corresponding to maximum class probability as the corresponding prediction entity of sample to be predicted System.

In the embodiment of the present application, it can be instructed using received sample to be predicted as using above-mentioned entity relation extraction model Practice the input for the entity relation extraction model that method training is completed, the entity relation extraction model completed by training is to be predicted to this Sample is handled, so that obtaining the corresponding prediction of the sample to be predicted extracts result, wherein the corresponding prediction of sample to be predicted Extracting result includes each corresponding class probability of entity relationship in multiple entity relationships and multiple entity relationships.

The corresponding class probability of each entity relationship is utilized as a result, it can be by the corresponding reality of maximum class probability Body relationship is as the corresponding prediction entity relationship of the sample to be predicted.

Wherein, entity relation extraction can be applied in multiple fields, such as in the medical field, can to electronic health record into Row entity relation extraction, to establish medical knowledge map；Either in agriculture field, Agricultural Information text can be carried out real Body Relation extraction, to establish agricultural knowledge map.

For example in agriculture field, Agricultural Information data have had very big scale at this stage, obtain a large amount of valuable Agricultural data becomes easier to, and still, most of Agricultural Information data of acquisition are unstructured text data or half hitch Structure text data, it is difficult to effectively directly utilize, must need to be further understood from and screen.

And it utilizes entity relation extraction that will quickly can extract in unstructured or semi-structured natural language text and ties Structure information, and the data of structuring are stored, it convenient for inquiry and obtains, such as the agriculture that user can be established by inquiry Industry knowledge mapping, agricultural knowledge required for obtaining.For example, " corn is also named maize to given sentence, is important cereal crops And forage crop ", wherein " corn " and " maize " presentation-entity, and be two entities known, that entity relation extraction returns Entity relationship classification be " alias "；Entity relationship based on two entities that entity relation extraction is returned, which is added It adds in agricultural knowledge map, so that user can learn the nickname of " corn ", or obtain by inquiring agricultural knowledge map The scientific name for knowing " maize " is " corn ".

Wherein, illustratively, it is assumed that the entity relationship of Agricultural Information text is defined as 8 classes, comprising: alias, source area, Ingredient, subclass, honorary title, value, classification grade and other, below with specific Agricultural Information text " during rice originates in State and India " further illustrates entity relation extraction method provided by the embodiments of the present application as sample to be predicted.

It is available to be somebody's turn to do when being handled using the entity relation extraction model that training is completed above-mentioned Agricultural Information text The corresponding prediction of Agricultural Information text is extracted as a result, it includes multiple agriculture entity relationships and multiple agriculturals that result is extracted in the prediction The corresponding class probability of each agricultural entity relationship in entity relationship.

If the 8 class entity relationships that result includes above-mentioned example are extracted in the corresponding prediction of the Agricultural Information text, it is used as one Kind of example, the corresponding prediction extraction result of the Agricultural Information text can be with are as follows:

P (alias)=0.03；

P (source area)=0.7；

P (ingredient)=0.04；

P (subclass)=0.07；

P (honorary title)=0.03；

P (value)=0.08；

P (classification grade)=0.02；

P (other)=0.03；

Wherein, P (source area)=0.7 is maximum, then can regard corresponding agriculture entity relationship " source area " as the agricultural The corresponding prediction entity relationship of information text；That is, by the entity relation extraction model prediction, " rice is native to China And India " the corresponding prediction entity relationship of this Agricultural Information text be " source area ".

Thus, it is possible to update the corresponding relationship of obtained Agricultural Information text and prediction entity relationship in agricultural knowledge figure In spectrum, by allow user by retrieval entity key in a manner of, inquire agricultural knowledge map, to assist user to quickly understand Required agricultural knowledge.

Based on inventive concept identical with above-mentioned entity relation extraction model training method provided by the embodiments of the present application, ask 1, Figure 11 is a kind of schematic structure of entity relation extraction model training apparatus 400 provided by the embodiments of the present application refering to fig. 1 Figure, the entity relation extraction model training apparatus 400 include first processing module 401 and parameter updating module 402.

First processing module 401 is closed for obtaining target training sample set and the corresponding trained entity of target training sample set System, wherein target training sample set includes multiple training samples；

First processing module 401 is also used to, and by each training sample vectorization in multiple training samples, obtains each instruction Practice the corresponding feature of sample and is embedded in vector；

First processing module 401 is also used to, and is extracted each feature insertion vector in the semantic information of multiple dimensions, is obtained every The corresponding multi-dimensional semantic feature vector of one training sample, wherein multi-dimensional semantic feature vector is characterized to semantic association vector The semantic results obtained after being learnt, what semantic association vector characterization obtained after learning to multiple dimensions of training sample Semantic results；

First processing module 401 is also used to, and according to all multi-dimensional semantic feature vectors, it is corresponding to obtain target training sample set Prediction entity relationship；

Parameter updating module 402, for updating based on prediction entity relationship, semantic association vector and training entity relationship The model parameter of entity relation extraction model.

Wherein, for convenience and simplicity of description, the specific works mistake of above-mentioned entity relation extraction model training apparatus 400 Journey, please refers in aforementioned corresponding entity relation extraction model training method corresponding step, and the embodiment of the present application is herein no longer It is repeated.

In addition, please be join based on inventive concept identical with above-mentioned entity relation extraction method provided by the embodiments of the present application Figure 12 is read, Figure 12 is a kind of schematic diagram of entity relation extraction device 500 provided by the embodiments of the present application, which closes It is that draw-out device 500 includes receiving module 501 and Second processing module 502.

Receiving module 501 is for receiving sample to be predicted；

Second processing module 502 is used for the entity relationship completed using the training of above-mentioned entity relation extraction model training method Extraction model is treated forecast sample and is handled, and obtains the corresponding prediction of sample to be predicted and extracts result, wherein knot is extracted in prediction Fruit includes each corresponding class probability of entity relationship in multiple entity relationships and multiple entity relationships；

Second processing module 502 is also used to, using entity relationship corresponding to maximum class probability as sample to be predicted Corresponding prediction entity relationship.

Wherein, for convenience and simplicity of description, the specific work process of above-mentioned entity relation extraction device 500, please refers to Corresponding step, the embodiment of the present application are no longer repeated herein in aforementioned corresponding entity relation extraction embodiment of the method.

In embodiment provided herein, it should be understood that disclosed device and method, it can also be by other Mode realize.The apparatus embodiments described above are merely exemplary, for example, the flow chart and block diagram in attached drawing are shown According to the device of the embodiment of the present application, the architecture, function and operation in the cards of method and computer program product. In this regard, each box in flowchart or block diagram can represent a part of a module, section or code, the mould A part of block, program segment or code includes one or more executable instructions for implementing the specified logical function.

It should also be noted that function marked in the box can also be with difference in some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can also execute in the opposite order, this depends on the function involved.

It is also noted that each box in block diagram and or flow chart and the box in block diagram and or flow chart Combination, can the dedicated hardware based system of as defined in executing function or movement realize, or can be with dedicated The combination of hardware and computer instruction is realized.

In addition, each functional module in the embodiment of the present application can integrate one independent part of formation together, It can be modules individualism, an independent part can also be integrated to form with two or more modules.

It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) execute the embodiment of the present application the method all or part of the steps.And it is preceding The storage medium stated includes: that USB flash disk, mobile hard disk, read-only memory, random access memory, magnetic or disk etc. are various can To store the medium of program code.

In conclusion a kind of entity relation extraction method provided by the embodiments of the present application, model training method and related dress It sets, when by learning to multiple training samples that target training sample set includes, from multiple dimensions to each training sample Adequately learnt, and the semantic association vector obtained after study is handled, obtains multi-dimensional semantic feature vector, thus Multi-dimensional semantic feature vector is handled again, the corresponding prediction entity relationship of the target training sample set is obtained, so that being based on The prediction entity relationship, semantic association vector and training entity relationship, update the model parameter of entity relation extraction model, compare In the prior art, entity relation extraction model is enabled to learn the semantic expressiveness to training sample under different dimensions, rather than Only semantic expressiveness of the learning training sample under single dimension, thus when making entity relation extraction, it can be in conjunction with the multiple of sample Dimension determines entity relationship, promotes the accuracy of entity relation extraction.

Also, the entity relation extraction model also completed using training carries out the pumping of entity relationship to Agricultural Information text It takes, so that the corresponding prediction entity relationship of Agricultural Information text is obtained, so that user can pass through the side of retrieval entity key Formula, auxiliary user quickly understand required agricultural knowledge.

The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.

It is obvious to a person skilled in the art that the application is not limited to the details of above-mentioned exemplary embodiment, Er Qie In the case where without departing substantially from spirit herein or essential characteristic, the application can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and scope of the present application is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included in the application.Any reference signs in the claims should not be construed as limiting the involved claims.

Claims

1. a kind of entity relation extraction model training method, which is characterized in that the described method includes:

Obtain target training sample set and the corresponding trained entity relationship of the target training sample set, wherein the target instruction Practicing sample set includes multiple training samples；

By each training sample vectorization in the multiple training sample, obtains each training sample and respectively correspond to Feature be embedded in vector；

Each feature insertion vector is extracted in the semantic information of multiple dimensions, each training sample is obtained and respectively corresponds to Multi-dimensional semantic feature vector, wherein multi-dimensional semantic feature vector characterization obtains after learning to semantic association vector Semantic results, the semantic knot that semantic association vector characterization obtains after learning to multiple dimensions of the training sample Fruit；

According to all multi-dimensional semantic feature vectors, the corresponding prediction entity relationship of the target training sample set is obtained；

Based on the prediction entity relationship, the semantic association vector and the trained entity relationship, the entity relationship is updated The model parameter of extraction model.

2. the method as described in claim 1, which is characterized in that extract each feature insertion vector in the language of multiple dimensions Adopted information, the step of obtaining each training sample corresponding multi-dimensional semantic feature vector, comprising:

It is embedded in vector according to the corresponding target signature of target training sample, obtains the corresponding intermediate features of the target training sample Vector, wherein what the median feature vector characterization obtained after learning to the contextual information of the target training sample Semantic results, the target training sample be the target training sample set include multiple training samples in one of；

The study processing that multiple dimensions are carried out to the median feature vector, obtains the corresponding target language of the target training sample Adopted interconnection vector；

According to the target semanteme interconnection vector and the median feature vector, the corresponding target of the target training sample is obtained Multi-dimensional semantic feature vector.

3. method according to claim 2, which is characterized in that the median feature vector is the matrix comprising N row subcharacter, N is the integer greater than 1；

The study processing that multiple dimensions are carried out to the median feature vector, obtains the corresponding target language of the target training sample The step of adopted interconnection vector, including；

Study processing is respectively carried out to every a line subcharacter of the median feature vector, and all row subcharacters are respectively handled Result afterwards merges, and obtains the target semanteme interconnection vector.

4. the method as described in claim 1, which is characterized in that be based on the prediction entity relationship, the semantic association vector And the trained entity relationship, the step of updating the model parameter of the entity relation extraction model, comprising:

Based on the prediction entity relationship, the semantic association vector and the trained entity relationship, the multiple dimension is obtained In loss function value under every dimension；

According to the sum of the loss function value under all dimensions, the model parameter of the entity relation extraction model is updated.

5. method as claimed in claim 4, which is characterized in that the sum of loss function value under all dimensions meets following public Formula:

In formula, p (y_i|x_i) indicate training sample x_iEntity relationship be y_iProbability, | | | |_FIndicate not Luo Beini crow this Frobenius norm, α are the adjustment factor of setting, and A indicates that the semantic association vector, I indicate unit matrix.

6. the method as described in claim 1, which is characterized in that according to all multi-dimensional semantic feature vectors, described in acquisition The step of target training sample set corresponding prediction entity relationship, comprising:

All multi-dimensional semantic feature vectors are handled using attention Attention mechanism, obtain the target instruction Practice the corresponding attention feature vector of sample set；

Based on the attention feature vector, the corresponding prediction entity relationship of the target training sample set is obtained.

7. the method as described in claim 1, which is characterized in that include two entities in each training sample；

By each training sample vectorization in the multiple training sample, obtains each training sample and respectively correspond to Feature be embedded in vector the step of, comprising:

Obtain the first instance and second instance in target training sample, wherein the target training sample is target instruction One of in multiple training samples that white silk sample set includes；

Obtain each corresponding word vector of text and each text corresponding first in the target training sample Position is embedded in the vector sum second position and is embedded in vector, wherein first position insertion vector characterizes each text and described the The relative position distance of one entity, the second position insertion vector characterize the relative position of each text Yu the second instance Distance；

The corresponding word vector of each text, first position insertion vector and the second position are embedded in vector It merges, obtains the corresponding word insertion vector of each text；

The corresponding word insertion vector of texts all in the target training sample is merged, the target training is obtained The corresponding feature of sample is embedded in vector.

8. the method as described in claim 1, which is characterized in that obtaining target training sample set and the target training sample Before the step of collecting corresponding trained entity relationship, the method also includes:

It obtains and multiple entity relationships multiple training sample sets correspondingly, wherein each training sample set includes more A training sample；

Each training sample set that the multiple training sample is concentrated successively is used as the target training sample set, with more The model parameter of the new entity relation extraction model, until the entity relation extraction model meets the condition of convergence of setting.

9. a kind of entity relation extraction method, which is characterized in that the described method includes:

Receive sample to be predicted；

The entity relation extraction model completed using the method according to claim 1 training is to the sample to be predicted This is handled, and is obtained the corresponding prediction of the sample to be predicted and is extracted result, wherein it includes multiple that result is extracted in the prediction Each corresponding class probability of entity relationship in entity relationship and the multiple entity relationship；

10. method as claimed in claim 9, which is characterized in that the sample to be predicted is Agricultural Information text；

It includes each agriculture real in multiple agriculture entity relationships and the multiple agriculture entity relationship that result is extracted in the prediction The corresponding class probability of body relationship；

The corresponding prediction entity relationship of the Agricultural Information text is agriculture entity relationship corresponding to maximum class probability.

11. a kind of entity relation extraction model training apparatus, which is characterized in that described device includes:

First processing module is closed for obtaining target training sample set and the corresponding trained entity of the target training sample set System, wherein the target training sample set includes multiple training samples；

The first processing module is also used to, and each training sample vectorization in the multiple training sample obtains Each corresponding feature of the training sample is embedded in vector；

The first processing module is also used to, and is extracted each feature insertion vector in the semantic information of multiple dimensions, is obtained Each corresponding multi-dimensional semantic feature vector of the training sample, wherein the multi-dimensional semantic feature vector characterization is to language The semantic results that adopted interconnection vector obtains after being learnt, multiple dimensions of the semantic association vector characterization to the training sample The semantic results that degree obtains after being learnt；

The first processing module is also used to, and according to all multi-dimensional semantic feature vectors, obtains the target training sample Collect corresponding prediction entity relationship；

Parameter updating module, for being based on the prediction entity relationship, the semantic association vector and the trained entity relationship, Update the model parameter of the entity relation extraction model.

12. a kind of entity relation extraction device, which is characterized in that described device includes:

Receiving module, for receiving sample to be predicted；

Second processing module, the entity relation extraction for being completed using the method according to claim 1 training Model handles the sample to be predicted, obtains the corresponding prediction of the sample to be predicted and extracts result, wherein is described pre- Surveying and extracting result includes each corresponding class probability of entity relationship in multiple entity relationships and the multiple entity relationship；

The Second processing module is also used to, using entity relationship corresponding to maximum class probability as the sample to be predicted Corresponding prediction entity relationship.

13. a kind of electronic equipment characterized by comprising

Memory, for storing one or more programs；

Processor；

When one or more of programs are executed by the processor, such as side of any of claims 1-10 is realized Method.

14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt Such as method of any of claims 1-10 is realized when processor executes.