CN110210035A - The training method of sequence labelling method, apparatus and sequence labelling model - Google Patents

The training method of sequence labelling method, apparatus and sequence labelling model Download PDF

Info

Publication number
CN110210035A
CN110210035A CN201910481021.2A CN201910481021A CN110210035A CN 110210035 A CN110210035 A CN 110210035A CN 201910481021 A CN201910481021 A CN 201910481021A CN 110210035 A CN110210035 A CN 110210035A
Authority
CN
China
Prior art keywords
label
sequence
score
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910481021.2A
Other languages
Chinese (zh)
Other versions
CN110210035B (en
Inventor
李正华
黄德朋
张民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201910481021.2A priority Critical patent/CN110210035B/en
Publication of CN110210035A publication Critical patent/CN110210035A/en
Application granted granted Critical
Publication of CN110210035B publication Critical patent/CN110210035B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of sequence labelling methods, device, the training method of sequence labelling model, equipment and computer readable storage medium, the score layer of sequence labelling model includes and one-to-one second score layer of Marking Guidelines in the program, it further include the first score layer corresponding with whole Marking Guidelines, due to the unique design of the score layer in the model, therefore it can use training set of the isomeric data as the model of a variety of Marking Guidelines, expand training corpus scale, and the model can learn the general character between the corpus of different labeled specification, to mark performance of the lift scheme under single Marking Guidelines.In addition, the output result of the model is binding sequence label, it is equivalent to the sequence label directly obtained under a variety of Marking Guidelines, facilitates conversion of the text between different labeled specification.

Description

The training method of sequence labelling method, apparatus and sequence labelling model
Technical field
This application involves natural language processing field, in particular to a kind of sequence labelling method, apparatus, sequence labelling model Training method, equipment and computer readable storage medium.
Background technique
In natural language processing task, the training sample using labeled data as Natural Language Processing Models is generally required This, wherein the scale of labeled data significantly affects the performance of model.Since the construction cost of artificial labeled data is very expensive, Therefore scholar proposes to realize the scheme for expanding data scale using isomeric data resource.However, since isomeric data follows not Same Marking Guidelines, so can not directly be mixed to isomeric data.So how to efficiently use isomeric data improves model Performance becomes one and studies a question.
Currently, a kind of scheme using isomeric data lift scheme performance are as follows: using a kind of data resource in another number According to additional feature is generated in resource, it is similar to accumulation study, by taking CTB and PKU as an example, is individually trained with CTB corpus first Model parameter parameter, then additionally be added PKU language material feature, with PKU corpus continue training pattern.However, due to two Corpus research direction is different, and the Marking Guidelines of part of speech are also different, therefore can bring noise for model, cannot achieve mention it is high performance Purpose.
As it can be seen that the scheme that the existing data using different labeled specification are trained model, the program is made an uproar in the presence of introducing The problem of sound, cannot achieve the purpose of lift scheme mark performance.
Summary of the invention
The purpose of the application be to provide a kind of sequence labelling method, apparatus, the training method of sequence labelling model, equipment and Computer readable storage medium, the scheme that model is trained to solve the existing data using different labeled specification, The problem of cannot achieve lift scheme mark performance.Concrete scheme is as follows:
In a first aspect, this application provides a kind of sequence labelling device, including sequence labelling model, the sequence labelling mould Type includes:
Input layer: for obtaining text to be marked;
Expression layer: for determining that the vector of each word of the text to be marked indicates, and the vector is indicated to distinguish It is sent to the first score layer and multiple second score layers;
First score layer: for indicating according to the vector, each binding label in binding tag set is determined Raw score;
Second score layer: it for being indicated according to the vector, determines each in the tag set of corresponding Marking Guidelines The score of a separate label, wherein second score layer and the Marking Guidelines correspond, and the binding label is one group The tag combination of single separate label including the various Marking Guidelines;
Prediction interval: for the raw score and independent mark corresponding with the binding label according to the binding label The score of label determines the final score of the binding label;Institute's predicate is determined according to the final score of each binding label Target bundle label;
Output layer: for exporting the target labels sequence of the text to be marked, the target labels sequence includes described The target of each word of text to be marked bundlees label.
Preferably, the expression layer includes:
First coding unit: it is encoded for the word to the text to be marked, obtains the primary vector of institute's predicate;
Second coding unit: it for being encoded using each word of the first bidirectional circulating neural network to institute's predicate, obtains To the secondary vector of institute's predicate;
Indicate unit: for determining that the vector of institute's predicate indicates, and will according to the primary vector and the secondary vector The vector expression is respectively sent to the first score layer and multiple second score layers.
Preferably, the expression unit is specifically used for:
According to the primary vector and the secondary vector, determine that the vector of institute's predicate indicates;Utilize the second bidirectional circulating Neural network encodes the vector expression of institute's predicate, obtains global information;The global information is sent to and is sent respectively To the first score layer and multiple second score layers.
Preferably, the prediction interval is specifically used for:
According to the raw score of the binding label and the score of separate label corresponding with the binding label, really The final score of the fixed binding label;It is determined respectively using softmax function according to the final score of each binding label The probability of a binding label;Label is bundled according to the target of the determine the probability institute predicate of each binding label.
Preferably, further includes:
Loss layer: in the training process, according to target loss function, predicting that annotation results, practical annotation results are true Determine penalty values, training realized to adjust model parameter, wherein the target loss function are as follows:
Wherein, k indicates the quantity of correct label, yjIt is general after by the prediction interval for the score of correct label Rate.
Second aspect, this application provides a kind of training methods of sequence labelling model, are applied to sequence as described above The sequence labelling model of annotation equipment, comprising:
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the reality of training text, the training text Border sequence label;
By the training sample list entries marking model, the prediction label sequence of the sequence labelling model output is obtained Column;
The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence It is whole, until reaching preset termination condition, to realize the training of the sequence labelling model.
The third aspect, this application provides a kind of training equipment of sequence labelling model, are applied to sequence as described above The sequence labelling model of annotation equipment, comprising:
Memory: for storing computer program;
Processor: for executing the computer program to perform the steps of
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the reality of training text, the training text Border sequence label;By the training sample list entries marking model, the prediction label of the sequence labelling model output is obtained Sequence;The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, Until reaching preset termination condition, to realize the training of the sequence labelling model.
Fourth aspect, this application provides a kind of computer readable storage mediums, are applied to sequence labelling as described above The sequence labelling model of device is stored with computer program, the computer program quilt on the computer readable storage medium For realizing following steps when processor executes:
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the reality of training text, the training text Border sequence label;By the training sample list entries marking model, the prediction label of the sequence labelling model output is obtained Sequence;The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, Until reaching preset termination condition, to realize the training of the sequence labelling model.
5th aspect, this application provides a kind of sequence labelling methods, comprising:
Obtain text to be marked;
Determine that the vector of each word of the text to be marked indicates;
It is indicated according to the vector, determines the score of each separate label in the tag set of a variety of Marking Guidelines, and really Surely the raw score of each binding label in tag set is bundled, it includes the various Marking Guidelines that the binding label, which is one group, Single separate label tag combination;
According to the raw score of the binding label and the score of separate label corresponding with the binding label, really The final score of the fixed binding label;
Determine that the target of institute's predicate bundlees label according to the final score of the binding label, to obtain the text to be marked This target labels sequence.
This application provides a kind of sequence labelling method, apparatus, the training method of sequence labelling model, equipment and computers Readable storage medium storing program for executing, the program can obtain text to be marked, determine that the vector of each word of text to be marked indicates;And then root It is indicated according to vector, determines the score of each separate label in the tag set of a variety of Marking Guidelines, and determine binding tag set In it is each binding label raw score;Then according to the raw score of binding label and independence corresponding with binding label The score of label determines the final score of binding label;Finally determine that the target of word bundlees according to the final score of binding label Label, to obtain the target labels sequence of text to be marked.
As it can be seen that in the program score layer of sequence labelling model include with the one-to-one score layer of Marking Guidelines, also wrap Include score layer corresponding with whole Marking Guidelines can use a variety of due to the unique design of the score layer in the model Training set of the isomeric data of Marking Guidelines as the model expands training corpus scale, and the model can learn difference General character between the corpus of Marking Guidelines, thus mark performance of the lift scheme under single Marking Guidelines.In addition, the model Exporting result is binding sequence label, is equivalent to the sequence label directly obtained under a variety of Marking Guidelines, facilitates text in difference Conversion between Marking Guidelines.
Detailed description of the invention
It, below will be to embodiment or existing for the clearer technical solution for illustrating the embodiment of the present application or the prior art Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this Shen Some embodiments please for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of functional block diagram of sequence labelling Installation practice one provided herein;
Fig. 2 is the schematic diagram that label is bundled in a kind of sequence labelling Installation practice one provided herein;
Fig. 3 is the schematic diagram that the vector of word in a kind of sequence labelling Installation practice two provided herein indicates;
Fig. 4 is a kind of implementation flow chart of the training method embodiment of sequence labelling model provided herein;
Fig. 5 is a kind of structural schematic diagram of the training apparatus embodiments of sequence labelling model provided by the present application;
Fig. 6 is a kind of flow diagram of sequence labelling embodiment of the method provided herein.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, with reference to the accompanying drawings and detailed description The application is described in further detail.Obviously, described embodiments are only a part of embodiments of the present application, rather than Whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall in the protection scope of this application.
Currently, being lift scheme performance, often expand the scale of labeled data using isomeric data, however, existing benefit It is that model introduces noise with the scheme of isomeric data, cannot achieve the purpose of lift scheme performance.In view of the above-mentioned problems, this Shen It please provide a kind of sequence labelling method, apparatus, the training method of sequence labelling model, equipment and computer-readable storage medium Matter, the program propose a kind of sequence labelling model of unique design, effectively improve model performance.
A kind of sequence labelling Installation practice one provided by the present application is introduced below, embodiment one includes sequence mark Injection molding type.It should be noted that the present embodiment uses deep neural network as above-mentioned sequence labelling model, to avoid traditional The shortcomings that model based on Feature Engineering, for example, feature extraction process it is cumbersome, be difficult to ensure the problems such as feature templates reasonability.Make For a kind of specific embodiment, the present embodiment chooses BiLSTM (Bidirectional Long Short-Term Memory) As basic model.
Referring to Fig. 1, above-mentioned sequence labelling model is specifically included:
Input layer 101: for obtaining text to be marked;
Expression layer 102: for determining that the vector of each word of the text to be marked indicates, and the vector is indicated to divide It is not sent to the first score layer 103 and multiple second score layers 104;
Sequence labelling model specifically can be used to implement the name Entity recognition to text to be marked, divide in the present embodiment The processing such as word, part-of-speech tagging, since the emphasis of the present embodiment is to distribute the process of label for word each in text to be marked, Therefore the operation the present embodiment such as name Entity recognition, participle is not described in detail.Text to be marked enter score layer it Before, need to convert each word in text to be marked to vector expression.As a kind of specific embodiment, the present embodiment is adopted Obtain the insertion vector of word with the mode of pre-training, that is to say, that initialization when directly acquire the good word of other model trainings to Amount, for the unregistered word not being found in the vocabulary of pre-training, can generate the insertion of the word to indicate current word at random Vector.
First score layer 103: for indicating according to the vector, each binding label in binding tag set is determined Raw score;
Second score layer 104: for being indicated according to the vector, in the tag set for determining corresponding Marking Guidelines The score of each separate label, wherein second score layer and the Marking Guidelines correspond, and the binding label is one Group includes the tag combination of the single separate label of the various Marking Guidelines;
Aforementioned Marking Guidelines refer to the rule and foundation being labeled to each word of text sentence, specifically in the form of a label Word is labeled, the Marking Guidelines being currently known include CTB, PKU, MSR etc..By taking two kinds of Marking Guidelines of CTB and PKU as an example, For one text " especially China's economy downturn.", the results are shown in Table 1 for the sequence labelling obtained according to CTB Marking Guidelines, The results are shown in Table 2 for the sequence labelling obtained according to PKU Marking Guidelines, it is seen then that for one text, advises according to different marks Model, available different sequence labelling is as a result, i.e. different sequence label.The purpose of the present embodiment is that utilizing isomeric data The mark performance of lift scheme, therefore the present embodiment is realized based on a variety of Marking Guidelines, more specifically, the present embodiment is based on two Kind or two or more Marking Guidelines are realized, specifically choose which Marking Guidelines can be determined according to actual demand, this implementation Example is not specifically limited in this embodiment.
Table 1
Each word in text Especially I State It is economical It glides
The annotation results of CTB Marking Guidelines AD PN NN NN VV PU
Table 2
Each word in text Especially It is China It is economical It glides
The annotation results of PKU Marking Guidelines d v n n v w
It is noted that the present embodiment bundlees the label of a variety of Marking Guidelines, obtain including various mark rule The tag combination of label in model, for convenience of describing, which is known as bundling label by the present embodiment, and will be in Marking Guidelines Label be known as separate label, the building process for bundling label is specifically as shown in Figure 2.The present embodiment is in widened binding tally set It closes and is modeled, by considering that single separate label mapping is converted one group of binding mark by all binding label possibilities Label.
Above-mentioned first score layer 103 and the second score layer 104 be only used for distinguish two kinds of score layers, do not indicate quantity and successively Sequentially.As shown in Figure 1, sequence labelling model shares N+1 score layer, including first score layer in the present embodiment, also wrap Include multiple second score layers, wherein the first score layer is used to indicate according to the vector of word to determine each bundle in binding tag set The raw score of tying, the second score layer and Marking Guidelines correspond, and the second score layer is used to be indicated according to the vector of word Determine the score of each separate label in the tag set of corresponding Marking Guidelines.It is above-mentioned as a kind of specific embodiment First score layer can be different MLP (Multilayer Preceptron) layers from above-mentioned second score layer.
Prediction interval 105: for according to the raw score of the binding label and corresponding with the binding label only The score of day-mark label determines the final score of the binding label;Institute is determined according to the final score of each binding label The target of predicate bundlees label;
As a kind of specific embodiment, the present embodiment to binding label raw score, with bundle label it is corresponding The score of each separate label is summed, using summed result as the final score of binding label, finally according to each binding The size relation of the final score of label determines the target binding label of word.
Output layer 106: for exporting the target labels sequence of the text to be marked, the target labels sequence includes institute State the target binding label of each word of text to be marked.
It should be noted that due to the unique design of the score layer in the sequence labelling model of the present embodiment, it can be with Training set of the corpus as the model of a variety of Marking Guidelines, expanding data scale are chosen, and the model can learn difference General character between the corpus of Marking Guidelines, thus mark performance of the lift scheme under single Marking Guidelines.That is, the mould Type can be realized the corresponding notation methods of Marking Guidelines of any one in aforementioned a variety of Marking Guidelines, and in aforementioned a variety of marks The mark performance under any one Marking Guidelines in specification is all promoted.Specifically, during the test, the sequence labelling Model can export the binding sequence label of text to be marked, and binding sequence label includes the label sequence of above-mentioned a variety of Marking Guidelines Column can be obtained any one mark rule in above-mentioned a variety of Marking Guidelines by simply being divided to binding sequence label Sequence label under model, therefore, sequence labelling model provided in this embodiment do not need to preassign the expected mark of input text Note specification.
The present embodiment provides a kind of sequence labelling device, including sequence labelling model, the score layer of the model include with The one-to-one score layer of Marking Guidelines further includes score layer corresponding with whole Marking Guidelines, due to the score in the model The unique design of layer, therefore can use training set of the isomeric data as the model of a variety of Marking Guidelines, expand training language Gauge mould, and the model can learn the general character between the corpus of different labeled specification, so that lift scheme is in single mark Mark performance under specification.In addition, the output result of the model is binding sequence label, it is equivalent to and directly obtains a variety of mark rule Sequence label under model facilitates conversion of the text between different labeled specification.
Start that a kind of sequence labelling Installation practice two provided by the present application is discussed in detail below, embodiment two is based on above-mentioned Embodiment one is realized, and has carried out expansion to a certain extent on the basis of example 1.
Specifically, the sequence labelling device that embodiment two provides includes sequence labelling model, the sequence labelling model packet It includes: input layer, expression layer, coding layer, the first MLP layers, the multiple 2nd MLP layers, prediction interval, output layer, loss layer, separately below Each layer is introduced:
Input layer: for obtaining text to be marked;
Expression layer: for determining that the vector of each word of the text to be marked indicates;
Traditional scheme is when converting vector for word indicates, generally directly using the insertion vector of word as the vector table of the word Show, in order to allow the vector of word to indicate more fully to give expression to text information, as a preferred embodiment, as shown in figure 3, The vector that the present embodiment primary vector and secondary vector obtain word jointly indicates.Specifically, the present embodiment is first in expression layer Primary vector and secondary vector are determined respectively.Wherein, primary vector, that is, word is embedded in vector, can be obtained by the way of pre-training It takes, for unregistered word, can be obtained by the way of random initializtion.For secondary vector, pass through random initializtion first The word vector of each word of word is obtained, then, as shown in figure 3, whole word vectors are inputted one layer of BiLSTM, obtains both direction The last one upper respective output, and the output in both direction is spliced, obtain secondary vector.Due to the last character The output of symbol has learnt the information to other characters, therefore as secondary vector, can more fully indicate text envelope Breath.Finally, the present embodiment splices primary vector and secondary vector, and the vector that splicing is obtained is as the vector table of word Show.
Specifically, giving a text S={ w to be marked1,w2,...,wn, wiIndicate that i-th of word in text, n indicate The quantity of word in text to be marked, for each word wi={ ci_1,ci_2,...,ci_n, ci_jIndicate word wiIn j-th of word, m Indicate the quantity of word in the word.For wiIn all characters, the present embodiment is entered into BiLSTM, and by the last one The output h of characterlmAnd hrmIt is spliced to wiBehind corresponding term vector, w is obtainediVector indicate Xi, then XiIt can indicate are as follows:
To sum up, expression layer specifically includes in the present embodiment:
First coding unit: it is encoded for the word to the text to be marked, obtains the primary vector of institute's predicate;
Second coding unit: it for being encoded using each word of the first bidirectional circulating neural network to institute's predicate, obtains To the secondary vector of institute's predicate;
Indicate unit: for determining that the vector of institute's predicate is indicated according to the primary vector and the secondary vector.
Coding layer: the vector expression of institute's predicate is encoded using the second bidirectional circulating neural network, obtains global letter Breath;The global information is sent to and is respectively sent to the first MLP layers and multiple two MLP layers;
Specifically, coding layer is encoded using BiLSTM distich sub-information, the present embodiment is by the output X of expression layeriMake For the input of LSTM, entire sentence sequence is encoded to obtain word w by LSTMiGlobal information hi, public affairs which is related to Formula includes:
ii=σ (Win·[hi-1,xi]+bin) (2)
fi=σ (Wfg·[hi-1,xi]+bfg) (3)
oi=σ (Wout·[hi-1,xi]+bout) (4)
ci=fi·ci-1+ii·tanh(Wc·[hi-1,xi]+bc) (5)
hi=oi·tanh(ci) (6)
Wherein, ii, fi, oi, ciRespectively indicate the corresponding input gate of i-th of word, forget door, out gate, cell state it is defeated Out, xiAnd hiIndicate the corresponding input of i-th of word and hidden layer output.σ indicates sigmoid activation primitive, and W and b are respectively corresponding The weight and biasing of door.
The hidden state of LSTM only obtained information, and the information never considered from the past.In order to encode two The hidden layer output of two LSTM of forward-backward algorithm is stitched together by the sentence information on a direction, the present embodiment, obtains word wi's BiLSTM hidden state indicates hi:
Described first MLP layers: for indicating according to the vector, determining the original of each binding label in binding tag set Beginning score;
Described 2nd MLP layers: for being indicated according to the vector, determining each in the tag set of corresponding Marking Guidelines The score of separate label;
Specifically, score layer calculates the score of each label using MLP in the present embodiment, sequence labelling model shares N+ 1 MLP layers, comprising: MLP layers of the 2nd of the score of each separate label of determining N kind Marking Guidelines respectively further includes determining respectively The first MLP layers of the score of a binding label.Specifically, by the output h of BiLSTMiAs the input of MLP, to obtain sentence In the corresponding each label of each word score Pi:
Wherein, WmlpAnd bmlpRespectively indicate MLP layers of weight and biasing.
Prediction interval: for the raw score and independent mark corresponding with the binding label according to the binding label The score of label determines the final score of the binding label;Institute's predicate is determined according to the final score of each binding label Target bundle label;
Specifically, the present embodiment is according to the coupled maps relationship between binding label and separate label, to binding label Raw score, N number of separate label corresponding with the binding label score be added, obtain binding label final score. By taking N=2 as an example, i-th of word is marked as binding label [t in S in sentencea,tb] score are as follows:
Wherein, Scorejoint(s,i,[ta,tb]) indicate that i-th of word is denoted as joint label [t in sentence Sa,tb] original Beginning score, Scoresep_a(s,i,[ta,tb]) indicate that i-th of word is denoted as in the tag set of the first Marking Guidelines in sentence S Separate label taScore, Scoresep_b(s,i,[ta,tb]) indicate separate label t in the tag set of the second Marking Guidelinesb's Score.
After the final score for obtaining binding label, as a kind of specific embodiment, the present embodiment is used Softmax function obtains the general of each binding label for the score for all binding labels being calculated to be normalized Rate, and the target binding label of each word is predicted accordingly:
Wherein, piFor the normalization probability of i-th of binding label in binding tag set, ScoreiFor i-th of binding label Score, n be bundle tag set in bundle label quantity.
To sum up, prediction interval described in the present embodiment is specifically used for: according to it is described binding label raw score and with institute The score for stating the corresponding separate label of binding label determines the final score of the binding label;Utilize softmax function root The probability of each binding label is determined according to the final score of each binding label;According to each binding label The target of determine the probability institute predicate bundlees label.
Output layer: for exporting the target labels sequence of the text to be marked, the target labels sequence includes described The target of each word of text to be marked bundlees label.
Loss layer: in the training process, according to target loss function, predicting that annotation results, practical annotation results are true Determine penalty values, realizes training to adjust model parameter.
In the art, model generally uses parameter of the CrossEntropy function as objective function for model to estimate Meter is solved and assessment models performance by minimizing objective function.Wherein, CrossEntropy function are as follows:
Wherein, yiIt is correct label probability distribution,It is according to the label probability distribution after Score Normalization, loss It is corresponding loss between the sample results and model prediction result being calculated, parameter Estimation is carried out for returning, to instruct Practice model, the purpose of model training minimizes loss.
On this basis, the present embodiment considers the correct part of speech more than one due to each word, it is assumed that Marking Guidelines 1 Number of labels is | T1|, the number of labels of Marking Guidelines 2 is | T2| ..., the number of labels of Marking Guidelines N is | TN|, then, mark The correct option of each word in note specification 1 has | T2|*...*|TN| it is a, and the correct option of each word in Marking Guidelines 2 has |T1|*|T3|*...*|TN| it is a, and so on, the correct option of each word in Marking Guidelines N has | T1|*...|TN-1| it is a. Therefore, as a preferred embodiment, the present embodiment proposes a kind of improved objective function, specifically:
Wherein, k indicates the quantity of correct label, yjPass through after Softmax normalization for the score of correct label Probability.
For prove the present embodiment sequence labelling model performance boost effect, below to the sequence labelling mould of the present embodiment Type and existing model compare explanation:
It is assumed that the purpose of an existing practical application scene is to improve mark performance of the model under CTB Marking Guidelines, And assume that the quantity of Marking Guidelines in the present embodiment is 2, two kinds of Marking Guidelines are respectively CTB and PKU.So, in experimental data Setting aspect, as shown in table 3, the input and output of model are as shown in table 4 for the data set setting of existing model.As it can be seen that existing model Can only be using the corpus of single Marking Guidelines as training set, the input of model only considers the vector of word, and can only export single The sequence label of Marking Guidelines.Referring to table 5, table 6 and table 7, the sequence labelling model of the present embodiment can be by a variety of Marking Guidelines Corpus collectively as training set, expand data scale;The input of model has comprehensively considered term vector and word vector, preferably Vector expression is able to ascend model to the learning ability of text, lift scheme performance;Model can output bundle sequence label, i.e., The sequence label under a variety of Marking Guidelines is directly obtained, convenient for conversion of the text under different labeled specification, is simple and efficient.
Table 3
Table 4
The input of existing model The output label of existing model
Especially AD
I PN
State NN
It is economical NN
It glides VV
PU
Table 5
Table 6
Table 7
In conclusion a kind of sequence labelling device provided in this embodiment, passes through the score layer to sequence labelling model It improves, realizes using extensive isomery authority data and promote the purpose of sequence labelling performance.In addition, model directly export it is more The sequence label of kind Marking Guidelines, not only improves the accuracy rate of the mark under single Marking Guidelines, but also facilitate text not With the conversion between Marking Guidelines.
A kind of training method of sequence labelling model provided by the embodiments of the present application is introduced below, it is described below The training method of sequence labelling model is applied to the sequence labelling model of sequence labelling device above-mentioned.
As shown in figure 4, the training method of the sequence labelling model includes:
Step S401: obtaining the training sample of a variety of Marking Guidelines, and the training sample includes training text, the training The physical tags sequence of text;
Step S402: by the training sample list entries marking model, the pre- of the sequence labelling model output is obtained Survey sequence label;
Step S403: according to the prediction label sequence and the physical tags sequence to the ginseng of the sequence labelling model Number is adjusted, until reaching preset termination condition, to realize the training of the sequence labelling model.
Specifically, the adjustment process of above-mentioned model parameter can be an automatic process.Model training is completed default Termination condition can reach default maximum number of iterations for the number of iterations, be also possible to model in the iteration by certain number After process, when performance does not reach expected and promoted, determine that model training is completed, concrete foundation actual demand determines, this reality Example is applied to be not specifically limited.
A kind of training equipment of sequence labelling model provided by the embodiments of the present application is introduced below, it is described below The training equipment application of sequence labelling model is in the sequence labelling model of sequence labelling device above-mentioned.
As shown in figure 5, the training equipment of sequence labelling model includes:
Memory 501: for storing computer program;
Processor 502: for executing the computer program to perform the steps of
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the reality of training text, the training text Border sequence label;By the training sample list entries marking model, the prediction label of the sequence labelling model output is obtained Sequence;The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, Until reaching preset termination condition, to realize the training of the sequence labelling model.
A kind of computer readable storage medium provided by the embodiments of the present application is introduced below, calculating described below Machine readable storage medium storing program for executing is applied to the sequence labelling model of sequence labelling device above-mentioned.
Specifically, being stored with computer program on the computer readable storage medium, the computer program is processed For realizing following steps when device executes:
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the reality of training text, the training text Border sequence label;By the training sample list entries marking model, the prediction label of the sequence labelling model output is obtained Sequence;The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, Until reaching preset termination condition, to realize the training of the sequence labelling model.
A kind of sequence labelling method provided by the embodiments of the present application is introduced below, as shown in fig. 6, the sequence labelling Method includes:
Step S601: text to be marked is obtained;
Step S602: determine that the vector of each word of the text to be marked indicates;
Step S603: it is indicated according to the vector, determines each separate label in the tag set of a variety of Marking Guidelines Score, and determine the raw score of each binding label in binding tag set, the binding label includes various institutes for one group State the tag combination of the single separate label of Marking Guidelines;
Step S604: according to the raw score of the binding label and separate label corresponding with the binding label Score, determine it is described binding label final score;
Step S605: determine that the target of institute's predicate bundlees label according to the final score of the binding label, to obtain State the target labels sequence of text to be marked.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part Explanation.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
Scheme provided herein is described in detail above, specific case used herein is to the application's Principle and embodiment is expounded, the present processes that the above embodiments are only used to help understand and its core Thought;At the same time, for those skilled in the art, according to the thought of the application, in specific embodiment and application range Upper there will be changes, in conclusion the contents of this specification should not be construed as limiting the present application.

Claims (9)

1. a kind of sequence labelling device, which is characterized in that including sequence labelling model, the sequence labelling model includes:
Input layer: for obtaining text to be marked;
Expression layer: for determining that the vector of each word of the text to be marked indicates, and vector expression is sent respectively To the first score layer and multiple second score layers;
First score layer: for being indicated according to the vector, each binding label is original in determining binding tag set Score;
Second score layer: it for being indicated according to the vector, determines each only in the tag set of corresponding Marking Guidelines The score of day-mark label, wherein second score layer and the Marking Guidelines correspond, and the binding label is one group and includes The tag combination of the single separate label of the various Marking Guidelines;
Prediction interval: for according to the raw score of the binding label and with the corresponding separate label of the binding label Score determines the final score of the binding label;The mesh of institute's predicate is determined according to the final score of each binding label Mark binding label;
Output layer: for exporting the target labels sequence of the text to be marked, the target labels sequence includes described wait mark The target of each word of explanatory notes sheet bundlees label.
2. sequence labelling device as described in claim 1, which is characterized in that the expression layer includes:
First coding unit: it is encoded for the word to the text to be marked, obtains the primary vector of institute's predicate;
Second coding unit: for encoding using each word of the first bidirectional circulating neural network to institute's predicate, institute is obtained The secondary vector of predicate;
Indicate unit: for determining that the vector of institute's predicate indicates, and will be described according to the primary vector and the secondary vector Vector expression is respectively sent to the first score layer and multiple second score layers.
3. sequence labelling device as claimed in claim 2, which is characterized in that the expression unit is specifically used for:
According to the primary vector and the secondary vector, determine that the vector of institute's predicate indicates;Utilize the second bidirectional circulating nerve Network encodes the vector expression of institute's predicate, obtains global information;The global information is sent to and is respectively sent to One score layer and multiple second score layers.
4. sequence labelling device as described in claim 1, which is characterized in that the prediction interval is specifically used for:
According to the raw score of the binding label and the score of separate label corresponding with the binding label, institute is determined State the final score of binding label;Each institute is determined according to the final score of each binding label using softmax function State the probability of binding label;Label is bundled according to the target of the determine the probability institute predicate of each binding label.
5. sequence labelling device as claimed in claim 4, which is characterized in that further include:
Loss layer: in the training process, determining damage according to target loss function, prediction annotation results, practical annotation results Mistake value realizes training to adjust model parameter, wherein the target loss function are as follows:
Wherein, k indicates the quantity of correct label, yjFor correct label score by the probability after the prediction interval.
6. a kind of training method of sequence labelling model, which is characterized in that applied to as described in claim 1-5 any one The sequence labelling model of sequence labelling device, comprising:
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the practical mark of training text, the training text Sign sequence;
By the training sample list entries marking model, the prediction label sequence of the sequence labelling model output is obtained;
The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, directly To preset termination condition is reached, to realize the training of the sequence labelling model.
7. a kind of training equipment of sequence labelling model, which is characterized in that applied to as described in claim 1-5 any one The sequence labelling model of sequence labelling device, comprising:
Memory: for storing computer program;
Processor: for executing the computer program to perform the steps of
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the practical mark of training text, the training text Sign sequence;By the training sample list entries marking model, the prediction label sequence of the sequence labelling model output is obtained; The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, until reaching To preset termination condition, to realize the training of the sequence labelling model.
8. a kind of computer readable storage medium, which is characterized in that applied to the sequence as described in claim 1-5 any one The sequence labelling model of annotation equipment is stored with computer program, the computer journey on the computer readable storage medium For realizing following steps when sequence is executed by processor:
The training sample of a variety of Marking Guidelines is obtained, the training sample includes the practical mark of training text, the training text Sign sequence;By the training sample list entries marking model, the prediction label sequence of the sequence labelling model output is obtained; The parameter of the sequence labelling model is adjusted according to the prediction label sequence and the physical tags sequence, until reaching To preset termination condition, to realize the training of the sequence labelling model.
9. a kind of sequence labelling method characterized by comprising
Obtain text to be marked;
Determine that the vector of each word of the text to be marked indicates;
It is indicated according to the vector, determines the score of each separate label in the tag set of a variety of Marking Guidelines, and determine bundle In tying set it is each binding label raw score, it is described binding label be one group include the various Marking Guidelines list The tag combination of a separate label;
According to the raw score of the binding label and the score of separate label corresponding with the binding label, institute is determined State the final score of binding label;
Determine that the target of institute's predicate bundlees label according to the final score of the binding label, to obtain the text to be marked Target labels sequence.
CN201910481021.2A 2019-06-04 2019-06-04 Sequence labeling method and device and training method of sequence labeling model Active CN110210035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910481021.2A CN110210035B (en) 2019-06-04 2019-06-04 Sequence labeling method and device and training method of sequence labeling model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910481021.2A CN110210035B (en) 2019-06-04 2019-06-04 Sequence labeling method and device and training method of sequence labeling model

Publications (2)

Publication Number Publication Date
CN110210035A true CN110210035A (en) 2019-09-06
CN110210035B CN110210035B (en) 2023-01-24

Family

ID=67790556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910481021.2A Active CN110210035B (en) 2019-06-04 2019-06-04 Sequence labeling method and device and training method of sequence labeling model

Country Status (1)

Country Link
CN (1) CN110210035B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115391608A (en) * 2022-08-23 2022-11-25 哈尔滨工业大学 Automatic labeling conversion method for graph-to-graph structure
WO2023045949A1 (en) * 2021-09-27 2023-03-30 华为技术有限公司 Model training method and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729312A (en) * 2017-09-05 2018-02-23 苏州大学 More granularity segmenting methods and system based on sequence labelling modeling
CN109800298A (en) * 2019-01-29 2019-05-24 苏州大学 A kind of training method of Chinese word segmentation model neural network based

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729312A (en) * 2017-09-05 2018-02-23 苏州大学 More granularity segmenting methods and system based on sequence labelling modeling
CN109800298A (en) * 2019-01-29 2019-05-24 苏州大学 A kind of training method of Chinese word segmentation model neural network based

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHENGHUA LI等: "Ambiguity-aware Ensemble Training for Semi-supervised Dependency Parsing", 《PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》 *
王仁武等: "实体-属性抽取的GRU+CRF方法", 《现代情报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023045949A1 (en) * 2021-09-27 2023-03-30 华为技术有限公司 Model training method and related device
CN115391608A (en) * 2022-08-23 2022-11-25 哈尔滨工业大学 Automatic labeling conversion method for graph-to-graph structure

Also Published As

Publication number Publication date
CN110210035B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
CN111666427B (en) Entity relationship joint extraction method, device, equipment and medium
CN109800298A (en) A kind of training method of Chinese word segmentation model neural network based
CN108062388A (en) Interactive reply generation method and device
CN109196582A (en) With the system and method for word accent prediction pronunciation
CN106663092A (en) Neural machine translation systems with rare word processing
CN107590127A (en) A kind of exam pool knowledge point automatic marking method and system
CN110148400A (en) The pronunciation recognition methods of type, the training method of model, device and equipment
CN107220220A (en) Electronic equipment and method for text-processing
CN109471915A (en) A kind of text evaluation method, device, equipment and readable storage medium storing program for executing
CN107169031B (en) Picture material recommendation method based on depth expression
CN107452379A (en) The identification technology and virtual reality teaching method and system of a kind of dialect language
CN112800239B (en) Training method of intention recognition model, and intention recognition method and device
CN110781663A (en) Training method and device of text analysis model and text analysis method and device
CN109597988A (en) The former prediction technique of vocabulary justice, device and electronic equipment across language
CN111339302A (en) Method and device for training element classification model
CN109670168A (en) Short answer automatic scoring method, system and storage medium based on feature learning
CN112559749B (en) Intelligent matching method, device and storage medium for online education teachers and students
CN110162789A (en) A kind of vocabulary sign method and device based on the Chinese phonetic alphabet
CN112417092A (en) Intelligent text automatic generation system based on deep learning and implementation method thereof
CN110210035A (en) The training method of sequence labelling method, apparatus and sequence labelling model
CN111563146A (en) Inference-based difficulty controllable problem generation method
CN115438176A (en) Method and equipment for generating downstream task model and executing task
CN116797417A (en) Intelligent auxiliary system based on large language model
CN114780723B (en) Portrayal generation method, system and medium based on guide network text classification
CN117252739B (en) Method, system, electronic equipment and storage medium for evaluating paper

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant