CN106897265A

CN106897265A - Term vector training method and device

Info

Publication number: CN106897265A
Application number: CN201710022458.0A
Authority: CN
Inventors: 李建欣; 刘垚鹏; 彭浩; 张日崇; 陈汉腾
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2017-01-12
Filing date: 2017-01-12
Publication date: 2017-06-27
Anticipated expiration: 2037-01-12
Also published as: CN106897265B

Abstract

The present invention provides a kind of term vector training method and device, belongs to machine learning techniques field.The term vector training method includes：Newly-increased lexicon is obtained, the vocabulary increased newly in lexicon constitutes new term storehouse with the vocabulary in old lexicon, the corresponding term vector of haveing been friends in the past of the vocabulary in old lexicon；Initialization process is carried out to the vocabulary in new term storehouse so that the term vector for belonging to the vocabulary in old lexicon in new term storehouse is old term vector, it is random term vector that the vocabulary term vector in newly-increased lexicon is belonged in new term storehouse；Term vector according to corresponding first Huffman tree in new term storehouse and corresponding second Huffman tree of old lexicon respectively to vocabulary in new term storehouse is updated.Term vector training method and device that the present invention is provided, improve the training effectiveness of term vector.

Description

Term vector training method and device

Technical field

The present invention relates to machine learning techniques field, more particularly to a kind of term vector training method and device.

Background technology

In machine learning techniques, in order that machine understands the implication of human language, the vocabulary of neutral net language model Show that each vocabulary in human language is converted into instrument the form of term vector so that computer can be learnt by term vector The implication of each vocabulary in human language.

Using prior art, after new vocabulary is added in lexicon, it usually needs in relearning new lexicon All of vocabulary, the term vector new to obtain each vocabulary.But, using which so that the training effectiveness of term vector is relatively low.

The content of the invention

The present invention provides a kind of term vector training method and device, improves the training effectiveness of term vector.

The embodiment of the present invention provides a kind of term vector training method, including：

Newly-increased lexicon is obtained, the vocabulary in the newly-increased lexicon constitutes new term storehouse with the vocabulary in old lexicon, The corresponding term vector of haveing been friends in the past of vocabulary in the old lexicon；

Initialization process is carried out to the vocabulary in the new term storehouse so that belong to the old vocabulary in the new term storehouse The term vector of the vocabulary in storehouse is old term vector, and the vocabulary term vector belonged in the new term storehouse in the newly-increased lexicon is Random term vector；

According to corresponding first Huffman tree in the new term storehouse and corresponding second Huffman tree point of the old lexicon The other term vector to vocabulary in the new term storehouse is updated.

In an embodiment of the present invention, it is described according to corresponding first Huffman tree in the new term storehouse and the old vocabulary Term vector of corresponding second Huffman tree in storehouse respectively to vocabulary in the new term storehouse is updated, including：

The corresponding goal-selling function of first vocabulary is obtained, first vocabulary is the word in the new term storehouse Converge；

According to first vocabulary in the attribute of first Huffman tree and the attribute pair in second Huffman tree The goal-selling function carries out gradient treatment, obtains the corresponding term vector of first vocabulary.

In an embodiment of the present invention, the corresponding goal-selling function of acquisition first vocabulary, including：

If first vocabulary belongs to the old lexicon, the primal objective function according to Skip-gram models is to institute Stating the first vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary；

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is institute State the primal objective function of Skip-gram models.

If first vocabulary belongs to the old lexicon, the primal objective function according to CBOW models is to described first Vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary；

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is institute State the primal objective function of CBOW models.

In an embodiment of the present invention, the primal objective function according to Skip-gram models is carried out to first vocabulary Factorization, obtains the corresponding goal-selling function of first vocabulary, including：

If first vocabulary belongs to the old lexicon, basis Factorization is carried out to first vocabulary, the corresponding goal-selling function of first vocabulary is obtained；

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is Skip- The primal objective function of gram models

Wherein, w represents first vocabulary, and W represents the old lexicon, and Δ W represents the newly-increased lexicon, C (w) tables Show the lexicon that the corresponding vocabulary of w contexts is constituted, u represents the corresponding vocabulary of w contexts,N omicronn-leaf child node w is represented Two Huffman trees and the length of the Huffman encoding matched on the first Huffman tree, i represent that first vocabulary is described the I-th node on two Huffman trees, j represents that first vocabulary is j-th node on second Huffman tree, The term vector of -1 node of jth on the corresponding first Huffman paths of u is represented,Represent on the second Huffman path that u is represented J-th Huffman encoding of node,Activation primitive is represented, v (w) represents the corresponding term vectors of w.

In an embodiment of the present invention, the primal objective function according to CBOW models first vocabulary is carried out because Formula is decomposed, and obtains the corresponding goal-selling function of first vocabulary, including：

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is The primal objective function of CBOW models

Wherein,J-th Huffman encoding of node on the second Huffman path that w is represented is represented,In expression C (w) The corresponding term vector sum of all vocabulary.

In an embodiment of the present invention, it is described according to first vocabulary in the attribute of first Huffman tree and in institute The attribute for stating the second Huffman tree carries out gradient treatment to the goal-selling function, obtain the corresponding word of first vocabulary to Amount, including：

If first vocabulary belongs to the old lexicon, and first vocabulary in the coding of first Huffman tree There is same prefix part with the coding in second Huffman tree, then to first vocabulary in second Huffman tree On Huffman encoding different piece corresponding node vectorial basis Perform stochastic gradient rising treatment；To the different piece of Huffman encoding of first vocabulary on first Huffman tree The vectorial basis of correspondence node on second Huffman treePerform with The decline of machine gradient is processed；

If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according toStochastic gradient rising treatment is performed, first vocabulary is obtained corresponding Term vector；

Wherein, η ' represents learning rate.

If first vocabulary belongs to the old lexicon, and first vocabulary in the coding of first Huffman tree There is same prefix part with the coding in second Huffman tree, then to first vocabulary in second Huffman tree On Huffman encoding different piece corresponding node vectorial basisPerform Stochastic gradient rising is processed；To the different piece correspondence of Huffman encoding of first vocabulary on first Huffman tree The vectorial basis of node on second Huffman treePerform stochastic gradient Decline is processed；

If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according toPerform stochastic gradient rising treatment, obtain the corresponding word of first vocabulary to Amount；

Wherein,Represent the i-th -1 term vector of node on the corresponding first Huffman paths of w.

The embodiment of the present invention also provides a kind of term vector trainer, including：

Acquisition module, for obtaining newly-increased lexicon, the vocabulary in vocabulary in the newly-increased lexicon and old lexicon New term storehouse is constituted, the corresponding term vector of haveing been friends in the past of the vocabulary in the old lexicon；

Initialization module, for carrying out initialization process to the vocabulary in the new term storehouse so that the new term storehouse In to belong to the term vector of vocabulary in the old lexicon be old term vector, the newly-increased lexicon is belonged in the new term storehouse In vocabulary term vector be random term vector；

Update module, for according to corresponding first Huffman tree in the new term storehouse and the old lexicon corresponding Term vector of two Huffman trees respectively to vocabulary in the new term storehouse is updated.

In an embodiment of the present invention, the update module, specifically for obtaining the corresponding default mesh of first vocabulary Scalar functions, first vocabulary is the vocabulary in the new term storehouse；According to first vocabulary in first Huffman tree Attribute and gradient treatment is carried out to the goal-selling function in the attribute of second Huffman tree, obtain first word Converge corresponding term vector.

Term vector training method provided in an embodiment of the present invention and device, by obtaining newly-increased lexicon, and to new term Vocabulary in storehouse carries out initialization process so that belong in new term storehouse the term vector of vocabulary in old lexicon for old word to Amount, it is random term vector that the vocabulary term vector in newly-increased lexicon is belonged in new term storehouse；Further according to new term storehouse corresponding The term vector of one Huffman tree and corresponding second Huffman tree of old lexicon respectively to vocabulary in new term storehouse is updated, and carries The training effectiveness of term vector high.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description does one and simply introduces, it should be apparent that, drawings in the following description are this hairs Some bright embodiments, for those of ordinary skill in the art, without having to pay creative labor, can be with Other accompanying drawings are obtained according to these accompanying drawings.

Fig. 1 is a kind of schematic flow sheet of term vector training method provided in an embodiment of the present invention；

Fig. 2 is that the flow that is updated of term vector of vocabulary in a kind of storehouse to new term provided in an embodiment of the present invention is illustrated Figure；

Fig. 3 is a kind of structural representation of term vector trainer provided in an embodiment of the present invention.

Specific embodiment

To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.

Term " first ", " second ", " the 3rd ", " in description and claims of this specification and above-mentioned accompanying drawing Four " etc. (if present) is for distinguishing similar object, without for describing specific order or precedence.Should manage Solution so data for using can be exchanged in the appropriate case, so as to embodiments of the invention described herein, for example can be with Order in addition to those for illustrating herein or describing is implemented.Additionally, term " comprising " and " having " and they appoint What deforms, it is intended that covering is non-exclusive to be included, for example, contain the process of series of steps or unit, method, system, Product or equipment are not necessarily limited to those steps clearly listed or unit, but may include not list clearly or for These processes, method, product or other intrinsic steps of equipment or unit.

It should be noted that these specific embodiments can be combined with each other below, for same or analogous concept Or process may be repeated no more in certain embodiments.

Fig. 1 is a kind of schematic flow sheet of term vector training method provided in an embodiment of the present invention, the term vector training side Method can be performed by term vector trainer, the term vector trainer can with it is integrated within a processor, it is also possible to be separately provided, Here, the present invention is not particularly limited.Specifically, shown in Figure 1, the term vector training method can include：

S101, the newly-increased lexicon of acquisition.

Wherein, the vocabulary increased newly in lexicon constitutes new term storehouse, the word in old lexicon with the vocabulary in old lexicon The corresponding term vector of haveing been friends in the past of remittance.

In embodiments of the present invention, the vocabulary in old lexicon has been trained to corresponding old term vector, increases lexicon newly Vocabulary do not train corresponding term vector.For example：It is the existing lexicon for having trained term vector, new epexegesis in old lexicon Remittance storehouse includes newly-increased vocabulary, will now train the vocabulary in the old lexicon of term vector to merge into neologisms with newly-increased vocabulary Remittance storehouse.

S102, initialization process is carried out to the vocabulary in new term storehouse so that in belonging to old lexicon in new term storehouse The term vector of vocabulary is old term vector, and it is random term vector that the vocabulary term vector in newly-increased lexicon is belonged in new term storehouse.

Example, in embodiments of the present invention, remember that old lexicon is W, wherein, the vocabulary in old lexicon has been trained and obtained Corresponding term vector is designated as v (w), and it is Δ W to increase lexicon newly, then new term storehouse is W'=W+ Δ W, remembers that old lexicon W is corresponding Second Huffman tree is T, and W' corresponding first Huffman trees in new term storehouse are T'.Then judge the first vocabulary w in new term storehouse, If w is in old lexicon W, it was demonstrated that w trained corresponding term vector in old lexicon, then no longer the word is instructed Practice, but inherit original v (w)；If the first vocabulary w in new term storehouse belongs to newly-increased vocabulary, then in newly-increased lexicon The corresponding term vectors of random initializtion w.

For example, by taking the first vocabulary as an example, the first vocabulary is any vocabulary in new term storehouse, and the first vocabulary is in the first Hough Distribution on Man Shu can include two kinds of situations.The first situation：First vocabulary is the leaf node on the first Huffman tree；The Two kinds of situations：First vocabulary is the non-leaf nodes on the first Huffman tree.

The first situation：If the first vocabulary is the leaf node on the first Huffman tree, can be according to equation below 1 pair First vocabulary is initialized：

Wherein, w represents the first vocabulary, and v (w) represents term vectors of the w on the second Huffman tree T；V'(w) represent w the Term vector on one Huffman tree T'.

With reference to formula 1 as can be seen that if the first vocabulary belongs to old lexicon W, now the term vector of the first vocabulary is first Vocabulary corresponding old term vector in old lexicon, if it is newly-increased word that the first vocabulary is not belonging to old lexicon W, i.e. the first vocabulary Converge, at this point it is possible to the term vector that random initializtion, i.e. term vector now the first vocabulary are carried out to the term vector of first vocabulary is Random term vector.

Second situation：If the first vocabulary is the non-leaf nodes on the first Huffman tree, the n omicronn-leaf child node has one Individual parameter vector.For differentiation parameter vector, we set W₁Parameter on i-th node on the first Huffman path of correspondence Vector isW₂Parameter vector on the first Huffman path of correspondence on i-th node isWork as W₁And W₂On correspondence tree During same node, haveAssuming that vocabulary w being encoded to " 0010 " and being breathed out first on the second Huffman tree Coding on Fu Man trees is changed into " 00011 ", because the Huffman encoding of the two has same prefix " 00 ", this same prefix " 00 " Vector on corresponding node keeps constant.Here need to set mark L simultaneously^wAnd L'^wRepresent the first vocabulary w second respectively The code length of code length and the first vocabulary w on Huffman tree on the first Huffman tree.Then can be according to equation below 2 First vocabulary is initialized：

Wherein,I-th Huffman encoding of node on the first Huffman path that non-leaf nodes w is represented is represented, Represent i-th Huffman encoding of node on the second Huffman path that non-leaf nodes w is represented.At this point it is possible to non-leaf Node w corresponding Huffman encodings on the first Huffman tree are divided into prefix matching part With other nodes N omicronn-leaf child node w is represented in the second Huffman tree and first The length of the Huffman encoding matched on Huffman tree.

With reference to formula 2 as can be seen that if the first vocabulary is the non-leaf nodes on the first Huffman tree, the first vocabulary exists Vector corresponding with the prefix part matched on the second Huffman tree is existing parameter vector on first Huffman tree And the corresponding vector of coded portion is mismatched, it is initialized with null vector.

What deserves to be explained is, in embodiments of the present invention, for the first vocabulary, if the first vocabulary is the first Huffman Leaf node on tree, then using random initializtion；If n omicronn-leaf child node, then null vector is initialized as, specially：Initial word vector can be so set to fall into intervalWherein, m refers to word The length of vector.

After vocabulary in new term storehouse carries out initialization process, it is possible to the vocabulary correspondence in the new term storehouse Term vector be updated.

It is S103, right respectively according to corresponding first Huffman tree in new term storehouse and corresponding second Huffman tree of old lexicon The term vector of vocabulary is updated in new term storehouse.

Term vector training method provided in an embodiment of the present invention, by obtaining newly-increased lexicon, and in new term storehouse Vocabulary carries out initialization process so that the term vector for belonging to the vocabulary in old lexicon in new term storehouse is old term vector, neologisms It is random term vector to belong to the vocabulary term vector in newly-increased lexicon in remittance storehouse；Further according to corresponding first Huffman in new term storehouse Term vector of corresponding with old lexicon the second Huffman tree of tree respectively to vocabulary in new term storehouse is updated, improve word to The training effectiveness of amount.

Optionally, in embodiments of the present invention, S103 is according to corresponding first Huffman tree in new term storehouse and old lexicon Corresponding second Huffman tree the term vector of vocabulary in new term storehouse is updated respectively can by following possible realization, Specific shown in Figure 2, Fig. 2 is that the term vector of vocabulary in a kind of storehouse to new term provided in an embodiment of the present invention is updated Schematic flow sheet.

S201, the corresponding goal-selling function of the first vocabulary of acquisition.

Wherein, the first vocabulary is the vocabulary in new term storehouse.

Optionally, S201 obtains the corresponding goal-selling function of the first vocabulary, can be obtained by following two models：

For the first Skip-gram model, if the first vocabulary belongs to old lexicon, according to Skip-gram moulds The primal objective function of type carries out factorization to the first vocabulary, obtains the corresponding goal-selling function of the first vocabulary；If first Vocabulary belongs to newly-increased lexicon, then the corresponding goal-selling function of the first vocabulary is the primal objective function of Skip-gram models.

Example, in embodiments of the present invention, if the first vocabulary belongs to old lexicon, according to assembler code identical part Factorization is carried out to each word in W with different parts and can be obtained by the corresponding goal-selling function of the first vocabulary, i.e.,： According toFactorization is carried out to the first vocabulary, Obtain the corresponding goal-selling function of the first vocabulary；

If the first vocabulary belongs to newly-increased lexicon, the corresponding goal-selling function of the first vocabulary is Skip-gram models Primal objective function：

Wherein, w represents the first vocabulary, and W represents old lexicon, and Δ W represents newly-increased lexicon, and C (w) represents w contexts pair The lexicon that the vocabulary answered is constituted, u represents the corresponding vocabulary of w contexts,N omicronn-leaf child node w is represented in the second Huffman tree and The length of the Huffman encoding matched on the first Huffman tree, i represents that the first vocabulary is i-th section on the second Huffman tree Point, j represents that the first vocabulary is j-th node on the second Huffman tree,Represent on the corresponding first Huffman paths of u the The j-1 term vector of node,J-th Huffman encoding of node on the second Huffman path that u is represented is represented,Activation primitive is represented, v (w) represents the corresponding term vectors of w,The summation of same prefix coding is represented, Represent that the summation of zero n omicronn-leaf child node is inherited and be initialized as to other vocabulary in new term storehouse.

For second CBOW model, if the first vocabulary belongs to old lexicon, according to the original mesh of CBOW models Scalar functions carry out factorization to the first vocabulary, obtain the corresponding goal-selling function of the first vocabulary；If the first vocabulary belongs to new Increase lexicon, then the corresponding goal-selling function of the first vocabulary is the primal objective function of CBOW models.

Example, in embodiments of the present invention, if the first vocabulary belongs to old lexicon, according to the first vocabulary, then basis Coding identical part and different parts carry out factorization to each word in W, and to can be obtained by the first vocabulary corresponding pre- If object function, i.e.,：

According toFactorization is carried out to the first vocabulary, is obtained To the corresponding goal-selling function of the first vocabulary.

If the first vocabulary belongs to newly-increased lexicon, the corresponding goal-selling function of the first vocabulary is original for CBOW models Object function l (w, i)：

What deserves to be explained is, in embodiments of the present invention, by according to coding identical part and different parts in W Each word carry out factorization, the amount of calculation that can be saved during term vector, so as to improve computational efficiency.

After the corresponding goal-selling function of the first vocabulary is got, it is possible to according to the first vocabulary in the first Huffman The attribute of tree and gradient treatment is carried out to goal-selling function in the attribute of the second Huffman tree, so as to obtain the first vocabulary correspondence Term vector.

S202, according to the first vocabulary the attribute of the first Huffman tree and the second Huffman tree attribute to goal-selling Function carries out gradient treatment, obtains the corresponding term vector of the first vocabulary.

Incorporated by reference to step S201, can be by the following two kinds model realization：

For the first Skip-gram model, if the first vocabulary belongs to old lexicon, and the first vocabulary is breathed out first The coding of Fu Man trees has same prefix part with the coding in the second Huffman tree, then to the first vocabulary in the second Huffman tree On Huffman encoding different piece corresponding node vectorial basis Perform stochastic gradient rising treatment；To the different piece of Huffman encoding of first vocabulary on the first Huffman tree correspondence the The vectorial basis of node on two Huffman treesPerform at stochastic gradient descent Reason.

If the first vocabulary belongs to newly-increased lexicon, to the first vocabulary according to Stochastic gradient rising treatment is performed, the corresponding term vector of the first vocabulary is obtained；Wherein, η ' represents learning rate.

For example：Can be expressed as：

For second CBOW model, if the first vocabulary belongs to old lexicon, and the first vocabulary in the first Huffman The coding of tree has same prefix part with the coding in the second Huffman tree, then to the first vocabulary on the second Huffman tree The vectorial basis of the different piece corresponding node of Huffman encodingPerform random Gradient rising is processed；Different piece to Huffman encoding of first vocabulary on the first Huffman tree is corresponding in the second Huffman The vectorial basis of node on treePerform stochastic gradient descent treatment.

If the first vocabulary belongs to newly-increased lexicon, to the first vocabulary according to Stochastic gradient rising treatment is performed, the corresponding term vector of the first vocabulary is obtained.

For example, can be expressed as：

Wherein, what η ' was represented is learning rate.Example, initial learning rate η is set₀=0.025, often process 1000 Word, is once adjusted according to below equation to learning rate：

Wherein, word_count_actual represents current processed word number, and train_words+1 is to prevent denominator It is zero.A threshold value η is introduced simultaneously_min=10^-4*η₀, i.e. the minimum η of η_min, prevent the too small situation of learning rate.In increment During study hair, word number counter needs the word number plus original language material and combines η_minLimitation calculate η.

Fig. 3 is a kind of structural representation of term vector trainer 30 provided in an embodiment of the present invention, and certainly, the present invention is real It is to be illustrated by taking Fig. 3 as an example to apply example, but is not represented present invention is limited only by this.Please join shown in Fig. 3, term vector training Device 30 can include：

Acquisition module 301, for obtaining newly-increased lexicon, increases the vocabulary structure in vocabulary and the old lexicon in lexicon newly Into new term storehouse, the corresponding term vector that has been friends in the past of vocabulary in old lexicon

Initialization module 302, for carrying out initialization process to the vocabulary in new term storehouse so that new term belongs in storehouse The term vector of the vocabulary in old lexicon be old term vector, belong in new term storehouse the vocabulary term vector in newly-increased lexicon be with Machine term vector.

Update module 303, for being breathed out according to corresponding first Huffman tree in new term storehouse and old lexicon corresponding second Term vector of the Fu Man trees respectively to vocabulary in new term storehouse is updated.

Optionally, update module 303, specifically for obtaining the corresponding goal-selling function of the first vocabulary, the first vocabulary is Vocabulary in new term storehouse；According to the first vocabulary the attribute of the first Huffman tree and the second Huffman tree attribute to default Object function carries out gradient treatment, obtains the corresponding term vector of the first vocabulary.

Term vector trainer 30 shown in the embodiment of the present invention, can perform the term vector shown in above method embodiment The corresponding technical scheme of training method, its realization principle and beneficial effect are similar to, and are no longer repeated herein.

One of ordinary skill in the art will appreciate that：Realizing all or part of step of above-mentioned each method embodiment can lead to The related hardware of programmed instruction is crossed to complete.Foregoing program can be stored in a computer read/write memory medium.The journey Sequence upon execution, performs the step of including above-mentioned each method embodiment；And foregoing storage medium includes：ROM, RAM, magnetic disc or Person's CD etc. is various can be with the medium of store program codes.

Finally it should be noted that：Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations；To the greatest extent Pipe has been described in detail with reference to foregoing embodiments to the present invention, it will be understood by those within the art that：Its according to The technical scheme described in foregoing embodiments can so be modified, or which part or all technical characteristic are entered Row equivalent；And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology The scope of scheme.

Claims

1. a kind of term vector training method, it is characterised in that including：

Newly-increased lexicon is obtained, the vocabulary in the newly-increased lexicon constitutes new term storehouse with the vocabulary in old lexicon, described The corresponding term vector of haveing been friends in the past of vocabulary in old lexicon；

Initialization process is carried out to the vocabulary in the new term storehouse so that in belonging to the old lexicon in the new term storehouse Vocabulary term vector be old term vector, the vocabulary term vector belonged in the new term storehouse in the newly-increased lexicon is random Term vector；

It is right respectively according to corresponding first Huffman tree in the new term storehouse and corresponding second Huffman tree of the old lexicon The term vector of vocabulary is updated in the new term storehouse.

2. method according to claim 1, it is characterised in that described according to corresponding first Huffman in the new term storehouse The term vector of tree and corresponding second Huffman tree of the old lexicon respectively to vocabulary in the new term storehouse is updated, bag Include：

The corresponding goal-selling function of first vocabulary is obtained, first vocabulary is the vocabulary in the new term storehouse；

According to first vocabulary the attribute of first Huffman tree and second Huffman tree attribute to described Goal-selling function carries out gradient treatment, obtains the corresponding term vector of first vocabulary.

3. method according to claim 2, it is characterised in that the corresponding goal-selling letter of acquisition first vocabulary Number, including：

If first vocabulary belongs to the old lexicon, the primal objective function according to Skip-gram models is to described One vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary；

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is described The primal objective function of Skip-gram models.

4. method according to claim 2, it is characterised in that the corresponding goal-selling letter of acquisition first vocabulary Number, including：

If first vocabulary belongs to the old lexicon, the primal objective function according to CBOW models is to first vocabulary Factorization is carried out, the corresponding goal-selling function of first vocabulary is obtained；

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is described The primal objective function of CBOW models.

5. method according to claim 3, it is characterised in that the primal objective function according to Skip-gram models is to institute Stating the first vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary, including：

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is Skip-gram The primal objective function of model

Wherein, w represents first vocabulary, and W represents the old lexicon, and Δ W represents the newly-increased lexicon, and C (w) represents w The lexicon that the corresponding vocabulary of context is constituted, u represents the corresponding vocabulary of w contexts,N omicronn-leaf child node w is represented to be breathed out second Fu Man trees and the length of the Huffman encoding matched on the first Huffman tree, i represent that first vocabulary is breathed out for described second I-th node on Fu Man trees, j represents that first vocabulary is j-th node on second Huffman tree,Represent u The term vector of -1 node of jth on corresponding first Huffman path,Represent j-th on the second Huffman path that u is represented The Huffman encoding of node,Activation primitive is represented, v (w) represents the corresponding term vectors of w.

6. method according to claim 4, it is characterised in that the primal objective function according to CBOW models is to described First vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary, including：

If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is CBOW moulds The primal objective function of type

Wherein,J-th Huffman encoding of node on the second Huffman path that w is represented is represented,Own in expression C (w) The corresponding term vector sum of vocabulary.

7. method according to claim 5, it is characterised in that it is described according to first vocabulary in first Huffman The attribute of tree and gradient treatment is carried out to the goal-selling function in the attribute of second Huffman tree, obtain described first The corresponding term vector of vocabulary, including：

If first vocabulary belongs to the old lexicon, and first vocabulary first Huffman tree coding with The coding of second Huffman tree has same prefix part, then to first vocabulary on second Huffman tree The vectorial basis of the different piece corresponding node of Huffman encodingPerform Stochastic gradient rising is processed；To the different piece correspondence of Huffman encoding of first vocabulary on first Huffman tree The vectorial basis of node on second Huffman treePerform boarding steps Degree decline treatment；

If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according to Stochastic gradient rising treatment is performed, the corresponding term vector of first vocabulary is obtained；

Wherein, η ' represents learning rate.

8. method according to claim 6, it is characterised in that it is described according to first vocabulary in first Huffman The attribute of tree and gradient treatment is carried out to the goal-selling function in the attribute of second Huffman tree, obtain described first The corresponding term vector of vocabulary, including：

If first vocabulary belongs to the old lexicon, and first vocabulary first Huffman tree coding with The coding of second Huffman tree has same prefix part, then to first vocabulary on second Huffman tree The vectorial basis of the different piece corresponding node of Huffman encodingPerform boarding steps Degree rising treatment；Different piece to Huffman encoding of first vocabulary on first Huffman tree is corresponding described The vectorial basis of node on second Huffman treePerform at stochastic gradient descent Reason；

9. a kind of term vector trainer, it is characterised in that including：

Acquisition module, for obtaining newly-increased lexicon, the vocabulary in the newly-increased lexicon is constituted with the vocabulary in old lexicon New term storehouse, the corresponding term vector of haveing been friends in the past of vocabulary in the old lexicon；

Initialization module, for carrying out initialization process to the vocabulary in the new term storehouse so that belong in the new term storehouse The term vector of the vocabulary in the old lexicon is old term vector, in belonging to the newly-increased lexicon in the new term storehouse Vocabulary term vector is random term vector；

Update module, for being breathed out according to corresponding first Huffman tree in the new term storehouse and the old lexicon corresponding second Term vector of the Fu Man trees respectively to vocabulary in the new term storehouse is updated.

10. device according to claim 9, it is characterised in that

The update module, specifically for obtaining the corresponding goal-selling function of first vocabulary, first vocabulary is institute State the vocabulary in new term storehouse；According to first vocabulary in the attribute of first Huffman tree and in second Huffman The attribute of tree carries out gradient treatment to the goal-selling function, obtains the corresponding term vector of first vocabulary.