CN106897265A - Term vector training method and device - Google Patents
Term vector training method and device Download PDFInfo
- Publication number
- CN106897265A CN106897265A CN201710022458.0A CN201710022458A CN106897265A CN 106897265 A CN106897265 A CN 106897265A CN 201710022458 A CN201710022458 A CN 201710022458A CN 106897265 A CN106897265 A CN 106897265A
- Authority
- CN
- China
- Prior art keywords
- vocabulary
- lexicon
- term vector
- huffman
- old
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
Abstract
The present invention provides a kind of term vector training method and device, belongs to machine learning techniques field.The term vector training method includes:Newly-increased lexicon is obtained, the vocabulary increased newly in lexicon constitutes new term storehouse with the vocabulary in old lexicon, the corresponding term vector of haveing been friends in the past of the vocabulary in old lexicon;Initialization process is carried out to the vocabulary in new term storehouse so that the term vector for belonging to the vocabulary in old lexicon in new term storehouse is old term vector, it is random term vector that the vocabulary term vector in newly-increased lexicon is belonged in new term storehouse;Term vector according to corresponding first Huffman tree in new term storehouse and corresponding second Huffman tree of old lexicon respectively to vocabulary in new term storehouse is updated.Term vector training method and device that the present invention is provided, improve the training effectiveness of term vector.
Description
Technical field
The present invention relates to machine learning techniques field, more particularly to a kind of term vector training method and device.
Background technology
In machine learning techniques, in order that machine understands the implication of human language, the vocabulary of neutral net language model
Show that each vocabulary in human language is converted into instrument the form of term vector so that computer can be learnt by term vector
The implication of each vocabulary in human language.
Using prior art, after new vocabulary is added in lexicon, it usually needs in relearning new lexicon
All of vocabulary, the term vector new to obtain each vocabulary.But, using which so that the training effectiveness of term vector is relatively low.
The content of the invention
The present invention provides a kind of term vector training method and device, improves the training effectiveness of term vector.
The embodiment of the present invention provides a kind of term vector training method, including:
Newly-increased lexicon is obtained, the vocabulary in the newly-increased lexicon constitutes new term storehouse with the vocabulary in old lexicon,
The corresponding term vector of haveing been friends in the past of vocabulary in the old lexicon;
Initialization process is carried out to the vocabulary in the new term storehouse so that belong to the old vocabulary in the new term storehouse
The term vector of the vocabulary in storehouse is old term vector, and the vocabulary term vector belonged in the new term storehouse in the newly-increased lexicon is
Random term vector;
According to corresponding first Huffman tree in the new term storehouse and corresponding second Huffman tree point of the old lexicon
The other term vector to vocabulary in the new term storehouse is updated.
In an embodiment of the present invention, it is described according to corresponding first Huffman tree in the new term storehouse and the old vocabulary
Term vector of corresponding second Huffman tree in storehouse respectively to vocabulary in the new term storehouse is updated, including:
The corresponding goal-selling function of first vocabulary is obtained, first vocabulary is the word in the new term storehouse
Converge;
According to first vocabulary in the attribute of first Huffman tree and the attribute pair in second Huffman tree
The goal-selling function carries out gradient treatment, obtains the corresponding term vector of first vocabulary.
In an embodiment of the present invention, the corresponding goal-selling function of acquisition first vocabulary, including:
If first vocabulary belongs to the old lexicon, the primal objective function according to Skip-gram models is to institute
Stating the first vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is institute
State the primal objective function of Skip-gram models.
In an embodiment of the present invention, the corresponding goal-selling function of acquisition first vocabulary, including:
If first vocabulary belongs to the old lexicon, the primal objective function according to CBOW models is to described first
Vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is institute
State the primal objective function of CBOW models.
In an embodiment of the present invention, the primal objective function according to Skip-gram models is carried out to first vocabulary
Factorization, obtains the corresponding goal-selling function of first vocabulary, including:
If first vocabulary belongs to the old lexicon, basis
Factorization is carried out to first vocabulary, the corresponding goal-selling function of first vocabulary is obtained;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is Skip-
The primal objective function of gram models
Wherein, w represents first vocabulary, and W represents the old lexicon, and Δ W represents the newly-increased lexicon, C (w) tables
Show the lexicon that the corresponding vocabulary of w contexts is constituted, u represents the corresponding vocabulary of w contexts,N omicronn-leaf child node w is represented
Two Huffman trees and the length of the Huffman encoding matched on the first Huffman tree, i represent that first vocabulary is described the
I-th node on two Huffman trees, j represents that first vocabulary is j-th node on second Huffman tree,
The term vector of -1 node of jth on the corresponding first Huffman paths of u is represented,Represent on the second Huffman path that u is represented
J-th Huffman encoding of node,Activation primitive is represented, v (w) represents the corresponding term vectors of w.
In an embodiment of the present invention, the primal objective function according to CBOW models first vocabulary is carried out because
Formula is decomposed, and obtains the corresponding goal-selling function of first vocabulary, including:
If first vocabulary belongs to the old lexicon, basis
Factorization is carried out to first vocabulary, the corresponding goal-selling function of first vocabulary is obtained;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is
The primal objective function of CBOW models
Wherein,J-th Huffman encoding of node on the second Huffman path that w is represented is represented,In expression C (w)
The corresponding term vector sum of all vocabulary.
In an embodiment of the present invention, it is described according to first vocabulary in the attribute of first Huffman tree and in institute
The attribute for stating the second Huffman tree carries out gradient treatment to the goal-selling function, obtain the corresponding word of first vocabulary to
Amount, including:
If first vocabulary belongs to the old lexicon, and first vocabulary in the coding of first Huffman tree
There is same prefix part with the coding in second Huffman tree, then to first vocabulary in second Huffman tree
On Huffman encoding different piece corresponding node vectorial basis
Perform stochastic gradient rising treatment;To the different piece of Huffman encoding of first vocabulary on first Huffman tree
The vectorial basis of correspondence node on second Huffman treePerform with
The decline of machine gradient is processed;
If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according toStochastic gradient rising treatment is performed, first vocabulary is obtained corresponding
Term vector;
Wherein, η ' represents learning rate.
In an embodiment of the present invention, it is described according to first vocabulary in the attribute of first Huffman tree and in institute
The attribute for stating the second Huffman tree carries out gradient treatment to the goal-selling function, obtain the corresponding word of first vocabulary to
Amount, including:
If first vocabulary belongs to the old lexicon, and first vocabulary in the coding of first Huffman tree
There is same prefix part with the coding in second Huffman tree, then to first vocabulary in second Huffman tree
On Huffman encoding different piece corresponding node vectorial basisPerform
Stochastic gradient rising is processed;To the different piece correspondence of Huffman encoding of first vocabulary on first Huffman tree
The vectorial basis of node on second Huffman treePerform stochastic gradient
Decline is processed;
If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according toPerform stochastic gradient rising treatment, obtain the corresponding word of first vocabulary to
Amount;
Wherein,Represent the i-th -1 term vector of node on the corresponding first Huffman paths of w.
The embodiment of the present invention also provides a kind of term vector trainer, including:
Acquisition module, for obtaining newly-increased lexicon, the vocabulary in vocabulary in the newly-increased lexicon and old lexicon
New term storehouse is constituted, the corresponding term vector of haveing been friends in the past of the vocabulary in the old lexicon;
Initialization module, for carrying out initialization process to the vocabulary in the new term storehouse so that the new term storehouse
In to belong to the term vector of vocabulary in the old lexicon be old term vector, the newly-increased lexicon is belonged in the new term storehouse
In vocabulary term vector be random term vector;
Update module, for according to corresponding first Huffman tree in the new term storehouse and the old lexicon corresponding
Term vector of two Huffman trees respectively to vocabulary in the new term storehouse is updated.
In an embodiment of the present invention, the update module, specifically for obtaining the corresponding default mesh of first vocabulary
Scalar functions, first vocabulary is the vocabulary in the new term storehouse;According to first vocabulary in first Huffman tree
Attribute and gradient treatment is carried out to the goal-selling function in the attribute of second Huffman tree, obtain first word
Converge corresponding term vector.
Term vector training method provided in an embodiment of the present invention and device, by obtaining newly-increased lexicon, and to new term
Vocabulary in storehouse carries out initialization process so that belong in new term storehouse the term vector of vocabulary in old lexicon for old word to
Amount, it is random term vector that the vocabulary term vector in newly-increased lexicon is belonged in new term storehouse;Further according to new term storehouse corresponding
The term vector of one Huffman tree and corresponding second Huffman tree of old lexicon respectively to vocabulary in new term storehouse is updated, and carries
The training effectiveness of term vector high.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for having technology description does one and simply introduces, it should be apparent that, drawings in the following description are this hairs
Some bright embodiments, for those of ordinary skill in the art, without having to pay creative labor, can be with
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of schematic flow sheet of term vector training method provided in an embodiment of the present invention;
Fig. 2 is that the flow that is updated of term vector of vocabulary in a kind of storehouse to new term provided in an embodiment of the present invention is illustrated
Figure;
Fig. 3 is a kind of structural representation of term vector trainer provided in an embodiment of the present invention.
Specific embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art
The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Term " first ", " second ", " the 3rd ", " in description and claims of this specification and above-mentioned accompanying drawing
Four " etc. (if present) is for distinguishing similar object, without for describing specific order or precedence.Should manage
Solution so data for using can be exchanged in the appropriate case, so as to embodiments of the invention described herein, for example can be with
Order in addition to those for illustrating herein or describing is implemented.Additionally, term " comprising " and " having " and they appoint
What deforms, it is intended that covering is non-exclusive to be included, for example, contain the process of series of steps or unit, method, system,
Product or equipment are not necessarily limited to those steps clearly listed or unit, but may include not list clearly or for
These processes, method, product or other intrinsic steps of equipment or unit.
It should be noted that these specific embodiments can be combined with each other below, for same or analogous concept
Or process may be repeated no more in certain embodiments.
Fig. 1 is a kind of schematic flow sheet of term vector training method provided in an embodiment of the present invention, the term vector training side
Method can be performed by term vector trainer, the term vector trainer can with it is integrated within a processor, it is also possible to be separately provided,
Here, the present invention is not particularly limited.Specifically, shown in Figure 1, the term vector training method can include:
S101, the newly-increased lexicon of acquisition.
Wherein, the vocabulary increased newly in lexicon constitutes new term storehouse, the word in old lexicon with the vocabulary in old lexicon
The corresponding term vector of haveing been friends in the past of remittance.
In embodiments of the present invention, the vocabulary in old lexicon has been trained to corresponding old term vector, increases lexicon newly
Vocabulary do not train corresponding term vector.For example:It is the existing lexicon for having trained term vector, new epexegesis in old lexicon
Remittance storehouse includes newly-increased vocabulary, will now train the vocabulary in the old lexicon of term vector to merge into neologisms with newly-increased vocabulary
Remittance storehouse.
S102, initialization process is carried out to the vocabulary in new term storehouse so that in belonging to old lexicon in new term storehouse
The term vector of vocabulary is old term vector, and it is random term vector that the vocabulary term vector in newly-increased lexicon is belonged in new term storehouse.
Example, in embodiments of the present invention, remember that old lexicon is W, wherein, the vocabulary in old lexicon has been trained and obtained
Corresponding term vector is designated as v (w), and it is Δ W to increase lexicon newly, then new term storehouse is W'=W+ Δ W, remembers that old lexicon W is corresponding
Second Huffman tree is T, and W' corresponding first Huffman trees in new term storehouse are T'.Then judge the first vocabulary w in new term storehouse,
If w is in old lexicon W, it was demonstrated that w trained corresponding term vector in old lexicon, then no longer the word is instructed
Practice, but inherit original v (w);If the first vocabulary w in new term storehouse belongs to newly-increased vocabulary, then in newly-increased lexicon
The corresponding term vectors of random initializtion w.
For example, by taking the first vocabulary as an example, the first vocabulary is any vocabulary in new term storehouse, and the first vocabulary is in the first Hough
Distribution on Man Shu can include two kinds of situations.The first situation:First vocabulary is the leaf node on the first Huffman tree;The
Two kinds of situations:First vocabulary is the non-leaf nodes on the first Huffman tree.
The first situation:If the first vocabulary is the leaf node on the first Huffman tree, can be according to equation below 1 pair
First vocabulary is initialized:
Wherein, w represents the first vocabulary, and v (w) represents term vectors of the w on the second Huffman tree T;V'(w) represent w the
Term vector on one Huffman tree T'.
With reference to formula 1 as can be seen that if the first vocabulary belongs to old lexicon W, now the term vector of the first vocabulary is first
Vocabulary corresponding old term vector in old lexicon, if it is newly-increased word that the first vocabulary is not belonging to old lexicon W, i.e. the first vocabulary
Converge, at this point it is possible to the term vector that random initializtion, i.e. term vector now the first vocabulary are carried out to the term vector of first vocabulary is
Random term vector.
Second situation:If the first vocabulary is the non-leaf nodes on the first Huffman tree, the n omicronn-leaf child node has one
Individual parameter vector.For differentiation parameter vector, we set W1Parameter on i-th node on the first Huffman path of correspondence
Vector isW2Parameter vector on the first Huffman path of correspondence on i-th node isWork as W1And W2On correspondence tree
During same node, haveAssuming that vocabulary w being encoded to " 0010 " and being breathed out first on the second Huffman tree
Coding on Fu Man trees is changed into " 00011 ", because the Huffman encoding of the two has same prefix " 00 ", this same prefix " 00 "
Vector on corresponding node keeps constant.Here need to set mark L simultaneouslywAnd L'wRepresent the first vocabulary w second respectively
The code length of code length and the first vocabulary w on Huffman tree on the first Huffman tree.Then can be according to equation below 2
First vocabulary is initialized:
Wherein,I-th Huffman encoding of node on the first Huffman path that non-leaf nodes w is represented is represented,
Represent i-th Huffman encoding of node on the second Huffman path that non-leaf nodes w is represented.At this point it is possible to non-leaf
Node w corresponding Huffman encodings on the first Huffman tree are divided into prefix matching part
With other nodes N omicronn-leaf child node w is represented in the second Huffman tree and first
The length of the Huffman encoding matched on Huffman tree.
With reference to formula 2 as can be seen that if the first vocabulary is the non-leaf nodes on the first Huffman tree, the first vocabulary exists
Vector corresponding with the prefix part matched on the second Huffman tree is existing parameter vector on first Huffman tree
And the corresponding vector of coded portion is mismatched, it is initialized with null vector.
What deserves to be explained is, in embodiments of the present invention, for the first vocabulary, if the first vocabulary is the first Huffman
Leaf node on tree, then using random initializtion;If n omicronn-leaf child node, then null vector is initialized as, specially:Initial word vector can be so set to fall into intervalWherein, m refers to word
The length of vector.
After vocabulary in new term storehouse carries out initialization process, it is possible to the vocabulary correspondence in the new term storehouse
Term vector be updated.
It is S103, right respectively according to corresponding first Huffman tree in new term storehouse and corresponding second Huffman tree of old lexicon
The term vector of vocabulary is updated in new term storehouse.
Term vector training method provided in an embodiment of the present invention, by obtaining newly-increased lexicon, and in new term storehouse
Vocabulary carries out initialization process so that the term vector for belonging to the vocabulary in old lexicon in new term storehouse is old term vector, neologisms
It is random term vector to belong to the vocabulary term vector in newly-increased lexicon in remittance storehouse;Further according to corresponding first Huffman in new term storehouse
Term vector of corresponding with old lexicon the second Huffman tree of tree respectively to vocabulary in new term storehouse is updated, improve word to
The training effectiveness of amount.
Optionally, in embodiments of the present invention, S103 is according to corresponding first Huffman tree in new term storehouse and old lexicon
Corresponding second Huffman tree the term vector of vocabulary in new term storehouse is updated respectively can by following possible realization,
Specific shown in Figure 2, Fig. 2 is that the term vector of vocabulary in a kind of storehouse to new term provided in an embodiment of the present invention is updated
Schematic flow sheet.
S201, the corresponding goal-selling function of the first vocabulary of acquisition.
Wherein, the first vocabulary is the vocabulary in new term storehouse.
Optionally, S201 obtains the corresponding goal-selling function of the first vocabulary, can be obtained by following two models:
For the first Skip-gram model, if the first vocabulary belongs to old lexicon, according to Skip-gram moulds
The primal objective function of type carries out factorization to the first vocabulary, obtains the corresponding goal-selling function of the first vocabulary;If first
Vocabulary belongs to newly-increased lexicon, then the corresponding goal-selling function of the first vocabulary is the primal objective function of Skip-gram models.
Example, in embodiments of the present invention, if the first vocabulary belongs to old lexicon, according to assembler code identical part
Factorization is carried out to each word in W with different parts and can be obtained by the corresponding goal-selling function of the first vocabulary, i.e.,:
According toFactorization is carried out to the first vocabulary,
Obtain the corresponding goal-selling function of the first vocabulary;
If the first vocabulary belongs to newly-increased lexicon, the corresponding goal-selling function of the first vocabulary is Skip-gram models
Primal objective function:
Wherein, w represents the first vocabulary, and W represents old lexicon, and Δ W represents newly-increased lexicon, and C (w) represents w contexts pair
The lexicon that the vocabulary answered is constituted, u represents the corresponding vocabulary of w contexts,N omicronn-leaf child node w is represented in the second Huffman tree and
The length of the Huffman encoding matched on the first Huffman tree, i represents that the first vocabulary is i-th section on the second Huffman tree
Point, j represents that the first vocabulary is j-th node on the second Huffman tree,Represent on the corresponding first Huffman paths of u the
The j-1 term vector of node,J-th Huffman encoding of node on the second Huffman path that u is represented is represented,Activation primitive is represented, v (w) represents the corresponding term vectors of w,The summation of same prefix coding is represented,
Represent that the summation of zero n omicronn-leaf child node is inherited and be initialized as to other vocabulary in new term storehouse.
For second CBOW model, if the first vocabulary belongs to old lexicon, according to the original mesh of CBOW models
Scalar functions carry out factorization to the first vocabulary, obtain the corresponding goal-selling function of the first vocabulary;If the first vocabulary belongs to new
Increase lexicon, then the corresponding goal-selling function of the first vocabulary is the primal objective function of CBOW models.
Example, in embodiments of the present invention, if the first vocabulary belongs to old lexicon, according to the first vocabulary, then basis
Coding identical part and different parts carry out factorization to each word in W, and to can be obtained by the first vocabulary corresponding pre-
If object function, i.e.,:
According toFactorization is carried out to the first vocabulary, is obtained
To the corresponding goal-selling function of the first vocabulary.
If the first vocabulary belongs to newly-increased lexicon, the corresponding goal-selling function of the first vocabulary is original for CBOW models
Object function l (w, i):
Wherein,J-th Huffman encoding of node on the second Huffman path that w is represented is represented,In expression C (w)
The corresponding term vector sum of all vocabulary.
What deserves to be explained is, in embodiments of the present invention, by according to coding identical part and different parts in W
Each word carry out factorization, the amount of calculation that can be saved during term vector, so as to improve computational efficiency.
After the corresponding goal-selling function of the first vocabulary is got, it is possible to according to the first vocabulary in the first Huffman
The attribute of tree and gradient treatment is carried out to goal-selling function in the attribute of the second Huffman tree, so as to obtain the first vocabulary correspondence
Term vector.
S202, according to the first vocabulary the attribute of the first Huffman tree and the second Huffman tree attribute to goal-selling
Function carries out gradient treatment, obtains the corresponding term vector of the first vocabulary.
Incorporated by reference to step S201, can be by the following two kinds model realization:
For the first Skip-gram model, if the first vocabulary belongs to old lexicon, and the first vocabulary is breathed out first
The coding of Fu Man trees has same prefix part with the coding in the second Huffman tree, then to the first vocabulary in the second Huffman tree
On Huffman encoding different piece corresponding node vectorial basis
Perform stochastic gradient rising treatment;To the different piece of Huffman encoding of first vocabulary on the first Huffman tree correspondence the
The vectorial basis of node on two Huffman treesPerform at stochastic gradient descent
Reason.
If the first vocabulary belongs to newly-increased lexicon, to the first vocabulary according to
Stochastic gradient rising treatment is performed, the corresponding term vector of the first vocabulary is obtained;Wherein, η ' represents learning rate.
For example:Can be expressed as:
For second CBOW model, if the first vocabulary belongs to old lexicon, and the first vocabulary in the first Huffman
The coding of tree has same prefix part with the coding in the second Huffman tree, then to the first vocabulary on the second Huffman tree
The vectorial basis of the different piece corresponding node of Huffman encodingPerform random
Gradient rising is processed;Different piece to Huffman encoding of first vocabulary on the first Huffman tree is corresponding in the second Huffman
The vectorial basis of node on treePerform stochastic gradient descent treatment.
If the first vocabulary belongs to newly-increased lexicon, to the first vocabulary according to
Stochastic gradient rising treatment is performed, the corresponding term vector of the first vocabulary is obtained.
Wherein,Represent the i-th -1 term vector of node on the corresponding first Huffman paths of w.
For example, can be expressed as:
Wherein, what η ' was represented is learning rate.Example, initial learning rate η is set0=0.025, often process 1000
Word, is once adjusted according to below equation to learning rate:
Wherein, word_count_actual represents current processed word number, and train_words+1 is to prevent denominator
It is zero.A threshold value η is introduced simultaneouslymin=10-4*η0, i.e. the minimum η of ηmin, prevent the too small situation of learning rate.In increment
During study hair, word number counter needs the word number plus original language material and combines ηminLimitation calculate η.
Fig. 3 is a kind of structural representation of term vector trainer 30 provided in an embodiment of the present invention, and certainly, the present invention is real
It is to be illustrated by taking Fig. 3 as an example to apply example, but is not represented present invention is limited only by this.Please join shown in Fig. 3, term vector training
Device 30 can include:
Acquisition module 301, for obtaining newly-increased lexicon, increases the vocabulary structure in vocabulary and the old lexicon in lexicon newly
Into new term storehouse, the corresponding term vector that has been friends in the past of vocabulary in old lexicon
Initialization module 302, for carrying out initialization process to the vocabulary in new term storehouse so that new term belongs in storehouse
The term vector of the vocabulary in old lexicon be old term vector, belong in new term storehouse the vocabulary term vector in newly-increased lexicon be with
Machine term vector.
Update module 303, for being breathed out according to corresponding first Huffman tree in new term storehouse and old lexicon corresponding second
Term vector of the Fu Man trees respectively to vocabulary in new term storehouse is updated.
Optionally, update module 303, specifically for obtaining the corresponding goal-selling function of the first vocabulary, the first vocabulary is
Vocabulary in new term storehouse;According to the first vocabulary the attribute of the first Huffman tree and the second Huffman tree attribute to default
Object function carries out gradient treatment, obtains the corresponding term vector of the first vocabulary.
Term vector trainer 30 shown in the embodiment of the present invention, can perform the term vector shown in above method embodiment
The corresponding technical scheme of training method, its realization principle and beneficial effect are similar to, and are no longer repeated herein.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above-mentioned each method embodiment can lead to
The related hardware of programmed instruction is crossed to complete.Foregoing program can be stored in a computer read/write memory medium.The journey
Sequence upon execution, performs the step of including above-mentioned each method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or
Person's CD etc. is various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;To the greatest extent
Pipe has been described in detail with reference to foregoing embodiments to the present invention, it will be understood by those within the art that:Its according to
The technical scheme described in foregoing embodiments can so be modified, or which part or all technical characteristic are entered
Row equivalent;And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology
The scope of scheme.
Claims (10)
1. a kind of term vector training method, it is characterised in that including:
Newly-increased lexicon is obtained, the vocabulary in the newly-increased lexicon constitutes new term storehouse with the vocabulary in old lexicon, described
The corresponding term vector of haveing been friends in the past of vocabulary in old lexicon;
Initialization process is carried out to the vocabulary in the new term storehouse so that in belonging to the old lexicon in the new term storehouse
Vocabulary term vector be old term vector, the vocabulary term vector belonged in the new term storehouse in the newly-increased lexicon is random
Term vector;
It is right respectively according to corresponding first Huffman tree in the new term storehouse and corresponding second Huffman tree of the old lexicon
The term vector of vocabulary is updated in the new term storehouse.
2. method according to claim 1, it is characterised in that described according to corresponding first Huffman in the new term storehouse
The term vector of tree and corresponding second Huffman tree of the old lexicon respectively to vocabulary in the new term storehouse is updated, bag
Include:
The corresponding goal-selling function of first vocabulary is obtained, first vocabulary is the vocabulary in the new term storehouse;
According to first vocabulary the attribute of first Huffman tree and second Huffman tree attribute to described
Goal-selling function carries out gradient treatment, obtains the corresponding term vector of first vocabulary.
3. method according to claim 2, it is characterised in that the corresponding goal-selling letter of acquisition first vocabulary
Number, including:
If first vocabulary belongs to the old lexicon, the primal objective function according to Skip-gram models is to described
One vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is described
The primal objective function of Skip-gram models.
4. method according to claim 2, it is characterised in that the corresponding goal-selling letter of acquisition first vocabulary
Number, including:
If first vocabulary belongs to the old lexicon, the primal objective function according to CBOW models is to first vocabulary
Factorization is carried out, the corresponding goal-selling function of first vocabulary is obtained;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is described
The primal objective function of CBOW models.
5. method according to claim 3, it is characterised in that the primal objective function according to Skip-gram models is to institute
Stating the first vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary, including:
If first vocabulary belongs to the old lexicon, basis
Factorization is carried out to first vocabulary, the corresponding goal-selling function of first vocabulary is obtained;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is Skip-gram
The primal objective function of model
Wherein, w represents first vocabulary, and W represents the old lexicon, and Δ W represents the newly-increased lexicon, and C (w) represents w
The lexicon that the corresponding vocabulary of context is constituted, u represents the corresponding vocabulary of w contexts,N omicronn-leaf child node w is represented to be breathed out second
Fu Man trees and the length of the Huffman encoding matched on the first Huffman tree, i represent that first vocabulary is breathed out for described second
I-th node on Fu Man trees, j represents that first vocabulary is j-th node on second Huffman tree,Represent u
The term vector of -1 node of jth on corresponding first Huffman path,Represent j-th on the second Huffman path that u is represented
The Huffman encoding of node,Activation primitive is represented, v (w) represents the corresponding term vectors of w.
6. method according to claim 4, it is characterised in that the primal objective function according to CBOW models is to described
First vocabulary carries out factorization, obtains the corresponding goal-selling function of first vocabulary, including:
If first vocabulary belongs to the old lexicon, basis
Factorization is carried out to first vocabulary, the corresponding goal-selling function of first vocabulary is obtained;
If first vocabulary belongs to the newly-increased lexicon, the corresponding goal-selling function of first vocabulary is CBOW moulds
The primal objective function of type
Wherein,J-th Huffman encoding of node on the second Huffman path that w is represented is represented,Own in expression C (w)
The corresponding term vector sum of vocabulary.
7. method according to claim 5, it is characterised in that it is described according to first vocabulary in first Huffman
The attribute of tree and gradient treatment is carried out to the goal-selling function in the attribute of second Huffman tree, obtain described first
The corresponding term vector of vocabulary, including:
If first vocabulary belongs to the old lexicon, and first vocabulary first Huffman tree coding with
The coding of second Huffman tree has same prefix part, then to first vocabulary on second Huffman tree
The vectorial basis of the different piece corresponding node of Huffman encodingPerform
Stochastic gradient rising is processed;To the different piece correspondence of Huffman encoding of first vocabulary on first Huffman tree
The vectorial basis of node on second Huffman treePerform boarding steps
Degree decline treatment;
If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according to
Stochastic gradient rising treatment is performed, the corresponding term vector of first vocabulary is obtained;
Wherein, η ' represents learning rate.
8. method according to claim 6, it is characterised in that it is described according to first vocabulary in first Huffman
The attribute of tree and gradient treatment is carried out to the goal-selling function in the attribute of second Huffman tree, obtain described first
The corresponding term vector of vocabulary, including:
If first vocabulary belongs to the old lexicon, and first vocabulary first Huffman tree coding with
The coding of second Huffman tree has same prefix part, then to first vocabulary on second Huffman tree
The vectorial basis of the different piece corresponding node of Huffman encodingPerform boarding steps
Degree rising treatment;Different piece to Huffman encoding of first vocabulary on first Huffman tree is corresponding described
The vectorial basis of node on second Huffman treePerform at stochastic gradient descent
Reason;
If first vocabulary belongs to the newly-increased lexicon, to first vocabulary according to
Stochastic gradient rising treatment is performed, the corresponding term vector of first vocabulary is obtained;
Wherein,Represent the i-th -1 term vector of node on the corresponding first Huffman paths of w.
9. a kind of term vector trainer, it is characterised in that including:
Acquisition module, for obtaining newly-increased lexicon, the vocabulary in the newly-increased lexicon is constituted with the vocabulary in old lexicon
New term storehouse, the corresponding term vector of haveing been friends in the past of vocabulary in the old lexicon;
Initialization module, for carrying out initialization process to the vocabulary in the new term storehouse so that belong in the new term storehouse
The term vector of the vocabulary in the old lexicon is old term vector, in belonging to the newly-increased lexicon in the new term storehouse
Vocabulary term vector is random term vector;
Update module, for being breathed out according to corresponding first Huffman tree in the new term storehouse and the old lexicon corresponding second
Term vector of the Fu Man trees respectively to vocabulary in the new term storehouse is updated.
10. device according to claim 9, it is characterised in that
The update module, specifically for obtaining the corresponding goal-selling function of first vocabulary, first vocabulary is institute
State the vocabulary in new term storehouse;According to first vocabulary in the attribute of first Huffman tree and in second Huffman
The attribute of tree carries out gradient treatment to the goal-selling function, obtains the corresponding term vector of first vocabulary.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710022458.0A CN106897265B (en) | 2017-01-12 | 2017-01-12 | Word vector training method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710022458.0A CN106897265B (en) | 2017-01-12 | 2017-01-12 | Word vector training method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106897265A true CN106897265A (en) | 2017-06-27 |
CN106897265B CN106897265B (en) | 2020-07-10 |
Family
ID=59198669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710022458.0A Active CN106897265B (en) | 2017-01-12 | 2017-01-12 | Word vector training method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106897265B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108509422A (en) * | 2018-04-04 | 2018-09-07 | 广州荔支网络技术有限公司 | A kind of Increment Learning Algorithm of term vector, device and electronic equipment |
WO2019095836A1 (en) * | 2017-11-14 | 2019-05-23 | 阿里巴巴集团控股有限公司 | Method, device, and apparatus for word vector processing based on clusters |
CN110020303A (en) * | 2017-11-24 | 2019-07-16 | 腾讯科技(深圳)有限公司 | Determine the alternative method, apparatus and storage medium for showing content |
CN110210557A (en) * | 2019-05-31 | 2019-09-06 | 南京工程学院 | A kind of online incremental clustering method of unknown text under real-time streams tupe |
CN111325026A (en) * | 2020-02-18 | 2020-06-23 | 北京声智科技有限公司 | Training method and system for word vector model |
US10769383B2 (en) | 2017-10-23 | 2020-09-08 | Alibaba Group Holding Limited | Cluster-based word vector processing method, device, and apparatus |
US11822447B2 (en) | 2020-10-06 | 2023-11-21 | Direct Cursus Technology L.L.C | Methods and servers for storing data associated with users and digital items of a recommendation system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105930318A (en) * | 2016-04-11 | 2016-09-07 | 深圳大学 | Word vector training method and system |
CN106055623A (en) * | 2016-05-26 | 2016-10-26 | 《中国学术期刊(光盘版)》电子杂志社有限公司 | Cross-language recommendation method and system |
-
2017
- 2017-01-12 CN CN201710022458.0A patent/CN106897265B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105930318A (en) * | 2016-04-11 | 2016-09-07 | 深圳大学 | Word vector training method and system |
CN106055623A (en) * | 2016-05-26 | 2016-10-26 | 《中国学术期刊(光盘版)》电子杂志社有限公司 | Cross-language recommendation method and system |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10769383B2 (en) | 2017-10-23 | 2020-09-08 | Alibaba Group Holding Limited | Cluster-based word vector processing method, device, and apparatus |
WO2019095836A1 (en) * | 2017-11-14 | 2019-05-23 | 阿里巴巴集团控股有限公司 | Method, device, and apparatus for word vector processing based on clusters |
US10846483B2 (en) | 2017-11-14 | 2020-11-24 | Advanced New Technologies Co., Ltd. | Method, device, and apparatus for word vector processing based on clusters |
CN110020303A (en) * | 2017-11-24 | 2019-07-16 | 腾讯科技(深圳)有限公司 | Determine the alternative method, apparatus and storage medium for showing content |
CN108509422A (en) * | 2018-04-04 | 2018-09-07 | 广州荔支网络技术有限公司 | A kind of Increment Learning Algorithm of term vector, device and electronic equipment |
CN108509422B (en) * | 2018-04-04 | 2020-01-24 | 广州荔支网络技术有限公司 | Incremental learning method and device for word vectors and electronic equipment |
CN110210557A (en) * | 2019-05-31 | 2019-09-06 | 南京工程学院 | A kind of online incremental clustering method of unknown text under real-time streams tupe |
CN110210557B (en) * | 2019-05-31 | 2024-01-12 | 南京工程学院 | Online incremental clustering method for unknown text in real-time stream processing mode |
CN111325026A (en) * | 2020-02-18 | 2020-06-23 | 北京声智科技有限公司 | Training method and system for word vector model |
CN111325026B (en) * | 2020-02-18 | 2023-10-10 | 北京声智科技有限公司 | Training method and system for word vector model |
US11822447B2 (en) | 2020-10-06 | 2023-11-21 | Direct Cursus Technology L.L.C | Methods and servers for storing data associated with users and digital items of a recommendation system |
Also Published As
Publication number | Publication date |
---|---|
CN106897265B (en) | 2020-07-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106897265A (en) | Term vector training method and device | |
CN103620624B (en) | For the method and apparatus causing the local competition inquiry learning rule of sparse connectivity | |
CN106802888A (en) | Term vector training method and device | |
CN108229582A (en) | Entity recognition dual training method is named in a kind of multitask towards medical domain | |
CN108665175A (en) | A kind of processing method, device and the processing equipment of insurance business risk profile | |
CN108334496A (en) | Human-computer dialogue understanding method and system and relevant device for specific area | |
CN109784474A (en) | A kind of deep learning model compression method, apparatus, storage medium and terminal device | |
Cho et al. | Exponentially increasing the capacity-to-computation ratio for conditional computation in deep learning | |
CN106980650A (en) | A kind of emotion enhancing word insertion learning method towards Twitter opinion classifications | |
CN108021908A (en) | Face age bracket recognition methods and device, computer installation and readable storage medium storing program for executing | |
CN109299264A (en) | File classification method, device, computer equipment and storage medium | |
CN107273352A (en) | A kind of word insertion learning model and training method based on Zolu functions | |
CN108197653A (en) | A kind of time series classification method based on convolution echo state network | |
CN107025463A (en) | Based on the bedroom apparatus for grouping and method for merging grouping algorithm | |
CN107194151A (en) | Determine the method and artificial intelligence equipment of emotion threshold value | |
CN109242089B (en) | Progressive supervised deep learning neural network training method, system, medium and device | |
CN114154839A (en) | Course recommendation method based on online education platform data | |
CN111324736B (en) | Man-machine dialogue model training method, man-machine dialogue method and system | |
CN110069781B (en) | Entity label identification method and related equipment | |
CN107886163A (en) | Single-object problem optimization method and device based on AGN and CNN | |
KR20180127890A (en) | Method and apparatus for user adaptive speech recognition | |
CN109871448A (en) | A kind of method and system of short text classification | |
CN110516228A (en) | Name entity recognition method, device, computer installation and computer readable storage medium | |
Tsihrintzis et al. | Surveys in artificial intelligence-based technologies | |
Shinde et al. | Mining classification rules from fuzzy min-max neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |