CN107818076A

CN107818076A - For the semantic processes of natural language

Info

Publication number: CN107818076A
Application number: CN201610818984.3A
Authority: CN
Inventors: 秦涛; 刘铁岩
Original assignee: Microsoft Technology Licensing LLC
Current assignee: Microsoft Technology Licensing LLC
Priority date: 2016-09-12
Filing date: 2016-09-12
Publication date: 2018-03-20
Anticipated expiration: 2036-09-12
Also published as: CN107818076B

Abstract

In embodiment of the disclosure, it is proposed that a kind of method and apparatus of semantic processes for natural language.After acquisition includes the Item Sets of multiple projects, it is determined that an each quantization means of the project on one group of semantic dimension and another quantization means in another group of dimension, then generate the semantic values of each project using the quantization means in this two groups of dimensions.In accordance with an embodiment of the present disclosure, semantic values can be used in identifying project concentration disparity items between semantic dependency, and each quantization means can be shared by multiple projects in Item Sets.Therefore, same quantization means are shared by multiple projects, the disclosure can not only effectively reduce the size of semantic model, and can significantly improve the speed of semantic processes during the semantic processes for natural language.

Description

For the semantic processes of natural language

Background technology

Natural language processing refers to the technology using computer disposal human language, and it enables a computer to understand the mankind Language.Computer is trained by the corpus for manually marking or not marking, generates the semantic expressiveness of natural language.Natural language Speech processing is an important directions in machine learning field, and it can apply to semantic analysis, information retrieval, machine translation, language Speech modeling and chat robots etc..

Recognition with Recurrent Neural Network (Recurrent Neural Networks, RNN) is a kind of god of node orientation connection cyclization Through network, its internal state can be expressed as dynamic time sequence data.Different from the acyclic neutral net of in general, RNN can profit The list entries of arbitrary sequence is handled with its internal memory.RNN can be remembered to information above and is applied to current In the calculating of output, i.e., the node between hidden layer in neutral net has connection, and the input of hidden layer is not only including defeated Entering the output of layer also includes the output of last moment hidden layer.Therefore, RNN is suitable for use in the word or text of prediction subsequent time Using such as text generation, machine translation, speech recognition and iamge description.However, current RNN semantic processes process needs Take a large amount of storage spaces and processing speed is slow.

The content of the invention

Inventors noted that can obtain a large amount of corpus data from network, these data are easy to collect.Base Most semantic scenes can be covered in the semantic model that a large amount of corpus data are trained, it is thus possible to effectively applied In being applied to actual natural language processing.Based on this understanding, each project (example is represented different from the use of one group of multi-C vector Such as word) semantic values conventional method, embodiment of the disclosure represents item using two groups or more multi-C vector Purpose semantic values.By means of this two groups or more multi-C vector, the model size and processing speed of semantic model can Optimised, this is all markedly different from any known arrangement in operation principle and mechanism.

For example, in accordance with an embodiment of the present disclosure, the Item Sets for including multiple projects can be obtained.Then, using two with On subvector represent the semantic vector of each project together, wherein semantic vector can be used in identifying project the difference of concentration Semantic dependency between project, and each subvector can be shared by multiple projects in Item Sets respectively.Therefore, pass through So that multiple projects share same subvector, embodiment of the disclosure is during the semantic processes for natural language, not only The size of semantic model can be effectively reduced, and the speed of semantic processes can be significantly improved.

It is their specific realities below in order to which simplified form introduces the selection to concept to provide Summary Applying in mode to be further described.Summary is not intended to identify the key feature or principal character of the disclosure, is also not intended to Limit the scope of the present disclosure.

Brief description of the drawings

With reference to accompanying drawing and refer to described further below, above and other feature, advantage and the aspect of each embodiment of the disclosure It will be apparent.In the accompanying drawings, same or analogous accompanying drawing mark represents same or analogous element, wherein：

Fig. 1 is shown in which that the block diagram of computing system/server of one or more other embodiments of the present disclosure can be implemented；

Fig. 2 shows the flow chart of the method for generative semantics value in accordance with an embodiment of the present disclosure；

Fig. 3 A and 3B show the exemplary plot of the table for Item Sets in accordance with an embodiment of the present disclosure；

Fig. 4 shows the flow chart for being used for the method by multiple allocations of items in table in accordance with an embodiment of the present disclosure；

Fig. 5 shows the exemplary plot of some rows in the table for Item Sets in accordance with an embodiment of the present disclosure；

Fig. 6 shows the flow chart for being used to determine the method for associated project in accordance with an embodiment of the present disclosure；And

Fig. 7 shows the exemplary plot of the prediction process of the semantic model based on RNN in accordance with an embodiment of the present disclosure.

In all of the figs, same or similar reference numeral represents same or similar element.

Embodiment

Embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although some of the disclosure are shown in accompanying drawing Embodiment, it should be understood that, the disclosure can be realized by various forms, and should not be construed as being limited to this In the embodiment that illustrates, it is in order to more thorough and be fully understood by the disclosure conversely to provide these embodiments.It should be understood that It is being given for example only property of the drawings and Examples effect of the disclosure, is not intended to limit the protection domain of the disclosure.

Usually, the semantic modeling process for the natural language being described herein is considered a kind of machine learning Journey.Machine learning is that a kind of automatically analyzed from data obtains rule, and the algorithm that assimilated equations are predicted to unknown data. Term as used herein " semantic model " refers to that foundation is known with associated priori such as the syntaxes, grammer, morphology of specific languages Know the model established, it is determined for the semantic relevance between word.Term as used herein " training process " or " learning process " refers to the process of optimize semantic model using corpus or training data.For example, semantic model can pass through Training learns gradually to optimize the semantic values of each project, so as to improve the accuracy of semantic model.Above and below the disclosure Wen Zhong, to discuss convenient purpose, term " training " or " study " are interchangeable." vector " used herein is also referred to as For " embedded vector (embedding vector) ", it is used for the Semantic mapping of each project into hyperspace to form language Adopted quantization means.

Terms used herein " comprising " and its deformation are that opening includes, i.e., " include but is not limited to ".Term "based" It is " being based at least partially on ".Term " one embodiment " expression " at least one embodiment "；Term " another embodiment " represents " at least one further embodiment ".The related definition of other terms provides in will be described below.

Traditionally, in machine-learning process, semantic model is generally using corpus in Item Sets (such as vocabulary) Project (such as word or other character strings) be trained to determine the semantic values of each project, the semantic values are generally by vector Represent.However, when the number of the project in corpus is too big, the size of semantic model will also become very huge.For example, work as When training the semantic model based on RNN, when multiple projects in Item Sets be present, it is necessary to generate a term vector (one-hot Vector), its dimension is equal to the number of the project in Item Sets, and each project is generally also provided with respective embedded vector.For example, When the number of the project in Item Sets reaches tens million of ranks, the size of semantic model is up to tens GB (GB), The size of the memory of existing computing device (such as GPU and mobile computing device include mobile phone and tablet personal computer) is far more than, So that existing computing device to semantic model according to can not be trained.Therefore, traditional Semantic Modeling Method can cause language The size of adopted model is excessively huge.

Traditional method is directed to each project in vocabulary, generates a high dimension vector for representing its semantic values. For example, table 1 shows traditional vector representation for project, vector x is generated for project " January "₁, for project " February " generates vector x₂Deng wherein vector x₁And x₂Etc. being semantic expressiveness of the corresponding project in hyperspace.It is known , semantic association degree that the distance between vector can be between expression project.That is, the distance between vector is shorter, table Relevance is higher between showing two projects, for example, due to " January " and " one " semantic relevance be more than " January " and The semantic relevance of " two ", therefore, vector x₁With x₅The distance between be less than vector x₁With x₆The distance between.

Table 1

Vector	Project
		x₁	January
x₂	February
		…	…
x₅	one
		x₆	two
…	…

In addition, in the semantic model based on RNN, in the next item after predicting current project, it is necessary to computational item Mesh concentrates the probability of each project.However, when the number of the project in Item Sets reaches tens million of ranks, the place of semantic model Reason speed will also become very slow.Even if use current most fast processor, it is also possible to need to take up to for tens years To complete the training process of so huge Item Sets.Therefore, when traditional Semantic Modeling Method needs to consume very long training Between.

Therefore, the present disclosure proposes a kind of Semantic Modeling Method based on multigroup dimension.In accordance with an embodiment of the present disclosure, make The semantic values of each project are generated with the quantization means in two groups or more dimension so that the quantization table in every group of dimension Showing can be shared by multiple projects in Item Sets.As an example, table 2 show in accordance with an embodiment of the present disclosure be directed to project Vector representation, each project represents that such as project " January " is by vector x by two vector joints₁And y₁Joint expression, item Mesh " February " is by vector x₁And y₂Joint expression, wherein vector x₁And x₂For the quantization means in first group of dimension, and it is vectorial y₁And y₂For the quantization means in second group of dimension.

Table 2

Vector	Project
		(x₁, y₁)	January
(x₁, y₂)	February
		…	…
(x₂, y₁)	one
		(x₂, y₂)	two
…	…

In this way, share same quantization means by multiple projects, for example, project " January " and " February " shares the vector x in first group of dimension₁, the disclosure is during the semantic modeling for natural language, not only The size of semantic model can be effectively reduced, and the processing speed of semantic model can be significantly improved.

Illustrate the general principle of the disclosure and some sample implementations below with reference to Fig. 1 to Fig. 7.Fig. 1 shows it In can implement one or more other embodiments of the present disclosure computing system/server 100 block diagram.It should be appreciated that shown in Fig. 1 What the computing system/server 100 gone out was merely exemplary, without should form to the function of embodiment described herein and Any restrictions of scope.

As shown in figure 1, computing system/server 100 is the form of universal computing device.Computing system/server 100 Component can include but is not limited to one or more processors or processing unit 110, memory 120, storage device 130, one Or multiple communication units 140, one or more input equipments 150 and one or more output equipments 160.Processing unit 110 It can be reality or virtual processor and various processing can be performed according to the program stored in memory 120.In many places Manage device system in, multiple processing unit for parallel execution computer executable instructions, with improve computing system/server 100 and Row disposal ability.

Computing system/server 100 generally includes multiple computer-readable storage mediums.Such medium can calculate system It is the addressable any medium that can be obtained of system/server 100, including but not limited to volatibility and non-volatile media, removable Unload and non-dismountable medium.Memory 120 can be volatile memory (such as register, cache, random access storage Device (RAM)), nonvolatile memory is (for example, read-only storage (ROM), Electrically Erasable Read Only Memory (EEPROM), flash memory) or their certain combination.Storage device 130 can be detachable or non-removable medium, and can So that including machine readable media, such as flash drive, disk or any other medium, it can be used in storage information And/or data (such as data 170) and it can be accessed in computing system/server 100.

Computing system/server 100 may further include other detachable/non-dismountable, volatile, nonvolatile Storage medium.Although not shown in FIG. 1, can provide for from detachable, non-volatile magnetic disk being read out or writing Disk drive and the disc drives for being read out or writing from detachable, anonvolatile optical disk.In such cases, each Driving can be connected to bus (not shown) by one or more data media interfaces.Memory 120 can include one or Multiple program products 122, it has one or more program module collections, and these program modules are configured as performing and retouched herein The function for the various embodiments stated.

Communication unit 140 is realized and communicated by communication media with other computing device.Additionally, computing system/ The function of the component of server 100 can realize that these computing machines can with single computing cluster or multiple computing machines By being communicated.Therefore, computing system/server 100 can use and other one or more servers, net The logic of network personal computer (PC) or another general networking node is connected to be operated in networked environment.

Input equipment 150 can be one or more various input equipments, such as mouse, keyboard, trackball etc..Output is set Standby 160 can be one or more output equipments, such as display, loudspeaker, printer etc..Computing system/server 100 is also It can be communicated as desired by communication unit 140 with one or more external equipment (not shown), external equipment is such as Storage device, display device etc., make it that user is led to the equipment that computing system/server 100 interacts with one or more Letter, or with causing any equipment of other computing device communications of computing system/server 100 with one or more (for example, net Card, modem etc.) communicated.Such communication can perform via input/output (I/O) interface (not shown).

As shown in figure 1, being stored with data 170 in storage device 130, it includes training data 172 (such as corpus), meter Calculating systems/servers 100 can be trained using training data, so as to export training result 180 by output equipment 160. In logic, training result 180 can be considered as one or more tables, and the table is including each project in training data in multigroup dimension On quantization means (such as vector).Of course, it should be noted that it is merely exemplary in the form of a table to export training result 180 , it is not intended to the scope of the present disclosure is limited in any way.Any other additional or alternative data format is all feasible.

Some example embodiments based on the generation table 180 of training data 172 have been described below in detail.Fig. 2 shows basis The flow chart of the method 200 for generative semantics value of embodiment of the disclosure.It should be appreciated that method 200 can be by with reference to figure 1 Described processing unit 110 performs.

202, the Item Sets for including multiple projects are obtained.For example, can be from storage device 130, communication unit 140, defeated Enter the grade Network Capture corpus of equipment 150 as training data.It is stored in corpus in the actual use of language and truly occurs The linguistic data crossed, it generally includes the data set of a large amount of texts by arrangement.In corpus, the project quilt that each occurs As a word example (token).In certain embodiments, deduplication operation can be carried out to word example, and by the word example after duplicate removal As multiple projects in Item Sets.In certain embodiments, Item Sets can include：Word, such as English " one ", " January ", Chinese " year ", " hour ", etc.；Phrase, such as " day-and-a-half "；And/or some other character string, For example, numeral, network address, digital alphabet combination etc..

204, each project for concentration of identifying project is on one group of semantic dimension (being referred to as " first group of semantic dimension ") Quantization means (being referred to as " the first quantization means ") and the quantization on another group of semantic dimension (being referred to as " second group of semantic dimension ") Represent and (be referred to as " the second quantization means ").

In certain embodiments, quantization means may be implemented as vector.For the sake of for convenience of description, hereafter will mainly it describe Such embodiment.Of course it is to be understood that what this was merely exemplary, it is not intended to limit the scope of the present disclosure in any way.Appoint What his appropriate data format is also feasible.In addition, for ease of distinguishing, vector corresponding with quantization means is referred to as " son Vector ".For example, the first quantization means may be implemented as the first subvector x, and the second quantization means may be implemented as second Subvector y.

Term " one group of semantic dimension " as used herein represents the hyperspace of certain dimension.Correspondingly, the quantization of project Represent to refer to the semantic quantization means of the semantic values of project in hyperspace, every one-dimensional expression project in semantic dimension exists Sub- semantic values in the dimension.Alternatively, first group of semantic dimension and second group of semantic dimension can have identical dimension, example Such as 1000 dimensions or 1024 dimensions, and first group of semantic dimension can be the row in table, and second group of semantic dimension can be in table Row.For example, it may be determined that 1024 the first subvector x tieed up of each project and the second subvector of 1024 dimensions in Item Sets y.Further below by with reference to the example embodiment that the first subvector and the second subvector are determined in the description of figure 4 action 204.

206, at least generated based on the first quantization means and the second quantization means each project semantic values (such as to Amount), such as the semantic values of each project can be expressed as to vectorial (x, y).In accordance with an embodiment of the present disclosure, the vector generated It can be used in the semantic dependency identified project between the sundry item in Item Sets.As described above, for example, between vector Distance can be between expression project the distance between semantic association degree, vector it is shorter, represent relevance between two projects It is higher, that is to say, that to be quantized by the semantic relation between vectorial project.

For example, the semanteme that " image " is similar with " picture " expression, then the vector distance between them is relatively short.Example again Such as, because the relation of " Beijing " and " China " is similar to " Washington " and " U.S. " relation, therefore, for " Beijing " vector with The distance between vector for " China " is substantially equal to for the vector in " Washington " and between the vector in " U.S. " Distance.

In addition, as shown in Table 2 above, the first subvector x and/or the second subvector y are total to by multiple projects in Item Sets Enjoy, that is, multiple projects share the first subvector x, and multiple projects share the second subvector y.

Although it should be appreciated that only show to represent the semantic values of each project by two subvectors in method 200, but Represent that the semantic values of each project are also possible by more than two subvectors.For example, three subvectors can be passed through (x, y, z) represents the semantic values of each project, and wherein z represents another quantization table of each project on the 3rd group of semantic dimension Show.That is, the protection domain of the disclosure is not limited by subvector number.The concentration in addition, although method 200 is identified project Each item object vector, but the vector in a subset for the concentration that can also only identify project.

In certain embodiments, the subvector and second group of language of first group of semantic dimension can be represented using the mode of table The subvector of adopted dimension, the representation of other certain data formats is also feasible.With reference to figure 3A, show according to the disclosure Embodiment the table 300 for Item Sets.The example embodiment of generation table 300 is described in detail below with reference to Fig. 4.Herein In described embodiment, the vector x in table 300 is used^rAnd vector x^cJoint represents a project, thus vector x^rAnd x^c Subvector is collectively referred to as, or is known respectively as row vector and column vector.

As shown in Figure 3A, by all items composition in Item Sets into table 300, each position correspondence in table 300 is in one Individual project.For example, in table 300, the vectorial subvector by the first row of word " January "With the subvector of first rowThe two joint expression, the vectorial subvector by the first row of word " February "With the subvector of secondary seriesThe two Joint represents.As can be seen here, the subvector of the first rowAt least shared by word " January " and " February ".Namely Say, subvector of all items in table 300 in i-th row on first group of semantic dimension isAnd all items in jth row Subvector of the mesh on second group of semantic dimension beTherefore, the word in table 300 in ith row and jth column is by subvectorWithThe two joint represents.

Fig. 3 B show the table 350 for Item Sets in accordance with an embodiment of the present disclosure.As shown in Figure 3 B, in table 350, The word of ith row and jth column opening position by the i-th row subvectorWith the subvector of jth rowThe two joint represent, i.e., to AmountWhereinWithIt is higher-dimension (such as 1024 dimensions) vector.For example, in accordance with an embodiment of the present disclosure, word The vector of " January " can be expressed asAnd the vector of word " February " can be expressed as

Traditionally, it is used to represent semantic values, it is necessary to generate one for each project for the Item Sets with V project Vector, then need to amount to V it is vectorial.However, as described by method 200 and Fig. 3 A-3B, in accordance with an embodiment of the present disclosure Method needs only at leastIndividual row vector andIndividual column vector, i.e., altogetherIndividual vector.In some cases, in table When row and column in 300 is unequal, vectorial number is slightly larger thanBut when V value is ten million rank, the disclosure to Amount number is far smaller than the number of vectors of traditional method.Therefore, the disclosure can effectively reduce the number of vector, so as to effectively Reduce the size of semantic model, for example, the disclosure can reduce the size of semantic model from traditional 80 GB (GB) To 70 Mbytes (MB).

Fig. 4 shows the flow for being used for the method 400 by multiple allocations of items in table in accordance with an embodiment of the present disclosure Figure.The description of method 400 is distributed project in table (such as table 300) exactly based on training data, and determines often to go in table With the vector value of each column.It should be appreciated that method 400 can be the son of action 204 in the method 200 above with reference to described by Fig. 2 Action, and can be as being performed with reference to the processing unit 110 described by figure 1.

402, by multiple project organizations in Item Sets into table so that the project being in a line in table 300 has There is identical row vector, and the project in same row has identical column vector.In some embodiments, can there will be phase Same a line that the project of same prefix is initially distributed in table, and the project with identical suffix is initially distributed Same row in table.In certain embodiments, for English word or character string, prefix represents English word or character string Front portion, and the rear portion of postfix notation English word or character string, for example, by word " react ", " real " and " return " is distributed in same a line, and word " billion ", " million " and " trillion " is distributed in same row. Next, in 404-408, dispensing position of the training dataset come adjusted iterm in table 300 is utilized.

Specifically, 404, based on dispensing position renewal vector.For example, being based on existing dispensing position, language material is utilized Storehouse is trained to all row vectors in table 300 and all column vectors, until row vector and/or column vector become to restrain.Example Such as, similar project " image " and the vector value of " picture " can be updated so that their vector space distance is shorter.Therefore, All or part of row vector x in table 300^rWith column vector x^cValue can be updated, for example, row vectorValue from (0.025,0.365,0.263 ...) be updated to (0.012,0.242,0.347 ...).

406, based on the vector adjustment dispensing position after renewal.In certain embodiments, based on being had determined in 404 Table 300 in all row vectors and column vector value, dispensing position of the corpus to all items in table 300 can be utilized It is adjusted, make it that the loss function of all items is minimum.Loss function is that one kind of optimization aim is used to learn or train The form of model, loss function can represent the measurement of the overall loss for model optimization, wherein loss can represent to miss The various factors such as error in classification.Distribution position can be adjusted by minimizing the negative log-likelihood function of next word in sequence Put, for example, in the context with T word, overall negative log-likelihood function NLL can be represented by formula (1).

Wherein NLL can be defined asWherein NLL_wRepresent to be directed to particular words w Negative log-likelihood function.

, can be by NLL for the ease of description_wL (w, r (w), c (w)) is expressed as, wherein (r (w), c (w)) represents word w The position of row and column in table 300.L can be utilized_r(w, r (w)) and carry out l_c(w, c (w)) represents l (w, r (w), c (w)) word Language w row loss and word w row lose.Therefore, word w negative log-likelihood function NLL_wIt can be expressed as again：

Wherein S_wRepresent the set of all positions of the word w in corpus.

It is then possible to positions of the word w in table 300 is adjusted, make it that loss function NLL is minimum.For example, it is assumed that by word Language w is from the new position (i, j) that home position (r (w), c (w)) is moved in table 300.It can be assumed the distribution position of other words Put constant and calculate row loss l respectively_r(w, i) and l_cThe loss of (w, j) row, is then defined as l according to formula (2) by l (w, i, j)_r (w,i)+l_c(w,j).Then, an accepted way of doing sth (3) can be converted by minimizing loss function.

Wherein a (w, i, j)=1 represents that word w is dispensed on the new position (i, j) in table 300, and wherein S_rAnd S_c The set of the row and column in table is represented respectively.

Therefore, standard minimal weight perfect matching problem can be equal to the problem of above-mentioned minimum loss function, thus Such as can be solved by minimum cost maximum-flow algorithm (MCMF).It is any being currently known or exploitation in the future MCMF methods can be used in combination with embodiments of the invention, the scope of the present invention not limited to this.Therefore, by MCMF methods, Dispensing position of each project in table can be adjusted using corpus so that the dispensing position of all items in Item Sets It is more accurate.

408, judge whether to meet the condition of convergence.For example, the number that the condition of convergence can be iteration has reached predetermined Number.Alternatively or additionally, the condition of convergence can the time of iteration reach the scheduled time.In another example the condition of convergence Can be that loss function starts to restrain, i.e., the rate of change of loss function is less than Threshold variety.If not meeting the condition of convergence, Then repetitive operation 404-408, until meeting the condition of convergence.If having met the condition of convergence, 410, store multiple projects and exist The value of all row vectors and column vector in dispensing position and table 300 in table 300 uses for follow-up.

By the above-mentioned means, being trained by corpus to all row vectors and column vector in table, and pass through Corpus adjusts dispensing position of all items in table, enables to all items in Item Sets to be distributed in table most Suitable position.Therefore, the value of all row vectors and column vector that train in accordance with an embodiment of the present disclosure is also very accurate, has It ensure that to effect the accuracy of semantic model.

Fig. 5 shows the exemplary plot of some rows in the table of Item Sets in accordance with an embodiment of the present disclosure.For example, passing through Several times after iteration adjustment, there is semantic relevance, or tool between the project in same a line and/or same row in table 300 There is semantic and grammar association.For example, in the 832nd row 510, the project of distribution is place name；In the 887th row 520, distribution Project is network address；In the 889th row 530, the project of distribution is the expression about concept of time.Therefore, according to the reality of the disclosure The Semantic Modeling Method for applying example can effectively between discovery project semantic relevance, or semanteme between discovery project and Grammar association, and adjust dispensing position of each project in table according to semantic.Therefore, in accordance with an embodiment of the present disclosure The accuracy of semantic model can not only be ensured, and the size of semantic model can be reduced.

Fig. 6 shows the flow chart for being used to determine the method 600 of associated project in accordance with an embodiment of the present disclosure.Root According to method 600, it may be determined that another project (also known as " Section 2 (also known as " first item ") associated with a project Mesh ").It should be appreciated that method 600 can perform after the action 206 in the method 200 above with reference to described by Fig. 2, and Can be as being performed with reference to the processing unit 110 described by figure 1.

602, at least based on the second quantization means, it is determined that on first group of semantic dimension, associated with project Three quantization means.For example it is assumed that first item is word " January ", then the first quantization means are subvectorAnd the second amount Change is expressed as subvectorThen, based on subvectorWith the current hidden layer state vector in RNN, in all rows of table 300 It is determined that with the subvector of the maximally related row of first item (such as the row vector of the 2nd row).Then 604, at least based on the 3rd Quantization means, it is determined that the 4th on second group of semantic dimension, associated with project quantization means.For example, based in 602 The row vector of determinationWith the current hidden layer state vector in RNN, determined in all row of table 300 most related to first item Row subvector (such as the 1st row column vector).Although it should be appreciated that first determined in above-mentioned example another purpose row to Then amount determines another purpose column vector again, but can also first determine another purpose column vector and then determine another again Purpose row vector.

606, according to the 3rd quantization means and the 4th quantization means, another project is determined.For example, it is based on row vectorWith Column vectorThe project that another project is the second row first row in table 300, i.e. word " one " can be determined.In some embodiments In, first item can be the current term in sentence, then the semantic vector based on current term, and can predict to appear in Next word after in sentence, current term.

Traditionally, for the Item Sets with V project, it is determined that during another project maximally related with first item, pass Then system method selects it needs to be determined that the correlation of each project and first item in Item Sets according to all correlations Another project, thus need V*t₀Total time, wherein t₀It is to determine the time of each project and the correlation of first item.So And the V project organization is existed in method in accordance with an embodiment of the present disclosureIn the form of row, and first from A line, Ran Houcong are selected in rowThe row of selection one in row, thus need only toTotal time.In some cases, When row and column in table 300 is unequal, determine that another purpose total time is slightly larger thanBut the value in V is ten million During rank, processing time in accordance with an embodiment of the present disclosure is far smaller than the processing time of traditional method, such as will can locate Reason time from traditional decades are reduced to several days or even several hours.Therefore, the disclosure can effectively improve semantic model Processing speed.

Fig. 7 shows the exemplary plot of the prediction process based on RNN in accordance with an embodiment of the present disclosure.For example, described by Fig. 7 The prediction process based on RNN can be used for the word of the data prediction current location t based on upper position t-1 in sentence. Hidden layer state vector can be divided into hidden layer state row vector h^rWith hidden layer state column vector h^c, w_tRepresent the word of t positions.Make n The dimension of column vector is inputted for the line of input vector sum in RNN semantic models, and it is the hidden layer shape in RNN semantic models to make m The column vector of the dimension, wherein position t-1 of state vector∈ sets of real numbers Rⁿ, position t row vector∈ sets of real numbers Rⁿ, position T-1 hidden layer state row vector∈ sets of real numbers R^m.Therefore, can be determined by formula (4) position t-1 hidden layer status Bar to AmountWith position t hidden layer state row vector

Wherein W, U and b represent affine transformation parameter, wherein W ∈ sets of real numbers R^m*n, U ∈ sets of real numbers R^m*m, and b ∈ are real Manifold R^m, f represent neutral net in nonlinear activation function (such as sigmoid functions).In addition, row vector and column vector are equal From input vector matrix X^c, X^r∈ sets of real numbers

It is then possible to row probability Ps of the word w in position t is determined by formula (5)_r(w_t) and word w position t row probability P_c(w_t)。

Wherein r (w) represents word w line index, and c (w) represents word w column index,∈ sets of real numbers R^mIt is Y^r∈ is real ManifoldI-th vector, and∈ sets of real numbers R^mIt is Y^c∈ sets of real numbersI-th of vector, and S_rWith S_cThe set of the row and column in table is represented respectively.

Therefore, it can determine that next word appears in the probability of every row by formula (5), and it is then determined that next word goes out The probability of present each column, so that it is determined that the word of the row position of the row of maximum probability and maximum probability is as maximum probability Next word.

As can be seen here, the column vector based on position t-1With position t-1 hidden layer state row vectorsPosition can be determined Put t-1 hidden layer state column vectorsIt is then based onRow probability Ps of the word w in position t can be determined_r(w_t), that is, extremely Few column vector based on previous wordThe row vector of current term can be determined.In addition, the row vector by position tWith Position t-1 hidden layer state column vectorsPosition t hidden layer state row vectors can be determinedIt is then based onWord can be determined Row probability Ps of the w in position t_c(w_t), that is, at least row vector based on current termTo determine the column vector of current term. Therefore, for the Item Sets with V project, the general of word is determined by the calculating respectively of every row probability and each column probability Rate, the next item down object time can will be predicted from V*t₀It is reduced toIt is effectively improved the processing of RNN semantic models Speed.

Method and function described herein can be performed by one or more hardware logic components at least in part. Such as, but not limited to, the exemplary types for the hardware logic component that can be used include field programmable gate array (FPGA), specially With integrated circuit (ASIC), Application Specific Standard Product (ASSP), on-chip system (SOC), CPLD (CPLD) etc..

For implement disclosed method program code can using any combinations of one or more programming languages come Write.These program codes can be supplied to the place of all-purpose computer, special-purpose computer or other programmable data processing units Manage device or controller so that program code when processor or controller execution when by making defined in flow chart and/or block diagram Function/operation is carried out.Program code can be performed completely on machine, partly performed on machine, as stand alone software Performed on machine packet portion and partly perform or performed completely on remote machine or server on the remote machine.

In the context of present disclosure, machine readable media can be tangible medium, and it can include or store The program for using or being used in combination with instruction execution system, device or equipment for instruction execution system, device or equipment.Machine Device computer-readable recording medium can be machine-readable signal medium or machine-readable storage medium.Machine readable media can include but unlimited In electronics, magnetic, optical, electromagnetism, infrared or semiconductor system, device or equipment, or the above appoints What appropriate combination.The more specific example of machinable medium can include being electrically connected, be portable based on one or more lines Formula computer disks, hard disk, random access memory (RAM), read-only storage (ROM), Erasable Programmable Read Only Memory EPROM (EPROM or flash memory), optical fiber, portable compact disk read-only storage (CD-ROM), optical storage device, magnetic storage are set Standby or the above any appropriate combination.

Although in addition, depicting each operation using certain order, this should be understood to requirement so operation with shown The certain order that goes out performs in sequential order, or requires that the operation of all diagrams should be performed to obtain desired result. Under certain environment, multitask and parallel processing are probably favourable.Similarly, although containing some tools in being discussed above Body realizes details, but these are not construed as the limitation to the scope of the present disclosure.In the context individually realized Some features of description can also be realized in single realization in combination.On the contrary, described in the context of single realization Various features can also be realized in multiple realizations individually or in any suitable subcombination.

It is listed below some sample implementations of the disclosure.

The disclosure may be implemented as a kind of electronic equipment, and the electronic equipment includes processing unit and memory, described Memory is coupled to the processing unit and is stored with instruction, and the instruction performs following when being performed by the processing unit Action：Obtaining includes the Item Sets of multiple projects；Determine first of the project in the Item Sets on first group of semantic dimension Quantization means and the second quantization means on second group of semantic dimension；And at least based on first quantization means and described Second quantization means generate the semantic values of the project, wherein the semantic values can be used in determining the project and the item Semantic dependency between the sundry item that mesh is concentrated, at least one in first quantization means and second quantization means It is individual to be shared by the project and at least one other project in the Item Sets.

In certain embodiments, wherein the semantic values are by vector representation, first quantization means are by the first subvector Represent, and second quantization means are represented by the second subvector.

In certain embodiments, the action also includes：The semantic values based on the project, determine the Item Sets In another project associated with the project.

In certain embodiments, wherein the project is the first word in sentence, another project is the sentence It is middle to appear in the second word after first word, and determine that another project includes：Based on described first The semantic values of word predict second word.

In certain embodiments, wherein determining that another project includes：Second quantization means are at least based on, it is determined that 3rd quantization means on first group of semantic dimension, associated with the project；At least quantify based on the described 3rd Represent, it is determined that the 4th on second group of semantic dimension, associated with project quantization means；And according to described 3rd quantization means and the 4th quantization means determine another project.

In certain embodiments, wherein determining that first quantization means and second quantization means include：By described in The multiple project organization in Item Sets is into table so that the project being in a line in the table has at described first group Identical quantization means on semantic dimension, and the project in same row has the identical on second group of semantic dimension Quantization means；And the dispensing position of the project in the table is adjusted using training dataset.

In certain embodiments, wherein the multiple project organization in the Item Sets is included into table：There to be phase The same a line of the allocation of items of same prefix in the table, the prefix represent the front portion of the project；And will tool There is the same row of the allocation of items of identical suffix in the table, the rear portion of project described in the postfix notation.

In certain embodiments, wherein adjusting the dispensing position bag of the project in the table using training dataset Include：Iteration performs following operation at least once, until the condition of convergence is satisfied, the condition of convergence and iteration time, iteration time Number or the Parameters variation of training pattern are relevant：Based on the dispensing position, first quantization means and second amount are updated Change and represent；And based on first quantization means after renewal and second quantization means, adjust the dispensing position.

The disclosure may be implemented as a kind of computer implemented method, and methods described includes：Acquisition includes multiple projects Item Sets；Determine first quantization means of the project in the Item Sets on first group of semantic dimension and semantic at second group The second quantization means in dimension；And at least generated based on first quantization means and second quantization means described The semantic values of project, the semantic values can be used in determining the semanteme between the sundry item in the project and the Item Sets Correlation, it is at least one by the project and the Item Sets in first quantization means and second quantization means At least one other project share.

In certain embodiments, this method also includes：The semantic values based on the project, are determined in the Item Sets Another project associated with the project.

The disclosure may be implemented as a kind of computer program product, and it is stored in non-transitory, computer storage medium simultaneously And including machine-executable instruction, when the machine-executable instruction is run in a device so that the equipment：Obtain including more The Item Sets of individual project；Determine first quantization means of the project in the Item Sets on first group of semantic dimension and second The second quantization means on group semantic dimension；And at least it is based on first quantization means and the second quantization means next life Into the semantic values of the project, the semantic values can be used in determining between the sundry item in the project and the Item Sets Semantic dependency, it is at least one by the project and the item in first quantization means and second quantization means At least one other project that mesh is concentrated is shared.

In certain embodiments, wherein also causing the equipment when machine-executable instruction is run in a device：Base In the semantic values of the project, another project associated with the project in the Item Sets is determined.

, should although describing the disclosure using the language specific to architectural feature and/or method logical action When understanding that the theme defined in appended claims is not necessarily limited to special characteristic described above or action.On on the contrary, Special characteristic and action described by face are only the exemplary forms for realizing claims.

Claims

1. a kind of electronic equipment, including：

Processing unit；

Memory, coupled to the processing unit and instruction is stored with, the instruction is held when being performed by the processing unit Row is following to be acted：

Obtaining includes the Item Sets of multiple projects；

Determine first quantization means of the project in the Item Sets on first group of semantic dimension and in second group of semantic dimension On the second quantization means；And

At least the semantic values of the project, the semanteme are generated based on first quantization means and second quantization means Value can be used in determining the semantic dependency between the sundry item in the project and the Item Sets, and described first quantifies table Show and be total to at least one in second quantization means by the project and at least one other project in the Item Sets Enjoy.

2. equipment according to claim 1, wherein the semantic values, by vector representation, first quantization means are by first Subvector is represented, and second quantization means are represented by the second subvector.

3. equipment according to claim 1, the action also includes：

The semantic values based on the project, determine another project associated with the project in the Item Sets.

4. equipment according to claim 3, wherein the project is the first word in sentence, another project is institute The second word after first word will be appeared in by stating in sentence, and determine that another project includes：

Second word is predicted based on the semantic values of first word.

5. equipment according to claim 3, wherein determining that another project includes：

Second quantization means are at least based on, it is determined that on first group of semantic dimension, associated with the project 3rd quantization means；

The 3rd quantization means are at least based on, it is determined that on second group of semantic dimension, associated with the project 4th quantization means；And

Another project is determined according to the 3rd quantization means and the 4th quantization means.

6. equipment according to claim 1, wherein determining that first quantization means and second quantization means include：

By the multiple project organization in the Item Sets into table so that the project being in a line in the table has Identical quantization means on first group of semantic dimension, and the project in same row has in second group of semantic dimensions Identical quantization means on degree；And

The dispensing position of the project in the table is adjusted using training dataset.

7. equipment according to claim 6, wherein the multiple project organization in the Item Sets is included into table：

By the same a line of the allocation of items with identical prefix in the table, the prefix represents the previous portion of the project Point；And

By the same row of the allocation of items with identical suffix in the table, the latter portion of project described in the postfix notation Point.

8. equipment according to claim 6, wherein adjusting point of the project in the table using training dataset Include with position：

Iteration performs following operation at least once, until the condition of convergence is satisfied, the condition of convergence and at least one following phase Close：The Parameters variation of iteration time, iterations and training pattern：

Based on the dispensing position, first quantization means and second quantization means are updated；And

Based on first quantization means after renewal and second quantization means, the dispensing position is adjusted.

9. a kind of computer implemented method, including：

Obtaining includes the Item Sets of multiple projects；

10. according to the method for claim 9, wherein the semantic values are by vector representation, first quantization means are by the One subvector is represented, and second quantization means are represented by the second subvector.

11. the method according to claim 11, in addition to：

12. according to the method for claim 11, wherein the project is the first word in sentence, another project is The second word after first word will be appeared in the sentence, and determine that another project includes：

Second word is predicted based on the semantic values of first word.

13. according to the method for claim 11, wherein determining that another project includes：

14. according to the method for claim 9, wherein determining first quantization means and the second quantization means bag Include：

15. according to the method for claim 14, wherein the multiple project organization in the Item Sets is included into table：

16. according to the method for claim 14, wherein adjusting the project using training dataset in the table Dispensing position includes：

17. a kind of computer program product, the computer program product is stored in non-transitory, computer storage medium simultaneously And cause the equipment when running in a device including machine-executable instruction, the machine-executable instruction：

Obtaining includes the Item Sets of multiple projects；

18. computer program product according to claim 17, wherein the machine-executable instruction is run in a device When also cause the equipment：

It is associated with the project in the Item Sets to determine according to the 3rd quantization means and the 4th quantization means Another project.

19. computer program product according to claim 17, wherein determining first quantization means and described second Quantization means include：

20. computer program product according to claim 19, exist wherein adjusting the project using training dataset Dispensing position in the table includes：