CN109284510A - A kind of text handling method, system and a kind of device for text-processing - Google Patents
A kind of text handling method, system and a kind of device for text-processing Download PDFInfo
- Publication number
- CN109284510A CN109284510A CN201710602815.0A CN201710602815A CN109284510A CN 109284510 A CN109284510 A CN 109284510A CN 201710602815 A CN201710602815 A CN 201710602815A CN 109284510 A CN109284510 A CN 109284510A
- Authority
- CN
- China
- Prior art keywords
- word
- source
- information
- decoding
- hidden layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The embodiment of the invention provides a kind of text handling method, system and a kind of devices for text-processing, this method comprises: receiving source text, the source text has multiple source words;The multiple source Chinese word coding is multiple vectors by calling encoder;When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word, one or more information determine the central point of local attention window in the central point before decoding t-th of target word;Local attention window is determined based on the central point of the local attention window;It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into t-th of target word.By comprehensively considering much information, the accuracy rate of the centralized positioning of attention is improved, to improve the quality of the business processings such as translation.
Description
Technical field
The present invention relates to the technical fields of Language Processing, more particularly to a kind of text handling method, a kind of text-processing
System and a kind of device for text-processing.
Background technique
Machine translation is otherwise known as automatic translation technology, by the program capability using computer, a kind of language is automatic
It is converted to another language, the former is known as original language, and the latter is referred to as object language.
Currently, the common local attention model of machine translation, local attention model is the improvement based on attention model,
In existing local attention mechanism method, when predicting the word of each object language, a feedforward neural network has been used
The center for predicting an attention takes the attention of a window size around the central point to carry out calculating object language
Word.
But feedforward neural network is few using the information of encoder reference, the centralized positioning accuracy rate of attention is low, causes
That translates is of poor quality.
Summary of the invention
In view of the above problems, in order to which the centralized positioning accuracy rate for solving the problems, such as above-mentioned attention is low, the embodiment of the present invention
Propose a kind of text handling method and a kind of corresponding text processing system, a kind of device for text-processing.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of text handling methods, comprising:
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word,
One or more information determine the central point of local attention window in central point before decoding t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out described t-th
Target word.
Optionally, it is described according to encoding state, the decoded state when decoding t-th of target word, in decoding described in
One or more information determine that the step of central point of local attention window includes: in central point before t-th of target word
The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the decoder second
Hidden layer state, one when other target words before decoding t-th of target word, in the matrix connection of weight matrix or
Multiple information;
It connects in conjunction with the first hidden layer state, the second hidden layer state with the matrix and is infused in the determining source text
The center that power of anticipating is concentrated, the central point as local attention window.
Optionally, the first hidden layer state for obtaining the encoder, when decoding t-th of target word, described
Second hidden layer state of decoder, when other target words before decoding t-th of target word, the matrix of weight matrix connects
The step of one or more information in connecing includes:
Extract j-th of the source word recorded when sequentially inputting the source text and the source after the word of j-th of source
First word information of word;
J-th of the source word recorded when extracting the source text described in backward input and the source before the word of j-th of source
Second word information of word;
In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to;
And/or
Extract multiple weight matrixs when other target words before decoding t-th of target word;
The multiple weight matrix is mapped as to the weight matrix of multiple specified formats;
The weight matrix of the multiple specified format is added, matrix connection is obtained.
Optionally, the first hidden layer state, the second hidden layer state described in the combination are connected with the matrix determines institute
The center paid attention in source text is stated, the step of central point as local attention window includes:
One or more letters in being connected respectively to the first hidden layer state, the second hidden layer state with the matrix
Breath configuration weight matrix;
Combination configuration weight matrix the first hidden layer state, the second hidden layer state connected with the matrix in one or more
A information obtains characteristic information;
Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information;
Nonlinear transformation is carried out to the active information, obtains characteristic value;
Product between the characteristic value and the word length of the source word is rounded downwards, local attention window is obtained
Central point.
Optionally, the step of center based on the local attention determines local attention window include:
The difference between the central point and preset centre deviation value is calculated, as first end point value;
Between the central point and preset centre deviation value and value is calculated, as the second endpoint value;
By the distance between the first end point value and second endpoint value, it is set as local attention window.
The embodiment of the invention also discloses a kind of text processing systems, comprising:
Source text receiving module, for receiving source text, the source text has multiple source words;
Vector coding module, for call encoder by the multiple source Chinese word coding be multiple vectors;
Central point determining module, for according to encoding state, decoding t-th of mesh when decoding t-th of target word
Decoded state when marking word, one or more information determine part note in the central point before decoding t-th of target word
The central point of meaning power window;
Local attention window determining module, for determining that part pays attention to based on the central point of the local attention window
Power window;
Vector decoding module, for calling decoder according to being located at source word in the local attention window, by it is described to
Amount decodes t-th of target word.
Optionally, institute's central point determining module includes:
Reference information acquisition submodule is decoding t-th of mesh for obtaining the first hidden layer state of the encoder
When marking word, the second hidden layer state of the decoder, when other target words before decoding t-th of target word, weight
One or more information in the matrix connection of matrix;
Reference information determines submodule, in conjunction with the first hidden layer state, the second hidden layer state and the square
Battle array connection determines the center that attention is concentrated in the source text, the central point as local attention window.
Optionally, the reference information acquisition submodule includes:
First word information extraction unit, for extracting j-th of the source word recorded when sequentially inputting the source text and position
First word information of the source word after the word of j-th of source;
Second word information extraction unit, j-th of the source word recorded when for extracting the source text described in backward input and position
Second word information of the source word before the word of j-th of source;
Word information combination converting unit, it is described for being converted in conjunction with first word information and second word information
First hidden layer state of encoder;
And/or
Weight matrix extraction unit, for more when extracting other target words before decoding t-th of target word
A weight matrix;
Weight matrix map unit, for the multiple weight matrix to be mapped as to the weight matrix of multiple specified formats;
Weight matrix addition unit is added for the weight matrix to the multiple specified format, is obtained matrix and is connected
It connects.
Optionally, the reference information determines that submodule includes:
Weight matrix configuration unit, for respectively to the first hidden layer state, the second hidden layer state and the square
One or more information configuration weight matrixs in battle array connection;
Reference information assembled unit, for combining the first hidden layer state, the second hidden layer state and the institute of configuration weight matrix
One or more information in matrix connection are stated, characteristic information is obtained;
Nonlinear activation unit is swashed for carrying out nonlinear activation to the characteristic information and configuring weight matrix
Information living;
Non-linear conversion unit obtains characteristic value for carrying out nonlinear transformation to the active information;
Unit is rounded downwards to obtain for being rounded the product between the characteristic value and the word length of the source word downwards
Obtain the central point of local attention window.
Optionally, the local attention window determining module includes:
Submodule is arranged in first end point value, for calculating the difference between the central point and preset centre deviation value,
As first end point value;
Submodule is arranged in second endpoint value, for calculating between the central point and preset centre deviation value and value,
As the second endpoint value;
Submodule is arranged in local attention window, for by between the first end point value and second endpoint value away from
From being set as local attention window.
The embodiment of the invention also discloses a kind of device for text-processing, include memory and one or
More than one program, perhaps more than one program is stored in memory and is configured to by one or one for one of them
It includes the instruction for performing the following operation that a above processor, which executes the one or more programs:
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word,
One or more information determine the central point of local attention window in central point before decoding t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out described t-th
Target word.
Optionally, it is also configured to execute the one or more programs by one or more than one processor
Include the instruction for performing the following operation:
The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the decoder second
Hidden layer state, one when other target words before decoding t-th of target word, in the matrix connection of weight matrix or
Multiple information;
It connects in conjunction with the first hidden layer state, the second hidden layer state with the matrix and is infused in the determining source text
The center that power of anticipating is concentrated, the central point as local attention window.
Optionally, it is also configured to execute the one or more programs by one or more than one processor
Include the instruction for performing the following operation:
Extract j-th of the source word recorded when sequentially inputting the source text and the source after the word of j-th of source
First word information of word;
J-th of the source word recorded when extracting the source text described in backward input and the source before the word of j-th of source
Second word information of word;
In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to;
And/or
Extract multiple weight matrixs when other target words before decoding t-th of target word;
The multiple weight matrix is mapped as to the weight matrix of multiple specified formats;
The weight matrix of the multiple specified format is added, matrix connection is obtained.
Optionally, it is also configured to execute the one or more programs by one or more than one processor
Include the instruction for performing the following operation:
One or more letters in being connected respectively to the first hidden layer state, the second hidden layer state with the matrix
Breath configuration weight matrix;
Combination configuration weight matrix the first hidden layer state, the second hidden layer state connected with the matrix in one or more
A information obtains characteristic information;
Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information;
Nonlinear transformation is carried out to the active information, obtains characteristic value;
Product between the characteristic value and the word length of the source word is rounded downwards, local attention window is obtained
Central point.
Optionally, it is also configured to execute the one or more programs by one or more than one processor
Include the instruction for performing the following operation:
The difference between the central point and preset centre deviation value is calculated, as first end point value;
Between the central point and preset centre deviation value and value is calculated, as the second endpoint value;
By the distance between the first end point value and second endpoint value, it is set as local attention window.
The embodiment of the present invention includes following advantages:
The embodiment of the present invention introduces local attention model in coding-decoding architecture, calls encoder for received
Multiple source Chinese word codings in source text are multiple vectors, when decoding t-th of target word, according to encoding state, in t-th of decoding
Decoded state when target word, one or more determining part attentions of information in the central point before decoding t-th of target word
The central point of power window, determines therefrom that local attention window, call decoder according to be located in local attention window to
Amount decodes t-th of target word, is conducive to search suitable for the encoding state and when decoding t-th of target word in source text
The position that decoded state is focused on is also beneficial to reduce the center point set before decoding t-th of target word in source text
In attention, the attention concentrated except the position of central point before decode t-th of target word is improved in source text, it is logical
It crosses and comprehensively considers much information, improve the accuracy rate of the centralized positioning of attention, to improve at the business such as translation
The quality of reason.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of text handling method of one embodiment of the present of invention;
Fig. 2 is a kind of structural block diagram of text processing system of one embodiment of the present of invention;
Fig. 3 is a kind of block diagram of device for text-processing shown according to an exemplary embodiment;
Fig. 4 is the structural schematic diagram of server in the embodiment of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of text handling method of one embodiment of the invention is shown, it specifically can be with
Include the following steps:
Step 101, source text is received.
Source text is the text of pending business processing, and under normal circumstances, source text has multiple source words.
In contrast, the word after progress business processing is referred to as target word.
It should be noted that source word, target word be for business processing, indicate a unit word, one
A punctuation mark, a number, a Chinese character, a phrase, an English word are properly termed as the word of a unit.
Step 102, the multiple source Chinese word coding is multiple vectors by calling encoder.
In the concrete realization, the embodiment of the present invention can apply Encoder-Decoder (coding-decoding) frame.
There are encoder and decoder in Encoder-Decoder frame, encoder can be used for turning list entries
It is melted into the vector of a regular length, decoder can be used for fixed vector being then converted into output sequence.
The business processings such as Encoder-Decoder frame can be applied to translation, document is won, question answering system, for example,
In translation, list entries (i.e. source text) is to belong to text that is to be translated, belonging to first language, output sequence (i.e. target word)
After being translation, belong to the text of second language;In question answering system, list entries is the problem of proposition, and output sequence is answer.
It should be noted that the model that encoder and decoder are specifically used, it can be by those skilled in the art according to reality
Border situation is configured, for example, CNN (Convolutional Neural Network, convolutional neural networks), RNN
(Recurrent Neural Networks, Recognition with Recurrent Neural Network), BiRNN (Bidirectional recurrent neural
Networks, bidirectional circulating neural network), GRU (Gated Rucurrent Unit, gating cycle unit), LSTM (Long
Short-Term Memory, time recurrent neural network), Deep LSTM (depth time recurrent neural network) etc., these moulds
Type can also be combined according to the actual situation by those skilled in the art, for example, encoder is used using CNN, decoder
RNN, encoder is using RNN, decoder using RNN, etc., and the embodiments of the present invention are not limited thereto.
Step 103, when decoding t-th of target word, according to encoding state, the solution when decoding t-th of target word
Code state, one or more information determine local attention window in the central point before decoding t-th of target word
Central point.
In embodiments of the present invention, local attention model, local attention are introduced in Encoder-Decoder frame
Model is the mutation of attention model (Attention model).
Attention model is a kind of soft alignment model, during business processing (as translated), one target word of every generation
Before, attention alignment model is calculated, which illustrates that " attention " concentrates on when generating current goal word
Certain source words in source text (corresponding part, probability value are big in weight matrix).
In attention model, when generating each target word, although meeting " attention " in certain source words, to its of source text
He also has corresponding probability by source word, this this may result in attention and concentrates not enough.And local attention model by window it
Outer source word is ignored, so that attention is more concentrated.
It should be noted that local attention model is not requiring encoder all to encode all input information into one admittedly
Among the vector of measured length.On the contrary, encoder needs the sequence by input coding at a vector at this time, and when decoded
It waits, selects a subset in the slave sequence vector that each step all can be selective and be further processed.In this way, generating each
When output, it can accomplish the information for making full use of list entries to carry.
It in the concrete realization, i.e., can be with reference to volume in t moment if decoder decodes t (t is positive integer) a target word
Code state, the decoded state when decode t-th of target word, one in the central point before decoding t-th of target word
Or multiple information, determine the central point of part attention window.
Wherein, encoding state is conducive to search position that be suitable for coding, that attention is concentrated in source text.
Decoded state when decoding t-th of target word, which is conducive to search in source text, to be suitable for decoding t-th of target
When word, attention concentrate position.
Central point before decoding t-th of target word advantageously reduces the central point before decoding t-th of target word
The attention of concentration improves the attention that the position other than the central point before decoding t-th of target word is concentrated.
In one embodiment of the invention, step 103 may include following sub-step:
Sub-step S11 obtains the first hidden layer state of the encoder, when decoding t-th of target word, the solution
Second hidden layer state of code device, when other target words before decoding t-th of target word, the matrix connection of weight matrix
In one or more information.
1, encoding state can be indicated using the first hidden layer state of encoder:
In the concrete realization, on the one hand, extract jth (j is positive integer) a source word recorded when sequence receives source text
And the first word information of the source word after j-th of source word.
On the other hand, j-th of the source word recorded when backward receives source text and the source before j-th of source word are extracted
Second word information of word.
In conjunction with the first word information and the second word information, the first hidden layer state of encoder is converted to.
2, the decoded state when decoding t-th of target word can using when decoding t-th of target word, decoder the
Two hidden layer states are indicated:
In the concrete realization, the first hidden layer state, the t- of the encoder when decoding t-1 target word can be extracted
1 target word and content vector can be obtained the decoded state when decoding t-th of target word by function conversion.
Wherein, hiding sequence vector when contents of vector is by encoding is added to obtain by weight.
3, other mesh before decoding t-th of target word can be used in the central point before decoding t-th of target word
Mark word when, weight matrix matrix connection be indicated.
In the concrete realization, multiple weight squares when other target words before decoding t-th of target word can be extracted
Battle array.
It, can be by multiple weights for the ease of weight matrix to be added since the dimension of different weight matrixs is different
Matrix is mapped as the weight matrix of multiple specified formats.
The weight matrix of multiple specified formats is added, matrix connection is obtained.
Sub-step S12 is connected described in determination in conjunction with the first hidden layer state, the second hidden layer state with the matrix
The center that attention is concentrated in source text, the central point as local attention window.
In embodiments of the present invention, comprehensively consider the first hidden layer state, one that the second hidden layer state is connected with matrix or
Multiple factors determine the center that attention is concentrated in source text, the central point as local attention window.
In one example of an embodiment of the present invention, sub-step S12 may include following sub-step:
Sub-step S121, respectively connects the first hidden layer state, the second hidden layer state with the matrix
One or more information configuration weight matrixs.
Sub-step S122, the first hidden layer state, the second hidden layer state of combination configuration weight matrix are connected with the matrix
In one or more information, obtain characteristic information.
Sub-step S123 carries out nonlinear activation to the characteristic information and configures weight matrix, obtains active information.
Sub-step S124 carries out nonlinear transformation to the active information, obtains characteristic value.
Sub-step S125 is rounded downwards the product between the characteristic value and the word length of the source word, obtains part
The central point of attention window.
For connecting simultaneously using the first hidden layer state, the second hidden layer state with matrix, following formula meter can be passed through
Calculate the central point of local attention window:
Wherein, mid indicates that the central point of local attention window, Floor () function are rounded for downward, | S | indicate source
The word length of text, sigmoid function are used to carry out nonlinear activation for carrying out nonlinear transformation, tanh function,Wpt、
Wps、WaRespectively indicate four weight matrixs, htRefer to the second hidden layer state of t moment (decoding t-th of target word) decoder, hs
Refer to the first hidden layer state of encoder, Att<tIndicate t moment before all moment weight matrix (including give the first hidden layer shape
State, the second hidden layer state, matrix connection and the weight matrix of nonlinear activation function) matrix connect.
Step 104, local attention window is determined based on the central point of the local attention window.
In the concrete realization, however, it is determined that the central point of local attention window, then it can be with the central point a certain range
Interior region is as local attention window.
In one embodiment of the invention, step 104 may include following sub-step:
Sub-step S21 calculates the difference between the central point and preset centre deviation value, as first end point value.
Sub-step S22 calculates between the central point and preset centre deviation value and value, as the second endpoint value.
The distance between the first end point value and second endpoint value are set local attention by sub-step S23
Window.
Using the embodiment of the present invention, a centre deviation value can be preset, that is, deviates the center of local attention window
The value of point.
It should be noted that the centre deviation value can be the value of default, can also be counted according to the case where source text
It calculates, the embodiments of the present invention are not limited thereto.
Assuming that central point is mid, centre deviation value is w, then local attention window are as follows:
[mid-w,mid+w]
Furthermore, for the formula of the central point of above-mentioned calculating part attention window, since sigmoid function is
Any real number is converted into the real number between (0,1), therefore, central point mid is that the integer between (1, | S |) is therefore ignored
Part beyond source text.
If the difference between central point and centre deviation value less than 0, takes 0 to be used as first end point value.
If the difference between central point and centre deviation value is greater than the word length of source text | S |, take | S | as second end
Point value.
At this point, local attention window are as follows:
[max(0,mid-w),min(|S|,mid+w)]
Wherein, min function representation takes smaller value, and max function representation takes the larger value.
Step 105, it calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into institute
State t-th of target word.
In local attention model, the source word being located in local attention window can be calculated for t-th of target
The attention of word calls decoder according to the source word for being located in local attention window and configuring attention, vector decoding is gone out
T-th of target word.
The embodiment of the present invention introduces local attention model in coding-decoding architecture, calls encoder for received
Multiple source Chinese word codings in source text are multiple vectors, when decoding t-th of target word, according to encoding state, in t-th of decoding
Decoded state when target word, one or more determining part attentions of information in the central point before decoding t-th of target word
The central point of power window, determines therefrom that local attention window, call decoder according to be located in local attention window to
Amount decodes t-th of target word, is conducive to search suitable for the encoding state and when decoding t-th of target word in source text
The position that decoded state is focused on is also beneficial to reduce the center point set before decoding t-th of target word in source text
In attention, the attention concentrated except the position of central point before decode t-th of target word is improved in source text, it is logical
It crosses and comprehensively considers much information, improve the accuracy rate of the centralized positioning of attention, to improve at the business such as translation
The quality of reason.
Embodiment in order to enable those skilled in the art to better understand the present invention is illustrated below by way of the example of translation.
Assuming that source text be Chinese sentence " I | be | China | people |, | like | eat | China | dish.", wherein " | " is source word
Between separator, then share 10 words together with punctuate in the source text.
If human translation, it is translated as english sentence " I am a Chinese, I like eating Chinese food. "
If generating following translation using traditional local attention model:
I am a Chinese,eating food.
At the 6th moment, that is, when generating " eating ", the central point for calculating attention window is 7 (" eating "), translation
Eating is gone out, has missed " liking ".
If generating following translation using the local attention model of the embodiment of the present invention:
I am a Chinese,like eating Chines food.
At the 6th moment, that is, when generating " like ", the central point for calculating attention window is 6 (" liking "), translation
" like " out improves the quality of translation.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method
It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to
According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should
Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented
Necessary to example.
Referring to Fig. 2, a kind of structural block diagram of text processing system of one embodiment of the invention is shown, specifically can wrap
Include following module:
Source text receiving module 201, for receiving source text, the source text has multiple source words;
Vector coding module 202, for call encoder by the multiple source Chinese word coding be multiple vectors;
Central point determining module 203, for according to encoding state, decoding the t when decoding t-th of target word
Decoded state when a target word, one or more information determine office in the central point before decoding t-th of target word
The central point of portion's attention window;
Local attention window determining module 204, for determining part based on the central point of the local attention window
Attention window;
Vector decoding module 205 will be described for calling decoder according to source word in the local attention window is located at
Vector decoding goes out t-th of target word.
In one embodiment of the invention, institute's central point determining module 203 includes:
Reference information acquisition submodule is decoding t-th of mesh for obtaining the first hidden layer state of the encoder
When marking word, the second hidden layer state of the decoder, when other target words before decoding t-th of target word, weight
One or more information in the matrix connection of matrix;
Reference information determines submodule, in conjunction with the first hidden layer state, the second hidden layer state and the square
Battle array connection determines the center that attention is concentrated in the source text, the central point as local attention window.
In one embodiment of the invention, the reference information acquisition submodule includes:
First word information extraction unit, for extracting j-th of the source word recorded when sequentially inputting the source text and position
First word information of the source word after the word of j-th of source;
Second word information extraction unit, j-th of the source word recorded when for extracting the source text described in backward input and position
Second word information of the source word before the word of j-th of source;
Word information combination converting unit, it is described for being converted in conjunction with first word information and second word information
First hidden layer state of encoder;
And/or
Weight matrix extraction unit, for more when extracting other target words before decoding t-th of target word
A weight matrix;
Weight matrix map unit, for the multiple weight matrix to be mapped as to the weight matrix of multiple specified formats;
Weight matrix addition unit is added for the weight matrix to the multiple specified format, is obtained matrix and is connected
It connects.
In one embodiment of the invention, the reference information determines that submodule includes:
Weight matrix configuration unit, for respectively to the first hidden layer state, the second hidden layer state and the square
One or more information configuration weight matrixs in battle array connection;
Reference information assembled unit, for combining the first hidden layer state, the second hidden layer state and the institute of configuration weight matrix
One or more information in matrix connection are stated, characteristic information is obtained;
Nonlinear activation unit is swashed for carrying out nonlinear activation to the characteristic information and configuring weight matrix
Information living;
Non-linear conversion unit obtains characteristic value for carrying out nonlinear transformation to the active information;
Unit is rounded downwards to obtain for being rounded the product between the characteristic value and the word length of the source word downwards
Obtain the central point of local attention window.
In one embodiment of the invention, the local attention window determining module 204 includes:
Submodule is arranged in first end point value, for calculating the difference between the central point and preset centre deviation value,
As first end point value;
Submodule is arranged in second endpoint value, for calculating between the central point and preset centre deviation value and value,
As the second endpoint value;
Submodule is arranged in local attention window, for by between the first end point value and second endpoint value away from
From being set as local attention window.
About the system in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, no detailed explanation will be given here.
The embodiment of the present invention introduces local attention model in coding-decoding architecture, calls encoder for received
Multiple source Chinese word codings in source text are multiple vectors, when decoding t-th of target word, according to encoding state, in t-th of decoding
Decoded state when target word, one or more determining part attentions of information in the central point before decoding t-th of target word
The central point of power window, determines therefrom that local attention window, call decoder according to be located in local attention window to
Amount decodes t-th of target word, is conducive to search suitable for the encoding state and when decoding t-th of target word in source text
The position that decoded state is focused on is also beneficial to reduce the center point set before decoding t-th of target word in source text
In attention, the attention concentrated except the position of central point before decode t-th of target word is improved in source text, it is logical
It crosses and comprehensively considers much information, improve the accuracy rate of the centralized positioning of attention, to improve at the business such as translation
The quality of reason.
Fig. 3 is a kind of block diagram of device 300 for text-processing shown according to an exemplary embodiment.For example, dress
Setting 300 can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical treatment
Equipment, body-building equipment, personal digital assistant etc..
Referring to Fig. 3, device 300 may include following one or more components: processing component 302, memory 304, power supply
Component 306, multimedia component 308, audio component 310, the interface 312 of input/output (I/O), sensor module 314, and
Communication component 316.
The integrated operation of the usual control device 300 of processing component 302, such as with display, telephone call, data communication, phase
Machine operation and record operate associated operation.Processing element 302 may include that one or more processors 320 refer to execute
It enables, to perform all or part of the steps of the methods described above.In addition, processing component 302 may include one or more modules, just
Interaction between processing component 302 and other assemblies.For example, processing component 302 may include multi-media module, it is more to facilitate
Interaction between media component 308 and processing component 302.
Memory 304 is configured as storing various types of data to support the operation in equipment 300.These data are shown
Example includes the instruction of any application or method for operating on the device 300, contact data, and telephone book data disappears
Breath, picture, video etc..Memory 304 can be by any kind of volatibility or non-volatile memory device or their group
It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile
Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash
Device, disk or CD.
Power supply module 306 provides electric power for the various assemblies of device 300.Power supply module 306 may include power management system
System, one or more power supplys and other with for device 300 generate, manage, and distribute the associated component of electric power.
Multimedia component 308 includes the screen of one output interface of offer between described device 300 and user.One
In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings
Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action
Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers
Body component 308 includes a front camera and/or rear camera.When equipment 300 is in operation mode, such as screening-mode or
When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and
Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 310 is configured as output and/or input audio signal.For example, audio component 310 includes a Mike
Wind (MIC), when device 300 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched
It is set to reception external audio signal.The received audio signal can be further stored in memory 304 or via communication set
Part 316 is sent.In some embodiments, audio component 310 further includes a loudspeaker, is used for output audio signal.
I/O interface 312 provides interface between processing component 302 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock
Determine button.
Sensor module 314 includes one or more sensors, and the state for providing various aspects for device 300 is commented
Estimate.For example, sensor module 314 can detecte the state that opens/closes of equipment 300, and the relative positioning of component, for example, it is described
Component is the display and keypad of device 300, and sensor module 314 can be with 300 1 components of detection device 300 or device
Position change, the existence or non-existence that user contacts with device 300,300 orientation of device or acceleration/deceleration and device 300
Temperature change.Sensor module 314 may include proximity sensor, be configured to detect without any physical contact
Presence of nearby objects.Sensor module 314 can also include optical sensor, such as CMOS or ccd image sensor, at
As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 316 is configured to facilitate the communication of wired or wireless way between device 300 and other equipment.Device
300 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation
In example, communication component 316 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 316 further includes near-field communication (NFC) module, to promote short range communication.Example
Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology,
Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 300 can be believed by one or more application specific integrated circuit (ASIC), number
Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided
It such as include the memory 304 of instruction, above-metioned instruction can be executed by the processor 320 of device 300 to complete the above method.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal
When device executes, so that mobile terminal is able to carry out a kind of text handling method, which comprises
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word,
One or more information determine the central point of local attention window in central point before decoding t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out described t-th
Target word.
Optionally, it is described according to encoding state, the decoded state when decoding t-th of target word, in decoding described in
One or more information determine that the step of central point of local attention window includes: in central point before t-th of target word
The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the decoder second
Hidden layer state, one when other target words before decoding t-th of target word, in the matrix connection of weight matrix or
Multiple information;
It connects in conjunction with the first hidden layer state, the second hidden layer state with the matrix and is infused in the determining source text
The center that power of anticipating is concentrated, the central point as local attention window.
Optionally, the first hidden layer state for obtaining the encoder, when decoding t-th of target word, described
Second hidden layer state of decoder, when other target words before decoding t-th of target word, the matrix of weight matrix connects
The step of one or more information in connecing includes:
Extract j-th of the source word recorded when sequentially inputting the source text and the source after the word of j-th of source
First word information of word;
J-th of the source word recorded when extracting the source text described in backward input and the source before the word of j-th of source
Second word information of word;
In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to;
And/or
Extract multiple weight matrixs when other target words before decoding t-th of target word;
The multiple weight matrix is mapped as to the weight matrix of multiple specified formats;
The weight matrix of the multiple specified format is added, matrix connection is obtained.
Optionally, the first hidden layer state, the second hidden layer state described in the combination are connected with the matrix determines institute
The center paid attention in source text is stated, the step of central point as local attention window includes:
One or more letters in being connected respectively to the first hidden layer state, the second hidden layer state with the matrix
Breath configuration weight matrix;
Combination configuration weight matrix the first hidden layer state, the second hidden layer state connected with the matrix in one or more
A information obtains characteristic information;
Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information;
Nonlinear transformation is carried out to the active information, obtains characteristic value;
Product between the characteristic value and the word length of the source word is rounded downwards, local attention window is obtained
Central point.
Optionally, the step of center based on the local attention determines local attention window include:
The difference between the central point and preset centre deviation value is calculated, as first end point value;
Between the central point and preset centre deviation value and value is calculated, as the second endpoint value;
By the distance between the first end point value and second endpoint value, it is set as local attention window.
Fig. 4 is the structural schematic diagram of server in the embodiment of the present invention.The server 400 can be due to configuration or performance be different
Generate bigger difference, may include one or more central processing units (central processing units,
CPU) 422 (for example, one or more processors) and memory 432, one or more storage application programs 442 or
The storage medium 430 (such as one or more mass memory units) of data 444.Wherein, memory 432 and storage medium
430 can be of short duration storage or persistent storage.The program for being stored in storage medium 430 may include one or more modules
(diagram does not mark), each module may include to the series of instructions operation in server.Further, central processing unit
422 can be set to communicate with storage medium 430, and the series of instructions behaviour in storage medium 430 is executed on server 400
Make.
Server 400 can also include one or more power supplys 426, one or more wired or wireless networks
Interface 450, one or more input/output interfaces 458, one or more keyboards 456, and/or, one or one
The above operating system 441, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
The embodiment of the invention discloses A1, a kind of text handling method, comprising:
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word,
One or more information determine the central point of local attention window in central point before decoding t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out described t-th
Target word.
A2, method according to a1, it is described according to encoding state, the decoding shape when decoding t-th of target word
State, one or more information determine the center of local attention window in the central point before decoding t-th of target word
Point the step of include:
The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the decoder second
Hidden layer state, one when other target words before decoding t-th of target word, in the matrix connection of weight matrix or
Multiple information;
It connects in conjunction with the first hidden layer state, the second hidden layer state with the matrix and is infused in the determining source text
The center that power of anticipating is concentrated, the central point as local attention window.
A3, the method according to A2, the first hidden layer state for obtaining the encoder are decoding described t-th
When target word, the second hidden layer state of the decoder, when other target words before decoding t-th of target word, power
Value matrix matrix connection in one or more information the step of include:
Extract j-th of the source word recorded when sequentially inputting the source text and the source after the word of j-th of source
First word information of word;
J-th of the source word recorded when extracting the source text described in backward input and the source before the word of j-th of source
Second word information of word;
In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to;
And/or
Extract multiple weight matrixs when other target words before decoding t-th of target word;
The multiple weight matrix is mapped as to the weight matrix of multiple specified formats;
The weight matrix of the multiple specified format is added, matrix connection is obtained.
A4, the method according to A2, the first hidden layer state, the second hidden layer state and the square described in the combination
The step of battle array connection determines the center paid attention in the source text, central point as local attention window include:
One or more letters in being connected respectively to the first hidden layer state, the second hidden layer state with the matrix
Breath configuration weight matrix;
Combination configuration weight matrix the first hidden layer state, the second hidden layer state connected with the matrix in one or more
A information obtains characteristic information;
Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information;
Nonlinear transformation is carried out to the active information, obtains characteristic value;
Product between the characteristic value and the word length of the source word is rounded downwards, local attention window is obtained
Central point.
A5, the method according to A1 or A2 or A3 or A4, the center based on the local attention determine part
The step of attention window includes:
The difference between the central point and preset centre deviation value is calculated, as first end point value;
Between the central point and preset centre deviation value and value is calculated, as the second endpoint value;
By the distance between the first end point value and second endpoint value, it is set as local attention window.
The embodiment of the invention also discloses B6, a kind of text processing system, comprising:
Source text receiving module, for receiving source text, the source text has multiple source words;
Vector coding module, for call encoder by the multiple source Chinese word coding be multiple vectors;
Central point determining module, for according to encoding state, decoding t-th of mesh when decoding t-th of target word
Decoded state when marking word, one or more information determine part note in the central point before decoding t-th of target word
The central point of meaning power window;
Local attention window determining module, for determining that part pays attention to based on the central point of the local attention window
Power window;
Vector decoding module, for calling decoder according to being located at source word in the local attention window, by it is described to
Amount decodes t-th of target word.
B7, the system according to B6, institute's central point determining module include:
Reference information acquisition submodule is decoding t-th of mesh for obtaining the first hidden layer state of the encoder
When marking word, the second hidden layer state of the decoder, when other target words before decoding t-th of target word, weight
One or more information in the matrix connection of matrix;
Reference information determines submodule, in conjunction with the first hidden layer state, the second hidden layer state and the square
Battle array connection determines the center that attention is concentrated in the source text, the central point as local attention window.
B8, the system according to B7, the reference information acquisition submodule include:
First word information extraction unit, for extracting j-th of the source word recorded when sequentially inputting the source text and position
First word information of the source word after the word of j-th of source;
Second word information extraction unit, j-th of the source word recorded when for extracting the source text described in backward input and position
Second word information of the source word before the word of j-th of source;
Word information combination converting unit, it is described for being converted in conjunction with first word information and second word information
First hidden layer state of encoder;
And/or
Weight matrix extraction unit, for more when extracting other target words before decoding t-th of target word
A weight matrix;
Weight matrix map unit, for the multiple weight matrix to be mapped as to the weight matrix of multiple specified formats;
Weight matrix addition unit is added for the weight matrix to the multiple specified format, is obtained matrix and is connected
It connects.
B9, the system according to B7, the reference information determine that submodule includes:
Weight matrix configuration unit, for respectively to the first hidden layer state, the second hidden layer state and the square
One or more information configuration weight matrixs in battle array connection;
Reference information assembled unit, for combining the first hidden layer state, the second hidden layer state and the institute of configuration weight matrix
One or more information in matrix connection are stated, characteristic information is obtained;
Nonlinear activation unit is swashed for carrying out nonlinear activation to the characteristic information and configuring weight matrix
Information living;
Non-linear conversion unit obtains characteristic value for carrying out nonlinear transformation to the active information;
Unit is rounded downwards to obtain for being rounded the product between the characteristic value and the word length of the source word downwards
Obtain the central point of local attention window.
B10, the system according to B6 or B7 or B8 or B9, the part attention window determining module include:
Submodule is arranged in first end point value, for calculating the difference between the central point and preset centre deviation value,
As first end point value;
Submodule is arranged in second endpoint value, for calculating between the central point and preset centre deviation value and value,
As the second endpoint value;
Submodule is arranged in local attention window, for by between the first end point value and second endpoint value away from
From being set as local attention window.
The embodiment of the invention also discloses C11, a kind of device for text-processing, include memory and one
Perhaps more than one program one of them or more than one program is stored in memory, and be configured to by one or
It includes the instruction for performing the following operation that more than one processor of person, which executes the one or more programs:
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word,
One or more information determine the central point of local attention window in central point before decoding t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out described t-th
Target word.
C12, the device according to C11 are also configured to one to be executed by one or more than one processor
Or more than one program includes the instruction for performing the following operation:
The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the decoder second
Hidden layer state, one when other target words before decoding t-th of target word, in the matrix connection of weight matrix or
Multiple information;
It connects in conjunction with the first hidden layer state, the second hidden layer state with the matrix and is infused in the determining source text
The center that power of anticipating is concentrated, the central point as local attention window.
C13, the device according to C12 are also configured to one to be executed by one or more than one processor
Or more than one program includes the instruction for performing the following operation:
Extract j-th of the source word recorded when sequentially inputting the source text and the source after the word of j-th of source
First word information of word;
J-th of the source word recorded when extracting the source text described in backward input and the source before the word of j-th of source
Second word information of word;
In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to;
And/or
Extract multiple weight matrixs when other target words before decoding t-th of target word;
The multiple weight matrix is mapped as to the weight matrix of multiple specified formats;
The weight matrix of the multiple specified format is added, matrix connection is obtained.
C14, the device according to C12 are also configured to one to be executed by one or more than one processor
Or more than one program includes the instruction for performing the following operation:
One or more letters in being connected respectively to the first hidden layer state, the second hidden layer state with the matrix
Breath configuration weight matrix;
Combination configuration weight matrix the first hidden layer state, the second hidden layer state connected with the matrix in one or more
A information obtains characteristic information;
Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information;
Nonlinear transformation is carried out to the active information, obtains characteristic value;
Product between the characteristic value and the word length of the source word is rounded downwards, local attention window is obtained
Central point.
C15, the device according to C11 or C12 or C13 or C14, be also configured to by one or more than one processing
It includes the instruction for performing the following operation that device, which executes the one or more programs:
The difference between the central point and preset centre deviation value is calculated, as first end point value;
Between the central point and preset centre deviation value and value is calculated, as the second endpoint value;
By the distance between the first end point value and second endpoint value, it is set as local attention window.
Claims (10)
1. a kind of text handling method characterized by comprising
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word, decoding
One or more information determine the central point of local attention window in central point before t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into t-th of target
Word.
2. the method according to claim 1, wherein it is described according to encoding state, decoding t-th of target
Decoded state when word, one or more determining part attentions of information in the central point before decoding t-th of target word
The step of central point of power window includes:
The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the second hidden layer of the decoder
State, when other target words before decoding t-th of target word, one or more of the matrix connection of weight matrix
Information;
It is connected in conjunction with the first hidden layer state, the second hidden layer state with the matrix and determines attention in the source text
The center of concentration, the central point as local attention window.
3. according to the method described in claim 2, it is characterized in that, the first hidden layer state for obtaining the encoder,
When decoding t-th of target word, the second hidden layer state of the decoder, its before decoding t-th of target word
When his target word, the matrix of weight matrix connection in one or more information the step of include:
Extract j-th of the source word and the source word after the word of j-th of source recorded when sequentially inputting the source text
First word information;
J-th of source word recording and the source word before the word of j-th of source when extracting the source text described in backward input
Second word information;
In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to;
And/or
Extract multiple weight matrixs when other target words before decoding t-th of target word;
The multiple weight matrix is mapped as to the weight matrix of multiple specified formats;
The weight matrix of the multiple specified format is added, matrix connection is obtained.
4. according to the method described in claim 2, it is characterized in that, the first hidden layer state described in the combination, described second hidden
Layer state connects the center for determining and paying attention in the source text, the step of the central point as local attention window with the matrix
Suddenly include:
One or more information in connecting respectively to the first hidden layer state, the second hidden layer state with the matrix are matched
Set weight matrix;
The first hidden layer state, the second hidden layer state of combination configuration weight matrix connect with the matrix in one or more believe
Breath obtains characteristic information;
Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information;
Nonlinear transformation is carried out to the active information, obtains characteristic value;
Product between the characteristic value and the word length of the source word is rounded downwards, the center of local attention window is obtained
Point.
5. method according to claim 1 or 2 or 3 or 4, which is characterized in that described based in the local attention
The heart determines that the step of local attention window includes:
The difference between the central point and preset centre deviation value is calculated, as first end point value;
Between the central point and preset centre deviation value and value is calculated, as the second endpoint value;
By the distance between the first end point value and second endpoint value, it is set as local attention window.
6. a kind of text processing system characterized by comprising
Source text receiving module, for receiving source text, the source text has multiple source words;
Vector coding module, for call encoder by the multiple source Chinese word coding be multiple vectors;
Central point determining module, for according to encoding state, decoding t-th of target word when decoding t-th of target word
When decoded state, one or more information determine local attentions in the central point before decoding t-th of target word
The central point of window;
Local attention window determining module, for determining local attention window based on the central point of the local attention window
Mouthful;
Vector decoding module, for calling decoder according to source word in the local attention window is located at, by the vector solution
Code goes out t-th of target word.
7. system according to claim 6, which is characterized in that institute's central point determining module includes:
Reference information acquisition submodule is decoding t-th of target word for obtaining the first hidden layer state of the encoder
When, the decoder the second hidden layer state, when other target words before decoding t-th of target word, weight matrix
Matrix connection in one or more information;
Reference information determines submodule, for connecting in conjunction with the first hidden layer state, the second hidden layer state and the matrix
Connect the center for determining that attention is concentrated in the source text, the central point as local attention window.
8. system according to claim 7, which is characterized in that the reference information acquisition submodule includes:
First word information extraction unit, for extracting j-th of the source word recorded when sequentially inputting the source text and being located at institute
State the first word information of the source word after j-th of source word;
Second word information extraction unit, j-th of the source word recorded when for extracting the source text described in backward input and be located at institute
State the second word information of the source word before j-th of source word;
Word information combination converting unit, for being converted to the coding in conjunction with first word information and second word information
First hidden layer state of device;
And/or
Weight matrix extraction unit, for extracting multiple power when other target words before decoding t-th of target word
Value matrix;
Weight matrix map unit, for the multiple weight matrix to be mapped as to the weight matrix of multiple specified formats;
Weight matrix addition unit is added for the weight matrix to the multiple specified format, obtains matrix connection.
9. system according to claim 7, which is characterized in that the reference information determines that submodule includes:
Weight matrix configuration unit, for connecting respectively to the first hidden layer state, the second hidden layer state and the matrix
One or more information configuration weight matrixs in connecing;
Reference information assembled unit, for combining the first hidden layer state, the second hidden layer state and the square of configuration weight matrix
One or more information in battle array connection, obtain characteristic information;
Nonlinear activation unit obtains activation letter for carrying out nonlinear activation to the characteristic information and configuring weight matrix
Breath;
Non-linear conversion unit obtains characteristic value for carrying out nonlinear transformation to the active information;
It is rounded unit, for being rounded downwards to the product between the characteristic value and the word length of the source word, acquisition office downwards
The central point of portion's attention window.
10. a kind of device for text-processing, which is characterized in that include memory and one or more than one
Program, perhaps more than one program is stored in memory and is configured to by one or more than one processing for one of them
It includes the instruction for performing the following operation that device, which executes the one or more programs:
Source text is received, the source text has multiple source words;
The multiple source Chinese word coding is multiple vectors by calling encoder;
When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word, decoding
One or more information determine the central point of local attention window in central point before t-th of target word;
Local attention window is determined based on the central point of the local attention window;
It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into t-th of target
Word.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710602815.0A CN109284510B (en) | 2017-07-21 | 2017-07-21 | Text processing method and system and text processing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710602815.0A CN109284510B (en) | 2017-07-21 | 2017-07-21 | Text processing method and system and text processing device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109284510A true CN109284510A (en) | 2019-01-29 |
CN109284510B CN109284510B (en) | 2022-10-21 |
Family
ID=65185298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710602815.0A Active CN109284510B (en) | 2017-07-21 | 2017-07-21 | Text processing method and system and text processing device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109284510B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110347790A (en) * | 2019-06-18 | 2019-10-18 | 广州杰赛科技股份有限公司 | Text duplicate checking method, apparatus, equipment and storage medium based on attention mechanism |
US20200133952A1 (en) * | 2018-10-31 | 2020-04-30 | International Business Machines Corporation | Natural language generation system using graph-to-sequence model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060015323A1 (en) * | 2004-07-13 | 2006-01-19 | Udupa Raghavendra U | Method, apparatus, and computer program for statistical translation decoding |
CN102047680A (en) * | 2008-06-02 | 2011-05-04 | 皇家飞利浦电子股份有限公司 | Apparatus and method for adjusting the cognitive complexity of an audiovisual content to a viewer attention level |
CN102054178A (en) * | 2011-01-20 | 2011-05-11 | 北京联合大学 | Chinese painting image identifying method based on local semantic concept |
-
2017
- 2017-07-21 CN CN201710602815.0A patent/CN109284510B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060015323A1 (en) * | 2004-07-13 | 2006-01-19 | Udupa Raghavendra U | Method, apparatus, and computer program for statistical translation decoding |
CN102047680A (en) * | 2008-06-02 | 2011-05-04 | 皇家飞利浦电子股份有限公司 | Apparatus and method for adjusting the cognitive complexity of an audiovisual content to a viewer attention level |
CN102054178A (en) * | 2011-01-20 | 2011-05-11 | 北京联合大学 | Chinese painting image identifying method based on local semantic concept |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200133952A1 (en) * | 2018-10-31 | 2020-04-30 | International Business Machines Corporation | Natural language generation system using graph-to-sequence model |
CN110347790A (en) * | 2019-06-18 | 2019-10-18 | 广州杰赛科技股份有限公司 | Text duplicate checking method, apparatus, equipment and storage medium based on attention mechanism |
CN110347790B (en) * | 2019-06-18 | 2021-08-10 | 广州杰赛科技股份有限公司 | Text duplicate checking method, device and equipment based on attention mechanism and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109284510B (en) | 2022-10-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11620984B2 (en) | Human-computer interaction method, and electronic device and storage medium thereof | |
CN105119812B (en) | In the method, apparatus and terminal device of chat interface change emoticon | |
CN107291690A (en) | Punctuate adding method and device, the device added for punctuate | |
CN105162693B (en) | message display method and device | |
US20220277752A1 (en) | Voice interaction method and related apparatus | |
US11138422B2 (en) | Posture detection method, apparatus and device, and storage medium | |
CN110909815B (en) | Neural network training method, neural network training device, neural network processing device, neural network training device, image processing device and electronic equipment | |
CN107992485A (en) | A kind of simultaneous interpretation method and device | |
CN113691833B (en) | Virtual anchor face changing method and device, electronic equipment and storage medium | |
CN109243430A (en) | A kind of audio recognition method and device | |
CN109871843A (en) | Character identifying method and device, the device for character recognition | |
CN108538284A (en) | Simultaneous interpretation result shows method and device, simultaneous interpreting method and device | |
CN108039995A (en) | Message sending control method, terminal and computer-readable recording medium | |
CN109961094A (en) | Sample acquiring method, device, electronic equipment and readable storage medium storing program for executing | |
CN108628813A (en) | Treating method and apparatus, the device for processing | |
CN109302528B (en) | Photographing method, mobile terminal and computer readable storage medium | |
CN108073572A (en) | Information processing method and its device, simultaneous interpretation system | |
CN109412929A (en) | The method, device and mobile terminal that expression adaptively adjusts in instant messaging application | |
CN110135349A (en) | Recognition methods, device, equipment and storage medium | |
CN110502648A (en) | Recommended models acquisition methods and device for multimedia messages | |
CN109886211A (en) | Data mask method, device, electronic equipment and storage medium | |
CN109388699A (en) | Input method, device, equipment and storage medium | |
CN111835621A (en) | Session message processing method and device, computer equipment and readable storage medium | |
CN109284510A (en) | A kind of text handling method, system and a kind of device for text-processing | |
CN112631435A (en) | Input method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |