CN109284510A

CN109284510A - A kind of text handling method, system and a kind of device for text-processing

Info

Publication number: CN109284510A
Application number: CN201710602815.0A
Authority: CN
Inventors: 程善伯; 王宇光; 姜里羊; 陈伟; 王砚峰
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2017-07-21
Filing date: 2017-07-21
Publication date: 2019-01-29
Anticipated expiration: 2037-07-21
Also published as: CN109284510B

Abstract

The embodiment of the invention provides a kind of text handling method, system and a kind of devices for text-processing, this method comprises: receiving source text, the source text has multiple source words；The multiple source Chinese word coding is multiple vectors by calling encoder；When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word, one or more information determine the central point of local attention window in the central point before decoding t-th of target word；Local attention window is determined based on the central point of the local attention window；It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into t-th of target word.By comprehensively considering much information, the accuracy rate of the centralized positioning of attention is improved, to improve the quality of the business processings such as translation.

Description

A kind of text handling method, system and a kind of device for text-processing

Technical field

The present invention relates to the technical fields of Language Processing, more particularly to a kind of text handling method, a kind of text-processing System and a kind of device for text-processing.

Background technique

Machine translation is otherwise known as automatic translation technology, by the program capability using computer, a kind of language is automatic It is converted to another language, the former is known as original language, and the latter is referred to as object language.

Currently, the common local attention model of machine translation, local attention model is the improvement based on attention model, In existing local attention mechanism method, when predicting the word of each object language, a feedforward neural network has been used The center for predicting an attention takes the attention of a window size around the central point to carry out calculating object language Word.

But feedforward neural network is few using the information of encoder reference, the centralized positioning accuracy rate of attention is low, causes That translates is of poor quality.

Summary of the invention

In view of the above problems, in order to which the centralized positioning accuracy rate for solving the problems, such as above-mentioned attention is low, the embodiment of the present invention Propose a kind of text handling method and a kind of corresponding text processing system, a kind of device for text-processing.

To solve the above-mentioned problems, the embodiment of the invention discloses a kind of text handling methods, comprising:

Source text is received, the source text has multiple source words；

The multiple source Chinese word coding is multiple vectors by calling encoder；

When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word, One or more information determine the central point of local attention window in central point before decoding t-th of target word；

Local attention window is determined based on the central point of the local attention window；

It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out described t-th Target word.

Optionally, it is described according to encoding state, the decoded state when decoding t-th of target word, in decoding described in One or more information determine that the step of central point of local attention window includes: in central point before t-th of target word

The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the decoder second Hidden layer state, one when other target words before decoding t-th of target word, in the matrix connection of weight matrix or Multiple information；

It connects in conjunction with the first hidden layer state, the second hidden layer state with the matrix and is infused in the determining source text The center that power of anticipating is concentrated, the central point as local attention window.

Optionally, the first hidden layer state for obtaining the encoder, when decoding t-th of target word, described Second hidden layer state of decoder, when other target words before decoding t-th of target word, the matrix of weight matrix connects The step of one or more information in connecing includes:

Extract j-th of the source word recorded when sequentially inputting the source text and the source after the word of j-th of source First word information of word；

J-th of the source word recorded when extracting the source text described in backward input and the source before the word of j-th of source Second word information of word；

In conjunction with first word information and second word information, the first hidden layer state of the encoder is converted to；

And/or

Extract multiple weight matrixs when other target words before decoding t-th of target word；

The multiple weight matrix is mapped as to the weight matrix of multiple specified formats；

The weight matrix of the multiple specified format is added, matrix connection is obtained.

Optionally, the first hidden layer state, the second hidden layer state described in the combination are connected with the matrix determines institute The center paid attention in source text is stated, the step of central point as local attention window includes:

One or more letters in being connected respectively to the first hidden layer state, the second hidden layer state with the matrix Breath configuration weight matrix；

Combination configuration weight matrix the first hidden layer state, the second hidden layer state connected with the matrix in one or more A information obtains characteristic information；

Nonlinear activation is carried out to the characteristic information and configures weight matrix, obtains active information；

Nonlinear transformation is carried out to the active information, obtains characteristic value；

Product between the characteristic value and the word length of the source word is rounded downwards, local attention window is obtained Central point.

Optionally, the step of center based on the local attention determines local attention window include:

The difference between the central point and preset centre deviation value is calculated, as first end point value；

Between the central point and preset centre deviation value and value is calculated, as the second endpoint value；

By the distance between the first end point value and second endpoint value, it is set as local attention window.

The embodiment of the invention also discloses a kind of text processing systems, comprising:

Source text receiving module, for receiving source text, the source text has multiple source words；

Vector coding module, for call encoder by the multiple source Chinese word coding be multiple vectors；

Central point determining module, for according to encoding state, decoding t-th of mesh when decoding t-th of target word Decoded state when marking word, one or more information determine part note in the central point before decoding t-th of target word The central point of meaning power window；

Local attention window determining module, for determining that part pays attention to based on the central point of the local attention window Power window；

Vector decoding module, for calling decoder according to being located at source word in the local attention window, by it is described to Amount decodes t-th of target word.

Optionally, institute's central point determining module includes:

Reference information acquisition submodule is decoding t-th of mesh for obtaining the first hidden layer state of the encoder When marking word, the second hidden layer state of the decoder, when other target words before decoding t-th of target word, weight One or more information in the matrix connection of matrix；

Reference information determines submodule, in conjunction with the first hidden layer state, the second hidden layer state and the square Battle array connection determines the center that attention is concentrated in the source text, the central point as local attention window.

Optionally, the reference information acquisition submodule includes:

First word information extraction unit, for extracting j-th of the source word recorded when sequentially inputting the source text and position First word information of the source word after the word of j-th of source；

Second word information extraction unit, j-th of the source word recorded when for extracting the source text described in backward input and position Second word information of the source word before the word of j-th of source；

Word information combination converting unit, it is described for being converted in conjunction with first word information and second word information First hidden layer state of encoder；

And/or

Weight matrix extraction unit, for more when extracting other target words before decoding t-th of target word A weight matrix；

Weight matrix map unit, for the multiple weight matrix to be mapped as to the weight matrix of multiple specified formats；

Weight matrix addition unit is added for the weight matrix to the multiple specified format, is obtained matrix and is connected It connects.

Optionally, the reference information determines that submodule includes:

Weight matrix configuration unit, for respectively to the first hidden layer state, the second hidden layer state and the square One or more information configuration weight matrixs in battle array connection；

Reference information assembled unit, for combining the first hidden layer state, the second hidden layer state and the institute of configuration weight matrix One or more information in matrix connection are stated, characteristic information is obtained；

Nonlinear activation unit is swashed for carrying out nonlinear activation to the characteristic information and configuring weight matrix Information living；

Non-linear conversion unit obtains characteristic value for carrying out nonlinear transformation to the active information；

Unit is rounded downwards to obtain for being rounded the product between the characteristic value and the word length of the source word downwards Obtain the central point of local attention window.

Optionally, the local attention window determining module includes:

Submodule is arranged in first end point value, for calculating the difference between the central point and preset centre deviation value, As first end point value；

Submodule is arranged in second endpoint value, for calculating between the central point and preset centre deviation value and value, As the second endpoint value；

Submodule is arranged in local attention window, for by between the first end point value and second endpoint value away from From being set as local attention window.

The embodiment of the invention also discloses a kind of device for text-processing, include memory and one or More than one program, perhaps more than one program is stored in memory and is configured to by one or one for one of them It includes the instruction for performing the following operation that a above processor, which executes the one or more programs:

Source text is received, the source text has multiple source words；

Optionally, it is also configured to execute the one or more programs by one or more than one processor Include the instruction for performing the following operation:

And/or

The embodiment of the present invention includes following advantages:

The embodiment of the present invention introduces local attention model in coding-decoding architecture, calls encoder for received Multiple source Chinese word codings in source text are multiple vectors, when decoding t-th of target word, according to encoding state, in t-th of decoding Decoded state when target word, one or more determining part attentions of information in the central point before decoding t-th of target word The central point of power window, determines therefrom that local attention window, call decoder according to be located in local attention window to Amount decodes t-th of target word, is conducive to search suitable for the encoding state and when decoding t-th of target word in source text The position that decoded state is focused on is also beneficial to reduce the center point set before decoding t-th of target word in source text In attention, the attention concentrated except the position of central point before decode t-th of target word is improved in source text, it is logical It crosses and comprehensively considers much information, improve the accuracy rate of the centralized positioning of attention, to improve at the business such as translation The quality of reason.

Detailed description of the invention

Fig. 1 is a kind of step flow chart of text handling method of one embodiment of the present of invention；

Fig. 2 is a kind of structural block diagram of text processing system of one embodiment of the present of invention；

Fig. 3 is a kind of block diagram of device for text-processing shown according to an exemplary embodiment；

Fig. 4 is the structural schematic diagram of server in the embodiment of the present invention.

Specific embodiment

In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.

Referring to Fig.1, a kind of step flow chart of text handling method of one embodiment of the invention is shown, it specifically can be with Include the following steps:

Step 101, source text is received.

Source text is the text of pending business processing, and under normal circumstances, source text has multiple source words.

In contrast, the word after progress business processing is referred to as target word.

It should be noted that source word, target word be for business processing, indicate a unit word, one A punctuation mark, a number, a Chinese character, a phrase, an English word are properly termed as the word of a unit.

Step 102, the multiple source Chinese word coding is multiple vectors by calling encoder.

In the concrete realization, the embodiment of the present invention can apply Encoder-Decoder (coding-decoding) frame.

There are encoder and decoder in Encoder-Decoder frame, encoder can be used for turning list entries It is melted into the vector of a regular length, decoder can be used for fixed vector being then converted into output sequence.

The business processings such as Encoder-Decoder frame can be applied to translation, document is won, question answering system, for example, In translation, list entries (i.e. source text) is to belong to text that is to be translated, belonging to first language, output sequence (i.e. target word) After being translation, belong to the text of second language；In question answering system, list entries is the problem of proposition, and output sequence is answer.

It should be noted that the model that encoder and decoder are specifically used, it can be by those skilled in the art according to reality Border situation is configured, for example, CNN (Convolutional Neural Network, convolutional neural networks), RNN (Recurrent Neural Networks, Recognition with Recurrent Neural Network), BiRNN (Bidirectional recurrent neural Networks, bidirectional circulating neural network), GRU (Gated Rucurrent Unit, gating cycle unit), LSTM (Long Short-Term Memory, time recurrent neural network), Deep LSTM (depth time recurrent neural network) etc., these moulds Type can also be combined according to the actual situation by those skilled in the art, for example, encoder is used using CNN, decoder RNN, encoder is using RNN, decoder using RNN, etc., and the embodiments of the present invention are not limited thereto.

Step 103, when decoding t-th of target word, according to encoding state, the solution when decoding t-th of target word Code state, one or more information determine local attention window in the central point before decoding t-th of target word Central point.

In embodiments of the present invention, local attention model, local attention are introduced in Encoder-Decoder frame Model is the mutation of attention model (Attention model).

Attention model is a kind of soft alignment model, during business processing (as translated), one target word of every generation Before, attention alignment model is calculated, which illustrates that " attention " concentrates on when generating current goal word Certain source words in source text (corresponding part, probability value are big in weight matrix).

In attention model, when generating each target word, although meeting " attention " in certain source words, to its of source text He also has corresponding probability by source word, this this may result in attention and concentrates not enough.And local attention model by window it Outer source word is ignored, so that attention is more concentrated.

It should be noted that local attention model is not requiring encoder all to encode all input information into one admittedly Among the vector of measured length.On the contrary, encoder needs the sequence by input coding at a vector at this time, and when decoded It waits, selects a subset in the slave sequence vector that each step all can be selective and be further processed.In this way, generating each When output, it can accomplish the information for making full use of list entries to carry.

It in the concrete realization, i.e., can be with reference to volume in t moment if decoder decodes t (t is positive integer) a target word Code state, the decoded state when decode t-th of target word, one in the central point before decoding t-th of target word Or multiple information, determine the central point of part attention window.

Wherein, encoding state is conducive to search position that be suitable for coding, that attention is concentrated in source text.

Decoded state when decoding t-th of target word, which is conducive to search in source text, to be suitable for decoding t-th of target When word, attention concentrate position.

Central point before decoding t-th of target word advantageously reduces the central point before decoding t-th of target word The attention of concentration improves the attention that the position other than the central point before decoding t-th of target word is concentrated.

In one embodiment of the invention, step 103 may include following sub-step:

Sub-step S11 obtains the first hidden layer state of the encoder, when decoding t-th of target word, the solution Second hidden layer state of code device, when other target words before decoding t-th of target word, the matrix connection of weight matrix In one or more information.

1, encoding state can be indicated using the first hidden layer state of encoder:

In the concrete realization, on the one hand, extract jth (j is positive integer) a source word recorded when sequence receives source text And the first word information of the source word after j-th of source word.

On the other hand, j-th of the source word recorded when backward receives source text and the source before j-th of source word are extracted Second word information of word.

In conjunction with the first word information and the second word information, the first hidden layer state of encoder is converted to.

2, the decoded state when decoding t-th of target word can using when decoding t-th of target word, decoder the Two hidden layer states are indicated:

In the concrete realization, the first hidden layer state, the t- of the encoder when decoding t-1 target word can be extracted 1 target word and content vector can be obtained the decoded state when decoding t-th of target word by function conversion.

Wherein, hiding sequence vector when contents of vector is by encoding is added to obtain by weight.

3, other mesh before decoding t-th of target word can be used in the central point before decoding t-th of target word Mark word when, weight matrix matrix connection be indicated.

In the concrete realization, multiple weight squares when other target words before decoding t-th of target word can be extracted Battle array.

It, can be by multiple weights for the ease of weight matrix to be added since the dimension of different weight matrixs is different Matrix is mapped as the weight matrix of multiple specified formats.

The weight matrix of multiple specified formats is added, matrix connection is obtained.

Sub-step S12 is connected described in determination in conjunction with the first hidden layer state, the second hidden layer state with the matrix The center that attention is concentrated in source text, the central point as local attention window.

In embodiments of the present invention, comprehensively consider the first hidden layer state, one that the second hidden layer state is connected with matrix or Multiple factors determine the center that attention is concentrated in source text, the central point as local attention window.

In one example of an embodiment of the present invention, sub-step S12 may include following sub-step:

Sub-step S121, respectively connects the first hidden layer state, the second hidden layer state with the matrix One or more information configuration weight matrixs.

Sub-step S122, the first hidden layer state, the second hidden layer state of combination configuration weight matrix are connected with the matrix In one or more information, obtain characteristic information.

Sub-step S123 carries out nonlinear activation to the characteristic information and configures weight matrix, obtains active information.

Sub-step S124 carries out nonlinear transformation to the active information, obtains characteristic value.

Sub-step S125 is rounded downwards the product between the characteristic value and the word length of the source word, obtains part The central point of attention window.

For connecting simultaneously using the first hidden layer state, the second hidden layer state with matrix, following formula meter can be passed through Calculate the central point of local attention window:

Wherein, mid indicates that the central point of local attention window, Floor () function are rounded for downward, | S | indicate source The word length of text, sigmoid function are used to carry out nonlinear activation for carrying out nonlinear transformation, tanh function,W_pt、 W_ps、W_aRespectively indicate four weight matrixs, h_tRefer to the second hidden layer state of t moment (decoding t-th of target word) decoder, h_s Refer to the first hidden layer state of encoder, Att_<tIndicate t moment before all moment weight matrix (including give the first hidden layer shape State, the second hidden layer state, matrix connection and the weight matrix of nonlinear activation function) matrix connect.

Step 104, local attention window is determined based on the central point of the local attention window.

In the concrete realization, however, it is determined that the central point of local attention window, then it can be with the central point a certain range Interior region is as local attention window.

In one embodiment of the invention, step 104 may include following sub-step:

Sub-step S21 calculates the difference between the central point and preset centre deviation value, as first end point value.

Sub-step S22 calculates between the central point and preset centre deviation value and value, as the second endpoint value.

The distance between the first end point value and second endpoint value are set local attention by sub-step S23 Window.

Using the embodiment of the present invention, a centre deviation value can be preset, that is, deviates the center of local attention window The value of point.

It should be noted that the centre deviation value can be the value of default, can also be counted according to the case where source text It calculates, the embodiments of the present invention are not limited thereto.

Assuming that central point is mid, centre deviation value is w, then local attention window are as follows:

[mid-w,mid+w]

Furthermore, for the formula of the central point of above-mentioned calculating part attention window, since sigmoid function is Any real number is converted into the real number between (0,1), therefore, central point mid is that the integer between (1, | S |) is therefore ignored Part beyond source text.

If the difference between central point and centre deviation value less than 0, takes 0 to be used as first end point value.

If the difference between central point and centre deviation value is greater than the word length of source text | S |, take | S | as second end Point value.

At this point, local attention window are as follows:

[max(0,mid-w),min(|S|,mid+w)]

Wherein, min function representation takes smaller value, and max function representation takes the larger value.

Step 105, it calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into institute State t-th of target word.

In local attention model, the source word being located in local attention window can be calculated for t-th of target The attention of word calls decoder according to the source word for being located in local attention window and configuring attention, vector decoding is gone out T-th of target word.

Embodiment in order to enable those skilled in the art to better understand the present invention is illustrated below by way of the example of translation.

If human translation, it is translated as english sentence " I am a Chinese, I like eating Chinese food. "

If generating following translation using traditional local attention model:

I am a Chinese,eating food.

At the 6th moment, that is, when generating " eating ", the central point for calculating attention window is 7 (" eating "), translation Eating is gone out, has missed " liking ".

If generating following translation using the local attention model of the embodiment of the present invention:

I am a Chinese,like eating Chines food.

At the 6th moment, that is, when generating " like ", the central point for calculating attention window is 6 (" liking "), translation " like " out improves the quality of translation.

It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.

Referring to Fig. 2, a kind of structural block diagram of text processing system of one embodiment of the invention is shown, specifically can wrap Include following module:

Source text receiving module 201, for receiving source text, the source text has multiple source words；

Vector coding module 202, for call encoder by the multiple source Chinese word coding be multiple vectors；

Central point determining module 203, for according to encoding state, decoding the t when decoding t-th of target word Decoded state when a target word, one or more information determine office in the central point before decoding t-th of target word The central point of portion's attention window；

Local attention window determining module 204, for determining part based on the central point of the local attention window Attention window；

Vector decoding module 205 will be described for calling decoder according to source word in the local attention window is located at Vector decoding goes out t-th of target word.

In one embodiment of the invention, institute's central point determining module 203 includes:

In one embodiment of the invention, the reference information acquisition submodule includes:

And/or

In one embodiment of the invention, the reference information determines that submodule includes:

In one embodiment of the invention, the local attention window determining module 204 includes:

About the system in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.

Fig. 3 is a kind of block diagram of device 300 for text-processing shown according to an exemplary embodiment.For example, dress Setting 300 can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical treatment Equipment, body-building equipment, personal digital assistant etc..

Referring to Fig. 3, device 300 may include following one or more components: processing component 302, memory 304, power supply Component 306, multimedia component 308, audio component 310, the interface 312 of input/output (I/O), sensor module 314, and Communication component 316.

The integrated operation of the usual control device 300 of processing component 302, such as with display, telephone call, data communication, phase Machine operation and record operate associated operation.Processing element 302 may include that one or more processors 320 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 302 may include one or more modules, just Interaction between processing component 302 and other assemblies.For example, processing component 302 may include multi-media module, it is more to facilitate Interaction between media component 308 and processing component 302.

Memory 304 is configured as storing various types of data to support the operation in equipment 300.These data are shown Example includes the instruction of any application or method for operating on the device 300, contact data, and telephone book data disappears Breath, picture, video etc..Memory 304 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.

Power supply module 306 provides electric power for the various assemblies of device 300.Power supply module 306 may include power management system System, one or more power supplys and other with for device 300 generate, manage, and distribute the associated component of electric power.

Multimedia component 308 includes the screen of one output interface of offer between described device 300 and user.One In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers Body component 308 includes a front camera and/or rear camera.When equipment 300 is in operation mode, such as screening-mode or When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.

Audio component 310 is configured as output and/or input audio signal.For example, audio component 310 includes a Mike Wind (MIC), when device 300 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched It is set to reception external audio signal.The received audio signal can be further stored in memory 304 or via communication set Part 316 is sent.In some embodiments, audio component 310 further includes a loudspeaker, is used for output audio signal.

I/O interface 312 provides interface between processing component 302 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.

Sensor module 314 includes one or more sensors, and the state for providing various aspects for device 300 is commented Estimate.For example, sensor module 314 can detecte the state that opens/closes of equipment 300, and the relative positioning of component, for example, it is described Component is the display and keypad of device 300, and sensor module 314 can be with 300 1 components of detection device 300 or device Position change, the existence or non-existence that user contacts with device 300,300 orientation of device or acceleration/deceleration and device 300 Temperature change.Sensor module 314 may include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 314 can also include optical sensor, such as CMOS or ccd image sensor, at As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 316 is configured to facilitate the communication of wired or wireless way between device 300 and other equipment.Device 300 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation In example, communication component 316 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 316 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, device 300 can be believed by one or more application specific integrated circuit (ASIC), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.

In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 304 of instruction, above-metioned instruction can be executed by the processor 320 of device 300 to complete the above method.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..

A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal When device executes, so that mobile terminal is able to carry out a kind of text handling method, which comprises

Source text is received, the source text has multiple source words；

And/or

Fig. 4 is the structural schematic diagram of server in the embodiment of the present invention.The server 400 can be due to configuration or performance be different Generate bigger difference, may include one or more central processing units (central processing units, CPU) 422 (for example, one or more processors) and memory 432, one or more storage application programs 442 or The storage medium 430 (such as one or more mass memory units) of data 444.Wherein, memory 432 and storage medium 430 can be of short duration storage or persistent storage.The program for being stored in storage medium 430 may include one or more modules (diagram does not mark), each module may include to the series of instructions operation in server.Further, central processing unit 422 can be set to communicate with storage medium 430, and the series of instructions behaviour in storage medium 430 is executed on server 400 Make.

Server 400 can also include one or more power supplys 426, one or more wired or wireless networks Interface 450, one or more input/output interfaces 458, one or more keyboards 456, and/or, one or one The above operating system 441, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..

Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claim is pointed out.

It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

The embodiment of the invention discloses A1, a kind of text handling method, comprising:

Source text is received, the source text has multiple source words；

A2, method according to a1, it is described according to encoding state, the decoding shape when decoding t-th of target word State, one or more information determine the center of local attention window in the central point before decoding t-th of target word Point the step of include:

A3, the method according to A2, the first hidden layer state for obtaining the encoder are decoding described t-th When target word, the second hidden layer state of the decoder, when other target words before decoding t-th of target word, power Value matrix matrix connection in one or more information the step of include:

And/or

A4, the method according to A2, the first hidden layer state, the second hidden layer state and the square described in the combination The step of battle array connection determines the center paid attention in the source text, central point as local attention window include:

A5, the method according to A1 or A2 or A3 or A4, the center based on the local attention determine part The step of attention window includes:

The embodiment of the invention also discloses B6, a kind of text processing system, comprising:

B7, the system according to B6, institute's central point determining module include:

B8, the system according to B7, the reference information acquisition submodule include:

And/or

B9, the system according to B7, the reference information determine that submodule includes:

B10, the system according to B6 or B7 or B8 or B9, the part attention window determining module include:

The embodiment of the invention also discloses C11, a kind of device for text-processing, include memory and one Perhaps more than one program one of them or more than one program is stored in memory, and be configured to by one or It includes the instruction for performing the following operation that more than one processor of person, which executes the one or more programs:

Source text is received, the source text has multiple source words；

C12, the device according to C11 are also configured to one to be executed by one or more than one processor Or more than one program includes the instruction for performing the following operation:

C13, the device according to C12 are also configured to one to be executed by one or more than one processor Or more than one program includes the instruction for performing the following operation:

And/or

C14, the device according to C12 are also configured to one to be executed by one or more than one processor Or more than one program includes the instruction for performing the following operation:

C15, the device according to C11 or C12 or C13 or C14, be also configured to by one or more than one processing It includes the instruction for performing the following operation that device, which executes the one or more programs:

Claims

1. a kind of text handling method characterized by comprising

Source text is received, the source text has multiple source words；

When decoding t-th of target word, according to encoding state, the decoded state when decoding t-th of target word, decoding One or more information determine the central point of local attention window in central point before t-th of target word；

It calls decoder according to source word in the local attention window is located at, the vector decoding is gone out into t-th of target Word.

2. the method according to claim 1, wherein it is described according to encoding state, decoding t-th of target Decoded state when word, one or more determining part attentions of information in the central point before decoding t-th of target word The step of central point of power window includes:

The the first hidden layer state for obtaining the encoder, when decoding t-th of target word, the second hidden layer of the decoder State, when other target words before decoding t-th of target word, one or more of the matrix connection of weight matrix Information；

It is connected in conjunction with the first hidden layer state, the second hidden layer state with the matrix and determines attention in the source text The center of concentration, the central point as local attention window.

3. according to the method described in claim 2, it is characterized in that, the first hidden layer state for obtaining the encoder, When decoding t-th of target word, the second hidden layer state of the decoder, its before decoding t-th of target word When his target word, the matrix of weight matrix connection in one or more information the step of include:

Extract j-th of the source word and the source word after the word of j-th of source recorded when sequentially inputting the source text First word information；

J-th of source word recording and the source word before the word of j-th of source when extracting the source text described in backward input Second word information；

And/or

4. according to the method described in claim 2, it is characterized in that, the first hidden layer state described in the combination, described second hidden Layer state connects the center for determining and paying attention in the source text, the step of the central point as local attention window with the matrix Suddenly include:

One or more information in connecting respectively to the first hidden layer state, the second hidden layer state with the matrix are matched Set weight matrix；

The first hidden layer state, the second hidden layer state of combination configuration weight matrix connect with the matrix in one or more believe Breath obtains characteristic information；

Product between the characteristic value and the word length of the source word is rounded downwards, the center of local attention window is obtained Point.

5. method according to claim 1 or 2 or 3 or 4, which is characterized in that described based in the local attention The heart determines that the step of local attention window includes:

6. a kind of text processing system characterized by comprising

Central point determining module, for according to encoding state, decoding t-th of target word when decoding t-th of target word When decoded state, one or more information determine local attentions in the central point before decoding t-th of target word The central point of window；

Local attention window determining module, for determining local attention window based on the central point of the local attention window Mouthful；

Vector decoding module, for calling decoder according to source word in the local attention window is located at, by the vector solution Code goes out t-th of target word.

7. system according to claim 6, which is characterized in that institute's central point determining module includes:

Reference information acquisition submodule is decoding t-th of target word for obtaining the first hidden layer state of the encoder When, the decoder the second hidden layer state, when other target words before decoding t-th of target word, weight matrix Matrix connection in one or more information；

Reference information determines submodule, for connecting in conjunction with the first hidden layer state, the second hidden layer state and the matrix Connect the center for determining that attention is concentrated in the source text, the central point as local attention window.

8. system according to claim 7, which is characterized in that the reference information acquisition submodule includes:

First word information extraction unit, for extracting j-th of the source word recorded when sequentially inputting the source text and being located at institute State the first word information of the source word after j-th of source word；

Second word information extraction unit, j-th of the source word recorded when for extracting the source text described in backward input and be located at institute State the second word information of the source word before j-th of source word；

Word information combination converting unit, for being converted to the coding in conjunction with first word information and second word information First hidden layer state of device；

And/or

Weight matrix extraction unit, for extracting multiple power when other target words before decoding t-th of target word Value matrix；

Weight matrix addition unit is added for the weight matrix to the multiple specified format, obtains matrix connection.

9. system according to claim 7, which is characterized in that the reference information determines that submodule includes:

Weight matrix configuration unit, for connecting respectively to the first hidden layer state, the second hidden layer state and the matrix One or more information configuration weight matrixs in connecing；

Reference information assembled unit, for combining the first hidden layer state, the second hidden layer state and the square of configuration weight matrix One or more information in battle array connection, obtain characteristic information；

Nonlinear activation unit obtains activation letter for carrying out nonlinear activation to the characteristic information and configuring weight matrix Breath；

It is rounded unit, for being rounded downwards to the product between the characteristic value and the word length of the source word, acquisition office downwards The central point of portion's attention window.

10. a kind of device for text-processing, which is characterized in that include memory and one or more than one Program, perhaps more than one program is stored in memory and is configured to by one or more than one processing for one of them It includes the instruction for performing the following operation that device, which executes the one or more programs:

Source text is received, the source text has multiple source words；