CN110263147A

CN110263147A - The generation method and device of pushed information

Info

Publication number: CN110263147A
Application number: CN201910487754.7A
Authority: CN
Inventors: 陈若田; 刘弘一; 熊军; 李若鹏
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2019-06-05
Filing date: 2019-06-05
Publication date: 2019-09-20
Anticipated expiration: 2039-06-05
Also published as: CN110263147B

Abstract

This specification embodiment provides the generation method and device of a kind of pushed information, according to this method embodiment, in pushed information generating process, by obtaining the natural language description information about target to be pushed, description information is split into character cell, the corresponding sequence vector of description information is determined according to the corresponding each character vector of character cell, then the coding and decoding network processes of the natural language series processing sequence vector is utilized, at least one character string is predicted, then determines pushed information from the character string determined.The validity of pushed information generation can be improved in the embodiment.

Description

The generation method and device of pushed information

Technical field

This specification one or more embodiment is related to field of computer technology, more particularly to is generated and pushed by computer The method and apparatus of information.

Background technique

Product or service etc. usually can be pushed to masses by way of publicity by pushed information.Pushed information can letter Clean push target of rapidly treating is described.For example, advertising slogan is exactly a kind of pushed information in daily marketing.APP application In online APP, advertising slogan, which can be, would generally provide corresponding recommendation language for APP in market.By taking advertising slogan as an example, these are pushed away Breath of delivering letters usually have the characteristics that it is succinct it is understandable, uniqueness is suitable for reading, theme is clear, with appeal etc., therefore, in routine techniques, Pushed information is usually by manually being edited.This pushed information generating mode depend on artificial Writing Experience skill, writing at This height, low efficiency.

Natural language processing (Natural Language Processing), which can be used for studying, is able to achieve people and computer Between the various theory and methods of efficient communication are carried out with natural language.Applied field of the natural language processing in terms of artificial intelligence Scape is very more, for example, automatic translation, speech recognition, text classification etc..Field is pushed in information, natural language processing is answered How more accurate, more targeted information is carried out with being often embodied according to the side's of being pushed information and/or pushed information and pushed away It send.And for the specifying information content of push, but using limited.Particularly, pushed information is the advertisement for target to be pushed When language, due to the unique features of advertising slogan itself, the difficulty of natural language processing application is further increased.

Summary of the invention

This specification one or more embodiment describes a kind of method for generating pushed information based on natural language processing And device, it can solve above-mentioned one or more problems.

According in a first aspect, providing a kind of generation method of pushed information, comprising: obtain by natural language be directed to The description information that push target is described；The description information is split as multiple character cells, and determines each character list The corresponding each character vector of member, each character vector according to the multiple character cell put in order to be formed to Measure sequence；Using sequence vector described in coding and decoding network processes trained in advance, to obtain at least one no more than pre- fixed length The character string of degree；It is that the target to be pushed determines pushed information based on obtained each character string.

According to one embodiment, the character cell includes one in following: single text, at least one text are formed Vocabulary, the character string with independent meaning.

According to one embodiment, the multiple character cell includes the first character cell；The each character cell of determination Corresponding each character vector includes: the first number ID that the corresponding only hotlist of inquiry first character cell shows；It will First number ID input character vector model trained in advance, thus true according to the output result of the character vector model Determine the corresponding character vector of first character cell.

According to one embodiment, the coding and decoding network includes encoding nerve network and decoding neural network；The benefit The sequence vector described in coding and decoding network processes trained in advance includes: by the encoding nerve network by the vector sequence Column are converted into semantic vector；It is that the semantic vector predicts at least one character string by the decoding neural network.

According to one embodiment, the encoding nerve network or the decoding neural network be respectively Recognition with Recurrent Neural Network, Bidirectional circulating neural network, gating cycle unit, one in shot and long term memory models.

It is described that at least one character is predicted for the semantic vector by the decoding neural network according to one embodiment Sequence includes: according to the semantic vector and to start to identify, and the neuron by the decoding neural network is pre- at the first moment Survey the initial character of predetermined number；The character or character string being had been predicted that according to the semantic vector and t-1 moment, pass through institute State current character sequence of the neuron in t moment prediction predetermined number of decoding neural network, wherein t is the nature greater than 1 Number.

According to one embodiment, the coding and decoding network is trained in the following manner: will respectively with multiple push targets Corresponding each sequence is to as training sample, wherein each sequence to respectively corresponding a source sequence and a target sequence, Source sequence is the corresponding character vector sequence of description information of corresponding push target, and target sequence is and accordingly pushes pushing away for target Send the corresponding number ID sequence of sequence or character string；Using the source sequence of each sequence centering as the encoding nerve network Input, the target sequence after front one sequence of addition to be started to mark inputs the decoding neural network, according to the solution The output result of code neural network and the comparison in the target sequence for adding the sequence ends mark below, adjust the coding The model parameter of decoding network.

It is that the target to be pushed determines pushed information packet based at least one described character string according to one embodiment It includes: by language model trained in advance, predicting each probability that each character string occurs respectively as sentence；According to each Probability selection meets the character string of one of the following conditions, the pushed information as target to be pushed: belonging to predetermined quantity probability Maximum character string, probability are more than predetermined probabilities threshold value.

According to one embodiment, the description information is by optical character identification mode from for the target to be pushed It is obtained in picture.

According to second aspect, a kind of generating means of pushed information are provided, comprising:

Acquiring unit is configured to obtain the description information being described for target to be pushed by natural language；

Pretreatment unit is configured to for the description information to be split as multiple character cells, and determines each character cell Corresponding each character vector, each character vector to form vector according to putting in order for the multiple character cell Sequence；

Predicting unit is configured to using sequence vector described in coding and decoding network processes trained in advance, to obtain at least One is no more than the character string of predetermined length；

Determination unit, being configured to obtained each character string is that the target to be pushed determines pushed information.

According to the third aspect, a kind of computer readable storage medium is provided, computer program is stored thereon with, when described When computer program executes in a computer, enable computer execute first aspect method.

According to fourth aspect, a kind of calculating equipment, including memory and processor are provided, which is characterized in that described to deposit It is stored with executable code in reservoir, when the processor executes the executable code, the method for realizing first aspect.

The method and apparatus provided by this specification embodiment, in pushed information generating process, by obtain about The natural language description information of target to be pushed, splits into character cell for description information, corresponding each according to character cell Character vector determines the corresponding sequence vector of description information, then utilizes the coding and decoding network processes of natural language series processing The sequence vector predicts at least one character string, then determines pushed information from the character string determined.By In reducing artificial participation, artificial editing levels are not only restricted to, innovatively introduce coding and decoding network processes target to be pushed Description information generates pushed information, and the efficiency and validity of pushed information generation can be improved.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill of field, without creative efforts, it can also be obtained according to these attached drawings others Attached drawing.

Fig. 1 shows the implement scene schematic diagram of the embodiment of this specification disclosure；

Fig. 2 shows the generation method flow charts according to the pushed information of one embodiment；

Fig. 3 shows the network diagram that character vector is generated according to the digital ID that only hotlist of character cell shows；

Fig. 4 shows the schematic diagram of a coding and decoding network；

Fig. 5 is shown in a specific example, the schematic diagram that coding and decoding network is unfolded in chronological order；

The embodiment that Fig. 6 shows this specification disclosure is applied to the concrete scene signal that APP recommends language generating process；

Fig. 7 shows the schematic block diagram of the generating means of the pushed information according to one embodiment.

Specific embodiment

With reference to the accompanying drawing, the scheme provided this specification is described.

Fig. 1 is the implement scene schematic diagram of one embodiment that this specification discloses.In the implement scene, user can be with By being interacted with computing platform, to provide the description information of target to be pushed to computing platform, and obtained from computing platform For the pushed information of target to be pushed.Wherein, computing platform can be such as computer, tablet computer, smart phone etc Hardware device, be also possible to run on the application of hardware device, this be not construed as limiting in this implement scene.In one embodiment In, computing platform is terminal or the software for running on terminal, can directly be interacted between user and computing platform.Another In one embodiment, computing platform is when providing the server of support for the application of terminal operating, and user and terminal carry out directly Interaction, terminal and server is interacted by network, to interact indirectly between user and computing platform.

In scene shown in fig. 1, computing platform based on the input of user, can obtain the description letter of target to be pushed Breath.Then computing platform splits description information, obtains at least one character cell, such as a word or a word conduct One character cell.Each character cell can correspond to a character vector (such as term vector), the corresponding word of multiple character cells It accords with vector and constitutes character vector sequence.Further, computing platform can by character that coding and decoding network processes obtain to Sequence is measured, to generate at least one character string.According to these character strings, the push for target to be pushed can be determined Information.The pushed information is, for example, to treat the advertising slogan etc. of push target, for briefly observably pushing target to be pushed.One In a embodiment, it can also be the marking of each character string by prediction model trained in advance, be gone out using assessing it as sentence Existing probability, and select pushed information of the higher character string of score as target to be pushed.

In this way, by the coding and decoding network based on natural language processing, by the description information of natural language description, conversion At brief eye-catching pushed information, the formation efficiency of pushed information is improved.

The process for generating pushed information is detailed below.

Fig. 2 shows the generation method flow charts according to the pushed information of one embodiment.The executing subject of this method can be with It is any there is calculating, the system of processing capacity, unit, platform or server, such as computing platform shown in FIG. 1 etc.. If Fig. 2 shows, method includes the following steps: step 201, obtains and is retouched for target to be pushed by what natural language was described State information；Step 202, foregoing description information is split as multiple character cells, and determines that each character cell is corresponding Each character vector, each character vector to form sequence vector according to putting in order for multiple character cells: step 203, utilizing The trained above-mentioned sequence vector of coding and decoding network processes in advance, to obtain the character sequence that at least one is no more than predetermined length Column；It step 204, is that target to be pushed determines pushed information based on obtained each character string.

Firstly, in step 201, obtaining the description information being described for target to be pushed by natural language.It can To understand, target to be pushed is the object to be pushed in process of information push, such as can be various physical goods or virtual quotient Product, using (such as terminal APP), website, personage.

Target to be pushed can correspond to some description informations.These description informations can by picture or character style, It is provided by the business side of carry out information push.In general, content can cover wait push when description information is natural language form The shape of target, structure, application method, function, effect etc. various aspects.Natural language for describing target to be pushed can be with It is various linguistic forms, such as Chinese, English, phonetic, number, symbol, etc..For example, for the mental APP's of a exploitation Description information can be, " ×× is the outstanding application software of a culture child's early stage words cognitive ability, using completely new right brain as Shape thinking carries out the cognitive learning of Chinese character, is converted into a specific animated graphics each Chinese character is ingenious, and animation, sound, Picture, text etc. organically combine, and the development of a variety of sense organs such as stimulation child's vision, the sense of hearing, tactile quickly activates right brain, excites Child's learning interest ".It is appreciated that the description information usually can be it is lengthy and jumbled, at large to treat push target as far as possible Carry out the description of various aspects.

When description information is character style, the description information of character style can be directly acquired, when description information occurs When in picture, such as optical character identification (Optical Character Recognition, OCR) etc can be passed through Mode obtains corresponding description information.

Description information can be to be obtained in real time, such as user currently passes through keyboard and inputs.Description information can also be pre- Be first stored in it is local, directly from local acquisition.

Then, in step 202, foregoing description information is split as multiple character cells, and determine each character cell point Not corresponding each character vector, each character vector to form sequence vector according to putting in order for multiple character cells.

By step 201 it is found that the description information finally obtained is text information.It can be according to for this text information The character cells such as word, word are split.In one embodiment, for the description information of written form, such as Chinese, Korean, day The description information of the written forms such as language, the vocabulary that can be formed according to single text (char) and/or at least one text (word) etc. forms carry out, it is not limited here.For example, " ×× is a culture child early stage character learning energy in example above-mentioned The outstanding application software of power " splits at least one character cell are as follows: " ×× ", "Yes", " a ", " culture ", " child ", " early stage ", " character learning ", " ability ", " ", " outstanding ", " application ", " software " etc..In other embodiments, for character shape The description information of formula, such as the description information of English, French written form, can be according to the character string with independent meaning in this way Character cell split, the character string therein with independent meaning is referred to as word.For example, " Knowledge Is power " can split into " Knowledge ", " is ", " power " etc..

When carrying out character cell fractionation, maximum length character match can be carried out according to preset dictionary (from a word Symbol starts, and the vocabulary of maximum length is matched to from dictionary as a character cell), corresponding character cell is obtained, it can also To carry out separating character unit according to spcial character (space such as in English), it is not limited thereto.

To each character cell, a character vector can also be converted into.Character vector can be by vector mode to word The semanteme of symbol is described.The method for generating character vector can be based on statistical method (co-occurrence matrix, SVD are decomposed), Be also possible to the language model of the neural network based on different structure, for example, word2vec (word embeddings, word to Amount), glove (Global vectors for word representation, the Global Vector of word) etc..

By taking the language model (such as word2vec) of neural network generates term vector as an example, vocabulary first can be passed through into solely heat It indicates (one-hot representation), that is, distributes unique number string to each word, vocabulary is distinguished with this. Such as: " banana " is expressed as [0 00100000000000 0...], and " apple " is expressed as [0 0000 0 0 0 1 0 0 0 0 0 0 0...].That is, in corpus, a vocabulary (such as banana, apple) corresponding one A vector, only one value is 1 in vector, remaining is all 0.This vector is corresponding be exactly vocabulary unique number string.If Regard above-mentioned vector as binary number representation, then each vocabulary can also correspond to a decimal system or hexadecimal number Word.The different corresponding numeric strings of two vocabulary is also different.This numeric string is referred to as digital ID, such as number ID can also It is indicated with the decimal number being converted to above-mentioned binary vector, such as unique number string [0 0000000000 0...1 a number ID " 8 " can 00 0] be corresponded to.The size of the vector dimension obtained in this way depends on words in corpus Number.If vocabulary is more in corpus, vector dimension is very big.Term vector model can will be looked like by low-dimensional vector The stronger vocabulary of close or relevance is mapped to similar position in vector space.

Such as word2vec, as shown in figure 3, only hotlist that input layer can be vocabulary shows corresponding (digital ID), Output layer is the corresponding term vector of vocabulary, and wherein each element of output layer corresponds to a vocabulary dimension in term vector.Such as it is defeated Layer respectively corresponds vocabulary [apple, banana, orange, rice ...] out, and the value on each element can indicate that input layer is corresponding The correlation degree of vocabulary vocabulary corresponding with the element.Wherein, in training pattern, the term vector of sample vocabulary can pass through sample The correlation degree of this vocabulary vocabulary corresponding with each element indicates that for sample vocabulary, which can pass through Statistics determination is carried out to the context relation of vocabulary in corpus.Two vocabulary are together context (as adjacent) occurs jointly Probability is higher, and correlation degree is stronger.Numerical value on each element can between 0-1 value.Sample vocabulary is corresponding Only hotlist show in the digital ID hidden layer with less a node that is connected to one.The weight for connecting input layer and hidden layer will As term vector.The activation primitive of the hidden layer for example can be to the linear weighted function of this layer of each node and (not will use as Nonlinear activation function as sigmoid or tanh).Hereafter the node of hidden layer can be fed to softmax (normalization index Function) output layer.In the training process, for the vocabulary occurred in corpus, the weight (mould of neural network can constantly be adjusted Shape parameter), make the corresponding each vocabulary of input layer, the probability that the higher word of correlation degree exports in output layer is higher.

In this way, the corresponding each character vector of available each character cell.By each character vector according to more A character cell putting in order in description information is arranged, so that it may form the sequence vector for being directed to description information.

Then, by step 203, using the above-mentioned sequence vector of coding and decoding network processes trained in advance, with obtain to Few one is no more than the character string of predetermined length.Wherein, coding and decoding network (Encoder-Decoder) is suitable for processing sequence Arrange the neural network of sequence problem.List entries is exactly converted to the vector of a regular length by so-called coding；Decoding, The fixed vector generated before is exactly then converted into sequence output.Wherein, the process of coding is used to parse the character sequence of input Language meaning in column, decoded process express the language meaning being resolved to by other character string.

As shown in figure 4, coding and decoding network can pass through two-part neural fusion, the neural network of first part 401 realize coding (Encoder), and the neural network 402 of second part realizes decoding (Decoder).What neural network 401 exported Input of the semantic vector C as neural network 402.When specific implementation, encoding nerve network 401 and decoding neural network 402 be all optional, such as respectively from CNN (Convolutional Neural Networks, convolutional neural networks), RNN (Recurrent Neural Network, Recognition with Recurrent Neural Network), BiRNN (Bi-directional Recurrent Neural Network, bidirectional circulating neural network), GRU (Gated Recurrent Unit, gating cycle unit), LSTM (Long Short-Term Memory, shot and long term memory models) etc. in optionally.Encoding nerve network 401 and decoding neural network 402 Between by transmit semantic vector C be attached.The input of encoding nerve network 401 is character vector sequence " X₁”、“X₂”、 “X₃" ... the output of decoding neural network 402 corresponds to character string " Y₁”、“Y₂”、“Y₃" ... wherein, decode neural network 402 output can be a character string, be also possible to the digital ID sequence that only hotlist shows, the corresponding word of each number ID Symbol, to correspond to character string, is not limited thereto.

According to one embodiment, coding and decoding network can be used as a neural network and integrally be trained.Firstly, can Using will respectively each sequence corresponding with multiple push targets to as training sample.Wherein, each push target is one corresponding Sequence pair, the sequence is to including a source sequence (source) and a target sequence (target).Source sequence, which can be, to be directed to Character vector sequence after the description information fractionation of some target (such as commodity, personage).Target sequence, which can be, manually to be provided The digital ID of each character cell that splits of the pushed information determined for description information or character vector or character list Member itself.It as shown in table 1, is the example of each sample corresponding description information and pushed information.It is extracted from each description information The process of corresponding source sequence is similar with step 201, the process of step 203, and details are not described herein.It is mentioned from each pushed information Take the process of target sequence can be similar with the process of source sequence is extracted.When target sequence is character string, to pushed information Splitting into character cell can be obtained target sequence, when target sequence is the digital ID that only hotlist shows, further obtains and splits The digital ID of the character cell arrived.

1 pattern representation information of table and pushed information example

The sample instantiation of table 1 is the description information and corresponding terminal applies advertising slogan for terminal applies (APP) (pushed information).As it can be seen from table 1 description information can be with detailed specific lengthy and jumbled, pushed information is then succinct understandable, and theme is bright Really, the function of push target can be understood at a glance.

Process using training sample training coding and decoding network is described below.

In order to more clearly describe, please referring to Fig. 4, Fig. 5.Wherein, Fig. 5 is a specific sample input coding decoding In network development process, the hidden unit (neuron) in the neural network 401 and neural network 402 in Fig. 4 is carried out according to the time The schematic diagram of expansion.In fact, being realized in neural network 401 and neural network 402 by Recognition with Recurrent Neural Network RNN, LSTM etc. When, it respectively can only include a neuron.The neuron can receive different input datas in order in different moments, give Different output result out.

In Fig. 5, it is assumed that the source sequence of a sequence pair is A, B, C, target sequence W, X, Y, Z.Firstly, by source sequence A, B, C is successively used as feature to input neural network 401 in different moments, and the sequence of input is converted into one by neural network 401 The semantic vector of regular length.Then the semantic vector is passed into neural network 402, which passes through neural network 402 will transmit backward at various moments.

It for neural network 402, needs first to handle target sequence, in order to enable output result is more acurrate, in mould The input of the various time points of hidden unit in neural network 402 is adjusted by the type training stage, so that each input is Previous character cell or the corresponding number ID of previous character cell, rather than the output result of previous moment.Such as Fig. 5 institute Show, the input data that the neuron of neural network 402 receives at various moments be respectively sequence beginning label (such as go), W, X, Y,Z.Wherein, sequence beginning label indicates the beginning of forecasting sequence.And the semanteme of the regular length obtained by neural network 401 Vector can be transmitted in the different moments of neuron.It in Fig. 5, carves at the beginning, the sequence that neuron receives input starts It marks " go ", first character unit (character cell i.e. corresponding with " W ") is predicted according to semantic vector.Next moment, mind Through first language for receiving the character cell corresponding with character cell " W " of input or number ID and initial time pass on Adopted vector predicts second character cell (character list corresponding with " X " with first character unit " W " composition sequence Member), to export W, X sequence.And so on, until neuron receive the corresponding character of last character unit " Z " to Amount or number ID, the semantic vector prediction last character unit passed on according to last moment (are identified with the sequence ends The corresponding character cell of eos).

When being trained to model, for each sequence pair in each training sample above-mentioned, source sequence is input to The encoding nerve network 401 in coding and decoding network connected together, and start mark (such as sequence is increased up front Go target sequence) is input to decoding neural network 402.Obtain the character sequence that decoding neural network 402 exports at various moments Column or number ID sequence, by itself and corresponding character list in the target sequence for being added to the sequence ends mark (such as eos) at the end of Member is compared one by one, so as to adjust model parameter, so that the value of loss function changes to reduced direction.It can be with from Fig. 5 Find out, starts to identify " go " since the target sequence for input decoding neural network 402 is added to sequence, the mesh for comparing Mark sequence ending is added to the sequence ends mark " eos ", so that the neuron of decoding neural network 402 is output and input, The dislocation that a character cell is formed on target sequence, when so as to by inputting previous character cell, by prediction result It is compared with the latter character cell, completes the training of coding and decoding network.

It, can be by each character when the sequence vector obtained in the coding and decoding network processes step 202 using training Vector distinguishes the neuron of input coding neural network 401 in various time points according to order, and obtains decoding neural network 402 Neuron each output result at various moments.Each output result of neural network 402 can be character cell, character Vector, character cell corresponding number ID, the corresponding character cell of each output result form character string.Neural network 402 The output result at each moment can also be transmitted to subsequent time, to finally directly export the sequence of character units predicted Or number ID sequence, and correspond to character string.Firstly, the semantic vector exported by neural network 401, and what is set are opened Begin mark, and the first character of predetermined number (such as 10) can be predicted using the neuron of decoding neural network 402.Then, The semantic vector and t-1 moment exported according to neural network 401 decodes the character or character sequence that neural network 402 has been predicted that Column can predict the current character sequence of above-mentioned predetermined number by the neuron of decoding neural network 402 in t moment. Wherein, t is the natural number greater than 1.Decode state of the neuron in t moment of neural network 402, it can be understood as in Fig. 5 Decode t-th of neuron state after neural network 402 is unfolded in temporal sequence.

During determining character string by neural network 402, such as beam-search can be used (beamsearch), the method for greedy algorithm (Greedy algorithm) etc carries out.By taking beam-search as an example, according to vocabulary In vocabulary probability select vocabulary composition sequence.Wherein, vocabulary can first pass through corpus in advance and count to obtain.

For convenience, only have " I " "Yes" " student " three vocabulary as line size (beam using in vocabulary Size) specific example for 2 beam-search is illustrated.

Referring still to shown in Fig. 4, at the first moment, the neuron of neural network 402 is according to the semantic vector of input, and output is generally Maximum 2 of rate (2 be beam size) vocabulary " I " (such as probability is 0.5) and "Yes" (such as probability is 0.4), as predicting Initial character.Namely Y₁There are two types of possible values, respectively correspond vocabulary " I " and the "Yes" in vocabulary.Respectively " I " and "Yes" is as the neuron input in the second moment neural network 402, while the semantic vector that neural network 401 is obtained is protected It stays.When the neuron input " I " as the second moment, each vocabulary is obtained as Y₂The probability distribution of output, for example, " I " 0.3, "Yes" 0.6, " student " 0.1.When "Yes" is inputted as the neuron at the second moment, each vocabulary conduct is obtained Y₂The probability distribution of output, for example, " I " 0.3, "Yes" 0.3, " student " 0.4.

Since the beam size of setting is 2, retain two sequences of maximum probability.It can be obtained at this time at the first moment Y₁Two possible values on the basis of, calculate all possible sequence probability:

The probability of " I I " is 0.5 × 0.3=0.15；

The probability of " I is " is 0.5 × 0.6=0.3；

The probability of " my student " is 0.5 × 0.1=0.05；

The probability of " being me " is 0.4 × 0.3=0.12；

The probability of " being " is 0.4 × 0.3=0.12；

The probability of " being student " is 0.4 × 0.4=0.16.

The sequence of two maximum probabilities is " I is " and " being student ".The two sequences can correspond to the prediction of the second moment The sequence arrived.At the time of subsequent, the neuron of neural network 402 constantly repeats this process, until encountering end mark, Obtain the sequence of two maximum probabilities.Wherein, the quantity of vocabulary is usually very big in vocabulary, and calculating process is also more multiple than aforementioned process It is miscellaneous, but principle is consistent.Finally obtained sequence quantity is determined by line size (beam size), when line size is arranged When being 10, the sequence of 10 maximum probabilities is obtained.

It is worth noting that due to during decoded, the input of the i-th moment neuron includes the (i-1)-th moment nerve The output of member, then will lead to the output probability point of subsequent time model naturally when the output of neuron is different Cloth can be different, because the output of the (i-1)-th moment neuron affects the learning outcome of the i-th moment neuron as parameter.So When the character cell or character string difference of the selection of (i-1)-th moment, the output probability distribution at i moment is also different.

It is appreciated that the quantity of the character cell in known description information is limited, corresponding character vector sequence Length is also limited.That is, when inputting character vector, it is long according to the input vector sequence that puts in order of character cell Degree is also limited.Therefore, for encoding nerve network 301, it is not necessary to limit the number of plies of hidden layer.For recycling nerve net For network RNN, LSTM etc., it is understood that be the quantity that need not limit hidden unit.

In one embodiment, for the consistency of model training, completion can also be carried out to list entries, so that input The quantity for the character vector that sequence is included is fixed.For example, subsequent vector by orienting in advance for shorter sequence vector Amount indicates, so that completion is the sequence vector of predetermined length.

And for decoding neural network 402, due to output the result is that in advance unforeseen, output result is corresponding Character quantity in character string can not predefine.If output character quantity is larger, calculation amount also can larger or even nothing Limit executes.For pushed information, often wanted with such as terseness, generality, noticeable property etc. It asks, character quantity can excessively become lengthy and jumbled, be unable to satisfy the requirement of information push.Therefore, decoding nerve net can be preset The number of plies of hidden layer in network, for Recognition with Recurrent Neural Network, it is understood that be the quantity of hidden unit.In this way, can protect It demonstrate,proves the character string predicted and is no more than predetermined length.

In step 203, the character string of predetermined quantity can be predicted according to the probability of occurrence of character cell, here Predetermined quantity is at least one.It then, is that target to be pushed determination pushes away based on obtained each character string in step 204 It delivers letters breath.Here it is possible to all it regard the character string predicted in step 203 as pushed information, it can also be to pre- in step 203 The character string measured carries out quality evaluation by different modes, therefrom determines one or more character strings as push letter Breath.

According to one embodiment, in step 203 in the sequence of the maximum probability of the predetermined quantity of final output, can take The highest one or more sequences of probability are as pushed information.For example, in aforementioned exemplary, the sequence of two maximum probabilities, " I Be " probability be 0.3, the probability of " being student " is 0.16, can directly select " I is " this sequence.

According to another embodiment, language model trained in advance, the character that prediction is obtained by step 203 can be used Probability of the sequence as sentence.Wherein, language model can by given corpus, such as ×× encyclopaedia, news corpus etc., into Row training.For giving sentence, it is assumed that include sequence of character units P (S)=W₁, W₂..., W_k, then the probability of corresponding sentence can To indicate are as follows: P (S)=P (W₁, W₂..., W_k)p(W₁)p(W₂|W₁)…p(W_k|W₁, W₂..., W_k-1)；

Wherein, probability value P (S) can fall in the interval range of [0,1], p (W₁) it is W₁As the probability of beginning of the sentence, p (W₂| W₁) it can indicate W₂Appear in W₁Probability ... p (W later_k|W₁, W₂..., W_k-1) can indicate in W₁, W₂..., W_k-1's On the basis of, there is W_kProbability.

In one implementation, language model can be statistical model.That is, to character list each in corpus Member appears in probability after other character cells or character string, the probability for appearing in beginning of the sentence etc. and is counted.For example, right In " I " and two character cells of "Yes", it can be that "Yes" is appeared in corpus that "Yes", which appears in the probability after " I ", The number that " I " occurs in number/corpus after " me ".In this way, to sequence of character units P (S)=W₁, W₂..., W_k, can To inquire p (W₁)、p(W₂|W₁)…p(W_k|W₁, W₂..., W_k-1) etc., and their product is determined as P (S).

In another implementation, language model can be determined by preparatory training machine learning model.At this point, corpus Each sentence in library, all can serve as training sample.Language model can be n-gram (N source model), NNLM (Nerual Network Language Model, the neural network model of language), one of LSTM neural network language model etc.. By taking LSTM neural network as an example, the corresponding character vector of character cell that each sentence of training sample can be split, The neuron of LSTM is inputted in order, and the neuron at each moment influences the output of subsequent neuron, using known general Rate 1 adjusts model parameter as label, to be trained to LSTM neural network.To sequence of character units P (S)=W₁, W₂..., W_k, corresponding character string is inputted into trained LSTM model in order, available LSTM model is the sequence The prediction score made, to indicate probability P (S) of this character string as sentence.

In more implementations, a greater variety of language models can also be trained, are no longer enumerated herein.Prediction A possibility that value of P (S) is bigger, and character string is as sentence is bigger.Therefore, at least one word obtained in step 203 Sequence is accorded with, the character string of the predetermined quantity (such as 1) for the maximum probability predicted by prediction model can be filtered out as step The corresponding pushed information of description information in rapid 201, it is more than default general for can also filtering out through the probability that prediction model is predicted The character string of rate threshold value (such as 0.5) is as the corresponding pushed information of description information in step 201.

It, can also be first at least one sequence determined in step 203 according to whether complete sentence according to another embodiment Son is screened.For example, the number of plies for decoding the neuron of neural network is up to 10 in Fig. 3, " your day is cultivated when predicting Ability dotey, just " as character string when, according to taxeme judgement be apparently not a complete sentence, then can be preferential It screens out.Pushed information is selected from remaining sequence according still further to aforementioned either type later.

According to other embodiments, can also be filtered out by manually at least one character string that step 203 obtains The character string for meeting advertising slogan feature, the pushed information as target to be pushed.

In order to more intuitively describe the effect of this specification embodiment, please refers to shown in Fig. 6, this specification is shown specific Implementation example in scene.Scene shown in Fig. 6 is that the scene for recommending language is generated for APP.

Firstly, by step be coding and decoding network rapid 1,2,3 in advance training Sep2Sep (Sequence to Sequence, Sequence is to sequence) model.And by step 4,5,6 corpus of text train language model, for giving a mark for character string.

During generating recommendation language, APP is used as object to be pushed, and receives the APP details of user's input in step 7, That is description information.Then, in step 8, by the corresponding sequence vector input of APP details Sep2Sep model trained in advance. Sep2Sep model is according to the sequence vector of input, it is predicted that the character string of predetermined number.By step 9, with training in advance Language model is each character string marking predicted.It is each character predicted according to the marking of step 9 in step 10 Sequence permutation, therefrom to select the highest predetermined character string output of score.Wherein, selected score is highest predetermined A character string is exactly the recommendation language of APP.

It looks back above procedure and only obtains the natural language description about target to be pushed in pushed information generating process Information determines the corresponding sequence vector of description information, utilizes natural language sequence by splitting into character cell to description information The coding and decoding network processes of the processing sequence vector, predicts at least one character string, that from the word determined Pushed information is determined in symbol sequence.Due to reducing artificial participation, artificial editing levels are not only restricted to, coding is innovatively introduced The description information that decoding network handles target to be pushed generates pushed information, and the efficiency of pushed information generation can be improved and have Effect property.

According to the embodiment of another aspect, a kind of generating means of pushed information are also provided.Fig. 7 is shown to be implemented according to one The schematic block diagram of the generating means 700 for pushed information of example.As shown in fig. 7, device 700 includes:

Acquiring unit 71 is configured to obtain the description information being described for target to be pushed by natural language；

Pretreatment unit 72, the description information for being configured to will acquire the acquisition of unit 71 are split as multiple character cells, and really The corresponding each character vector of fixed each character cell, each character vector according to multiple character cells the shape that puts in order At sequence vector；

Predicting unit 73 is configured to the vector obtained using coding and decoding network processes pretreatment unit 72 trained in advance Sequence, to obtain the character string that at least one is no more than predetermined length；

Determination unit 74 is configured to each character string of the prediction of predicting unit 73, pushes away for target to be pushed determination It delivers letters breath.

In one embodiment, above-mentioned character cell can be one in following:

Single text, the vocabulary of at least one text formation, the character string with independent meaning.

According to one embodiment, above-mentioned multiple character cells include the first character cell；

Pretreatment unit 72 is further configured to:

Inquire the first number ID that the corresponding only hotlist of the first character cell shows；

By the first number ID input character vector model trained in advance, thus according to the output result of character vector model Determine the corresponding character vector of the first character cell.

In a possible design, coding and decoding network includes encoding nerve network and decoding neural network；

Predicting unit 73 is further configured to:

Sequence vector is converted into semantic vector by encoding nerve network；

It is that semantic vector predicts at least one character string by decoding neural network.

In a further embodiment, encoding nerve network or decoding neural network be respectively Recognition with Recurrent Neural Network, it is two-way Recognition with Recurrent Neural Network, gating cycle unit, one in shot and long term memory models.

In a further embodiment, predicting unit 73 is also configured as:

According to semantic vector and start to identify, the neuron by decoding neural network predicts predetermined number at the first moment First character；

The character or character string being had been predicted that according to semantic vector and t-1 moment, by the mind for decoding neural network The current character sequence of predetermined number is predicted in t moment through member, wherein t is the natural number greater than 1.

According to a possible design, device 700 further includes training unit (not shown), is configured to instruct in the following manner Practice coding and decoding network:

Will respectively each sequence corresponding with multiple push targets to as training sample, wherein each sequence to distinguish Including a source sequence and a target sequence, source sequence is the corresponding character vector sequence of description information of corresponding push target Column, target sequence are number ID sequence corresponding with the accordingly push push sequence of target or character string；

Using the source sequence of each sequence centering as the input of encoding nerve network, front one sequence of addition is started to mark Target sequence input decoding neural network after knowledge, is adding according to the output result of decoding neural network and below a sequence The comparison of the target sequence of end of identification adjusts the model parameter of coding and decoding network.

In a possible embodiment, determination unit 74 is also configured as:

By language model trained in advance, each probability that each character string occurs respectively as sentence is predicted；

The character string for meeting one of the following conditions according to each probability selection, the pushed information as target to be pushed: Belong to the character string of predetermined quantity maximum probability, probability is more than predetermined probabilities threshold value.

In one embodiment, device 700 further includes recognition unit, is configured to through optical character identification mode from being directed to The description information of target to be pushed is obtained in the picture of target to be pushed.

It is worth noting that device 700 shown in Fig. 7 be with Fig. 2 shows the corresponding device of embodiment of the method implement Example, Fig. 2 shows embodiment of the method in it is corresponding describe be equally applicable to device 700, details are not described herein.

According to the embodiment of another aspect, a kind of computer readable storage medium is also provided, is stored thereon with computer journey Sequence enables computer execute method described in conjunction with Figure 2 when the computer program executes in a computer.

According to the embodiment of another further aspect, a kind of calculating equipment, including memory and processor, the memory are also provided In be stored with executable code, when the processor executes the executable code, realize the method in conjunction with described in Fig. 2.

Those skilled in the art are it will be appreciated that in said one or multiple examples, function described in the invention It can be realized with hardware, software, firmware or their any combination.It when implemented in software, can be by these functions Storage in computer-readable medium or as on computer-readable medium one or more instructions or code transmitted.

Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all any modification, equivalent substitution, improvement and etc. on the basis of technical solution of the present invention, done should all Including within protection scope of the present invention.

Claims

1. a kind of generation method of pushed information, which comprises

Obtain the description information being described for target to be pushed by natural language；

The description information is split as multiple character cells, and determine the corresponding each character of each character cell to Amount, each character vector to form sequence vector according to putting in order for the multiple character cell；

Using sequence vector described in coding and decoding network processes trained in advance, to obtain at least one no more than predetermined length Character string；

It is that the target to be pushed determines pushed information based on obtained each character string.

2. according to the method described in claim 1, wherein, the character cell includes one in following:

3. according to the method described in claim 1, wherein, the multiple character cell includes the first character cell；

The corresponding each character vector of each character cell of determination includes:

Inquire the first number ID that the corresponding only hotlist of first character cell shows；

By first number ID input character vector model trained in advance, and according to the output knot of the character vector model Fruit determines the corresponding character vector of first character cell.

4. according to the method described in claim 1, wherein, the coding and decoding network includes encoding nerve network and decoding nerve Network；

It is described to include: using sequence vector described in coding and decoding network processes trained in advance

The sequence vector is converted into semantic vector by the encoding nerve network；

It is that the semantic vector predicts at least one character string by the decoding neural network.

5. according to the method described in claim 4, wherein, the encoding nerve network or the decoding neural network are respectively to follow Ring neural network, bidirectional circulating neural network, gating cycle unit, one in shot and long term memory models.

6. according to the method described in claim 4, wherein, described by the decoding neural network is that the semantic vector is predicted At least one character string includes:

Start to identify according to the semantic vector and sequence, be predicted by the neuron of the decoding neural network at the first moment The initial character of predetermined number；

The character or character string being had been predicted that according to the semantic vector and t-1 moment pass through the decoding neural network Neuron t moment prediction predetermined number current character sequence, wherein t is natural number greater than 1.

7. according to the method described in claim 4, wherein, the coding and decoding network is trained in the following manner:

Will respectively each sequence corresponding with multiple push targets to as training sample, wherein each sequence is to respectively corresponding One source sequence and a target sequence, source sequence are the corresponding sequence vector of description information of corresponding push target, target sequence Column are character string corresponding with accordingly push target or number ID sequence；

Using the source sequence of each sequence centering as the input of the encoding nerve network, front one sequence of addition is started to mark Target sequence after knowledge inputs the decoding neural network, is added according to the output result of the decoding neural network with below The comparison of the target sequence of one the sequence ends mark, adjusts the model parameter of the coding and decoding network.

8. according to the method described in claim 1, wherein, described based on obtained each character string is the mesh to be pushed It marks and determines that pushed information includes:

The pushed information as target to be pushed: the character string for meeting one of the following conditions according to each probability selection belongs to The character string of predetermined quantity maximum probability, probability are more than predetermined probabilities threshold value.

9. according to the method described in claim 1, wherein, the description information is by optical character identification mode from for described It is obtained in the picture of target to be pushed.

10. a kind of generating means of pushed information, described device include:

Pretreatment unit is configured to for the description information to be split as multiple character cells, and determines each character cell difference Corresponding each character vector, each character vector to form vector sequence according to putting in order for the multiple character cell Column；

Predicting unit is configured to using sequence vector described in coding and decoding network processes trained in advance, to obtain at least one No more than the character string of predetermined length；

11. device according to claim 10, wherein the character cell includes one in following:

12. device according to claim 10, wherein the multiple character cell includes the first character cell；

The pretreatment unit is further configured to:

By first number ID input character vector model trained in advance, thus according to the output of the character vector model As a result the corresponding character vector of first character cell is determined.

13. device according to claim 10, wherein the coding and decoding network includes encoding nerve network and decoding mind Through network；

The predicting unit is further configured to:

14. device according to claim 13, wherein the encoding nerve network or the decoding neural network are respectively Recognition with Recurrent Neural Network, bidirectional circulating neural network, gating cycle unit, one in shot and long term memory models.

15. device according to claim 13, wherein the predicting unit is additionally configured to:

Start to identify according to the semantic vector and sequence, be predicted by the neuron of the decoding neural network at the first moment The first character of predetermined number；

16. device according to claim 13, wherein described device further includes training unit, is configured to by with lower section The formula training coding and decoding network:

17. device according to claim 10, wherein the determination unit is additionally configured to:

18. device according to claim 10, wherein described device further includes recognition unit, is configured that

The description information is obtained from the picture for the target to be pushed by optical character identification mode.

19. a kind of computer readable storage medium, is stored thereon with computer program, when the computer program in a computer When execution, computer perform claim is enabled to require the method for any one of 1-9.

20. a kind of calculating equipment, including memory and processor, which is characterized in that be stored with executable generation in the memory Code realizes method of any of claims 1-9 when the processor executes the executable code.