CN107526725B

CN107526725B - Method and device for generating text based on artificial intelligence

Info

Publication number: CN107526725B
Application number: CN201710787262.0A
Authority: CN
Inventors: 刘毅
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-09-04
Filing date: 2017-09-04
Publication date: 2021-08-24
Anticipated expiration: 2037-09-04
Also published as: CN107526725A

Abstract

The embodiment of the application discloses a method and a device for generating a text based on artificial intelligence. One embodiment of the method comprises: acquiring a text to be expanded; segmenting a text to be expanded to obtain a word sequence of the text to be expanded; determining an identification information sequence corresponding to the word sequence according to a pre-stored corresponding relation between the words and the identification information; inputting the determined identification information sequence into a pre-trained text extension model to generate an identification information sequence of the extended text; and generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information. This embodiment increases the diversity of text generation.

Description

Method and device for generating text based on artificial intelligence

Technical Field

The application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to a method and a device for generating texts based on artificial intelligence.

Background

Artificial Intelligence (Artificial Intelligence), abbreviated in english as AI. The method is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, a field of research that includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others.

At present, when a text is expanded, the expansion is mainly realized based on a pre-established offline database, that is, words in the text to be expanded are replaced by words with similar semantics to the words in the offline database, so as to generate the expanded text.

However, in the currently adopted text generation method, the offline database is high in maintenance cost and limited in data, so that the text generation result is limited. The diversity of text generation is affected.

Disclosure of Invention

An object of the embodiments of the present application is to provide an improved method and apparatus for generating text based on artificial intelligence, so as to solve the technical problems mentioned in the above background.

In a first aspect, the present application provides a method for generating text based on artificial intelligence, the method comprising: acquiring a text to be expanded; segmenting a text to be expanded to obtain a word sequence of the text to be expanded; determining an identification information sequence corresponding to the word sequence according to a pre-stored corresponding relation between the words and the identification information; inputting the determined identification information sequence into a pre-trained text extension model, and generating an identification information sequence of the extended text, wherein the text extension model is used for representing the corresponding relation between the identification information sequence of the text to be extended and the identification information sequence of the extended text; and generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information.

In some embodiments, the text extension model includes a coding model and a decoding model, the coding model is used for representing the corresponding relation between the identification information sequence and the coding information sequence, and the decoding model is used for representing the corresponding relation between the identification information of the preset starting word, the coding information sequence and the identification information sequence; inputting the determined identification information sequence into a pre-trained text extension model, and generating an identification information sequence of the extended text, wherein the identification information sequence comprises the following steps: inputting the determined identification information sequence into a coding model to generate a coding information sequence of the text to be expanded; and inputting the generated coding information sequence and the identification information of the initial word into a decoding model to generate an identification information sequence of the expanded text.

In some embodiments, inputting the determined identification information sequence into a coding model, and generating a coding information sequence of a text to be expanded, includes: inputting each identification information in the determined identification information sequence into a forward propagation recurrent neural network for coding in a forward sequence manner to generate a first reference coding information sequence; inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order to generate a second reference coding information sequence; and generating a coding information sequence of the text sequence to be expanded according to the first reference coding information sequence and the second reference coding information sequence.

In some embodiments, inputting the generated encoded information sequence and the identification information of the start word into a decoding model, and generating an identification information sequence of the expanded text, includes: predicting an identification information sequence of an alternative subsequent word sequence of the starting word based on the cyclic neural network for decoding and the generated coding information sequence; calculating the occurrence probability of the identification information sequence according to the predicted occurrence probability of the identification information included in each identification information sequence; and selecting a preset number of identification information sequences from the predicted identification information sequences according to the sequence of the appearance probability from large to small as the identification information sequences of the expanded text.

In some embodiments, predicting the identification information sequence of the alternative subsequent word sequence of the starting word based on the recurrent neural network used for decoding and the generated encoded information sequence comprises: determining the weight of the coding information sequence generated in each prediction according to the attention model; weighting the generated coding information sequence according to the weight; and predicting the identification information sequence of the alternative subsequent word sequence of the initial word based on the cyclic neural network for decoding and the weighted coding information sequence.

In some embodiments, the text extension model is trained via the following steps: query sentences corresponding to the same click link in the click log of the search engine are pairwise combined into a sample group; segmenting the query sentences included in each sample group to obtain segmented words; selecting a preset number of words from the segmented words according to the sequence of the occurrence times from large to small; allocating identification information to each selected word, and storing the corresponding relation between the word and the identification information; determining an identification information sequence corresponding to the query sentence included in each sample group according to the corresponding relation between the words and the identification information; and respectively taking the identification information sequences corresponding to the two query sentences included in each sample group as input and output, and training to obtain the text extension model.

In some embodiments, the text to be expanded is generated according to query information input by the terminal; and after generating the expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information, the method further comprises the following steps: performing searching operation based on the generated text to obtain searching result information; and pushing the search result information to the terminal.

In a second aspect, the present application provides an apparatus for generating text based on artificial intelligence, the apparatus comprising: the acquiring unit is used for acquiring a text to be expanded; the segmentation unit is used for segmenting the text to be expanded to obtain a word sequence of the text to be expanded; the determining unit is used for determining an identification information sequence corresponding to the word sequence according to the corresponding relation between the pre-stored words and the identification information; the first generation unit is used for inputting the determined identification information sequence into a pre-trained text extension model and generating an identification information sequence of the extended text, wherein the text extension model is used for representing the corresponding relation between the identification information sequence of the text to be extended and the identification information sequence of the extended text; and the second generation unit is used for generating the expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information.

In some embodiments, the text extension model includes a coding model and a decoding model, the coding model is used for representing the corresponding relation between the identification information sequence and the coding information sequence, and the decoding model is used for representing the corresponding relation between the identification information of the preset starting word, the coding information sequence and the identification information sequence; and a first generation unit including: the coding subunit is used for inputting the determined identification information sequence into a coding model and generating a coding information sequence of the text to be expanded; and the decoding subunit is used for inputting the generated coding information sequence and the identification information of the start word into a decoding model and generating an identification information sequence of the expanded text.

In some embodiments, the encoding subunit is further configured to: inputting each identification information in the determined identification information sequence into a forward propagation recurrent neural network for coding in a forward sequence manner to generate a first reference coding information sequence; inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order to generate a second reference coding information sequence; and generating a coding information sequence of the text sequence to be expanded according to the first reference coding information sequence and the second reference coding information sequence.

In some embodiments, the decoding subunit is further configured to: predicting an identification information sequence of an alternative subsequent word sequence of the starting word based on the cyclic neural network for decoding and the generated coding information sequence; calculating the occurrence probability of the identification information sequence according to the predicted occurrence probability of the identification information included in each identification information sequence; and selecting a preset number of identification information sequences from the predicted identification information sequences according to the sequence of the appearance probability from large to small as the identification information sequences of the expanded text.

In some embodiments, the decoding subunit is further configured to: determining the weight of the coding information sequence generated in each prediction according to the attention model; weighting the generated coding information sequence according to the weight; and predicting the identification information sequence of the alternative subsequent word sequence of the initial word based on the cyclic neural network for decoding and the weighted coding information sequence.

In some embodiments, the apparatus further comprises a training unit to: query sentences corresponding to the same click link in the click log of the search engine are pairwise combined into a sample group; segmenting the query sentences included in each sample group to obtain segmented words; selecting a preset number of words from the segmented words according to the sequence of the occurrence times from large to small; allocating identification information to each selected word, and storing the corresponding relation between the word and the identification information; determining an identification information sequence corresponding to the query sentence included in each sample group according to the corresponding relation between the words and the identification information; and respectively taking the identification information sequences corresponding to the two query sentences included in each sample group as input and output, and training to obtain the text extension model.

In some embodiments, the text to be expanded is generated according to query information input by the terminal; and the device further comprises a pushing unit, wherein the pushing unit is used for: performing searching operation based on the generated text to obtain searching result information; and pushing the search result information to the terminal.

In a third aspect, the present application provides an apparatus comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method according to the first aspect.

In a fourth aspect, the present application provides a computer readable storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method according to the first aspect.

According to the method and the device for generating the text based on the artificial intelligence, the text to be expanded is obtained, the text to be expanded is segmented, the word sequence of the text to be expanded is obtained, the identification information sequence corresponding to the word sequence is input into the text expansion model trained in advance, the identification information sequence of the expanded text is generated, finally, the expanded text is generated according to the generated identification information sequence and the corresponding relation between the words and the identification information, and the diversity of text generation is improved.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a schematic flow chart diagram illustrating one embodiment of a method for generating text based on artificial intelligence in accordance with the present application;

FIG. 3 is a schematic diagram of an application scenario of an artificial intelligence based method for generating text according to the present application;

FIG. 4 is a schematic flow chart diagram illustrating yet another embodiment of an artificial intelligence based method for generating text in accordance with the present application;

FIG. 5 is an exemplary block diagram of one embodiment of an artificial intelligence based apparatus for generating text according to the present application;

FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing a server according to embodiments of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the artificial intelligence based method for generating text or the artificial intelligence based apparatus for generating text of the present application may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and

servers

105, 106. The network 104 is used to provide a medium for communication links between the

terminal devices

101, 102, 103 and the

servers

105, 106. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user 110 may use the

terminal devices

101, 102, 103 to interact with the

servers

105, 106 via the network 104 to receive or transmit data or the like. Various applications may be installed on the

terminal devices

101, 102, 103, such as a web browser application, a search engine type application, a map type application, a payment type application, a social type application, a shopping type application, an instant messaging tool, a cell phone assistant type application, etc.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting a search function, including but not limited to a smart phone, a tablet computer, an e-book reader, an MP3 player (Moving Picture Experts Group Audio Layer III, motion Picture Experts Group Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion Picture Experts Group Audio Layer 4), a laptop portable computer, a desktop computer, and the like.

The

servers

105, 106 may be servers providing various services, such as background servers providing support for the

terminal devices

101, 102, 103. The background server may analyze and process the received data such as the request, and feed back the processing result to the terminal device, for example, the background server may generate an expanded text according to the text to be expanded sent by the terminal.

It should be noted that the artificial intelligence based method for generating text provided by the embodiment of the present application may be executed by the

servers

105 and 106, and accordingly, the artificial intelligence based apparatus for generating text may be disposed in the

servers

105 and 106.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of an artificial intelligence based method for generating text in accordance with the present application is shown. The method for generating the text based on the artificial intelligence comprises the following steps:

step 201, acquiring a text to be expanded.

In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the artificial intelligence based method for generating text runs may obtain the text to be expanded from a local or other electronic device. The text to be expanded may be a text that has an analysis value and is available to the electronic device, for example, the text to be expanded may be query information (query) that is input by a user history and is stored locally in advance, or may be query information included in a query request sent by the user through a terminal in real time, and text that is input by other users.

Step 202, segmenting the text to be expanded to obtain a word sequence of the text to be expanded.

In this embodiment, the electronic device may segment the text to be expanded obtained in step 201 to obtain a word sequence of the text to be expanded. The segmentation of the text to be expanded may be a word segmentation/word segmentation operation performed on the text to be expanded, and the text to be expanded may be segmented into words by a full segmentation method or the like, so as to obtain a word sequence of the text to be expanded. For example, "what seafood is eaten by pregnant women to supplement zinc" can be divided into "pregnant women, what, seafood, supplement, zinc".

Step 203, determining an identification information sequence corresponding to the word sequence according to the corresponding relationship between the pre-stored words and the identification information.

In this embodiment, the electronic device may determine the identification information sequence corresponding to the word sequence obtained in step 202 according to a pre-stored correspondence between words and identification information. The identification information may be another representation of the word and may be composed of letters and/or numbers, for example, the identification information may be a sequence number of the word in a preset dictionary, and a word that does not exist in the dictionary may be represented by uniform identification information, such as "UNKOWN". The preset dictionary may be obtained by performing word segmentation processing on the speech, counting the frequency of occurrence of words, and storing words with high frequency of occurrence.

And 204, inputting the determined identification information sequence into a pre-trained text extension model, and generating an identification information sequence of the extended text.

In this embodiment, the electronic device may input the identifier information sequence determined in step 203 into a text extension model trained in advance, and generate an identifier information sequence of an extended text. The text extension model can be used for representing the corresponding relation between the identification information sequence of the text to be extended and the identification information sequence of the extended text.

As an example, the text extension model may include one or more Neural Network models, and the Neural Network models may use a Recurrent Neural Network (RNN) model, and connections between hidden nodes in a Network structure of the Recurrent Neural Network model form a ring, which not only learns information at the current time, but also depends on previous sequence information. The problem of information storage is solved due to the special network model structure. RNN has unique advantages for dealing with time series and language text series problems. Further, the text extension model may also be composed using one or more of a variant Long Short Term Memory network (LSTM) of RNN, a Gated Recursion Unit (GRU). The text expansion model may also be an operation formula that is preset by a technician based on statistics of a large amount of data and stored in the electronic device, and is used for operating one or more identification information in the identification information sequence of the text to be expanded to obtain the identification information sequence of the expanded text.

In some optional implementation manners of this embodiment, the text extension model includes a coding model and a decoding model, where the coding model is used to represent a corresponding relationship between an identification information sequence and a coding information sequence, and the decoding model is used to represent a corresponding relationship between identification information of a preset start word, both the coding information sequence and the identification information sequence; inputting the determined identification information sequence into a pre-trained text extension model, and generating an identification information sequence of the extended text, wherein the identification information sequence comprises the following steps: inputting the determined identification information sequence into a coding model to generate a coding information sequence of the text to be expanded; and inputting the generated coding information sequence and the identification information of the initial word into a decoding model to generate an identification information sequence of the expanded text.

In this implementation, the encoding may be to convert the input sequence into a vector of fixed length; decoding may be to convert the previously generated fixed vectors into an output sequence. Encoding-storing-decoding mimics the brain read-in-memory-output process. Besides the "encoding-decoding" mechanism, the mapping between the identification information sequence of the text to be expanded and the identification information sequence of the expanded text can be accomplished by using an attention (attention) model, and the attention model does not require that the encoder encodes all input information into a vector with a fixed length. Thus, the information carried by the input sequence can be fully utilized when each output is generated. The START word may be set according to actual needs, and may be "START", for example.

In some optional implementations of the present embodiment, the text extension model is trained via the following steps: query sentences corresponding to the same click link in the click log of the search engine are pairwise combined into a sample group; segmenting the query sentences included in each sample group to obtain segmented words; selecting a preset number of words from the segmented words according to the sequence of the occurrence times from large to small; allocating identification information to each selected word, and storing the corresponding relation between the word and the identification information; determining an identification information sequence corresponding to the query sentence included in each sample group according to the corresponding relation between the words and the identification information; and respectively taking the identification information sequences corresponding to the two query sentences included in each sample group as input and output, and training to obtain the text extension model.

In this implementation, the loss function of model training may be determined according to the probability of occurrence of words, and the text extension model may be initialized at random first, and then the model is trained using training data according to a small-batch stochastic gradient descent method, so that the empirical risk is minimized. Because the contents of the click logs of the search engine are rich, the relation between the query sentences corresponding to the same click link is not simply semantic similarity, so that the generated expanded text is richer, the output result of the model is closer to the query sentences, and the search effect is better if the search is subsequently performed according to the expanded text. Besides the query sentence corresponding to the same click link, the corpus of the training text extension model can also be associated texts submitted by other users or generated by a machine.

Step 205, generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information.

In this embodiment, the electronic device may determine, according to the correspondence between the words and the identification information, the words corresponding to the identification information in the identification information sequence generated in step 204, so as to obtain the expanded text.

In some optional implementation manners of the embodiment, the text to be expanded is generated according to query information input by the terminal; and after generating the expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information, the method further comprises the following steps: performing searching operation based on the generated text to obtain searching result information; and pushing the search result information to the terminal.

In this implementation manner, a user may input query information in the form of voice, pictures, characters, or the like through a terminal, and the electronic device may convert the query information into a text and use the text obtained through conversion as a text to be expanded. The search result information pushed to the terminal can be used as a supplement to the text search result to be expanded, so that the time for the user to acquire the information is further saved.

The method provided by the embodiment of the application obtains the text to be expanded, segments the text to be expanded to obtain the word sequence of the text to be expanded, inputs the identification information sequence corresponding to the word sequence into the pre-trained text expansion model to generate the identification information sequence of the expanded text, and finally generates the expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information, thereby improving the diversity of text generation.

With continued reference to FIG. 3, a schematic diagram of an application scenario of the artificial intelligence based method for generating text according to the present application is shown. In the application scenario of fig. 3, the server 301 first obtains a text 303 to be expanded, which is uploaded by a user through a terminal 302 and is "what seafood is eaten by a pregnant woman to supplement zinc", and then the server performs word segmentation and other processing on the text, and inputs the processed information into a text expansion model trained in advance, so as to finally obtain an expanded text 304, where the expanded text 304 includes "what trace elements are supplemented when the pregnant woman eats seafood", and "what food is eaten by the pregnant woman to supplement zinc".

Referring to fig. 4, fig. 4 is a flowchart illustrating a method for generating a text based on artificial intelligence according to another embodiment of the present invention.

In FIG. 4, the artificial intelligence based method 400 for generating text includes the steps of:

step 401, obtaining a text to be expanded.

In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the artificial intelligence based method for generating text runs may obtain the text to be expanded from a local or other electronic device.

Step 402, segmenting the text to be expanded to obtain a word sequence of the text to be expanded.

In this embodiment, the electronic device may segment the text to be expanded obtained in step 401 to obtain a word sequence of the text to be expanded.

Step 403, determining an identification information sequence corresponding to the word sequence according to the correspondence between the pre-stored words and the identification information.

In this embodiment, the electronic device may determine the identification information sequence corresponding to the word sequence obtained in step 402 according to a pre-stored correspondence between words and identification information.

And step 404, inputting each identification information in the determined identification information sequence into a forward propagation recurrent neural network for coding in a forward sequence to generate a first reference coding information sequence.

In this embodiment, the electronic device may input each identification information in the identification information sequence determined in step 403 into a forward propagation recurrent neural network for encoding in a forward sequence, and generate a first reference encoded information sequence. Taking the example that the recurrent neural network is LSTM recurrent neural network, the LSTM includes an input gate, a forgetting gate, and an output gate, and the first reference encoded information sequence can be calculated by the following formula:

i_enc,t＝σ(W_enc,ix_enc+U_enc,ih_t-1+b_enc,i) (1)

f_enc,t＝σ(W_enc,fx_enc+U_enc,fh_t-1+b_enc,f) (2)

o_enc,t＝σ(W_enc,ox_enc+U_enc,oh_t-1+b_enc,o) (3)

h_t＝o_enc,t⊙c_enc,t-1 (6)

will present the word x_tInformation of (2) and information of preceding word sequence h_t-1After comprehensive modeling, generating information h of the current word sequence_t。

Where symbol |, represents a dimension-by-dimension multiplication, tanh () represents a hyperbolic tangent function, and σ represents an S-type function (Sigmoid) function. t denotes the current time, x_encRepresents the input of the neural network for encoding at the current time, and the input at each time constitutes the identification information sequence determined in step 403. W_enc,i，W_enc,f，W_enc,o，W_enc,c，U_enc,i，U_enc,f，U_enc,o，U_enc,cWeight matrix representing neural network for coding, b_enc,i，b_enc,f，b_enc,o，b_enc,cA bias term representing a neural network for encoding. i.e. i_enc,tWeight, f, representing the input gate of the neural network for coding at the present moment_enc,tWeight, o, representing the forgetting gate of the neural network for coding at the present moment_enc,tAnd the weight value of the output gate of the neural network for coding at the current time is shown. h is_tRepresenting the output state of the neural network for encoding at the present time, h_t-1The output state of the neural network for encoding at the last time is represented. c. C_enc,tRepresenting the current time status information of the neural network for encoding, c_enc,t-1Representing the state information of the neural network for encoding at the last moment,

representing newly emerging state information in the neural network for encoding.

And step 405, inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order, and generating a second reference coding information sequence.

In this embodiment, the electronic device may input each identification information in the identification information sequence determined in step 403 into a back propagation recurrent neural network for encoding in reverse order, and generate a second reference encoded information sequence. The back propagation recurrent neural network can obtain proper back propagation recurrent neural network parameters through multiple rounds of iteration of a gradient descent method.

And 406, generating a coding information sequence of the text sequence to be extended according to the first reference coding information sequence and the second reference coding information sequence.

In this embodiment, the electronic device may generate the coding information sequence of the text sequence to be extended according to the first reference coding information sequence generated in step 404 and the second reference coding information sequence generated in step 405. The electronic device can generate the coding information sequence of the text sequence to be expanded by setting the weight matrix and weighting the first reference coding information sequence and the second reference coding information sequence according to the weight matrix. The weight matrix may be preset or determined by a machine learning method.

Step 407, predicting the identification information sequence of the alternative subsequent word sequence of the initial word based on the recurrent neural network for decoding and the generated coding information sequence.

In this embodiment, the electronic device may predict the identification information sequence of the candidate subsequent word sequence of the starting word based on the recurrent neural network for decoding and the encoded information sequence generated in step 406. Different from the encoding, the hidden layer result produced in the sequence traversal process needs to predict a corresponding target vocabulary in the decoding process, and the target vocabulary can be used as the input of the next iteration. In addition, the electronic device may further determine a weight of the encoded information sequence generated in each prediction according to the attention model; weighting the generated coding information sequence according to the weight; and predicting the identification information sequence of the alternative subsequent word sequence of the initial word based on the cyclic neural network for decoding and the weighted coding information sequence. And carrying out weighted summation on the results of the encoding end according to the attention model to generate context information.

Taking the cyclic neural network as LSTM cyclic neural network as an example, the identification information sequence of the alternative subsequent word sequence of the starting word can be predicted according to the following formula:

i_dec,t＝σ(W_dec,ix_dec+U_dec,is_t-1+A_ia_t+b_dec,i) (7)

f_dec,t＝σ(W_dec,fx_dec+U_dec,fs_t-1+A_fa_t+b_dec，f) (8)

o_dec,t＝σ(W_dec,ox_dec+U_dec,os_t-1+A_oa_t+b_dec,o) (9)

s_t＝o_dec,t⊙c_dec,t-1 (12)

where t denotes the current time, x_decRepresenting the input of the neural network for decoding at the current moment, W_dec,i，W_dec,f，W_dec,o，W_dec,c，U_dec,i，U_dec,f，U_dec,o，U_dec,c，A_i，A_f，A_o，A_cA weight matrix representing the neural network for decoding. b_dec,i，b_dec,f，b_dec,o，b_dec,cA bias term representing a neural network for decoding. i.e. i_dec,tWeight, f, representing the input gate of the neural network for decoding at the present moment_dec,tWeight, o, representing a neural network forgetting gate for decoding at the present time_dec,tAnd the weight value of the output gate of the neural network for decoding at the current moment is shown. c. C_dec,tState information indicating the current time of a neural network for decoding, c_dec,t-1Representing the state information of the neural network for decoding at the last time,

representing newly emerging state information in the neural network for decoding. S_tIndicating the output state of the neural network for decoding at the present time, S_t-1Indicating the last momentThe output state of the neural network for decoding. a is_tIndicating the attention allocation value.

a_tCan be calculated according to the following way:

v_it＝V_atanh(W_ach_i+U_as_t-1) (13)

the term "i" denotes a time corresponding to each piece of encoded information in the sequence of encoded information of the text sequence to be extended, and "j" denotes a time corresponding to each piece of encoded information in the sequence of encoded information of the text sequence to be extended. exp () represents an exponential function with a natural constant e as the base. V_a，W_a，U_aA weight matrix is represented. v. of_it、v_jtIs to determine the median of the inputs that should align with the output of the recurrent neural network used for decoding at the current time. w is a_itAnd the weight of the coded information at the time i in the coded information sequence of the text sequence to be expanded at the current time is represented. ch (channel)_iAnd the coded information represents the coded information at the time i in the coded information sequence of the text sequence to be expanded.

And step 408, calculating the occurrence probability of each identification information sequence according to the predicted occurrence probability of the identification information included in the identification information sequence.

In this embodiment, the electronic device may calculate the probability of occurrence of each identification information sequence according to the probability of occurrence of the identification information included in the identification information sequence predicted in step 407. Can be combined with S_tLinear transformations are projected into the vocabulary size space and the probability of the next word is predicted by a flexible maximum transfer function (Softmax) operation.

Step 409, selecting a predetermined number of identification information sequences from the predicted identification information sequences as the identification information sequences of the expanded text according to the descending order of the occurrence probability.

In this embodiment, the electronic device may select a predetermined number of identification information sequences as the identification information sequences of the expanded text from the predicted identification information sequences in descending order of the probability of occurrence calculated in step 408. The predetermined number may be set according to actual needs.

Optionally, the electronic device may also generate a predetermined number of identification information sequences with the highest probability in combination with a Beam Search (Beam Search) algorithm. As an example, the sequence START word START may be given as input at time 0 first, and then a probability distribution of the next word may be generated through decoding-end operations. We select a predetermined number of words with the highest probability from the distribution, and then use the predetermined number of words as the next word in the decoded sequence and as the input at time 1. Then, a predetermined number of words having the largest product of the preceding sequence probabilities are selected as candidates for input at time 2 from the predetermined number of distributions generated by each of the predetermined number of branches, and the above operation is repeated. If the bundle search yields a sequence ending the word "END," the bundle search width is decreased by one and the search continues until the width of the bundle becomes 0 or the maximum sequence generation length is reached. Therefore, the identification information sequences with the preset number can be obtained as the identification information sequences of the expanded texts.

Step 410, generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information.

In this embodiment, the electronic device may determine, according to the correspondence between the words and the identification information, the words corresponding to the identification information in the identification information sequence obtained in step 409, so as to obtain the expanded text.

The implementation details and technical effects of step 401, step 402, step 403, and step 410 may refer to the descriptions in step 201, step 202, step 203, and step 205, which are not described herein again.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the method provided by the above embodiment of the present application encodes based on the outputs of the forward propagation recurrent neural network and the backward propagation recurrent neural network, and then decodes through the recurrent neural network, so that the text extension model formed by the recurrent neural network is more rich and accurate.

With further reference to fig. 5, as an implementation of the method described above, the present application provides an embodiment of an apparatus for generating a text based on artificial intelligence, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.

As shown in fig. 5, the apparatus 500 for generating text based on artificial intelligence of the present embodiment includes: the text expansion method comprises an acquisition unit 510, a segmentation unit 520, a determination unit 530, a first generation unit 540 and a second generation unit 550, wherein the acquisition unit 510 is used for acquiring a text to be expanded; the segmentation unit 520 is configured to segment the text to be expanded to obtain a word sequence of the text to be expanded; a determining unit 530, configured to determine an identification information sequence corresponding to the word sequence according to a correspondence between a pre-stored word and identification information; a first generating unit 540, configured to input the determined identification information sequence into a pre-trained text extension model, and generate an identification information sequence of an extended text, where the text extension model is used to represent a correspondence between the identification information sequence of the text to be extended and the identification information sequence of the extended text; and a second generating unit 550, configured to generate an expanded text according to the generated identification information sequence and the correspondence between the word and the identification information.

In this embodiment, the specific processing of the obtaining unit 510, the splitting unit 520, the determining unit 530, the first generating unit 540, and the second generating unit 550 may refer to detailed descriptions of step 201, step 202, step 203, step 204, and step 205 in the embodiment of fig. 2, and is not described herein again.

In some optional implementation manners of this embodiment, the text extension model includes a coding model and a decoding model, where the coding model is used to represent a corresponding relationship between an identification information sequence and a coding information sequence, and the decoding model is used to represent a corresponding relationship between identification information of a preset start word, both the coding information sequence and the identification information sequence; and a first generation unit 50 including: the encoding subunit 541 is configured to input the determined identification information sequence into an encoding model, and generate an encoding information sequence of a text to be extended; and a decoding subunit 542, configured to input the generated encoded information sequence and the identification information of the start word into a decoding model, and generate an identification information sequence of the expanded text.

In some optional implementations of this embodiment, the encoding subunit 541 is further configured to: inputting each identification information in the determined identification information sequence into a forward propagation recurrent neural network for coding in a forward sequence manner to generate a first reference coding information sequence; inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order to generate a second reference coding information sequence; and generating a coding information sequence of the text sequence to be expanded according to the first reference coding information sequence and the second reference coding information sequence.

In some optional implementations of this embodiment, the decoding subunit 541 is further configured to: predicting an identification information sequence of an alternative subsequent word sequence of the starting word based on the cyclic neural network for decoding and the generated coding information sequence; calculating the occurrence probability of the identification information sequence according to the predicted occurrence probability of the identification information included in each identification information sequence; and selecting a preset number of identification information sequences from the predicted identification information sequences according to the sequence of the appearance probability from large to small as the identification information sequences of the expanded text.

In some optional implementations of this embodiment, the decoding subunit 542 is further configured to: determining the weight of the coding information sequence generated in each prediction according to the attention model; weighting the generated coding information sequence according to the weight; and predicting the identification information sequence of the alternative subsequent word sequence of the initial word based on the cyclic neural network for decoding and the weighted coding information sequence.

In some optional implementations of this embodiment, the apparatus further includes a training unit 560, and the training unit 560 is configured to: query sentences corresponding to the same click link in the click log of the search engine are pairwise combined into a sample group; segmenting the query sentences included in each sample group to obtain segmented words; selecting a preset number of words from the segmented words according to the sequence of the occurrence times from large to small; allocating identification information to each selected word, and storing the corresponding relation between the word and the identification information; determining an identification information sequence corresponding to the query sentence included in each sample group according to the corresponding relation between the words and the identification information; and respectively taking the identification information sequences corresponding to the two query sentences included in each sample group as input and output, and training to obtain the text extension model.

In some optional implementation manners of the embodiment, the text to be expanded is generated according to query information input by the terminal; and the apparatus further comprises a pushing unit 570, the pushing unit 570 being configured to: performing searching operation based on the generated text to obtain searching result information; and pushing the search result information to the terminal.

As can be seen from fig. 5, the apparatus 500 for generating text based on artificial intelligence in this embodiment accesses the target area by acquiring the target user information within a predetermined time period; extracting identification information included in the target user information; acquiring permanent station information which accords with the identification information in a user information database, wherein the user information database comprises the identification information and the permanent station information corresponding to the identification information; and generating a text according to the acquired permanent location information, thereby improving the diversity of text generation. Those skilled in the art will understand that the first generating unit and the second generating unit only represent two different acquiring units, and the first generating unit is configured to input the determined identification information sequence into a text extension model trained in advance, and generate an identification information sequence of an extended text; and the second generating unit is used for generating the expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information, wherein the first and the second do not form special limitation on the generating unit.

Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing a server according to embodiments of the present application. The server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 606 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601.

It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a unit, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a slicing unit, a determination unit, a first generation unit, and a second generation unit. The names of the units do not form a limitation to the unit itself in some cases, and for example, the acquiring unit may also be described as "a unit that acquires text to be expanded".

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring a text to be expanded; segmenting a text to be expanded to obtain a word sequence of the text to be expanded; determining an identification information sequence corresponding to the word sequence according to a pre-stored corresponding relation between the words and the identification information; inputting the determined identification information sequence into a pre-trained text extension model, and generating an identification information sequence of the extended text, wherein the text extension model is used for representing the corresponding relation between the identification information sequence of the text to be extended and the identification information sequence of the extended text; and generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A method for generating text based on artificial intelligence, the method comprising:

acquiring a text to be expanded;

segmenting the text to be expanded to obtain a word sequence of the text to be expanded;

determining an identification information sequence corresponding to the word sequence according to a corresponding relation between pre-stored words and identification information;

inputting the determined identification information sequence into a pre-trained text extension model, and generating an identification information sequence of the extended text, wherein the text extension model is used for representing the corresponding relation between the identification information sequence of the text to be extended and the identification information sequence of the extended text;

generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information;

wherein the text extension model is trained via:

query sentences corresponding to the same click link in the click log of the search engine are pairwise combined into a sample group; segmenting the query sentences included in each sample group to obtain segmented words; selecting a preset number of words from the segmented words according to the sequence of the occurrence times from large to small; allocating identification information to each selected word, and storing the corresponding relation between the word and the identification information; determining an identification information sequence corresponding to the query sentence included in each sample group according to the corresponding relation between the words and the identification information; respectively taking the identification information sequences corresponding to the two query sentences included in each sample group as input and output, and training to obtain the text extension model;

the text extension model comprises a coding model and a decoding model, wherein the coding model is used for representing the corresponding relation between the identification information sequence and the coding information sequence, and the decoding model is used for representing the corresponding relation between the identification information of the preset initial word, the coding information sequence and the identification information sequence;

the step of inputting the determined identification information sequence into a pre-trained text extension model to generate an identification information sequence of the extended text comprises the following steps:

inputting each identification information in the determined identification information sequence into a forward propagation recurrent neural network for coding in a forward sequence manner to generate a first reference coding information sequence; inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order to generate a second reference coding information sequence; generating a coding information sequence of a text sequence to be extended according to the first reference coding information sequence and the second reference coding information sequence; predicting an identification information sequence of an alternative subsequent word sequence of the starting word based on the cyclic neural network for decoding and the generated coding information sequence; calculating the occurrence probability of the identification information sequence according to the predicted occurrence probability of the identification information included in each identification information sequence; and selecting a preset number of identification information sequences from the predicted identification information sequences according to the sequence of the appearance probability from large to small as the identification information sequences of the expanded text.

2. The method of claim 1, wherein the inputting the determined identification information sequence into a pre-trained text extension model to generate an identification information sequence of an extended text comprises:

inputting the determined identification information sequence into the coding model to generate a coding information sequence of the text to be expanded;

and inputting the generated coding information sequence and the identification information of the initial word into the decoding model to generate an identification information sequence of the expanded text.

3. The method according to claim 2, wherein the inputting the determined identification information sequence into the coding model, generating the coding information sequence of the text to be expanded, comprises:

inputting each identification information in the determined identification information sequence into a forward propagation recurrent neural network for coding in a forward sequence manner to generate a first reference coding information sequence;

inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order to generate a second reference coding information sequence;

and generating a coding information sequence of the text sequence to be extended according to the first reference coding information sequence and the second reference coding information sequence.

4. The method of claim 2, wherein inputting the generated encoded information sequence and the identification information of the start word into the decoding model, generating an identification information sequence of an expanded text, comprises:

predicting an identification information sequence of an alternative subsequent word sequence of the starting word based on the cyclic neural network for decoding and the generated coding information sequence;

calculating the occurrence probability of the identification information sequence according to the predicted occurrence probability of the identification information included in each identification information sequence;

and selecting a preset number of identification information sequences from the predicted identification information sequences according to the sequence of the appearance probability from large to small as the identification information sequences of the expanded text.

5. The method of claim 4, wherein predicting the identification information sequence of the candidate subsequent word sequence of the starting word based on the recurrent neural network for decoding and the generated encoded information sequence comprises:

determining the weight of the coding information sequence generated in each prediction according to the attention model;

weighting the generated coding information sequence according to the weight;

and predicting the identification information sequence of the alternative subsequent word sequence of the initial word based on the cyclic neural network for decoding and the weighted coding information sequence.

6. The method according to any one of claims 1-5, wherein the text to be expanded is generated according to query information input by a terminal; and

after generating the expanded text according to the generated identification information sequence and the corresponding relationship between the words and the identification information, the method further comprises:

performing searching operation based on the generated text to obtain searching result information;

and pushing the search result information to the terminal.

7. An apparatus for generating text based on artificial intelligence, the apparatus comprising:

the acquiring unit is used for acquiring a text to be expanded;

the segmentation unit is used for segmenting the text to be expanded to obtain a word sequence of the text to be expanded;

the determining unit is used for determining an identification information sequence corresponding to the word sequence according to the corresponding relation between the pre-stored words and the identification information;

the first generation unit is used for inputting the determined identification information sequence into a pre-trained text extension model and generating an identification information sequence of the extended text, wherein the text extension model is used for representing the corresponding relation between the identification information sequence of the text to be extended and the identification information sequence of the extended text;

the second generation unit is used for generating an expanded text according to the generated identification information sequence and the corresponding relation between the words and the identification information;

wherein the apparatus further comprises a training unit for:

the first generating unit is further configured to input each identification information in the determined identification information sequence into a forward propagation recurrent neural network for encoding in a forward sequence, so as to generate a first reference encoded information sequence; inputting each identification information in the determined identification information sequence into a back propagation recurrent neural network for coding in a reverse order to generate a second reference coding information sequence; generating a coding information sequence of a text sequence to be extended according to the first reference coding information sequence and the second reference coding information sequence; predicting an identification information sequence of an alternative subsequent word sequence of the starting word based on the cyclic neural network for decoding and the generated coding information sequence; calculating the occurrence probability of the identification information sequence according to the predicted occurrence probability of the identification information included in each identification information sequence; and selecting a preset number of identification information sequences from the predicted identification information sequences according to the sequence of the appearance probability from large to small as the identification information sequences of the expanded text.

8. The apparatus of claim 7, wherein the first generating unit comprises:

the coding subunit is used for inputting the determined identification information sequence into the coding model and generating a coding information sequence of the text to be expanded;

and the decoding subunit is used for inputting the generated coding information sequence and the identification information of the starting word into the decoding model and generating an identification information sequence of the expanded text.

9. The apparatus of claim 8, wherein the encoding subunit is further configured to:

10. The apparatus of claim 8, wherein the decoding subunit is further configured to:

11. The apparatus of claim 10, wherein the decoding subunit is further configured to:

weighting the generated coding information sequence according to the weight;

12. The apparatus according to any one of claims 7-11, wherein the text to be expanded is generated according to query information input by a terminal; and

the device further comprises a pushing unit, wherein the pushing unit is used for:

and pushing the search result information to the terminal.

13. An apparatus, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.

14. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.