CN107526725A - The method and apparatus for generating text based on artificial intelligence - Google Patents
The method and apparatus for generating text based on artificial intelligence Download PDFInfo
- Publication number
- CN107526725A CN107526725A CN201710787262.0A CN201710787262A CN107526725A CN 107526725 A CN107526725 A CN 107526725A CN 201710787262 A CN201710787262 A CN 201710787262A CN 107526725 A CN107526725 A CN 107526725A
- Authority
- CN
- China
- Prior art keywords
- identification information
- text
- sequence
- word
- information sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present application discloses the method and apparatus for generating text based on artificial intelligence.One embodiment of this method includes:Expanded text is treated in acquisition;Expanded text is treated in cutting, obtains treating the word sequence of expanded text;According to the word and the corresponding relation of identification information prestored, it is determined that identification information sequence corresponding with word sequence;By the text extended model of identified identification information sequence inputting training in advance, the identification information sequence of the text after generation extension;According to the identification information sequence and word and the corresponding relation of identification information generated, the text after generation extension.This embodiment improves the diversity of text generation.
Description
Technical field
The application is related to field of computer technology, and in particular to Internet technical field, more particularly to based on artificial intelligence
The method and apparatus for generating text.
Background technology
Artificial intelligence (Artificial Intelligence), english abbreviation AI.It is research, develop for simulating,
Extension and the extension intelligent theory of people, method, a new technological sciences of technology and application system.Artificial intelligence is to calculate
One branch of machine science, it attempts to understand essence of intelligence, and produce it is a kind of it is new can be in a manner of human intelligence be similar
The intelligence machine made a response, the research in the field include robot, language identification, image recognition, natural language processing and specially
Family's system etc..
At present, when being extended to text, it is based primarily upon the offline database pre-established and realizes, expanded text will be treated
In word replace with word in offline database with its semantic similarity, the text after being extended with generation.
However, the document creation method used at present, because offline database maintenance cost is higher, and data are limited, institute
More limited to text generation result.Influence the diversity of text generation of knowing clearly.
The content of the invention
The purpose of the embodiment of the present application is to propose a kind of improved method for generating text based on artificial intelligence
And device, to solve the technical problem that background section above is mentioned.
In a first aspect, this application provides a kind of method for generating text based on artificial intelligence, this method includes:
Expanded text is treated in acquisition;Expanded text is treated in cutting, obtains treating the word sequence of expanded text;Believed according to the word prestored and mark
The corresponding relation of breath, it is determined that identification information sequence corresponding with word sequence;Identified identification information sequence inputting is instructed in advance
Experienced text extended model, the identification information sequence of the text after generation extension, wherein, text extended model is used to characterize to wait to expand
Corresponding relation between the identification information sequence of text after opening up the identification information sequence of text and extending;According to the mark generated
Know information sequence and the corresponding relation of word and identification information, the text after generation extension.
In certain embodiments, text extended model includes encoding model and decoded model, and encoding model, which is used to characterize, to be marked
Know the corresponding relation between information sequence and coded information sequences, decoded model is used for the mark for characterizing the starting word pre-set
Corresponding relation between information, both coded information sequences and identification information sequence;And by identified identification information sequence
The text extended model of training in advance is inputted, generates the identification information sequence of the text after extension, including:By identified mark
The coded information sequences of expanded text are treated in information sequence input coding model, generation;By the coded information sequences generated and rise
The identification information input decoded model of beginning word, the identification information sequence of the text after generation extension.
In certain embodiments, the volume of expanded text is treated into identified identification information sequence inputting encoding model, generation
Code information sequence, including:Each identification information positive sequence in identified identification information sequence is inputted to the forward direction for coding
Recognition with Recurrent Neural Network is propagated, generates the first reference encoder information sequence;By each mark in identified identification information sequence
Information inverted order inputs the backpropagation Recognition with Recurrent Neural Network for coding, generates the second reference encoder information sequence;According to first
The coded information sequences of expanded text sequence are treated in reference encoder information sequence and the second reference encoder information sequence, generation.
In certain embodiments, the identification information of the coded information sequences generated and starting word is inputted into decoded model,
The identification information sequence of text after generation extension, including:Based on the Recognition with Recurrent Neural Network for decoding and the coding generated
Information sequence, the identification information sequence of the alternative follow-up word sequence of prediction starting word;According to each identification information sequence predicted
The probability that the identification information that arranging includes occurs, calculate the probability of identification information sequence appearance;From each mark letter predicted
Cease in sequence according to the descending sequential selection predetermined number identification information sequence of the probability of appearance, as the text after extension
This identification information sequence.
In certain embodiments, based on the Recognition with Recurrent Neural Network for decoding and the coded information sequences generated, prediction
The identification information sequence of the alternative follow-up word sequence of word is originated, including:Determine to be generated during prediction every time according to attention model
Coded information sequences weight;The coded information sequences generated are weighted according to weight;Based on following for decoding
Coded information sequences after ring neutral net and weighting, the identification information sequence of the alternative follow-up word sequence of prediction starting word.
In certain embodiments, text extended model is trained via following steps:By the click logs of search engine
In, query statement corresponding with same clickthrough forms sample group two-by-two;The query statement that each sample group of cutting includes,
The each word being syncopated as;According to the sequential selection preset number that occurrence number is descending from each word being syncopated as
Word;For selected each word allocation identification information, and store the corresponding relation of word and identification information;According to word and identification information
Corresponding relation, it is determined that identification information sequence corresponding with the query statement that each sample group includes;Will be with each sample group bag
Identification information sequence corresponding to two query statements included, respectively as input and output, training obtains text extended model.
In certain embodiments, it is that the Query Information inputted according to terminal generates to treat expanded text;And according to giving birth to
Into identification information sequence and the corresponding relation of word and identification information, generation extension after text after, method also includes:It is based on
The text generated scans for operating, and obtains search result information;Search result information is pushed to terminal.
Second aspect, this application provides a kind of device for being used to generate text based on artificial intelligence, the device includes:
Acquiring unit, expanded text is treated for obtaining;Cutting unit, expanded text is treated for cutting, obtain treating the word order of expanded text
Row;Determining unit, the word and the corresponding relation of identification information prestored for basis, it is determined that mark letter corresponding with word sequence
Cease sequence;First generation unit, for by the text extended model of identified identification information sequence inputting training in advance, generation
The identification information sequence of text after extension, wherein, text extended model is used to characterize the identification information sequence for treating expanded text
Corresponding relation between the identification information sequence of the text after extension;Second generation unit, for according to the mark generated
The corresponding relation of information sequence and word and identification information, the text after generation extension.
In certain embodiments, text extended model includes encoding model and decoded model, and encoding model, which is used to characterize, to be marked
Know the corresponding relation between information sequence and coded information sequences, decoded model is used for the mark for characterizing the starting word pre-set
Corresponding relation between information, both coded information sequences and identification information sequence;And first generation unit, including:Coding
Subelement, for identified identification information sequence inputting encoding model, generation to be treated into the coded information sequences of expanded text;Solution
Numeral unit, for the identification information of the coded information sequences generated and starting word to be inputted into decoded model, after generation extension
Text identification information sequence.
In certain embodiments, coded sub-units, further it is configured to:Will be each in identified identification information sequence
Individual identification information positive sequence inputs the forward-propagating Recognition with Recurrent Neural Network for coding, generates the first reference encoder information sequence;Will
Each identification information inverted order in identified identification information sequence inputs the backpropagation Recognition with Recurrent Neural Network for coding, raw
Into the second reference encoder information sequence;According to the first reference encoder information sequence and the second reference encoder information sequence, generation is treated
The coded information sequences of expanded text sequence.
In certain embodiments, decoding subunit is further configured to:Based on the Recognition with Recurrent Neural Network for decoding and
The coded information sequences generated, the identification information sequence of the alternative follow-up word sequence of prediction starting word;It is every according to what is predicted
The probability that the identification information that individual identification information sequence includes occurs, calculate the probability of identification information sequence appearance;From being predicted
Each identification information sequence according to the descending sequential selection predetermined number identification information sequence of the probability of appearance, make
For the identification information sequence of the text after extension.
In certain embodiments, decoding subunit is further configured to:When determining prediction every time according to attention model
The weight of the coded information sequences generated;The coded information sequences generated are weighted according to weight;Based on for solving
Coded information sequences after the Recognition with Recurrent Neural Network of code and weighting, the identification information sequence of the alternative follow-up word sequence of prediction starting word
Row.
In certain embodiments, device also includes training unit, and training unit is used for:By the click logs of search engine
In, query statement corresponding with same clickthrough forms sample group two-by-two;The query statement that each sample group of cutting includes,
The each word being syncopated as;According to the sequential selection preset number that occurrence number is descending from each word being syncopated as
Word;For selected each word allocation identification information, and store the corresponding relation of word and identification information;According to word and identification information
Corresponding relation, it is determined that identification information sequence corresponding with the query statement that each sample group includes;Will be with each sample group bag
Identification information sequence corresponding to two query statements included, respectively as input and output, training obtains text extended model.
In certain embodiments, it is that the Query Information inputted according to terminal generates to treat expanded text;And device also wraps
Push unit is included, push unit is used for:Scan for operating based on the text generated, obtain search result information;To terminal
Push search result information.
The third aspect, this application provides a kind of equipment, including:One or more processors;Storage device, for storing
One or more programs, when one or more of programs are by one or more of computing devices so that it is one or
Multiple processors realize method as described in relation to the first aspect.
Fourth aspect, this application provides a kind of computer-readable recording medium, computer program is stored thereon with, it is special
Sign is, method as described in relation to the first aspect is realized when the program is executed by processor.
The method and apparatus for generating text based on artificial intelligence that the embodiment of the present application provides, wait to expand by obtaining
Text is opened up, and expanded text is treated in cutting, obtains treating the word sequence of expanded text, then will identification information sequence corresponding with word sequence
The text extended model of row input training in advance, the identification information sequence of the text after generation extension, finally, according to what is generated
The corresponding relation of identification information sequence and word and identification information, the text after extension is generated, improve the diversity of text generation.
Brief description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that the application can apply to exemplary system architecture figure therein;
Fig. 2 is the schematic of one embodiment for being used to generate the method for text based on artificial intelligence according to the application
Flow chart;
Fig. 3 is the schematic diagram for being used to generate the application scenarios of the method for text based on artificial intelligence according to the application;
Fig. 4 is the signal for being used to generate another embodiment of the method for text based on artificial intelligence according to the application
Property flow chart;
Fig. 5 is the exemplary of one embodiment for being used to generate the device of text based on artificial intelligence according to the application
Structure chart;
Fig. 6 is adapted for the structural representation of the computer system of the server for realizing the embodiment of the present application.
Embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Be easy to describe, illustrate only in accompanying drawing to about the related part of invention.
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase
Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 show can apply the application based on artificial intelligence be used for generate the method for text or based on artificial intelligence
The exemplary system architecture 100 for being used to generate the embodiment of the device of text of energy.
As shown in figure 1, system architecture 100 can include terminal device 101,102,103, network 104 and server 105,
106.Network 104 between terminal device 101,102,103 and server 105,106 provide communication link medium.Net
Network 104 can include various connection types, such as wired, wireless communication link or fiber optic cables etc..
User 110 can be interacted with using terminal equipment 101,102,103 by network 104 with server 105,106, to connect
Receive or send data etc..Various applications can be installed on terminal device 101,102,103, such as web browser applications, searched
Index hold up class application, map class application, pay class application, the application of social class, shopping class application, JICQ, mobile phone help
Hand class application etc..
Terminal device 101,102,103 can be the various electronic equipments for having display screen and supporting function of search, bag
Include but be not limited to smart mobile phone, tablet personal computer, E-book reader, MP3 player (Moving Picture Experts
Group Audio Layer III, dynamic image expert's compression standard audio aspect 3), MP4 (Moving Picture
Experts Group Audio Layer IV, dynamic image expert's compression standard audio aspect 4) it is player, on knee portable
Computer and desktop computer etc..
Server 105,106 can be to provide the server of various services, such as terminal device 101,102,103 is provided
The background server of support.Background server can analyze and process to data such as the requests that receives, and by result
Terminal device is fed back to, for example, what can be sent according to terminal treats expanded text, the text after generation extension.
It should be noted that the method for being used to generate text based on artificial intelligence that the embodiment of the present application is provided can be with
Performed by server 105,106, correspondingly, the device for being used to generate text based on artificial intelligence can be arranged at server
105th, in 106.
It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realizing need
Will, can have any number of terminal device, network and server.
With continued reference to Fig. 2, one that is used to generate the method for text based on artificial intelligence according to the application is shown
The flow 200 of embodiment.The method for generating text based on artificial intelligence, comprises the following steps:
Step 201, obtain and treat expanded text.
In the present embodiment, the electronic equipment (example of the method operation for being used to generate text thereon based on artificial intelligence
Server as shown in Figure 1), can be from local or obtained from other electronic equipments and treat expanded text.Treat that expanded text can
To be the text with break-up value that above-mentioned electronic equipment can be got, for example, treating that expanded text can be stored in advance in
In the inquiry request that the Query Information (query) of local user's history input or user are sent by terminal in real time
Including Query Information, and other users input text.
Step 202, expanded text is treated in cutting, obtains treating the word sequence of expanded text.
In the present embodiment, above-mentioned electronic equipment can treat expanded text with what is obtained in dicing step 201, obtain waiting to extend
The word sequence of text.Cutting treats that expanded text can treat expanded text to carry out cutting word/participle operation, can pass through full cutting
Method etc. is handled, and treating that expanded text is divided into word, obtains treating the word sequence of expanded text.For example, " pregnant woman eats any sea
Product zinc supplementation " can using cutting as " pregnant woman, eat, what, marine product, mend, zinc ".
Step 203, according to the word and the corresponding relation of identification information prestored, it is determined that mark letter corresponding with word sequence
Cease sequence.
In the present embodiment, above-mentioned electronic equipment can be according to the word and the corresponding relation of identification information prestored, really
Fixed identification information sequence corresponding with the word sequence obtained in step 202.Identification information can be another representation of word,
It can be made up of letter and/or numeral, for example, identification information can be sequence number of the word in the dictionary pre-set, for word
The word being not present in allusion quotation can use unified identification information, such as " UNKOWN ", represent.The dictionary pre-set can be to language material
After carrying out cutting word processing, the obtained frequency of word appearance is counted, the high word of frequency is will appear from and is stored what is obtained.
Step 204, by the text extended model of identified identification information sequence inputting training in advance, generate after extending
The identification information sequence of text.
In the present embodiment, above-mentioned electronic equipment can be advance by identified identification information sequence inputting in step 203
The text extended model of training, the identification information sequence of the text after generation extension.Text extended model, which can be used for sign, to be treated
Corresponding relation between the identification information sequence of text after the identification information sequence of expanded text and extension.
As an example, text extended model can include one or more neural network models, neural network model can be with
Using Recognition with Recurrent Neural Network model (RNN, Recurrent Neural Network), in Recognition with Recurrent Neural Network prototype network structure
Hidden node between connection form ring-type, it can not only learn the information at current time, the sequence information before also relying on.By
Solves the problems, such as information preservation in its special network architecture.So RNN is to processing time sequence and language text sequence
Row problem has the advantage of uniqueness.Further, RNN variant shot and long term memory network (LSTM, Long Short can also be used
Term Memory networks), gate recursive unit one or more of (GRU, Gated Recurrent Unit) group
Into text extended model.Text extended model can also be that technical staff is pre-set and deposited based on the statistics to mass data
Storage to one or more of identification information sequence in above-mentioned electronic equipment, to treat expanded text identification information carries out computing
With the operational formula of the identification information sequence of the text after being expanded.
In some optional implementations of the present embodiment, text extended model includes encoding model and decoded model, compiles
Code model is used to characterize the corresponding relation between identification information sequence and coded information sequences, and decoded model is used to characterize to set in advance
Corresponding relation between the identification information for originating word, both coded information sequences and the identification information sequence put;And by really
The text extended model of fixed identification information sequence inputting training in advance, the identification information sequence of the text after generation extension, bag
Include:By identified identification information sequence inputting encoding model, the coded information sequences of expanded text are treated in generation;By what is generated
The identification information input decoded model of coded information sequences and starting word, the identification information sequence of the text after generation extension.
In this implementation, coding, can be the vector that list entries is changed into a regular length;Decoding, can
To be that the fixed vector that will be generated before is then converted into output sequence.It is defeated that brain reading-memory-has been imitated in coding-storage-decoding
The process gone out.In addition to " coding-decoding " mechanism, notice (attention) model can also be used to complete to treat expanded text
Mapping between the identification information sequence of text after identification information sequence and extension, attention model do not require encoder by institute
There is input information all to encode among the vector of a regular length.So, when each output is produced, can do
To the information for making full use of list entries to carry.Starting word can be configured according to being actually needed, for example, it may be
“START”。
In some optional implementations of the present embodiment, text extended model is trained via following steps:It will search
Index in the click logs held up, query statement corresponding with same clickthrough forms sample group two-by-two;The each sample of cutting
The query statement that group includes, each word being syncopated as;It is descending according to occurrence number from each word being syncopated as
Sequential selection preset number word;For selected each word allocation identification information, and store that word is corresponding with identification information to close
System;According to word and the corresponding relation of identification information, it is determined that identification information sequence corresponding with the query statement that each sample group includes
Row;By identification information sequence corresponding with two query statements that each sample group includes, respectively as input and output, train
Obtain text extended model.
In this implementation, the loss function of model training can be according to the determine the probability of word appearance, can be first
Random initializtion text extended model, the model then is trained according to small lot stochastic gradient descent method using training data, is made
Obtain empirical risk minimization.Because the click logs of search engine are abundant in content, query statement corresponding to same clickthrough
Between relation merely to be semantic similar, so, the text after the extension of generation is more abundant, and model output result with
Query statement is more pressed close to, if subsequently being scanned for according to the text after extension, search effect is more preferably.Except same clickthrough
Outside corresponding query statement, the language material of training text extended model can also be the correlation that other users are submitted or machine generates
The text of connection.
Step 205, according to the identification information sequence and word and the corresponding relation of identification information generated, generate after extending
Text.
In the present embodiment, above-mentioned electronic equipment can according to word and the corresponding relation of identification information, it is determined that with according to step
Word corresponding to each identification information in identification information sequence, the text after being expanded generated in rapid 204.
In some optional implementations of the present embodiment, it is that the Query Information inputted according to terminal generates to treat expanded text
's;And according to the identification information sequence and word and the corresponding relation of identification information generated, after the text after generation extension,
Method also includes:Scan for operating based on the text generated, obtain search result information;To terminal push search result letter
Breath.
In this implementation, user can input the Query Information of the forms such as voice, picture or word by terminal, on
Text can be converted into by stating electronic equipment, and using the text being converted to as treating expanded text.Searched to what terminal pushed
Rope object information can as the supplement for treating expanded text search result, further saved with this user obtain information when
Between.
The method that above-described embodiment of the application provides treats expanded text by obtaining, and expanded text is treated in cutting, is obtained
The word sequence of expanded text is treated, then by the text expanded mode of identification information sequence inputting training in advance corresponding with word sequence
Type, the identification information sequence of the text after generation extension, finally, according to the identification information sequence and word and identification information generated
Corresponding relation, the text after generation extension, improve the diversity of text generation.
With continued reference to Fig. 3, it illustrates answering for the method for being used to generate text based on artificial intelligence according to the application
With the schematic diagram of scene.In Fig. 3 application scenarios, server 301 gets user and waits to expand by what terminal 302 uploaded first
Text 303 " what marine product zinc supplementation pregnant woman eats " is opened up, then server carries out segmenting etc. processing to it, and by the information after processing
The text 304 inputted after to the text extended model of training in advance, having finally given extension, the text 304 after extension include " pregnant
Woman eat marine product can supplement what trace element ", " what food zinc supplementation pregnant woman eats ".
Refer to Fig. 4, Fig. 4 be according to the method for being used to generate text based on artificial intelligence of the present embodiment another
The schematic flow sheet of embodiment.
In Fig. 4, should be comprised the following steps based on the method 400 for being used to generate text of artificial intelligence:
Step 401, obtain and treat expanded text.
In the present embodiment, the electronic equipment (example of the method operation for being used to generate text thereon based on artificial intelligence
Server as shown in Figure 1), can be from local or obtained from other electronic equipments and treat expanded text.
Step 402, expanded text is treated in cutting, obtains treating the word sequence of expanded text.
In the present embodiment, above-mentioned electronic equipment can treat expanded text with what is obtained in dicing step 401, obtain waiting to extend
The word sequence of text.
Step 403, according to the word and the corresponding relation of identification information prestored, it is determined that mark letter corresponding with word sequence
Cease sequence.
In the present embodiment, above-mentioned electronic equipment can be according to the word and the corresponding relation of identification information prestored, really
Fixed identification information sequence corresponding with the word sequence obtained in step 402.
Step 404, each identification information positive sequence in identified identification information sequence is inputted to the forward direction for coding
Recognition with Recurrent Neural Network is propagated, generates the first reference encoder information sequence.
In the present embodiment, above-mentioned electronic equipment can be by each mark in the identification information sequence determined in step 403
Knowledge information positive sequence inputs the forward-propagating Recognition with Recurrent Neural Network for coding, generates the first reference encoder information sequence.With circulation
Neutral net is exemplified by LSTM Recognition with Recurrent Neural Network, LSTM includes input gate, forgets door, out gate, the first reference encoder information
Sequence can be calculated by following equation:
ienc,t=σ (Wenc,ixenc+Uenc,iht-1+benc,i) (1)
fenc,t=σ (Wenc,fxenc+Uenc,fht-1+benc,f) (2)
oenc,t=σ (Wenc,oxenc+Uenc,oht-1+benc,o) (3)
ht=oenc,t⊙cenc,t-1 (6)
By current term xtInformation and front and continued sequence of terms information ht-1After comprehensive modeling, current term sequence is generated
Information ht。
Wherein, symbol ⊙ represents to be multiplied by dimension, and tanh () is expressed as hyperbolic tangent function, and σ represents S type functions
(Sigmoid) function.T represents current time, xencRepresent current time coding neutral net input, each moment it is defeated
Enter the identification information sequence for constituting and being determined in step 403.Wenc,i, Wenc,f, Wenc,o, Wenc,c, Uenc,i, Uenc,f, Uenc,o,
Uenc,cThe weight matrix of presentation code neutral net, benc,i, benc,f, benc,o, benc,cPresentation code neutral net it is inclined
Put item.ienc,tRepresent the current time coding weights of neutral net input gate, fenc,tRepresent current time coding nerve net
Network forgets the weights of door, oenc,tRepresent the weights of current time coding neutral net out gate.htRepresent current time coding
With the output state of neutral net, ht-1Represent the output state of last moment coding neutral net.cenc,tPresentation code god
Status information through network current time, cenc,t-1The status information of last moment coding neutral net is represented,Represent
Emerging status information in coding neutral net.
Step 405, each identification information inverted order in identified identification information sequence is inputted for the reverse of coding
Recognition with Recurrent Neural Network is propagated, generates the second reference encoder information sequence.
In the present embodiment, above-mentioned electronic equipment can be by each mark in the identification information sequence determined in step 403
Knowledge information inverted order inputs the backpropagation Recognition with Recurrent Neural Network for coding, generates the second reference encoder information sequence.Reversely pass
The iteration that Recognition with Recurrent Neural Network can be taken turns more by gradient descent method is broadcast, obtains suitable backpropagation Recognition with Recurrent Neural Network ginseng
Number.
Step 406, expanded text is treated according to the first reference encoder information sequence and the second reference encoder information sequence, generation
The coded information sequences of sequence.
In the present embodiment, above-mentioned electronic equipment can be according to the first reference encoder information sequence generated in step 404
With the second reference encoder information sequence generated in step 405, the coded information sequences of expanded text sequence are treated in generation.Above-mentioned electricity
Sub- equipment can weight the first reference encoder information sequence and the second reference encoder by setting weight matrix according to weight matrix
The coded information sequences of expanded text sequence are treated in information sequence, generation.Weight matrix can be pre-set, and can also pass through machine
Learning method determines.
Step 407, word is originated based on the Recognition with Recurrent Neural Network for decoding and the coded information sequences generated, prediction
The identification information sequence of alternative follow-up word sequence.
In the present embodiment, above-mentioned electronic equipment can be based on raw in the Recognition with Recurrent Neural Network and step 406 for decoding
Into coded information sequences, the identification information sequence of the alternative follow-up word sequence of prediction starting word.It is different during from coding, during decoding
The hidden layer result of output needs to predict corresponding target vocabulary in sequence ergodic process, and the target vocabulary can change as next round
The input in generation.In addition, above-mentioned electronic equipment can also determine the coding information generated during prediction every time according to attention model
The weight of sequence;The coded information sequences generated are weighted according to weight;Based on the Recognition with Recurrent Neural Network for decoding
With the coded information sequences after weighting, the identification information sequence of the alternative follow-up word sequence of prediction starting word.To coding side result
Summation is weighted by attention model, generates contextual information.
So that Recognition with Recurrent Neural Network is LSTM Recognition with Recurrent Neural Network as an example, the standby of starting word can be predicted according to below equation
Select the identification information sequence of follow-up word sequence:
idec,t=σ (Wdec,ixdec+Udec,ist-1+Aiat+bdec,i) (7)
fdec,t=σ (Wdec,fxdec+Udec,fst-1+Afat+bDec, f) (8)
odec,t=σ (Wdec,oxdec+Udec,ost-1+Aoat+bdec,o) (9)
st=odec,t⊙cdec,t-1 (12)
Wherein, t represents current time, xdecRepresent the input of current time decoding neutral net, Wdec,i, Wdec,f,
Wdec,o, Wdec,c, Udec,i, Udec,f, Udec,o, Udec,c, Ai, Af, Ao, AcRepresent the weight matrix of decoding neutral net.bdec,i,
bdec,f, bdec,o, bdec,cRepresent the bias term of decoding neutral net.idec,tRepresent that current time decoding is inputted with neutral net
The weights of door, fdec,tRepresent that the weights of door, o are forgotten in current time decoding with neutral netdec,tRepresent current time decoding god
Weights through network out gate.cdec,tRepresent the decoding status information at neutral net current time, cdec,t-1In expression for the moment
The status information of decoding neutral net is carved,Represent emerging status information in decoding neutral net.StRepresent to work as
The output state of preceding moment decoding neutral net, St-1Represent the output state of last moment decoding neutral net.atRepresent
Automobile driving value.
atIt can be calculated according in the following manner:
vit=Vatanh(Wachi+Uast-1) (13)
Wherein, i=1,2,3..., j=1,2,3... represent to treat each volume in the coded information sequences of expanded text sequence
At the time of code information corresponds to.Exp () represents the exponential function using natural constant e the bottom of as.Va, Wa, UaRepresent weight matrix.vit、
vjtIt is the median for the input that should be alignd for the output for the Recognition with Recurrent Neural Network for determining to be used to decode with current time.wit
Represent at current time, treat the weight of the coding information at i moment in the coded information sequences of expanded text sequence.chiExpression is treated
The coding information at i moment in the coded information sequences of expanded text sequence.
Step 408, the probability that the identification information included according to each identification information sequence predicted occurs, calculates the mark
Know the probability that information sequence occurs.
In the present embodiment, above-mentioned electronic equipment can include according to each identification information sequence predicted in step 407
Identification information occur probability, calculate the identification information sequence appearance probability.Can be by StWord is projected to as linear transformation
Table size space, the probability of next word is then predicted by the operation of flexible maximum transfer function (Softmax).
Step 409, according to the descending sequential selection of the probability of appearance from each identification information sequence predicted
Predetermined number identification information sequence, the identification information sequence as the text after extension.
In the present embodiment, above-mentioned electronic equipment can be from each identification information sequence predicted according to step 408
The descending sequential selection predetermined number identification information sequence of the probability of the appearance of middle calculating, as the text after extension
Identification information sequence.Predetermined number can be configured according to being actually needed.
Optionally, above-mentioned electronic equipment can also combine beam-search (Beam Search) algorithm, generating probability maximum
Preceding predetermined number identification information sequence.As an example, can first, given sequence starting vocabulary START is defeated as the moment 0
Enter, the probability distribution of next word is then generated by decoding end computing.We select the pre- of maximum probability from the distribution
Fixed number mesh word, then respectively using this predetermined number word as next word in decoding sequence, and it is used as the moment 1
Input.Then, selected in predetermined number branch in each distribution of predetermined number caused by branch general with front and continued sequence
Candidate of the rate product predetermined maximum number word as the input at moment 2, repeat aforesaid operations.If beam-search output series
Vocabulary " END " is terminated, then beam-search width subtracts one, and continues search for, until the width of boundling is changed into 0 or reaches maximal sequence
Untill generating length.Thus, you can obtain identification information sequence of the predetermined number identification information sequence as the text after extension
Row.
Step 410, according to the identification information sequence and word and the corresponding relation of identification information generated, generate after extending
Text.
In the present embodiment, above-mentioned electronic equipment can according to word and the corresponding relation of identification information, it is determined that with according to step
Word corresponding to each identification information, the text after being expanded in the identification information sequence obtained in rapid 409.
Step 401, step 402, step 403, step 410 realize details and technique effect may be referred to step 201, step
Rapid 202, the explanation in step 203, step 205, will not be repeated here.
Figure 4, it is seen that compared with embodiment corresponding to Fig. 2, the method base of above-described embodiment offer of the application
Encoded in the output of forward-propagating Recognition with Recurrent Neural Network, backpropagation Recognition with Recurrent Neural Network, then by circulating nerve net
Network is decoded so that the text extended model table justice being made up of Recognition with Recurrent Neural Network is more abundant and accurate.
With further reference to Fig. 5, as the realization to the above method, it is used for this application provides a kind of based on artificial intelligence
One embodiment of the device of text is generated, the device embodiment is corresponding with the embodiment of the method shown in Fig. 2, and the device is specific
It can apply in various electronic equipments.
As shown in figure 5, the device 500 for being used to generate text based on artificial intelligence of the present embodiment includes:Acquiring unit
510th, cutting unit 520, determining unit 530, the first generation unit 540, the second generation unit 550, wherein, acquiring unit 510,
Expanded text is treated for obtaining;Cutting unit 520, expanded text is treated for cutting, obtain treating the word sequence of expanded text;It is determined that
Unit 530, the word and the corresponding relation of identification information prestored for basis, it is determined that identification information sequence corresponding with word sequence
Row;First generation unit 540, for by the text extended model of identified identification information sequence inputting training in advance, generation
The identification information sequence of text after extension, wherein, text extended model is used to characterize the identification information sequence for treating expanded text
Corresponding relation between the identification information sequence of the text after extension;Second generation unit 550, for according to the mark generated
Know information sequence and the corresponding relation of word and identification information, the text after generation extension.
In the present embodiment, acquiring unit 510, cutting unit 520, determining unit 530, the first generation unit 540, second
The specific processing of generation unit 550 may be referred to Fig. 2 and correspond to embodiment step 201, step 202, step 203, step 204, step
Rapid 205 detailed description, will not be repeated here.
In some optional implementations of the present embodiment, text extended model includes encoding model and decoded model, compiles
Code model is used to characterize the corresponding relation between identification information sequence and coded information sequences, and decoded model is used to characterize to set in advance
Corresponding relation between the identification information for originating word, both coded information sequences and the identification information sequence put;And first life
Into unit 50, including:Coded sub-units 541, for by identified identification information sequence inputting encoding model, generating and waiting to extend
The coded information sequences of text;Decoding subunit 542, for by the coded information sequences generated and starting word identification information
Input decoded model, the identification information sequence of the text after generation extension.
In some optional implementations of the present embodiment, coded sub-units 541, further it is configured to:It will be determined
Identification information sequence in each identification information positive sequence input forward-propagating Recognition with Recurrent Neural Network for coding, generation first
Reference encoder information sequence;Each identification information inverted order in identified identification information sequence is inputted for the reverse of coding
Recognition with Recurrent Neural Network is propagated, generates the second reference encoder information sequence;Referred to according to the first reference encoder information sequence and second
The coded information sequences of expanded text sequence are treated in coded information sequences, generation.
In some optional implementations of the present embodiment, decoding subunit 541 is further configured to:Based on for solving
The Recognition with Recurrent Neural Network of code and the coded information sequences generated, the identification information sequence of the alternative follow-up word sequence of prediction starting word
Row;The probability that the identification information that each identification information sequence according to being predicted includes occurs, calculates the identification information sequence and goes out
Existing probability;According to the descending sequential selection predetermined number of the probability of appearance from each identification information sequence predicted
Individual identification information sequence, the identification information sequence as the text after extension.
In some optional implementations of the present embodiment, decoding subunit 542 is further configured to:According to notice
Model determines the weight of the coded information sequences generated during prediction every time;The coded information sequences generated are entered according to weight
Row weighting;Based on the coded information sequences after the Recognition with Recurrent Neural Network for decoding and weighting, prediction originates the alternative follow-up of word
The identification information sequence of word sequence.
In some optional implementations of the present embodiment, device also includes training unit 560, and training unit 560 is used for:
By in the click logs of search engine, query statement corresponding with same clickthrough forms sample group two-by-two;Cutting is each
The query statement that sample group includes, each word being syncopated as;From each word being syncopated as according to occurrence number by greatly to
Small sequential selection preset number word;For selected each word allocation identification information, and store pair of word and identification information
It should be related to;According to word and the corresponding relation of identification information, it is determined that mark letter corresponding with the query statement that each sample group includes
Cease sequence;By identification information sequence corresponding with two query statements that each sample group includes, respectively as input and export,
Training obtains text extended model.
In some optional implementations of the present embodiment, it is that the Query Information inputted according to terminal generates to treat expanded text
's;And device also includes push unit 570, push unit 570 is used for:Scan for operating based on the text generated, obtain
To search result information;Search result information is pushed to terminal.
From figure 5 it can be seen that the device 500 for being used to generate text based on artificial intelligence passes through acquisition in the present embodiment
Visited in predetermined amount of time targeted customer's information of target area;The identification information that extraction targeted customer's information includes;Obtain
Take the permanent residence information for meeting identification information in User Information Database, wherein, User Information Database include identification information and
Permanent residence information corresponding to identification information;Text is generated according to acquired permanent residence information, improves the various of text generation
Property.It will be understood by those skilled in the art that above-mentioned first generation unit, the second generation unit only represent two different acquisition lists
Member, the first generation unit, for the text extended model of identified identification information sequence inputting training in advance, generation to be extended
The identification information sequence of text afterwards;Second generation unit, for being believed according to the identification information sequence and word generated and mark
The corresponding relation of breath, the text after generation extension, wherein, first, second does not form the particular determination to generation unit.
Below with reference to Fig. 6, it illustrates suitable for for realizing the computer system 600 of the server of the embodiment of the present application
Structural representation.Server shown in Fig. 6 is only an example, should not be to the function and use range band of the embodiment of the present application
Carry out any restrictions.
As shown in fig. 6, computer system 600 includes CPU (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into program in random access storage device (RAM) 603 from storage part 608 and
Perform various appropriate actions and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
I/O interfaces 605 are connected to lower component:Importation 606 including keyboard, mouse etc.;Penetrated including such as negative electrode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage part 608 including hard disk etc.;
And the communications portion 609 of the NIC including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net performs communication process.Driver 610 is also according to needing to be connected to I/O interfaces 606.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc., it is arranged on as needed on driver 610, in order to read from it
Computer program be mounted into as needed storage part 608.
Especially, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product, it includes being carried on computer-readable medium
On computer program, the computer program include be used for execution flow chart shown in method program code.In such reality
To apply in example, the computer program can be downloaded and installed by communications portion 609 from network, and/or from detachable media
611 are mounted.When the computer program is performed by CPU (CPU) 601, perform what is limited in the present processes
Above-mentioned function.
It should be noted that computer-readable medium described herein can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer-readable recording medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor, or it is any more than combination.Meter
The more specifically example of calculation machine readable storage medium storing program for executing can include but is not limited to:Electrical connection with one or more wires, just
Take formula computer disk, hard disk, random access storage device (RAM), read-only storage (ROM), erasable type and may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In this application, computer-readable recording medium can any include or store journey
The tangible medium of sequence, the program can be commanded the either device use or in connection of execution system, device.And at this
In application, computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium beyond storage medium is read, the computer-readable medium, which can send, propagates or transmit, to be used for
By instruction execution system, device either device use or program in connection.Included on computer-readable medium
Program code can be transmitted with any appropriate medium, be included but is not limited to:Wirelessly, electric wire, optical cable, RF etc., or it is above-mentioned
Any appropriate combination.
Flow chart and block diagram in accompanying drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
Architectural framework in the cards, function and the operation of sequence product.At this point, each square frame in flow chart or block diagram can generation
The part of one unit of table, program segment or code, the part of the unit, program segment or code include one or more use
In the executable instruction of logic function as defined in realization.It should also be noted that marked at some as in the realization replaced in square frame
The function of note can also be with different from the order marked in accompanying drawing generation.For example, two square frames succeedingly represented are actually
It can perform substantially in parallel, they can also be performed in the opposite order sometimes, and this is depending on involved function.Also to note
Meaning, the combination of each square frame and block diagram in block diagram and/or flow chart and/or the square frame in flow chart can be with holding
Function as defined in row or the special hardware based system of operation are realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit can also be set within a processor, for example, can be described as:A kind of processor bag
Include acquiring unit, cutting unit, determining unit, the first generation unit, the second generation unit.Wherein, the title of these units exists
The restriction to the unit in itself is not formed in the case of certain, for example, acquiring unit is also described as " obtaining and treating extension text
This unit ".
As on the other hand, present invention also provides a kind of computer-readable medium, the computer-readable medium can be
Included in device described in above-described embodiment;Can also be individualism, and without be incorporated the device in.Above-mentioned calculating
Machine computer-readable recording medium carries one or more program, when said one or multiple programs are performed by the device so that should
Device:Expanded text is treated in acquisition;Expanded text is treated in cutting, obtains treating the word sequence of expanded text;According to the word that prestores with
The corresponding relation of identification information, it is determined that identification information sequence corresponding with word sequence;By identified identification information sequence inputting
The text extended model of training in advance, the identification information sequence of the text after generation extension, wherein, text extended model is used for table
Levy the corresponding relation between the identification information sequence of the text after the identification information sequence of expanded text and extension;According to giving birth to
Into identification information sequence and the corresponding relation of word and identification information, generation extension after text.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art
Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the particular combination of above-mentioned technical characteristic forms
Scheme, while should also cover in the case where not departing from foregoing invention design, carried out by above-mentioned technical characteristic or its equivalent feature
The other technical schemes for being combined and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein
The technical scheme that the technical characteristic of energy is replaced mutually and formed.
Claims (16)
- A kind of 1. method for generating text based on artificial intelligence, it is characterised in that methods described includes:Expanded text is treated in acquisition;Expanded text is treated described in cutting, obtains the word sequence for treating expanded text;According to the word and the corresponding relation of identification information prestored, it is determined that identification information sequence corresponding with the word sequence;By the text extended model of identified identification information sequence inputting training in advance, the mark letter of the text after generation extension Sequence is ceased, wherein, the text extended model is used to characterize the identification information sequence after expanded text and the text after extension Corresponding relation between identification information sequence;According to the identification information sequence and the corresponding relation of institute's predicate and identification information generated, the text after generation extension.
- 2. according to the method for claim 1, it is characterised in that the text extended model includes encoding model and decoding mould Type, the encoding model are used to characterize the corresponding relation between identification information sequence and coded information sequences, the decoded model The corresponding pass between identification information, both coded information sequences and identification information sequence for characterizing the starting word pre-set System;AndThe text extended model by identified identification information sequence inputting training in advance, the mark of the text after generation extension Know information sequence, including:By encoding model described in identified identification information sequence inputting, the coded information sequences of expanded text are treated described in generation;The identification information of the coded information sequences generated and the starting word is inputted into the decoded model, after generation extension The identification information sequence of text.
- 3. according to the method for claim 2, it is characterised in that described by volume described in identified identification information sequence inputting Code model, the coded information sequences of expanded text are treated described in generation, including:Each identification information positive sequence in identified identification information sequence is inputted and circulates nerve for the forward-propagating of coding Network, generate the first reference encoder information sequence;Each identification information inverted order in identified identification information sequence is inputted and circulates nerve for the backpropagation of coding Network, generate the second reference encoder information sequence;According to the first reference encoder information sequence and the second reference encoder information sequence, expanded text is treated described in generation The coded information sequences of sequence.
- 4. according to the method for claim 2, it is characterised in that described by the coded information sequences generated and the starting The identification information of word inputs the decoded model, generates the identification information sequence of the text after extension, including:Based on the Recognition with Recurrent Neural Network for decoding and the coded information sequences generated, the alternative follow-up of the starting word is predicted The identification information sequence of word sequence;The probability that the identification information that each identification information sequence according to being predicted includes occurs, calculates the identification information sequence and goes out Existing probability;Marked from each identification information sequence predicted according to the descending sequential selection predetermined number of the probability of appearance Know information sequence, the identification information sequence as the text after extension.
- 5. according to the method for claim 4, it is characterised in that the Recognition with Recurrent Neural Network based on for decoding and give birth to Into coded information sequences, predict the identification information sequence of the alternative follow-up word sequence of the starting word, including:The weight for the coded information sequences for determining to be generated during prediction every time according to attention model;The coded information sequences generated are weighted according to the weight;Based on the coded information sequences after the Recognition with Recurrent Neural Network for decoding and weighting, the alternative follow-up of the starting word is predicted The identification information sequence of word sequence.
- 6. according to the method for claim 1, it is characterised in that the text extended model is trained via following steps 's:By in the click logs of search engine, query statement corresponding with same clickthrough forms sample group two-by-two;The query statement that each sample group of cutting includes, each word being syncopated as;According to the sequential selection preset number word that occurrence number is descending from each word being syncopated as;For selected each word allocation identification information, and store the corresponding relation of word and identification information;According to institute's predicate and the corresponding relation of identification information, it is determined that mark letter corresponding with the query statement that each sample group includes Cease sequence;By identification information sequence corresponding with two query statements that each sample group includes, respectively as input and output, instruct Get the text extended model.
- 7. according to the method any one of claim 1-6, it is characterised in that described to treat that expanded text is defeated according to terminal The Query Information generation entered;AndIt is described according to the identification information sequence generated and the corresponding relation of institute's predicate and identification information, the text after generation extension Afterwards, methods described also includes:Scan for operating based on the text generated, obtain search result information;The search result information is pushed to the terminal.
- 8. a kind of device for being used to generate text based on artificial intelligence, it is characterised in that described device includes:Acquiring unit, expanded text is treated for obtaining;Cutting unit, for treating expanded text described in cutting, obtain the word sequence for treating expanded text;Determining unit, the word and the corresponding relation of identification information prestored for basis, it is determined that corresponding with the word sequence Identification information sequence;First generation unit, for the text extended model of identified identification information sequence inputting training in advance, generation to be expanded The identification information sequence of text after exhibition, wherein, the text extended model is used to characterize the identification information sequence for treating expanded text Corresponding relation between the identification information sequence of text after row and extension;Second generation unit, for according to the identification information sequence and the corresponding relation of institute's predicate and identification information generated, life Into the text after extension.
- 9. device according to claim 8, it is characterised in that the text extended model includes encoding model and decoding mould Type, the encoding model are used to characterize the corresponding relation between identification information sequence and coded information sequences, the decoded model The corresponding pass between identification information, both coded information sequences and identification information sequence for characterizing the starting word pre-set System;AndFirst generation unit, including:Coded sub-units, for by encoding model described in identified identification information sequence inputting, expanded text to be treated described in generation Coded information sequences;Decoding subunit, for the identification information of the coded information sequences generated and the starting word to be inputted into the decoding mould Type, the identification information sequence of the text after generation extension.
- 10. device according to claim 9, it is characterised in that the coded sub-units, be further configured to:Each identification information positive sequence in identified identification information sequence is inputted and circulates nerve for the forward-propagating of coding Network, generate the first reference encoder information sequence;Each identification information inverted order in identified identification information sequence is inputted and circulates nerve for the backpropagation of coding Network, generate the second reference encoder information sequence;According to the first reference encoder information sequence and the second reference encoder information sequence, expanded text is treated described in generation The coded information sequences of sequence.
- 11. device according to claim 9, it is characterised in that the decoding subunit is further configured to:Based on the Recognition with Recurrent Neural Network for decoding and the coded information sequences generated, the alternative follow-up of the starting word is predicted The identification information sequence of word sequence;The probability that the identification information that each identification information sequence according to being predicted includes occurs, calculates the identification information sequence and goes out Existing probability;Marked from each identification information sequence predicted according to the descending sequential selection predetermined number of the probability of appearance Know information sequence, the identification information sequence as the text after extension.
- 12. device according to claim 11, it is characterised in that the decoding subunit is further configured to:The weight for the coded information sequences for determining to be generated during prediction every time according to attention model;The coded information sequences generated are weighted according to the weight;Based on the coded information sequences after the Recognition with Recurrent Neural Network for decoding and weighting, the alternative follow-up of the starting word is predicted The identification information sequence of word sequence.
- 13. device according to claim 8, it is characterised in that described device also includes training unit, the training unit For:By in the click logs of search engine, query statement corresponding with same clickthrough forms sample group two-by-two;The query statement that each sample group of cutting includes, each word being syncopated as;According to the sequential selection preset number word that occurrence number is descending from each word being syncopated as;For selected each word allocation identification information, and store the corresponding relation of word and identification information;According to institute's predicate and the corresponding relation of identification information, it is determined that mark letter corresponding with the query statement that each sample group includes Cease sequence;By identification information sequence corresponding with two query statements that each sample group includes, respectively as input and output, instruct Get the text extended model.
- 14. according to the device any one of claim 8-13, it is characterised in that described to treat that expanded text is according to terminal The Query Information generation of input;AndDescribed device also includes push unit, and the push unit is used for:Scan for operating based on the text generated, obtain search result information;The search result information is pushed to the terminal.
- A kind of 15. equipment, it is characterised in that including:One or more processors;Storage device, for storing one or more programs,When one or more of programs are by one or more of computing devices so that one or more of processors are real The now method as described in any in claim 1-7.
- 16. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor The method as described in any in claim 1-7 is realized during execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710787262.0A CN107526725B (en) | 2017-09-04 | 2017-09-04 | Method and device for generating text based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710787262.0A CN107526725B (en) | 2017-09-04 | 2017-09-04 | Method and device for generating text based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107526725A true CN107526725A (en) | 2017-12-29 |
CN107526725B CN107526725B (en) | 2021-08-24 |
Family
ID=60683533
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710787262.0A Active CN107526725B (en) | 2017-09-04 | 2017-09-04 | Method and device for generating text based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107526725B (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108509413A (en) * | 2018-03-08 | 2018-09-07 | 平安科技(深圳)有限公司 | Digest extraction method, device, computer equipment and storage medium |
CN108932326A (en) * | 2018-06-29 | 2018-12-04 | 北京百度网讯科技有限公司 | A kind of example extended method, device, equipment and medium |
CN109284367A (en) * | 2018-11-30 | 2019-01-29 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling text |
CN109800421A (en) * | 2018-12-19 | 2019-05-24 | 武汉西山艺创文化有限公司 | A kind of game scenario generation method and its device, equipment, storage medium |
CN109858004A (en) * | 2019-02-12 | 2019-06-07 | 四川无声信息技术有限公司 | Text Improvement, device and electronic equipment |
CN110162751A (en) * | 2019-05-13 | 2019-08-23 | 百度在线网络技术(北京)有限公司 | Text generator training method and text generator training system |
CN110188204A (en) * | 2019-06-11 | 2019-08-30 | 腾讯科技(深圳)有限公司 | A kind of extension corpora mining method, apparatus, server and storage medium |
CN110309407A (en) * | 2018-03-13 | 2019-10-08 | 优酷网络技术(北京)有限公司 | Viewpoint extracting method and device |
CN110362810A (en) * | 2018-03-26 | 2019-10-22 | 优酷网络技术(北京)有限公司 | Text analyzing method and device |
CN110362809A (en) * | 2018-03-26 | 2019-10-22 | 优酷网络技术(北京)有限公司 | Text analyzing method and device |
CN110362808A (en) * | 2018-03-26 | 2019-10-22 | 优酷网络技术(北京)有限公司 | Text analyzing method and device |
CN110555104A (en) * | 2018-03-26 | 2019-12-10 | 优酷网络技术(北京)有限公司 | text analysis method and device |
CN110851673A (en) * | 2019-11-12 | 2020-02-28 | 西南科技大学 | Improved cluster searching strategy and question-answering system |
CN110852093A (en) * | 2018-07-26 | 2020-02-28 | 腾讯科技(深圳)有限公司 | Text information generation method and device, computer equipment and storage medium |
CN110874771A (en) * | 2018-08-29 | 2020-03-10 | 北京京东尚科信息技术有限公司 | Method and device for matching commodities |
CN111209725A (en) * | 2018-11-19 | 2020-05-29 | 阿里巴巴集团控股有限公司 | Text information generation method and device and computing equipment |
CN111783422A (en) * | 2020-06-24 | 2020-10-16 | 北京字节跳动网络技术有限公司 | Text sequence generation method, device, equipment and medium |
CN111859888A (en) * | 2020-07-22 | 2020-10-30 | 北京致医健康信息技术有限公司 | Diagnosis assisting method and device, electronic equipment and storage medium |
US11069346B2 (en) | 2019-04-22 | 2021-07-20 | International Business Machines Corporation | Intent recognition model creation from randomized intent vector proximities |
CN113392639A (en) * | 2020-09-30 | 2021-09-14 | 腾讯科技(深圳)有限公司 | Title generation method and device based on artificial intelligence and server |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090292693A1 (en) * | 2008-05-26 | 2009-11-26 | International Business Machines Corporation | Text searching method and device and text processor |
CN106407381A (en) * | 2016-09-13 | 2017-02-15 | 北京百度网讯科技有限公司 | Method and device for pushing information based on artificial intelligence |
CN106503255A (en) * | 2016-11-15 | 2017-03-15 | 科大讯飞股份有限公司 | Based on the method and system that description text automatically generates article |
CN106919702A (en) * | 2017-02-14 | 2017-07-04 | 北京时间股份有限公司 | Keyword method for pushing and device based on document |
CN106980683A (en) * | 2017-03-30 | 2017-07-25 | 中国科学技术大学苏州研究院 | Blog text snippet generation method based on deep learning |
-
2017
- 2017-09-04 CN CN201710787262.0A patent/CN107526725B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090292693A1 (en) * | 2008-05-26 | 2009-11-26 | International Business Machines Corporation | Text searching method and device and text processor |
CN101593179A (en) * | 2008-05-26 | 2009-12-02 | 国际商业机器公司 | Document search method and device and document processor |
CN106407381A (en) * | 2016-09-13 | 2017-02-15 | 北京百度网讯科技有限公司 | Method and device for pushing information based on artificial intelligence |
CN106503255A (en) * | 2016-11-15 | 2017-03-15 | 科大讯飞股份有限公司 | Based on the method and system that description text automatically generates article |
CN106919702A (en) * | 2017-02-14 | 2017-07-04 | 北京时间股份有限公司 | Keyword method for pushing and device based on document |
CN106980683A (en) * | 2017-03-30 | 2017-07-25 | 中国科学技术大学苏州研究院 | Blog text snippet generation method based on deep learning |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020520492A (en) * | 2018-03-08 | 2020-07-09 | 平安科技(深▲せん▼)有限公司Ping An Technology (Shenzhen) Co.,Ltd. | Document abstract automatic extraction method, device, computer device and storage medium |
CN108509413A (en) * | 2018-03-08 | 2018-09-07 | 平安科技(深圳)有限公司 | Digest extraction method, device, computer equipment and storage medium |
CN110309407A (en) * | 2018-03-13 | 2019-10-08 | 优酷网络技术(北京)有限公司 | Viewpoint extracting method and device |
CN110555104A (en) * | 2018-03-26 | 2019-12-10 | 优酷网络技术(北京)有限公司 | text analysis method and device |
CN110362810A (en) * | 2018-03-26 | 2019-10-22 | 优酷网络技术(北京)有限公司 | Text analyzing method and device |
CN110362809A (en) * | 2018-03-26 | 2019-10-22 | 优酷网络技术(北京)有限公司 | Text analyzing method and device |
CN110362808A (en) * | 2018-03-26 | 2019-10-22 | 优酷网络技术(北京)有限公司 | Text analyzing method and device |
CN108932326A (en) * | 2018-06-29 | 2018-12-04 | 北京百度网讯科技有限公司 | A kind of example extended method, device, equipment and medium |
CN108932326B (en) * | 2018-06-29 | 2021-02-19 | 北京百度网讯科技有限公司 | Instance extension method, device, equipment and medium |
CN110852093B (en) * | 2018-07-26 | 2023-05-16 | 腾讯科技(深圳)有限公司 | Poem generation method, device, computer equipment and storage medium |
CN110852093A (en) * | 2018-07-26 | 2020-02-28 | 腾讯科技(深圳)有限公司 | Text information generation method and device, computer equipment and storage medium |
CN110874771A (en) * | 2018-08-29 | 2020-03-10 | 北京京东尚科信息技术有限公司 | Method and device for matching commodities |
CN111209725A (en) * | 2018-11-19 | 2020-05-29 | 阿里巴巴集团控股有限公司 | Text information generation method and device and computing equipment |
CN111209725B (en) * | 2018-11-19 | 2023-04-25 | 阿里巴巴集团控股有限公司 | Text information generation method and device and computing equipment |
CN109284367A (en) * | 2018-11-30 | 2019-01-29 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling text |
CN109800421A (en) * | 2018-12-19 | 2019-05-24 | 武汉西山艺创文化有限公司 | A kind of game scenario generation method and its device, equipment, storage medium |
CN109858004B (en) * | 2019-02-12 | 2023-08-01 | 四川无声信息技术有限公司 | Text rewriting method and device and electronic equipment |
CN109858004A (en) * | 2019-02-12 | 2019-06-07 | 四川无声信息技术有限公司 | Text Improvement, device and electronic equipment |
US11069346B2 (en) | 2019-04-22 | 2021-07-20 | International Business Machines Corporation | Intent recognition model creation from randomized intent vector proximities |
US11521602B2 (en) | 2019-04-22 | 2022-12-06 | International Business Machines Corporation | Intent recognition model creation from randomized intent vector proximities |
CN110162751A (en) * | 2019-05-13 | 2019-08-23 | 百度在线网络技术(北京)有限公司 | Text generator training method and text generator training system |
CN110188204B (en) * | 2019-06-11 | 2022-10-04 | 腾讯科技(深圳)有限公司 | Extended corpus mining method and device, server and storage medium |
CN110188204A (en) * | 2019-06-11 | 2019-08-30 | 腾讯科技(深圳)有限公司 | A kind of extension corpora mining method, apparatus, server and storage medium |
CN110851673A (en) * | 2019-11-12 | 2020-02-28 | 西南科技大学 | Improved cluster searching strategy and question-answering system |
CN111783422B (en) * | 2020-06-24 | 2022-03-04 | 北京字节跳动网络技术有限公司 | Text sequence generation method, device, equipment and medium |
CN111783422A (en) * | 2020-06-24 | 2020-10-16 | 北京字节跳动网络技术有限公司 | Text sequence generation method, device, equipment and medium |
US11669679B2 (en) | 2020-06-24 | 2023-06-06 | Beijing Byledance Network Technology Co., Ltd. | Text sequence generating method and apparatus, device and medium |
CN111859888A (en) * | 2020-07-22 | 2020-10-30 | 北京致医健康信息技术有限公司 | Diagnosis assisting method and device, electronic equipment and storage medium |
CN111859888B (en) * | 2020-07-22 | 2024-04-02 | 北京致医健康信息技术有限公司 | Diagnosis assisting method, diagnosis assisting device, electronic equipment and storage medium |
CN113392639A (en) * | 2020-09-30 | 2021-09-14 | 腾讯科技(深圳)有限公司 | Title generation method and device based on artificial intelligence and server |
CN113392639B (en) * | 2020-09-30 | 2023-09-26 | 腾讯科技(深圳)有限公司 | Title generation method, device and server based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN107526725B (en) | 2021-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107526725A (en) | The method and apparatus for generating text based on artificial intelligence | |
CN110796190B (en) | Exponential modeling with deep learning features | |
CN110083705B (en) | Multi-hop attention depth model, method, storage medium and terminal for target emotion classification | |
CN107705784B (en) | Text regularization model training method and device, and text regularization method and device | |
CN107273503A (en) | Method and apparatus for generating the parallel text of same language | |
CN109906460A (en) | Dynamic cooperation attention network for question and answer | |
CN107577737A (en) | Method and apparatus for pushed information | |
CN109885756B (en) | CNN and RNN-based serialization recommendation method | |
CN110766142A (en) | Model generation method and device | |
CN107680580A (en) | Text transformation model training method and device, text conversion method and device | |
CN116415654A (en) | Data processing method and related equipment | |
CN110348535A (en) | A kind of vision Question-Answering Model training method and device | |
CN114358203B (en) | Training method and device for image description sentence generation module and electronic equipment | |
CN110162766B (en) | Word vector updating method and device | |
CN106682387A (en) | Method and device used for outputting information | |
US11423307B2 (en) | Taxonomy construction via graph-based cross-domain knowledge transfer | |
CN107943895A (en) | Information-pushing method and device | |
CN107832300A (en) | Towards minimally invasive medical field text snippet generation method and device | |
CN109710953A (en) | A kind of interpretation method and device calculate equipment, storage medium and chip | |
CN106407381A (en) | Method and device for pushing information based on artificial intelligence | |
CN109710760A (en) | Clustering method, device, medium and the electronic equipment of short text | |
CN113377914A (en) | Recommended text generation method and device, electronic equipment and computer readable medium | |
CN111046757A (en) | Training method and device for face portrait generation model and related equipment | |
CN113850012B (en) | Data processing model generation method, device, medium and electronic equipment | |
Xu et al. | CNN-based skip-gram method for improving classification accuracy of chinese text |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |