CN107273503A - Method and apparatus for generating the parallel text of same language - Google Patents
Method and apparatus for generating the parallel text of same language Download PDFInfo
- Publication number
- CN107273503A CN107273503A CN201710464118.3A CN201710464118A CN107273503A CN 107273503 A CN107273503 A CN 107273503A CN 201710464118 A CN201710464118 A CN 201710464118A CN 107273503 A CN107273503 A CN 107273503A
- Authority
- CN
- China
- Prior art keywords
- word sequence
- sequence
- term vector
- vector
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3322—Query formulation using system suggestions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
- G06F40/56—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses the method and apparatus for generating the parallel text of same language.One embodiment of this method includes:The term vector table of acquisition source cutting word sequence and training in advance;According to term vector table, it is determined that source term vector sequence corresponding with source segmenting word sequence;Source term vector sequence is imported to the first circulation neural network model of training in advance, the intermediate vector of the semantic default dimension for characterizing source cutting word sequence is generated;Intermediate vector is imported to the second circulation neural network model of training in advance, target word sequence vector corresponding with intermediate vector is generated;According to term vector table, it is determined that target cutting word sequence corresponding with target word sequence vector, and target cutting word sequence is defined as the parallel text of corresponding with source segmenting word sequence same language.The embodiment reduces the algorithm complexity of generation parallel text with language, reduces required memory space.
Description
Technical field
The application is related to field of computer technology, and in particular to Internet technical field, more particularly, to generates same language
The method and apparatus for saying parallel text.
Background technology
Artificial intelligence (Artificial Intelligence, AI) is research, developed for simulating, extending and extending people
Intelligent theory, method, a new technological sciences of technology and application system.Artificial intelligence is one of computer science
Branch, it attempts to understand the essence of intelligence, and produces a kind of new intelligence that can be made a response in the similar mode of human intelligence
Energy machine, the research in the field includes robot, language identification, image recognition, natural language processing and expert system etc..Manually
Natural language processing in smart field is computer science and an important directions in artificial intelligence field.It is studied
The various theoretical and methods for carrying out efficient communication between people and computer with natural language can be realized.It is a text generation and this
The similar parallel text of same language of the identical semanteme of text language is the important component in natural language processing.It is parallel with language
The application scenario of text is a lot, as an example, at present, search engine is examined in the query statement (query) inputted to user
Suo Shi, due to the randomness of user input query sentence, if the query statement inputted using user is retrieved, often effect
It is bad, in order to obtain more preferable retrieval effectiveness, generally can all parallel text be generated with language to query statement, then use institute
The parallel text of same language of generation is retrieved.
However, being typically in advance using statistics alignment algorithm at present when generating the parallel text of same language of a text
Or regular alignment algorithm, dictionary is replaced based on Parallel Corpus generation;Then, according to priori and replacement dictionary, generation
The parallel text of same language after replacement.The method of existing generation parallel text with language, alignment algorithm is complicated, it is necessary to artificial dry
Pre- more, the replacement dictionary accuracy rate generated is low, and needs to store replacement dictionary, and the required storage for generally replacing dictionary is empty
Between size all in several GB so that the problem of memory space needed for existing is big.
The content of the invention
The purpose of the application is to propose a kind of improved method and apparatus for being used to generate the parallel text with language, to solve
The technical problem that certainly background section above is mentioned.
In a first aspect, the embodiment of the present application provides a kind of method for being used to generate the parallel text with language, this method bag
Include:The term vector table of acquisition source cutting word sequence and training in advance, wherein, above-mentioned term vector table be used to characterizing word and term vector it
Between corresponding relation;According to above-mentioned term vector table, it is determined that source term vector sequence corresponding with above-mentioned source segmenting word sequence;Will be above-mentioned
Source term vector sequence imports the first circulation neural network model of training in advance, generates for characterizing above-mentioned source cutting word sequence
The intermediate vector of semantic default dimension, wherein, above-mentioned first circulation neural network model be used to characterizing term vector sequence with
State the corresponding relation between the vector of default dimension;Above-mentioned intermediate vector is imported to the second circulation neutral net mould of training in advance
Type, generates target word sequence vector corresponding with above-mentioned intermediate vector, wherein, above-mentioned second circulation neural network model is used for table
Levy the corresponding relation between the vector of above-mentioned default dimension and term vector sequence;According to above-mentioned term vector table, it is determined that with above-mentioned mesh
The corresponding target cutting word sequence of term vector sequence is marked, and above-mentioned target cutting word sequence is defined as and above-mentioned source cutting word order
The parallel text of the corresponding same language of row.
In certain embodiments, before above-mentioned acquisition source cutting word sequence and the term vector table of training in advance, the above method
Also include:The inquiry request that user's using terminal is sent is received, above-mentioned inquiry request includes query statement;To above-mentioned query statement
Pre-processed, obtain the cutting word sequence of above-mentioned query statement, above-mentioned pretreatment includes word segmentation processing and removes additional character;
Resulting cutting word sequence is defined as source cutting word sequence.
In certain embodiments, it is above-mentioned above-mentioned target cutting word sequence is defined as it is corresponding with above-mentioned source segmenting word sequence
With language after parallel text, the above method also includes:Scanned for according to the parallel text of above-mentioned same language, obtain search knot
Really;Mentioned above searching results are sent to above-mentioned terminal.
In certain embodiments, before above-mentioned acquisition source cutting word sequence and the term vector table of training in advance, the above method
Also include training step, above-mentioned training step includes:Obtain at least one pair of with language parallel cutting word sequence, wherein, each pair is same
The parallel cutting word sequence of language includes identical language and semantic identical the first cutting word sequence and the second cutting word sequence;Obtain
Default term vector table, default first circulation neural network model and default second circulation neural network model;For upper
State at least one pair of each pair with language in parallel cutting word sequence with language parallel cutting word sequence, according to above-mentioned default word to
Scale, determines the corresponding first segmenting word sequence vector of the first segmenting word sequence of this pair parallel cutting word sequence with language;Will
Above-mentioned first segmenting word sequence vector imports above-mentioned default first circulation neural network model, obtains and above-mentioned first segmenting word
The vector of the corresponding above-mentioned default dimension of sequence vector;Resulting vector is imported into above-mentioned default second circulation neutral net
Model, is obtained and resulting vectorial corresponding second segmenting word sequence vector;According to above-mentioned default term vector table, it is determined that with
The corresponding word sequence of above-mentioned second segmenting word sequence vector;According to resulting word sequence and this pair with language parallel cutting word order
Different information between second cutting word sequence of row, to above-mentioned default term vector table, above-mentioned default first circulation nerve
Network model and above-mentioned default second circulation neural network model are adjusted;By above-mentioned default term vector table, above-mentioned pre-
If first circulation neural network model and above-mentioned default second circulation neural network model be identified as training what is obtained
Term vector table, first circulation neural network model and second circulation neural network model.
In certain embodiments, above-mentioned first circulation neural network model and above-mentioned second circulation neural network model are
Time Recognition with Recurrent Neural Network model.
In certain embodiments, it is above-mentioned according to above-mentioned term vector table, it is determined that source word corresponding with above-mentioned source segmenting word sequence
Sequence vector, including:To each segmenting word in above-mentioned source cutting word sequence, inquiry and the segmenting word in above-mentioned term vector table
The term vector of matching, and the term vector found is defined as in above-mentioned source term vector sequence with the segmenting word in above-mentioned source cutting
The corresponding source term vector in position identical position in word sequence.
In certain embodiments, it is above-mentioned according to above-mentioned term vector table, it is determined that mesh corresponding with above-mentioned target word sequence vector
Cutting word sequence is marked, including:For each target term vector in above-mentioned target word sequence vector, selected from above-mentioned term vector table
Take with the word corresponding to the similarity highest term vector of the target term vector, selected word is defined as above-mentioned target cutting
The corresponding target segmenting word in position identical position in word sequence with the target term vector in above-mentioned target word sequence vector.
Second aspect, the embodiment of the present application provides a kind of device for being used to generate the parallel text with language, the device bag
Include:Acquiring unit, is configured to acquisition source cutting word sequence and the term vector table of training in advance, wherein, above-mentioned term vector table is used
Corresponding relation between sign word and term vector;First determining unit, is configured to according to above-mentioned term vector table, it is determined that with it is upper
State the corresponding source term vector sequence of source segmenting word sequence;First generation unit, is configured to import above-mentioned source term vector sequence
The first circulation neural network model of training in advance, generates semantic default dimension for characterizing above-mentioned source cutting word sequence
Intermediate vector, wherein, above-mentioned first circulation neural network model is used to characterize term vector sequence and the vector of above-mentioned default dimension
Between corresponding relation;Second generation unit, is configured to import above-mentioned intermediate vector the second circulation nerve of training in advance
Network model, generates target word sequence vector corresponding with above-mentioned intermediate vector, wherein, above-mentioned second circulation neural network model
For characterizing the corresponding relation between the vector of above-mentioned default dimension and term vector sequence;Second determining unit, is configured to root
According to above-mentioned term vector table, it is determined that target cutting word sequence corresponding with above-mentioned target word sequence vector, and by above-mentioned target cutting
Word sequence is defined as the parallel text of corresponding with above-mentioned source segmenting word sequence same language.
In certain embodiments, said apparatus also includes:Receiving unit, is configured to receive what user's using terminal was sent
Inquiry request, above-mentioned inquiry request includes query statement;Pretreatment unit, is configured to locate above-mentioned query statement in advance
Reason, obtains the cutting word sequence of above-mentioned query statement, and above-mentioned pretreatment includes word segmentation processing and removes additional character;3rd determines
Unit, is configured to resulting cutting word sequence being defined as source cutting word sequence.
In certain embodiments, said apparatus also includes:Search unit, is configured to according to the parallel text of above-mentioned same language
Scan for, obtain search result;Transmitting element, is configured to send mentioned above searching results to above-mentioned terminal.
In certain embodiments, said apparatus also includes training unit, and above-mentioned training unit includes:First acquisition module,
Be configured to obtain at least one pair of with language parallel cutting word sequence, wherein, each pair with language parallel cutting word sequence include language
Say identical and semantic identical the first cutting word sequence and the second cutting word sequence;Second acquisition module, is configured to obtain pre-
If term vector table, default first circulation neural network model and default second circulation neural network model;Adjusting module,
Be configured to at least one pair of above-mentioned each pair with language in parallel cutting word sequence with language parallel cutting word sequence, according to
Above-mentioned default term vector table, determines corresponding first cutting of the first segmenting word sequence of this pair parallel cutting word sequence with language
Term vector sequence;Above-mentioned first segmenting word sequence vector is imported into above-mentioned default first circulation neural network model, obtain with
The vector of the corresponding above-mentioned default dimension of above-mentioned first segmenting word sequence vector;Resulting vector is imported above-mentioned default the
Two Recognition with Recurrent Neural Network models, are obtained and resulting vectorial corresponding second segmenting word sequence vector;According to above-mentioned default
Term vector table, it is determined that word sequence corresponding with above-mentioned second segmenting word sequence vector;It is same with this pair according to resulting word sequence
Different information between second cutting word sequence of the parallel cutting word sequence of language, to above-mentioned default term vector table, above-mentioned pre-
If first circulation neural network model and above-mentioned default second circulation neural network model be adjusted;Determining module, matches somebody with somebody
Putting is used for above-mentioned default term vector table, above-mentioned default first circulation neural network model and above-mentioned default second circulation
Neural network model is identified as training obtained term vector table, first circulation neural network model and second circulation nerve net
Network model.
In certain embodiments, above-mentioned first circulation neural network model and above-mentioned second circulation neural network model are
Time Recognition with Recurrent Neural Network model.
In certain embodiments, above-mentioned first determining unit is further configured to:To in above-mentioned source cutting word sequence
Each segmenting word, inquires about the term vector matched with the segmenting word, and the term vector found is determined in above-mentioned term vector table
For the corresponding source word in position identical position in above-mentioned source term vector sequence with the segmenting word in above-mentioned source cutting word sequence
Vector.
In certain embodiments, above-mentioned second determining unit is further configured to:For above-mentioned target word sequence vector
In each target term vector, chosen from above-mentioned term vector table similarity highest term vector with the target term vector it is right
The word answered, selected word is defined as in above-mentioned target cutting word sequence with the target term vector in above-mentioned target term vector sequence
The corresponding target segmenting word in position identical position in row.
The third aspect, the embodiment of the present application provides a kind of electronic equipment, and the electronic equipment includes:One or more processing
Device;Storage device, for storing one or more programs, when said one or multiple programs are by said one or multiple processors
During execution so that said one or multiple processors realize the method as described in any implementation in first aspect.
Fourth aspect, the embodiment of the present application provides a kind of computer-readable recording medium, is stored thereon with computer journey
Sequence, it is characterised in that the side as described in any implementation in first aspect is realized when the computer program is executed by processor
Method.
The method and apparatus for being used to generate the parallel text with language that the embodiment of the present application is provided, by according to term vector
Table, it is determined that with corresponding source term vector sequence, then by source term vector sequence import training in advance first circulation neutral net mould
Type, generates the intermediate vector of the semantic default dimension for characterizing source cutting word sequence, then imports intermediate vector advance
The second circulation neural network model of training, generates target word sequence vector corresponding with intermediate vector, then further according to above-mentioned
Term vector table, it is determined that target cutting word sequence corresponding with target word sequence vector, finally and by target cutting word sequence is determined
For the parallel text of same language corresponding with source segmenting word sequence.So as to which generating process does not need manual intervention, reduces generation same
The algorithm complexity of the parallel text of language, and the larger replacement dictionary of space-consuming need not be stored (usual size is thousands of
Mbytes), it is only necessary to store term vector table, the parameter of first circulation neural network model and second circulation neural network model
Parameter (about space-consuming is tens to three altogether), so as to reduce required memory space.
Brief description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that the application can apply to exemplary system architecture figure therein;
Fig. 2 is the flow chart for being used to generate one embodiment of the method for parallel text with language according to the application;
Fig. 3 is the schematic diagram for being used to generate one application scenarios of the method for parallel text with language according to the application;
Fig. 4 is the flow chart for being used to generate another embodiment of the method for parallel text with language according to the application;
Fig. 5 is the structural representation for being used to generate one embodiment of the device of parallel text with language according to the application
Figure;
Fig. 6 is adapted for the structural representation of the computer system of the electronic equipment for realizing the embodiment of the present application.
Embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that, in order to
Be easy to description, illustrate only in accompanying drawing to about the related part of invention.
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase
Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 show can using the application be used for generate with language the method for parallel text or for generating same language
The exemplary system architecture 100 of the embodiment of the device of parallel text.
As shown in figure 1, system architecture 100 can include terminal device 101,102,103, network 104 and server 105.
Medium of the network 104 to provide communication link between terminal device 101,102,103 and server 105.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be interacted with using terminal equipment 101,102,103 by network 104 with server 105, to receive or send out
Send message etc..Various client applications, such as web browser applications, purchase can be installed on terminal device 101,102,103
Species application, searching class application, JICQ, mailbox client, social platform software etc..
Terminal device 101,102,103 can be the various electronic equipments with display screen, include but is not limited to intelligent hand
Machine, tablet personal computer, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, for example, shown from terminal device 101,102,103
Searching class website provides the backstage search server supported.Backstage search server can be to data such as the searching requests that receives
Progress such as analyzes at the processing, and result (such as web page interlinkage data) is fed back into terminal device.
It should be noted that the method for being used to generate the parallel text with language that the embodiment of the present application is provided is general by taking
Business device 105 is performed, and correspondingly, the device for generating the parallel text of same language is generally positioned in server 105.In some feelings
Under condition, the embodiment of the present application provided be used for generate with language the method for parallel text can also not need terminal device 101,
102nd, 103, and can individually be performed by server 105, at this moment, server 105 both can be the service with server capability
Device or the general electronic equipment without server capability but with calculation function.
It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realizing need
Will, can have any number of terminal device, network and server.
With continued reference to Fig. 2, it illustrates the reality for being used to generate the method for parallel text with language according to the application
Apply the flow 200 of example.This is used for the method for generating the parallel text with language, comprises the following steps:
Step 201, source cutting word sequence and the term vector table of training in advance are obtained.
In the present embodiment, for generating electronic equipment (such as Fig. 1 of the method operation of the parallel text of same language thereon
Shown server) locally or remotely it can be cut from other electronic equipments of above-mentioned electronic equipment network connection acquisition source
The term vector table of segmentation sequence and training in advance.
In the present embodiment, segmenting word refers to word or the phrase without additional character or punctuation mark.Cutting word order
The sequence that row are made up of at least one segmenting word arranged in order.Cutting word sequence in source can be stored in advance in above-mentioned electricity
The cutting word sequence of the local parallel text of same language to be generated of sub- equipment.Cutting word sequence in source can also be specified by user
The cutting word sequence of the parallel text of same language to be generated.Source cutting word sequence can also be above-mentioned electronic equipment from above-mentioned electronics
The parallel text of same language to be generated that other electronic equipments (for example, terminal device shown in Fig. 1) of device network connection are received
This cutting word sequence.The parallel text of one text refers to the semantic similar text to the text.The same language of one text
Say that parallel text refers to identical with text language and adopted similar text.For example, " ordering through train " is " through train ticket booking "
The parallel text with language, " rice is either with or without protein " is the parallel text of same language of " rice is either with or without protein ".
In the present embodiment, term vector table is used to word or phrase being mapped to real number vector, and the real number vector mapped is just
It is term vector.By using term vector table, it is possible to achieve the feature in natural language from the high-dimensional space of vocabulary table size
It is reduced to a relatively low dimensional space.Weigh term vector table principle be:Between the term vector of two words of semantic similarity
Similarity should be higher, it is whereas a lower.As an example, term vector can use Distributed Representation
A kind of real number vector that (or Distributional Representation) is represented.Here term vector table can be advance
Train.For example, it is " Beijing " that one in term vector table record, which can be word, corresponding term vector for " -0.1654,
0.8764,0.5364, -0.6354,0.1645 ", it can be Arbitrary Dimensions, this Shen that term vector, which has in 5 dimensions, practical application, here
Please this is not specifically limited.
It should be noted that how to train term vector table to be widely studied at present and application prior art, herein no longer
Repeat.
As an example, the word included by statement library and each sentence including a large amount of sentences can be obtained first;So
Afterwards, for each word in word storehouse, the sentence for including the word in statement library is obtained, and then in these sentences, is obtained
The context words adjacent with the word, based on the maximum principle of the degree of association sum of word and context words is made, calculate every
The term vector of individual word.
As an example, each sentence of each word to be analyzed included in statement library belonging in statement library can also be obtained
Default type, obtain the corresponding type set of each word to be analyzed;The term vector of each word to be analyzed is set to
Variable is trained, according to the corresponding type set of each word to be analyzed and term vector, the degree of association between each word to be analyzed is set up
Summation computation model, be used as training pattern;According to above-mentioned training pattern, based on the principle for the summation maximum for making the degree of association,
Training variable is trained, the term vector of each word to be analyzed is obtained.
Step 202, according to term vector table, it is determined that source term vector sequence corresponding with source segmenting word sequence.
In the present embodiment, according to the term vector table obtained in step 201, above-mentioned electronic equipment (such as clothes shown in Fig. 1
Business device) source term vector sequence corresponding with the source segmenting word sequence acquired in step 201 can be determined.Here, source term vector sequence
Row are the term vector sequences for generating the parallel text of same language of source cutting word sequence.Source term vector sequence is by arranging in order
At least one source term vector composition of row.Each source term vector in the term vector sequence of source with it is each in source cutting word sequence
Source segmenting word is corresponded, and each source term vector in the term vector sequence of source be according in source cutting word sequence with the source word to
Measure corresponding source segmenting word and inquire about what is obtained in term vector table.
In some optional implementations of the present embodiment, step 202 can be carried out as follows:To in source cutting word sequence
Each segmenting word, the term vector that is matched with the segmenting word is inquired about in term vector table, and the term vector found is defined as
The corresponding source term vector in position identical position in the term vector sequence of source with the segmenting word in source cutting word sequence.
Step 203, source term vector sequence is imported to the first circulation neural network model of training in advance, generated for characterizing
The intermediate vector of the semantic default dimension of source cutting word sequence.
In the present embodiment, can be by for generating that the method for the parallel text of same language runs on electronic equipment thereon
Source term vector sequence imports the first circulation neural network model of training in advance, generates the semanteme for characterizing source cutting word sequence
Default dimension intermediate vector.Wherein, first circulation neural network model is used to characterize term vector sequence and default dimension
Corresponding relation between vector.
In practice, Recognition with Recurrent Neural Network (Recurrent Neural Networks, RNNs) model is different from traditional
FNNs (Feed-forward Neural Networks, feed-forward neutral net), RNNs introduces directed circulation, can locate
Manage those input between forward-backward correlation the problem of.It is again to defeated from input layer to hidden layer in traditional neural network model
Go out layer, connect entirely between layers, the node between every layer is connectionless.But this common neutral net for
Handle the problem of sequence is related but helpless.And in Recognition with Recurrent Neural Network model, the output of a sequence currently with above
Output it is also relevant.The specific form of expression is that Recognition with Recurrent Neural Network network can be remembered to information above and be applied to work as
In the calculating of preceding output, i.e., the node between hidden layer is no longer connectionless but has connection, and hidden layer input not only
Output including input layer also includes the output of last moment hidden layer.In theory, Recognition with Recurrent Neural Network can be to any length
Sequence data handled.But in practice, often assume that current state is only several with above to reduce complexity
Individual state is related.
As an example, first circulation neural network model can be with substantial amounts of term vector sequence and corresponding default dimension
Vector as training data, using arbitrary nonlinear activation function (for example, Sigmoid functions, Softplus functions, double
Polarity S igmoid functions etc.) as the neuron activation functions of default first circulation neural network model, to the word inputted
Sequence vector is calculated, and using the vector of default dimension corresponding with the term vector sequence of the input as output, training is initial
First circulation neural network model obtained from.
Step 204, intermediate vector is imported to the second circulation neural network model of training in advance, generation and intermediate vector pair
The target word sequence vector answered.
In the present embodiment, the intermediate vector generated in step 203 can be imported training in advance by above-mentioned electronic equipment
Second circulation neural network model, generate corresponding with intermediate vector target word sequence vector.Wherein, second circulation nerve net
Network model is used to characterize the corresponding relation between the vector of default dimension and term vector sequence.Here, target word sequence vector is
It is made up of at least one the target word vector arranged in order, the number of target term vector can be with target word sequence vector
The number of source term vector is identical in the term vector sequence of source, can also be different from the number of source term vector in the term vector sequence of source, i.e.
The number of target term vector is not changeless in target word sequence vector.
As an example, second circulation neural network model can be with the corresponding word of vector sum of substantial amounts of default dimension to
Sequence is measured as training data, using arbitrary nonlinear activation function (for example, Sigmoid functions, Softplus functions, double
Polarity S igmoid functions etc.) as the neuron activation functions of default second circulation neural network model, it is pre- to what is inputted
If the vector of dimension is calculated, the vectorial corresponding term vector sequence with the default dimension inputted is regard as output, training
Obtained from initial second circulation neural network model.
Step 205, according to term vector table, it is determined that target cutting word sequence corresponding with target word sequence vector, and by mesh
Mark cutting word sequence is defined as the parallel text of corresponding with source segmenting word sequence same language.
In the present embodiment, above-mentioned electronic equipment can be according to the term vector table obtained in step 201, it is determined that and step
The corresponding target cutting word sequence of target word sequence vector generated in 204, and target cutting word sequence is defined as and source
The parallel text of the corresponding same language of segmenting word sequence.Here, target cutting word sequence is at least one mesh by arranging in order
Mark segmenting word composition.Each target segmenting word in target cutting word sequence and each target word in target word sequence vector
Vector is corresponded, and each target segmenting word in target cutting word sequence be according in target word sequence vector with the target
The corresponding target term vector of segmenting word inquires about what is obtained in term vector table.
In some optional implementations of the present embodiment, step 205 can be carried out as follows:
For each target term vector in target word sequence vector, chosen from term vector table and the target term vector
Word corresponding to similarity highest term vector, selected word is defined as in target cutting word sequence and the target term vector
The corresponding target segmenting word in position identical position in target word sequence vector.
As an example, cosine similarity between two term vectors can be calculated as similar between two term vectors
Degree.
As an example, the Euclidean distance between two term vectors can also be calculated, Euclidean distance is nearer, then two term vectors
Between similarity it is higher, otherwise similarity is lower.
Because source segmenting word can be mapped to source term vector by term vector table acquired in step 201, in step 205
During by target word DUAL PROBLEMS OF VECTOR MAPPING to target segmenting word, acquired term vector table in use or step 201, be with by source
Identical term vector table when segmenting word is mapped to source word segmentation vector, therefore, according to term vector table acquired in step 201,
Target word sequence vector is mapped to target cutting word sequence, resulting target cutting word sequence and the language of source cutting word sequence
Speech is identical and semantic similar, i.e. resulting target cutting word sequence is the parallel text of same language corresponding with source segmenting word sequence
This.
In some optional implementations of the present embodiment, term vector table, first circulation neural network model and second
Recognition with Recurrent Neural Network model can be obtained by the training of following training step:
First, obtain at least one pair of with language parallel cutting word sequence.
Here, each pair with language parallel cutting word sequence include identical language and semantic identical the first cutting word sequence and
Second cutting word sequence.As an example, acquired each pair is with language, parallel cutting word sequence can be artificial by technical staff
The language of mark is identical and semantic identical the first cutting word sequence and the second cutting word sequence.
Then, default term vector table, default first circulation neural network model and default second circulation god are obtained
Through network model.
Then, at least one pair of each pair with language in parallel cutting word sequence with language parallel cutting word sequence, root
According to default term vector table, corresponding first segmenting word of the first segmenting word sequence of this pair parallel cutting word sequence with language is determined
Sequence vector;First segmenting word sequence vector is imported into default first circulation neural network model, obtained and the first segmenting word
The vector of the corresponding default dimension of sequence vector;Resulting vector is imported into default second circulation neural network model, obtained
To with resulting vectorial corresponding second segmenting word sequence vector;According to default term vector table, it is determined that with the second segmenting word
The corresponding word sequence of sequence vector;According to second segmenting word of the resulting word sequence with this pair parallel cutting word sequence with language
Different information between sequence, is followed to default term vector table, default first circulation neural network model and default second
Ring neural network model is adjusted.As an example, adjustment term vector table can be adjustment term vector table in word it is corresponding with word
The value respectively tieed up in term vector, adjustment first circulation neural network model can be the defeated of adjustment first circulation neural network model
Enter matrix, hide layer matrix and output matrix, adjustment second circulation neural network model can be adjustment second circulation nerve net
The input matrix of network model, hiding layer matrix and output matrix.
Finally, by default term vector table, default first circulation neural network model and default second circulation nerve
Network model is identified as training obtained term vector table, first circulation neural network model and second circulation neutral net mould
Type.Here, default term vector table, default first circulation neural network model and default second circulation neural network model
In parameters adjusted and optimized in the training process, more preferable effect can be reached when in use.
In some optional implementations of the present embodiment, first circulation neural network model and second circulation nerve net
Network model can be time Recognition with Recurrent Neural Network model, and such as LSTM (the Long Short-Term Memory) times circulate god
Through network model.
With continued reference to Fig. 3, Fig. 3 is the application scenarios for being used to generate the method for parallel text with language according to the present embodiment
A schematic diagram.In Fig. 3 application scenarios, first electronic equipment obtain source cutting word sequence 301 " ordering through train " and
Term vector table 302, then determines the source term vector sequence 303 of source cutting word sequence 301 by term vector table 302, then by source word
Sequence vector 303 import training in advance first circulation neural network model 304, generate intermediate vector 305, then by centre to
Amount 305 imports the second circulation neural network models 306 of training in advance, generation target word sequence vector 307, finally by word to
Target word sequence vector 307 is determined target cutting word sequence 308 " through train ticket booking " by scale 302, so as to generate source cutting
The parallel text " through train ticket booking " of the corresponding same language of word sequence 301 " ordering through train ".
The method that above-described embodiment of the application is provided by according to term vector table, it is determined that with corresponding source term vector sequence
Row, then the first circulation neural network model of source term vector sequence importing training in advance is generated for characterizing source cutting word order
The intermediate vector of the semantic default dimension of row, then imports intermediate vector the second circulation neutral net mould of training in advance
Type, generates corresponding with intermediate vector target word sequence vector, then further according to above-mentioned term vector table, it is determined that with target term vector
The corresponding target cutting word sequence of sequence, finally and is defined as same language corresponding with source segmenting word sequence by target cutting word sequence
Say parallel text.So as to reduce the algorithm complexity of generation parallel text with language, and reduce required memory space.
With further reference to Fig. 4, it illustrates the stream of another embodiment of the method for generating the parallel text of same language
Journey 400.This is used for the flow 400 for generating the method for parallel text with language, comprises the following steps:
Step 401, the inquiry request that user's using terminal is sent is received.
In the present embodiment, for generating electronic equipment (such as Fig. 1 of the method operation of the parallel text of same language thereon
Shown server) inquiry that user's using terminal send can be received by wired connection mode or radio connection please
Ask.Here, inquiry request can include query statement.As an example, user can be to be installed in using terminal browser access
Searching class website, and input inquiry sentence, and after to above-mentioned searching class website provide support above-mentioned electronic equipment send wrap
The inquiry request of above-mentioned query statement is included, so that above-mentioned electronic equipment can receive above-mentioned inquiry request.
Step 402, query statement is pre-processed, obtains the cutting word sequence of query statement.
In the present embodiment, above-mentioned electronic equipment can be pre-processed to query statement, obtain the cutting of query statement
Word sequence.Here, pretreatment can include word segmentation processing and remove additional character.
Word segmentation processing is exactly the process that continuous word sequence is reassembled into word sequence according to certain specification.Need
Bright is that word segmentation processing is widely studied and application the prior art of those skilled in the art, be will not be repeated here.As an example,
Word segmentation processing can use the segmenting method based on string matching, the segmenting method based on understanding and the participle side based on statistics
Method.
Herein, additional character refers to relative to tradition or conventional outer symbol, and frequency of use is less and is difficult to directly input
Symbol, such as:Mathematic sign, unit symbol, tab etc..Additional character is removed just to refer to treat the text for removing additional character
This, wherein included additional character is removed, and retains the process of wherein no special symbol.
Above-mentioned pretreatment is have passed through, the cutting word sequence of query statement can be obtained.Here, cutting word sequence be by by
Constituted according at least one tactic segmenting word.
Step 403, resulting cutting word sequence is defined as source cutting word sequence.
In the present embodiment, the cutting word sequence obtained by step 402 can be defined as source and cuts by above-mentioned electronic equipment
Segmentation sequence, so that subsequent step is used.
Step 404, source cutting word sequence and the term vector table of training in advance are obtained.
Step 405, according to term vector table, it is determined that source term vector sequence corresponding with source segmenting word sequence.
Step 406, source term vector sequence is imported to the first circulation neural network model of training in advance, generated for characterizing
The intermediate vector of the semantic default dimension of source cutting word sequence.
Step 407, intermediate vector is imported to the second circulation neural network model of training in advance, generation and intermediate vector pair
The target word sequence vector answered.
Step 408, according to term vector table, it is determined that target cutting word sequence corresponding with target word sequence vector, and by mesh
Mark cutting word sequence is defined as the parallel text of corresponding with source segmenting word sequence same language.
In the present embodiment, the concrete operations of step 404, step 405, step 406, step 407 and step 408 and Fig. 2
Step 201 in shown embodiment, step 202, step 203, step 204 and step 205 concrete operations it is essentially identical, herein
Repeat no more.
Step 409, scanned for according to the parallel text of same language, obtain search result.
In the present embodiment, above-mentioned electronic equipment can determine corresponding with source segmenting word sequence same in a step 408
After the parallel text of language, scanned for according to the parallel text of same language, obtain search result.As an example, search result can be with
Include the web page interlinkage of the related webpage of text parallel to same language.Because user compares when terminal input inquiry sentence
Arbitrarily, if the content inputted according to user is scanned for, recall rate is relatively low.And step 404 is used to the operation of step 408,
After the parallel text of the corresponding same language of segmenting word sequence for generating query statement, the parallel text of same language generated and inquiry
The semanteme of sentence is approximate, but is particularly suited for search, so as to improve the recall rate of search.
Step 410, search result is sent to terminal.
In the present embodiment, above-mentioned electronic equipment can will search for obtained search result and send to step in step 409
The terminal of inquiry request is received in 401.
Figure 4, it is seen that compared with the corresponding embodiments of Fig. 2, being used in the present embodiment generates parallel with language
The flow 400 of the method for text, which has had more, to be received inquiry request from terminal and the query statement in inquiry request is pre-processed
And the step of scanned for according to the parallel text of identified same language and search result is returned into terminal.Thus, this implementation
The scheme of example description can improve the recall rate of search engine search.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, it is used to generate together this application provides one kind
One embodiment of the device of the parallel text of language, the device embodiment is corresponding with the embodiment of the method shown in Fig. 2, the device
Specifically it can apply in various electronic equipments.
As shown in figure 5, with language, the device 500 of parallel text includes for the generation that is used for of the present embodiment:Acquiring unit 501,
First determining unit 502, the first generation unit 503, the second generation unit 504 and the second determining unit 505.Wherein, obtain single
Member 501, is configured to acquisition source cutting word sequence and the term vector table of training in advance, wherein, above-mentioned term vector table is used to characterize
Corresponding relation between word and term vector;First determining unit 502, is configured to according to above-mentioned term vector table, it is determined that with it is above-mentioned
The corresponding source term vector sequence of source segmenting word sequence;First generation unit 503, is configured to import above-mentioned source term vector sequence
The first circulation neural network model of training in advance, generates semantic default dimension for characterizing above-mentioned source cutting word sequence
Intermediate vector, wherein, above-mentioned first circulation neural network model is used to characterize term vector sequence and the vector of above-mentioned default dimension
Between corresponding relation;Second generation unit 504, is configured to import above-mentioned intermediate vector the second circulation god of training in advance
Through network model, target word sequence vector corresponding with above-mentioned intermediate vector is generated, wherein, above-mentioned second circulation neutral net mould
Type is used to characterize the corresponding relation between the vector of above-mentioned default dimension and term vector sequence;Second determining unit 505, configuration is used
According to above-mentioned term vector table, it is determined that target cutting word sequence corresponding with above-mentioned target word sequence vector, and by above-mentioned target
Cutting word sequence is defined as the parallel text of corresponding with above-mentioned source segmenting word sequence same language.
In the present embodiment, the acquiring unit 501, first of the device 500 for generating the parallel text of same language determines single
First 502, first generation unit 503, the specific processing of the second generation unit 504 and the second determining unit 505 and its brought
Technique effect can refer to the phase of step 201 in Fig. 2 correspondence embodiments, step 202, step 203, step 204 and step 205 respectively
Speak on somebody's behalf bright, will not be repeated here.
In some optional implementations of the present embodiment, the above-mentioned device 500 for being used to generate the parallel text with language
It can also include:Receiving unit 506, is configured to receive the inquiry request that user's using terminal is sent, above-mentioned inquiry request bag
Include query statement;Pretreatment unit 507, is configured to pre-process above-mentioned query statement, obtains above-mentioned query statement
Cutting word sequence, above-mentioned pretreatment includes word segmentation processing and removes additional character;3rd determining unit 508, is configured to institute
Obtained cutting word sequence is defined as source cutting word sequence.Receiving unit 506, the determining unit 508 of pretreatment unit 507 and the 3rd
Specific processing and its technique effect that is brought can be respectively with reference to step 401, step 402 and step in Fig. 4 correspondence embodiments
403 related description, will not be repeated here.
In some optional implementations of the present embodiment, the above-mentioned device 500 for being used to generate the parallel text with language
It can also include:Search unit 509, is configured to be scanned for according to the parallel text of above-mentioned same language, obtains search result;Hair
Unit 510 is sent, is configured to send mentioned above searching results to above-mentioned terminal.Search unit 509 and transmitting element 510 it is specific
Processing and its technique effect brought can refer to the related description of step 409 and step 410 in Fig. 4 correspondence embodiments respectively,
It will not be repeated here.
In some optional implementations of the present embodiment, the above-mentioned device 500 for being used to generate the parallel text with language
Training unit 511 can also be included, above-mentioned training unit 511 can include:First acquisition module 5111, is configured to obtain extremely
Few a pair with language parallel cutting word sequence, wherein, with language, parallel cutting word sequence includes that language is identical and semantic phase to each pair
Same the first cutting word sequence and the second cutting word sequence;Second acquisition module 5112, is configured to obtain default term vector
Table, default first circulation neural network model and default second circulation neural network model;Adjusting module 5113, configuration is used
In at least one pair of above-mentioned each pair with language in parallel cutting word sequence with language parallel cutting word sequence, according to above-mentioned pre-
If term vector table, determine the corresponding first cutting term vector of the first segmenting word sequence of this pair parallel cutting word sequence with language
Sequence;Above-mentioned first segmenting word sequence vector is imported into above-mentioned default first circulation neural network model, obtained and above-mentioned the
The vector of the corresponding above-mentioned default dimension of one segmenting word sequence vector;Resulting vector is imported into above-mentioned default second circulation
Neural network model, is obtained and resulting vectorial corresponding second segmenting word sequence vector;According to above-mentioned default term vector
Table, it is determined that word sequence corresponding with above-mentioned second segmenting word sequence vector;It is flat with language according to resulting word sequence and this pair
Different information between second cutting word sequence of row cutting word sequence, to above-mentioned default term vector table, above-mentioned default
One Recognition with Recurrent Neural Network model and above-mentioned default second circulation neural network model are adjusted;Determining module 5114, configuration
For above-mentioned default term vector table, above-mentioned default first circulation neural network model and above-mentioned default second circulation is refreshing
It is identified as training obtained term vector table, first circulation neural network model and second circulation neutral net through network model
Model.The specific processing of training unit 511 and its technique effect brought can be respectively with reference to the correlations in Fig. 2 correspondence embodiments
Illustrate, will not be repeated here.
In some optional implementations of the present embodiment, above-mentioned first circulation neural network model and above-mentioned second is followed
Ring neural network model can be time Recognition with Recurrent Neural Network model.
In some optional implementations of the present embodiment, above-mentioned first determining unit 502 can further configure use
In:To each segmenting word in above-mentioned source cutting word sequence, inquired about in above-mentioned term vector table the word that is matched with the segmenting word to
Amount, and the term vector found is defined as in above-mentioned source term vector sequence with the segmenting word in above-mentioned source cutting word sequence
The corresponding source term vector in position identical position.The specific processing of first determining unit 502 and its technique effect brought can
Respectively with reference to the related description of step 202 in Fig. 2 correspondence embodiments, it will not be repeated here.
In some optional implementations of the present embodiment, above-mentioned second determining unit 505 can further configure use
In:For each target term vector in above-mentioned target word sequence vector, from above-mentioned term vector table choose with the target word to
Word corresponding to the similarity highest term vector of amount, selected word is defined as in above-mentioned target cutting word sequence and the mesh
Mark position identical position corresponding target segmenting word of the term vector in above-mentioned target word sequence vector.First determining unit
505 specific processing and its technique effect brought can refer to the related description of step 205 in Fig. 2 correspondence embodiments respectively,
It will not be repeated here.
Below with reference to Fig. 6, it illustrates suitable for for the computer system 600 for the electronic equipment for realizing the embodiment of the present application
Structural representation.Electronic equipment shown in Fig. 6 is only an example, to the function of the embodiment of the present application and should not use model
Shroud carrys out any limitation.
As shown in fig. 6, computer system 600 includes CPU (CPU, Central Processing Unit)
601, its can according to the program being stored in read-only storage (ROM, Read Only Memory) 602 or from storage part
606 programs being loaded into random access storage device (RAM, Random Access Memory) 603 and perform it is various appropriate
Action and processing.In RAM 603, the system that is also stored with 600 operates required various programs and data.CPU 601、ROM
602 and RAM 603 is connected with each other by bus 604.Input/output (I/O, Input/Output) interface 605 is also connected to
Bus 604.
I/O interfaces 605 are connected to lower component:Storage part 606 including hard disk etc.;And including such as LAN (locals
Net, Local Area Network) card, modem etc. NIC communications portion 607.Communications portion 607 is passed through
Communication process is performed by the network of such as internet.Driver 608 is also according to needing to be connected to I/O interfaces 605.Detachable media
609, such as disk, CD, magneto-optic disk, semiconductor memory etc., as needed be arranged on driver 608 on, in order to from
The computer program read thereon is mounted into storage part 606 as needed.
Especially, in accordance with an embodiment of the present disclosure, the process described above with reference to flow chart may be implemented as computer
Software program.For example, embodiment of the disclosure includes a kind of computer program product, it includes being carried on computer-readable medium
On computer program, the computer program include be used for execution flow chart shown in method program code.In such reality
Apply in example, the computer program can be downloaded and installed by communications portion 607 from network, and/or from detachable media
609 are mounted.When the computer program is performed by CPU (CPU) 601, perform what is limited in the present processes
Above-mentioned functions.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer-readable recording medium either the two any combination.Computer-readable recording medium for example can be --- but
Be not limited to --- electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor, or it is any more than combination.
The more specifically example of computer-readable recording medium can include but is not limited to:Electrical connection with one or more wires,
Portable computer diskette, hard disk, random access storage device (RAM), read-only storage (ROM), erasable type may be programmed read-only deposit
Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory
Part or above-mentioned any appropriate combination.In this application, computer-readable recording medium can any be included or store
The tangible medium of program, the program can be commanded execution system, device or device and use or in connection.And
In the application, computer-readable signal media can include believing in a base band or as the data of carrier wave part propagation
Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not
It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer
Any computer-readable medium beyond readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use
In by the use of instruction execution system, device or device or program in connection.Included on computer-readable medium
Program code any appropriate medium can be used to transmit, include but is not limited to:Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang
Any appropriate combination stated.
Flow chart and block diagram in accompanying drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
Architectural framework in the cards, function and the operation of sequence product.At this point, each square frame in flow chart or block diagram can generation
The part of one module of table, program segment or code, the part of the module, program segment or code is used comprising one or more
In the executable instruction for realizing defined logic function.It should also be noted that in some realizations as replacement, being marked in square frame
The function of note can also be with different from the order marked in accompanying drawing generation.For example, two square frames succeedingly represented are actually
It can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depending on involved function.Also to note
Meaning, the combination of each square frame in block diagram and/or flow chart and the square frame in block diagram and/or flow chart can be with holding
The special hardware based system of function or operation as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit can also be set within a processor, for example, can be described as:A kind of processor bag
Include acquiring unit, the first determining unit, the first generation unit, the second generation unit and the second determining unit.Wherein, these units
Title do not constitute restriction to the unit in itself under certain conditions, for example, the first determining unit is also described as
" unit for determining source term vector sequence ".
As on the other hand, present invention also provides a kind of computer-readable medium, the computer-readable medium can be
Included in device described in above-described embodiment;Can also be individualism, and without be incorporated the device in.Above-mentioned calculating
Machine computer-readable recording medium carries one or more program, when said one or multiple programs are performed by the device so that should
Device:The term vector table of acquisition source cutting word sequence and training in advance, wherein, above-mentioned term vector table is used to characterize word and term vector
Between corresponding relation;According to above-mentioned term vector table, it is determined that source term vector sequence corresponding with above-mentioned source segmenting word sequence;Will be upper
The first circulation neural network model that source term vector sequence imports training in advance is stated, is generated for characterizing above-mentioned source cutting word sequence
Semantic default dimension intermediate vector, wherein, above-mentioned first circulation neural network model be used for characterize term vector sequence with
Corresponding relation between the vector of above-mentioned default dimension;Above-mentioned intermediate vector is imported to the second circulation neutral net of training in advance
Model, generates target word sequence vector corresponding with above-mentioned intermediate vector, wherein, above-mentioned second circulation neural network model is used for
Characterize the corresponding relation between the vector of above-mentioned default dimension and term vector sequence;According to above-mentioned term vector table, it is determined that with it is above-mentioned
The corresponding target cutting word sequence of target word sequence vector, and above-mentioned target cutting word sequence is defined as and above-mentioned source segmenting word
The parallel text of the corresponding same language of sequence.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art
Member should be appreciated that invention scope involved in the application, however it is not limited to the technology of the particular combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, is carried out by above-mentioned technical characteristic or its equivalent feature
Other technical schemes formed by any combination.Such as features described above has similar work(with (but not limited to) disclosed herein
The technical characteristic of energy carries out technical scheme formed by replacement mutually.
Claims (16)
1. a kind of method for being used to generate the parallel text with language, it is characterised in that methods described includes:
The term vector table of acquisition source cutting word sequence and training in advance, wherein, the term vector table is used to characterize word and term vector
Between corresponding relation;
According to the term vector table, it is determined that source term vector sequence corresponding with the source segmenting word sequence;
The source term vector sequence is imported to the first circulation neural network model of training in advance, generates and is cut for characterizing the source
The intermediate vector of the semantic default dimension of segmentation sequence, wherein, the first circulation neural network model be used for characterize word to
Measure the corresponding relation between sequence and the vector of the default dimension;
The intermediate vector is imported to the second circulation neural network model of training in advance, generated corresponding with the intermediate vector
Target word sequence vector, wherein, the second circulation neural network model be used for characterize the default dimension vector with word to
Measure the corresponding relation between sequence;
According to the term vector table, it is determined that target cutting word sequence corresponding with the target word sequence vector, and by the mesh
Mark cutting word sequence is defined as the parallel text of corresponding with the source segmenting word sequence same language.
2. according to the method described in claim 1, it is characterised in that the acquisition source cutting word sequence and the word of training in advance to
Before scale, methods described also includes:
The inquiry request that user's using terminal is sent is received, the inquiry request includes query statement;
The query statement is pre-processed, the cutting word sequence of the query statement is obtained, the pretreatment includes participle
Processing and removal additional character;
Resulting cutting word sequence is defined as source cutting word sequence.
3. method according to claim 2, it is characterised in that it is described by the target cutting word sequence be defined as with it is described
After the parallel text of the corresponding same language of source segmenting word sequence, methods described also includes:
Scanned for according to the parallel text of the same language, obtain search result;
The search result is sent to the terminal.
4. according to any described method in claim 1-3, it is characterised in that the acquisition source cutting word sequence and in advance instruction
Before experienced term vector table, methods described also includes training step, and the training step includes:
Obtain at least one pair of with language parallel cutting word sequence, wherein, each pair with language parallel cutting word sequence include language phase
Same and semantic identical the first cutting word sequence and the second cutting word sequence;
Obtain default term vector table, default first circulation neural network model and default second circulation neutral net mould
Type;
For at least one pair of described each pair with language in parallel cutting word sequence with language parallel cutting word sequence, according to described
Default term vector table, determine corresponding first segmenting word of the first segmenting word sequence of this pair parallel cutting word sequence with language to
Measure sequence;The first segmenting word sequence vector is imported into the default first circulation neural network model, obtain with it is described
The vector of the corresponding default dimension of first segmenting word sequence vector;Resulting vector is imported into described default second to follow
Ring neural network model, is obtained and resulting vectorial corresponding second segmenting word sequence vector;According to the default word to
Scale, it is determined that word sequence corresponding with the second segmenting word sequence vector;According to resulting word sequence and this pair of same language
Different information between second cutting word sequence of parallel cutting word sequence, to the default term vector table, described default
First circulation neural network model and the default second circulation neural network model are adjusted;
By the default term vector table, the default first circulation neural network model and the default second circulation god
It is identified as training obtained term vector table, first circulation neural network model and second circulation neutral net through network model
Model.
5. method according to claim 4, it is characterised in that the first circulation neural network model and described second is followed
Ring neural network model is time Recognition with Recurrent Neural Network model.
6. method according to claim 5, it is characterised in that described according to the term vector table, it is determined that being cut with the source
The corresponding source term vector sequence of segmentation sequence, including:
To each segmenting word in the source cutting word sequence, inquired about in the term vector table word that is matched with the segmenting word to
Amount, and the term vector found is defined as in the source term vector sequence with the segmenting word in the source cutting word sequence
The corresponding source term vector in position identical position.
7. method according to claim 6, it is characterised in that described according to the term vector table, it is determined that with the target
The corresponding target cutting word sequence of term vector sequence, including:
For each target term vector in the target word sequence vector, from the term vector table choose with the target word to
Word corresponding to the similarity highest term vector of amount, selected word is defined as in the target cutting word sequence and the mesh
Mark position identical position corresponding target segmenting word of the term vector in the target word sequence vector.
8. a kind of device for being used to generate the parallel text with language, it is characterised in that described device includes:
Acquiring unit, is configured to acquisition source cutting word sequence and the term vector table of training in advance, wherein, the term vector table is used
Corresponding relation between sign word and term vector;
First determining unit, is configured to according to the term vector table, it is determined that source word corresponding with the source segmenting word sequence to
Measure sequence;
First generation unit, is configured to import the source term vector sequence first circulation neutral net mould of training in advance
Type, generates the intermediate vector of the semantic default dimension for characterizing the source cutting word sequence, wherein, the first circulation god
It is used to characterize the corresponding relation between term vector sequence and the vector of the default dimension through network model;
Second generation unit, is configured to import the intermediate vector second circulation neural network model of training in advance, raw
Into target word sequence vector corresponding with the intermediate vector, wherein, the second circulation neural network model is used to characterize institute
State the corresponding relation between the vector of default dimension and term vector sequence;
Second determining unit, is configured to according to the term vector table, it is determined that target corresponding with the target word sequence vector
Cutting word sequence, and the target cutting word sequence is defined as the parallel text of corresponding with the source segmenting word sequence same language
This.
9. device according to claim 8, it is characterised in that described device also includes:
Receiving unit, is configured to receive the inquiry request that user's using terminal is sent, the inquiry request includes query statement;
Pretreatment unit, is configured to pre-process the query statement, obtains the cutting word sequence of the query statement,
The pretreatment includes word segmentation processing and removes additional character;
3rd determining unit, is configured to resulting cutting word sequence being defined as source cutting word sequence.
10. device according to claim 9, it is characterised in that described device also includes:
Search unit, is configured to be scanned for according to the parallel text of the same language, obtains search result;
Transmitting element, is configured to send the search result to the terminal.
11. according to any described device in claim 8-10, it is characterised in that described device also includes training unit, institute
Stating training unit includes:
First acquisition module, be configured to obtain at least one pair of with language parallel cutting word sequence, wherein, each pair is parallel with language
Cutting word sequence includes identical language and semantic identical the first cutting word sequence and the second cutting word sequence;
Second acquisition module, is configured to obtain default term vector table, default first circulation neural network model and presets
Second circulation neural network model;
Adjusting module, is configured to cut at least one pair of described each pair with language in parallel cutting word sequence is parallel with language
Segmentation sequence, according to the default term vector table, determines the first cutting word sequence of this pair parallel cutting word sequence with language
Corresponding first segmenting word sequence vector;The first segmenting word sequence vector is imported into the default first circulation nerve net
Network model, obtains the vector of the default dimension corresponding with the first segmenting word sequence vector;Resulting vector is led
Enter the default second circulation neural network model, obtain and resulting vectorial corresponding second segmenting word sequence vector;
According to the default term vector table, it is determined that word sequence corresponding with the second segmenting word sequence vector;According to resulting
Different information between word sequence and the second cutting word sequence of this pair parallel cutting word sequence with language, to the default word
Vector table, the default first circulation neural network model and the default second circulation neural network model are adjusted
It is whole;
Determining module, is configured to the default term vector table, the default first circulation neural network model and institute
Default second circulation neural network model is stated to be identified as training obtained term vector table, first circulation neural network model
With second circulation neural network model.
12. device according to claim 11, it is characterised in that the first circulation neural network model and described second
Recognition with Recurrent Neural Network model is time Recognition with Recurrent Neural Network model.
13. device according to claim 12, it is characterised in that first determining unit is further configured to:
To each segmenting word in the source cutting word sequence, inquired about in the term vector table word that is matched with the segmenting word to
Amount, and the term vector found is defined as in the source term vector sequence with the segmenting word in the source cutting word sequence
The corresponding source term vector in position identical position.
14. device according to claim 13, it is characterised in that second determining unit is further configured to:
For each target term vector in the target word sequence vector, from the term vector table choose with the target word to
Word corresponding to the similarity highest term vector of amount, selected word is defined as in the target cutting word sequence and the mesh
Mark position identical position corresponding target segmenting word of the term vector in the target word sequence vector.
15. a kind of electronic equipment, including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are by one or more of computing devices so that one or more of processors
Realize the method as described in any in claim 1-7.
16. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor
The method as described in any in claim 1-7 is realized during execution.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710464118.3A CN107273503B (en) | 2017-06-19 | 2017-06-19 | Method and device for generating parallel text in same language |
US15/900,166 US10650102B2 (en) | 2017-06-19 | 2018-02-20 | Method and apparatus for generating parallel text in same language |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710464118.3A CN107273503B (en) | 2017-06-19 | 2017-06-19 | Method and device for generating parallel text in same language |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107273503A true CN107273503A (en) | 2017-10-20 |
CN107273503B CN107273503B (en) | 2020-07-10 |
Family
ID=60068971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710464118.3A Active CN107273503B (en) | 2017-06-19 | 2017-06-19 | Method and device for generating parallel text in same language |
Country Status (2)
Country | Link |
---|---|
US (1) | US10650102B2 (en) |
CN (1) | CN107273503B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108170676A (en) * | 2017-12-27 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | Method, system and the terminal of story creation |
CN108268442A (en) * | 2017-12-19 | 2018-07-10 | 芋头科技(杭州)有限公司 | A kind of sentence Intention Anticipation method and system |
CN108763277A (en) * | 2018-04-10 | 2018-11-06 | 平安科技(深圳)有限公司 | A kind of data analysing method, computer readable storage medium and terminal device |
CN108959467A (en) * | 2018-06-20 | 2018-12-07 | 华东师范大学 | A kind of calculation method of question sentence and the Answer Sentence degree of correlation based on intensified learning |
WO2019080648A1 (en) * | 2017-10-26 | 2019-05-02 | 华为技术有限公司 | Retelling sentence generation method and apparatus |
CN109858004A (en) * | 2019-02-12 | 2019-06-07 | 四川无声信息技术有限公司 | Text Improvement, device and electronic equipment |
WO2019149076A1 (en) * | 2018-02-05 | 2019-08-08 | 阿里巴巴集团控股有限公司 | Word vector generation method, apparatus and device |
CN110472251A (en) * | 2018-05-10 | 2019-11-19 | 腾讯科技(深圳)有限公司 | Method, the method for statement translation, equipment and the storage medium of translation model training |
CN111291563A (en) * | 2020-01-20 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Word vector alignment method and training method of word vector alignment model |
CN111353039A (en) * | 2018-12-05 | 2020-06-30 | 北京京东尚科信息技术有限公司 | File class detection method and device |
CN111950272A (en) * | 2020-06-23 | 2020-11-17 | 北京百度网讯科技有限公司 | Text similarity generation method and device and electronic equipment |
CN112883295A (en) * | 2019-11-29 | 2021-06-01 | 北京搜狗科技发展有限公司 | Data processing method, device and medium |
CN113449515A (en) * | 2021-01-27 | 2021-09-28 | 心医国际数字医疗系统(大连)有限公司 | Medical text prediction method and device and electronic equipment |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133202A (en) * | 2017-06-01 | 2017-09-05 | 北京百度网讯科技有限公司 | Text method of calibration and device based on artificial intelligence |
CN109614492B (en) * | 2018-12-29 | 2024-06-18 | 平安科技(深圳)有限公司 | Text data enhancement method, device, equipment and storage medium based on artificial intelligence |
CN110321537B (en) * | 2019-06-11 | 2023-04-07 | 创新先进技术有限公司 | Method and device for generating file |
CN111797622B (en) * | 2019-06-20 | 2024-04-09 | 北京沃东天骏信息技术有限公司 | Method and device for generating attribute information |
CN110442874B (en) * | 2019-08-09 | 2023-06-13 | 南京邮电大学 | Chinese word sense prediction method based on word vector |
CN110866395B (en) * | 2019-10-30 | 2023-05-05 | 语联网(武汉)信息技术有限公司 | Word vector generation method and device based on translator editing behaviors |
CN110866404B (en) * | 2019-10-30 | 2023-05-05 | 语联网(武汉)信息技术有限公司 | Word vector generation method and device based on LSTM neural network |
CN113627135B (en) * | 2020-05-08 | 2023-09-29 | 百度在线网络技术(北京)有限公司 | Recruitment post description text generation method, device, equipment and medium |
CN111753551B (en) * | 2020-06-29 | 2022-06-14 | 北京字节跳动网络技术有限公司 | Information generation method and device based on word vector generation model |
CN113836950B (en) * | 2021-09-22 | 2024-04-02 | 广州华多网络科技有限公司 | Commodity title text translation method and device, equipment and medium thereof |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1672149A (en) * | 2002-05-31 | 2005-09-21 | 埃里·阿博 | Word association method and apparatus |
CN1720524A (en) * | 2002-10-29 | 2006-01-11 | 埃里·阿博 | Knowledge system method and apparatus |
US20110202512A1 (en) * | 2010-02-14 | 2011-08-18 | Georges Pierre Pantanelli | Method to obtain a better understanding and/or translation of texts by using semantic analysis and/or artificial intelligence and/or connotations and/or rating |
CN104598611A (en) * | 2015-01-29 | 2015-05-06 | 百度在线网络技术(北京)有限公司 | Method and system for sequencing search entries |
CN104699763A (en) * | 2015-02-11 | 2015-06-10 | 中国科学院新疆理化技术研究所 | Text similarity measuring system based on multi-feature fusion |
CN105930314A (en) * | 2016-04-14 | 2016-09-07 | 清华大学 | Text summarization generation system and method based on coding-decoding deep neural networks |
CN106407381A (en) * | 2016-09-13 | 2017-02-15 | 北京百度网讯科技有限公司 | Method and device for pushing information based on artificial intelligence |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10360901B2 (en) * | 2013-12-06 | 2019-07-23 | Nuance Communications, Inc. | Learning front-end speech recognition parameters within neural network training |
CN105701120B (en) * | 2014-11-28 | 2019-05-03 | 华为技术有限公司 | The method and apparatus for determining semantic matching degree |
KR102167719B1 (en) * | 2014-12-08 | 2020-10-19 | 삼성전자주식회사 | Method and apparatus for training language model, method and apparatus for recognizing speech |
CN106610972A (en) * | 2015-10-21 | 2017-05-03 | 阿里巴巴集团控股有限公司 | Query rewriting method and apparatus |
CN106844368B (en) * | 2015-12-03 | 2020-06-16 | 华为技术有限公司 | Method for man-machine conversation, neural network system and user equipment |
US10453074B2 (en) * | 2016-07-08 | 2019-10-22 | Asapp, Inc. | Automatically suggesting resources for responding to a request |
-
2017
- 2017-06-19 CN CN201710464118.3A patent/CN107273503B/en active Active
-
2018
- 2018-02-20 US US15/900,166 patent/US10650102B2/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1672149A (en) * | 2002-05-31 | 2005-09-21 | 埃里·阿博 | Word association method and apparatus |
CN1720524A (en) * | 2002-10-29 | 2006-01-11 | 埃里·阿博 | Knowledge system method and apparatus |
US20110202512A1 (en) * | 2010-02-14 | 2011-08-18 | Georges Pierre Pantanelli | Method to obtain a better understanding and/or translation of texts by using semantic analysis and/or artificial intelligence and/or connotations and/or rating |
CN104598611A (en) * | 2015-01-29 | 2015-05-06 | 百度在线网络技术(北京)有限公司 | Method and system for sequencing search entries |
CN104699763A (en) * | 2015-02-11 | 2015-06-10 | 中国科学院新疆理化技术研究所 | Text similarity measuring system based on multi-feature fusion |
CN105930314A (en) * | 2016-04-14 | 2016-09-07 | 清华大学 | Text summarization generation system and method based on coding-decoding deep neural networks |
CN106407381A (en) * | 2016-09-13 | 2017-02-15 | 北京百度网讯科技有限公司 | Method and device for pushing information based on artificial intelligence |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710915A (en) * | 2017-10-26 | 2019-05-03 | 华为技术有限公司 | Repeat sentence generation method and device |
US11586814B2 (en) | 2017-10-26 | 2023-02-21 | Huawei Technologies Co., Ltd. | Paraphrase sentence generation method and apparatus |
CN109710915B (en) * | 2017-10-26 | 2021-02-23 | 华为技术有限公司 | Method and device for generating repeated statement |
WO2019080648A1 (en) * | 2017-10-26 | 2019-05-02 | 华为技术有限公司 | Retelling sentence generation method and apparatus |
CN108268442A (en) * | 2017-12-19 | 2018-07-10 | 芋头科技(杭州)有限公司 | A kind of sentence Intention Anticipation method and system |
CN108170676B (en) * | 2017-12-27 | 2019-05-10 | 百度在线网络技术(北京)有限公司 | Method, system and the terminal of story creation |
CN108170676A (en) * | 2017-12-27 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | Method, system and the terminal of story creation |
WO2019149076A1 (en) * | 2018-02-05 | 2019-08-08 | 阿里巴巴集团控股有限公司 | Word vector generation method, apparatus and device |
US10824819B2 (en) | 2018-02-05 | 2020-11-03 | Alibaba Group Holding Limited | Generating word vectors by recurrent neural networks based on n-ary characters |
CN108763277B (en) * | 2018-04-10 | 2023-04-18 | 平安科技(深圳)有限公司 | Data analysis method, computer readable storage medium and terminal device |
CN108763277A (en) * | 2018-04-10 | 2018-11-06 | 平安科技(深圳)有限公司 | A kind of data analysing method, computer readable storage medium and terminal device |
CN110472251B (en) * | 2018-05-10 | 2023-05-30 | 腾讯科技(深圳)有限公司 | Translation model training method, sentence translation equipment and storage medium |
CN110472251A (en) * | 2018-05-10 | 2019-11-19 | 腾讯科技(深圳)有限公司 | Method, the method for statement translation, equipment and the storage medium of translation model training |
CN108959467A (en) * | 2018-06-20 | 2018-12-07 | 华东师范大学 | A kind of calculation method of question sentence and the Answer Sentence degree of correlation based on intensified learning |
CN108959467B (en) * | 2018-06-20 | 2021-10-15 | 华东师范大学 | Method for calculating correlation degree of question sentences and answer sentences based on reinforcement learning |
CN111353039A (en) * | 2018-12-05 | 2020-06-30 | 北京京东尚科信息技术有限公司 | File class detection method and device |
CN111353039B (en) * | 2018-12-05 | 2024-05-17 | 北京京东尚科信息技术有限公司 | File category detection method and device |
CN109858004A (en) * | 2019-02-12 | 2019-06-07 | 四川无声信息技术有限公司 | Text Improvement, device and electronic equipment |
CN112883295A (en) * | 2019-11-29 | 2021-06-01 | 北京搜狗科技发展有限公司 | Data processing method, device and medium |
CN112883295B (en) * | 2019-11-29 | 2024-02-23 | 北京搜狗科技发展有限公司 | Data processing method, device and medium |
CN111291563A (en) * | 2020-01-20 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Word vector alignment method and training method of word vector alignment model |
CN111291563B (en) * | 2020-01-20 | 2023-09-01 | 腾讯科技(深圳)有限公司 | Word vector alignment method and word vector alignment model training method |
CN111950272A (en) * | 2020-06-23 | 2020-11-17 | 北京百度网讯科技有限公司 | Text similarity generation method and device and electronic equipment |
CN111950272B (en) * | 2020-06-23 | 2023-06-27 | 北京百度网讯科技有限公司 | Text similarity generation method and device and electronic equipment |
CN113449515A (en) * | 2021-01-27 | 2021-09-28 | 心医国际数字医疗系统(大连)有限公司 | Medical text prediction method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107273503B (en) | 2020-07-10 |
US20180365231A1 (en) | 2018-12-20 |
US10650102B2 (en) | 2020-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107273503A (en) | Method and apparatus for generating the parallel text of same language | |
CN107729319B (en) | Method and apparatus for outputting information | |
US11151177B2 (en) | Search method and apparatus based on artificial intelligence | |
CN107168952A (en) | Information generating method and device based on artificial intelligence | |
CN107491534B (en) | Information processing method and device | |
US11501182B2 (en) | Method and apparatus for generating model | |
CN107783960A (en) | Method, apparatus and equipment for Extracting Information | |
CN107066449A (en) | Information-pushing method and device | |
CN107679039A (en) | The method and apparatus being intended to for determining sentence | |
CN105677931B (en) | Information search method and device | |
US10755048B2 (en) | Artificial intelligence based method and apparatus for segmenting sentence | |
CN107832305A (en) | Method and apparatus for generating information | |
CN108038469A (en) | Method and apparatus for detecting human body | |
CN107680579A (en) | Text regularization model training method and device, text regularization method and device | |
CN109933662A (en) | Model training method, information generating method, device, electronic equipment and computer-readable medium | |
CN107908789A (en) | Method and apparatus for generating information | |
CN110555714A (en) | method and apparatus for outputting information | |
CN108804450A (en) | The method and apparatus of information push | |
CN109766418B (en) | Method and apparatus for outputting information | |
CN107832468A (en) | Demand recognition methods and device | |
CN109190124A (en) | Method and apparatus for participle | |
CN107958247A (en) | Method and apparatus for facial image identification | |
CN107506434A (en) | Method and apparatus based on artificial intelligence classification phonetic entry text | |
CN109740167A (en) | Method and apparatus for generating information | |
CN107742128A (en) | Method and apparatus for output information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |