CN109359308A - Machine translation method, device and readable storage medium storing program for executing - Google Patents

Machine translation method, device and readable storage medium storing program for executing Download PDF

Info

Publication number
CN109359308A
CN109359308A CN201811286094.8A CN201811286094A CN109359308A CN 109359308 A CN109359308 A CN 109359308A CN 201811286094 A CN201811286094 A CN 201811286094A CN 109359308 A CN109359308 A CN 109359308A
Authority
CN
China
Prior art keywords
vocabulary
source
target
target side
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811286094.8A
Other languages
Chinese (zh)
Other versions
CN109359308B (en
Inventor
黄江泉
谢军
王明轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Wuhan Co Ltd
Original Assignee
Tencent Technology Wuhan Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Wuhan Co Ltd filed Critical Tencent Technology Wuhan Co Ltd
Priority to CN201811286094.8A priority Critical patent/CN109359308B/en
Publication of CN109359308A publication Critical patent/CN109359308A/en
Application granted granted Critical
Publication of CN109359308B publication Critical patent/CN109359308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

This application discloses a kind of machine translation method, device and readable storage medium storing program for executing, are related to machine translation field.This method comprises: receiving source sentence to be translated;Determine phrase table corresponding with the target domain;By machine learning model according to the phrase table by the source statement translation of first natural language be the second nature language object statement.By determining phrase table corresponding with target domain, and when being translated by machine learning model to source sentence, object statement is obtained to the translation of the source sentence using phrase table participation, translation accuracy is improved when the source sentence of target domain is translated in realization, and without being trained for different fields to different machine learning models, only needing, which can be realized by the phrase table in a general machine learning model combining target field, translates the source sentence of target domain, and translation efficiency is higher.

Description

Machine translation method, device and readable storage medium storing program for executing
Technical field
The invention relates to machine translation field, in particular to a kind of machine translation method, device and readable storage Medium.
Background technique
Machine translation is the sentence by computer by a kind of sentence translation of natural language at another natural language Interpretative system, in general, the machine translation is to be translated by trained machine learning model to sentence, schematically, machine After device learning model is by the way that largely translation corpus sample is trained, user should by Chinese sentence " room rate sustainable growth " input After machine learning, output is obtained translator of English " Thehousingpricescontinuedtorise ", and is directed to certain specific necks The machine translation of some vocabulary in domain, it is different from the interpretative system in common machine translation, such as: in some more formal reports In announcement, place name " Beijing " is translated into " Peking ", and in textbook, place name " Beijing " is translated into " Beijing ".
In the related technology, it for the machine translation of specific area, needs to be arranged specific machine learning model and is translated, Namely specific neck after being trained by the translation corpus sample of the specific area to machine learning model, after being trained The specific machine learning model in domain, and the sentence to be translated of the specific area is turned over using the specific machine learning model It translates.
However, when the field that the machine translation is related to is more then needing that a specific machine is arranged to each field Device learning model, namely need to be trained the machine learning model in each field respectively, it needs to expend in training process a large amount of Time and manpower, it is relatively complicated to the training process of machine learning model for the machine translation in each field.
Summary of the invention
The embodiment of the present application provides a kind of machine translation method, device and readable storage medium storing program for executing, can solve for every The machine translation in a field problem relatively complicated to the training process of machine learning model.The technical solution is as follows:
On the one hand, a kind of machine translation method is provided, which comprises
Source sentence to be translated is received, the source sentence is the sentence of target domain;
It determines phrase table corresponding with the target domain, includes the source vocabulary of the target domain in the phrase table With the corresponding relationship of target side vocabulary, each source vocabulary is corresponding at least one target side vocabulary, the source vocabulary The first natural language corresponding with the source sentence, the target side vocabulary correspond to the second nature language;
It according to the phrase table is described by the source statement translation of first natural language by machine learning model The object statement of the second nature language.
On the other hand, a kind of machine translation apparatus is provided, described device includes:
Receiving module, for receiving source sentence to be translated, the source sentence is the sentence of target domain;
Determining module includes the target in the phrase table for determining phrase table corresponding with the target domain The source vocabulary in field and the corresponding relationship of target side vocabulary, each source vocabulary and at least one target side vocabulary pair It answers, the source vocabulary and corresponding first natural language of the source sentence, the target side vocabulary correspond to the second nature language;
Translation module, for by machine learning model according to the phrase table by the source language of first natural language Sentence is translated as the object statement of the second nature language.
On the other hand, a kind of server is provided, the server includes processor and memory, is deposited in the memory Contain at least one instruction, at least one section of program, code set or instruction set, at least one instruction, at least one section of journey Sequence, the code set or instruction set are loaded as the processor and are executed with the machine as described in above-mentioned the embodiment of the present application of realization Device interpretation method.
On the other hand, a kind of computer readable storage medium is provided, at least one finger is stored in the storage medium Enable, at least one section of program, code set or instruction set, at least one instruction, at least one section of program, the code set or Instruction set is loaded as the processor and is executed to realize the machine translation method as described in above-mentioned the embodiment of the present application.
On the other hand, a kind of computer program product is provided, when the computer program product is run on computers When, so that computer executes the machine translation method as described in above-mentioned the embodiment of the present application.
Technical solution bring beneficial effect provided by the embodiments of the present application includes at least:
By determining phrase table corresponding with target domain, and source sentence is being translated by machine learning model When, participate in obtaining object statement to the translation of the source sentence using the phrase table, realize to the source sentence of target domain into Translation accuracy is improved when row translation, and without being trained for different fields to different machine learning models, it is only necessary to The source sentence to target domain can be realized by the phrase table in a general machine learning model combining target field It is translated, translation efficiency is higher.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the corresponding relationship signal of the source vocabulary that one exemplary embodiment of the application provides and target side vocabulary Figure;
Fig. 2 is the schematic diagram for the machine translation system that one exemplary embodiment of the application provides;
Fig. 3 is the machine translation method flow chart that one exemplary embodiment of the application provides;
Fig. 4 is the structural representation of the neural network model for the machine translation method that one exemplary embodiment of the application provides Figure;
Fig. 5 is that the structure of the neural network model for the machine translation method that another exemplary embodiment of the application provides is shown It is intended to;
Fig. 6 is the machine translation method flow chart that another exemplary embodiment of the application provides;
Fig. 7 is that the structure of the neural network model for the machine translation method that another exemplary embodiment of the application provides is shown It is intended to;
Fig. 8 is that the structure of the neural network model for the machine translation method that another exemplary embodiment of the application provides is shown It is intended to;
Fig. 9 is the machine translation method flow chart that another exemplary embodiment of the application provides;
Figure 10 is the terminal interface schematic diagram for the machine translation method that one exemplary embodiment of the application provides;
Figure 11 is the structural block diagram for the machine translation apparatus that one exemplary embodiment of the application provides;
Figure 12 is the structural block diagram for the machine translation apparatus that another exemplary embodiment of the application provides;
Figure 13 is the structural block diagram for the server that one exemplary embodiment of the application provides.
Specific embodiment
To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with attached drawing to the application embodiment party Formula is described in further detail.
Firstly, simply being introduced noun involved in the application:
Machine translation: refer to the sentence by computer by a kind of sentence translation of natural language at another natural language Interpretative system.In general, the machine translation is to be translated by trained machine learning model to sentence, schematically, Machine learning model is trained by largely translating corpus sample, includes multiple groups Chinese corpus in the translation corpus sample With the corresponding relationship of English corpus, corresponding one English corpus of each Chinese corpus is as translation result, after the completion of training, user After Chinese sentence " room rate sustainable growth " is inputted the machine learning model, output obtains translator of English “Thehousingpricescontinuedtorise”。
Optionally, above-mentioned machine learning model can be implemented as neural network model, support vector machines (Support Vector Machine, SVM), models, the embodiment of the present application such as decision tree (Decision Tree, DT) do not limit this It is fixed, it is illustrated so that the machine learning model is neural network model as an example in the embodiment of the present application.
Phrase table: referring to the correspondence table of the corresponding relationship including source vocabulary and target side vocabulary, optionally, in the machine of progress Device translate when, can by machine learning model according to the phrase table by the source statement translation of the first natural language be second from The target side sentence of right language.Optionally, each source vocabulary in the phrase table is corresponding at least one target side vocabulary, In, corresponding first natural language of source vocabulary, target side vocabulary corresponds to the second nature language, and source vocabulary and/or target side Vocabulary is also implemented as phrase, such as: the first natural language is Chinese, and the second nature language is English, then source vocabulary " north The corresponding target side vocabulary in capital " includes " Beijing " " Peking " and " capitalofChina ".Optionally, the phrase table It can also be known as a large amount of word phrases tables (LargeVocabulary, LV phrase table).Optionally, which can also be directed to Different fields is defined source vocabulary and target side vocabulary, generates the corresponding different phrase table of different field, such as: specially Sharp field corresponds to phrase table 1, tour field corresponds to phrase table 2 and teaching field corresponds to phrase table 3.
Optionally, the source vocabulary in the phrase table and target side vocabulary are by source sentence and corresponding target language What sentence obtained after being segmented, schematically, referring to FIG. 1, source sentence is " the room rate sustainable growth of the city C ", corresponding mesh Marking end sentence is " C cityhousingpricescontinuedtorise ", wherein it is found that " city C " is right in corresponding table 11 It answers " C city ", " room rate " correspondence " housingprices ", " lasting " correspondence " continued ", " growth " correspondence " rise ", Then " city C room rate " correspondence " C cityhousingprices ", " room rate is lasting " correspondence " housingpricescontinued ", " sustainable growth " correspondence " continuedtorise ", " city C room rate continues " correspondence " C Cityhousingpricescontinued ", " room rate sustainable growth " correspondence " housingpricescontinuedtorise ", " the room rate sustainable growth of the city C " correspondence " C cityhousingpricescontinuedtorise ", wherein by " to " be preposition, Practical significance is not had, therefore without corresponding relationship.
Secondly, schematical, this application involves application scenarios include at least following scene:
The first: including the certain functional modules of multiple fields in machine translation application program, such as: " primary school's material is turned over Translate function ", " patent material interpretative function " and " tourism material interpretative function ", when user needs to the content in primary school book When being translated, primary school's material interpretative function is selected, and input source sentence A, when the machine translation application program is to the source When end sentence A is translated, source sentence A is translated by primary school's material corresponding phrase table.
Second: machine learning model provides terminal for providing a user the corresponding general-purpose version machine learning of machine translation Model provides the general-purpose version machine learning model and the corresponding phrase table of tourist industry for the user for being engaged in tourist industry, for The user for being engaged in primary school education industry provides the general-purpose version machine learning model and the corresponding phrase table of primary school education industry, for from The user of thing patent industry provides the general-purpose version machine learning model and the corresponding phrase table of patent industry.
It is worth noting that, above-mentioned application scenarios are only schematical citing, in actual operation, pass through machine learning mould Type realizes that the application scenarios of machine translation can use the machine translation method provided in the embodiment of the present application according to phrase table, The embodiment of the present application is not limited this.
It is worth noting that, the embodiment of the present application may be implemented in the terminal, also may be implemented in the server, it can be with It is realized jointly by terminal and server, as shown in Fig. 2, terminal 21 is for generating source sentence to be translated, and by the source language Sentence is sent to server 22, after server 22 translates the source sentence, translation result is sent to terminal 21 and is opened up Show.Optionally, it is attached between terminal 21 and server 22 by communication network, which can be cable network It can be wireless network, the embodiment of the present application is not limited this.
Schematically, the machine learning model and at least one phrase table for machine translation are stored in server 22, After user inputs the source sentence " city C room rate continuous rise " for needing to translate in terminal 21, terminal 21 sends out the source sentence It send to server 22, obtains target after being translated to the source sentence by machine learning model and phrase table by server 22 Sentence, and the object statement is sent to terminal 21 and is shown.
In conjunction with above-mentioned application scenarios to the invention relates to machine translation method be illustrated, Fig. 3 is the application The machine translation method flow chart that one exemplary embodiment provides, is applied in this way in server 22 as shown in Figure 2 For be illustrated, as shown in figure 3, the machine translation method includes:
Step 301, source sentence to be translated is received.
The source sentence is the sentence of target domain.Optionally, which is the sentence to be translated of user's input, Optionally, which is also possible to user and chooses generation when browsing word content, such as: user chooses when browsing article After word content " city C room rate continuous rise ", after selected text translation option, which is source sentence.It can Selection of land determines that the mode that the source sentence is the sentence of target domain includes such as any one under type:
The first, user selects the corresponding machine translation function of the target domain in machine translation application program or webpage Can, source input by sentence target domain to be translated of user's input is determined according to the target domain that user selects;
Second, phrase table corresponding with target domain is only stored in server, then it is assumed that the server received Source sentence to be translated is all the sentence of the target domain.
Optionally, the field of source sentence can be divided according to the application scenarios of source sentence, can also be according to source The formal degree of word of end sentence is divided.
Step 302, phrase table corresponding with target domain is determined.
Optionally, the corresponding relationship of the source vocabulary in the phrase table including the target domain and target side vocabulary, each Source vocabulary is corresponding at least one target side vocabulary, source vocabulary and corresponding first natural language of source sentence, target terminal word It converges and corresponds to the second nature language.
Optionally, which is the phrase table corresponding with the target domain prestored in server.
It step 303, is second by the source statement translation of the first natural language according to phrase table by neural network model The object statement of natural language.
It optionally, is the object statement of the second nature language by the source statement translation of the first natural language according to phrase table When, including any one in such as under type:
First, it include target classification matrix in neural network model, by the target classification matrix to each in phrase table The probability that target side vocabulary generates object statement is determined, until obtaining a complete object statement;
Second, n source vocabulary corresponding with source sentence is searched in phrase table, in source vocabulary and target side vocabulary Corresponding relationship in determine corresponding with n source vocabulary m target side vocabulary, wherein m target side word combination is as mesh End vocabulary is marked, m and n are positive integer, by neural network model according to target side vocabulary by the source of the first natural language Statement translation is the object statement of the second nature language.
Optionally, above-mentioned neural network model can be deep learning model.Optionally, above-mentioned neural network model is base In the neural network model of attention mechanism, which can be Recognition with Recurrent Neural Network model (RecurrentNeuralNetwork, RNN) is also possible to convolutional neural networks model (Convolution NeuralNetwork, CNN), it can also be based on the neural network machine translation model from attention (Self-Attention) (Neural Machine Translation, NMT), can also be used in mixed way RNN, CNN and NMT model, and the application is implemented Example is not limited this.
Optionally, above-mentioned target classification matrix is softmax matrix, which can be implemented as above-mentioned nerve A functional layer in network model.
Optionally, it is illustrated by taking neural network model as an example in the present embodiment, which can also realize For other machines learning model, the embodiment of the present application is not limited this.
In conclusion machine translation method provided in this embodiment, by determining phrase table corresponding with target domain, and When being translated by neural network model to source sentence, the translation of the source sentence is obtained using phrase table participation Object statement improves translation accuracy when the source sentence of target domain is translated in realization, and without being directed to different necks Domain is trained different neural network models, it is only necessary to pass through general neural network model combining target field Phrase table, which can be realized, translates the source sentence of target domain, and translation efficiency is higher.
Fig. 4 and Fig. 5 are please referred to, it illustrates pass through the neural network model based on attention mechanism in the embodiment of the present application According to the schematic diagram that target side vocabulary translates source sentence, first to include in phrase table in the target side vocabulary All target side vocabulary for be illustrated, it is assumed that hidden layer size is H, the target side vocabulary for including in target side vocabulary Quantity is n (Y1To Yn), then hidden state htThe vector for being H for length, softmax are the matrix that dimension is H × n, the softmax As target classification matrix, htIt is by h1、h2、h3Until ht-1What common determination obtained, by the hidden state htMultiplied by softmax square After battle array, obtaining the vector that length is n is htThe probability value namely Y of each target side vocabulary in corresponding target side vocabulary1 Corresponding probability d1, Y2Corresponding probability d2, and so on, determine that a target side vocabulary of maximum probability participates in subsequent time shape State ht+1Generation.
In an alternative embodiment, target side vocabulary is the vocabulary in phrase table corresponding with source vocabulary, and Fig. 6 is The machine translation method flow chart that another exemplary embodiment of the application provides, is applied in this way as shown in Figure 2 It is illustrated in server 22, as shown in fig. 6, the machine translation method includes:
Step 601, source sentence to be translated is received.
The source sentence is the sentence of target domain.Optionally, which is the sentence to be translated of user's input, Optionally, which is also possible to user and chooses generation when browsing word content, such as: user chooses when browsing article After word content " city C room rate continuous rise ", after selected text translation option, which is source sentence.
Step 602, phrase table corresponding with target domain is determined.
Optionally, the corresponding relationship of the source vocabulary in the phrase table including the target domain and target side vocabulary, each Source vocabulary is corresponding at least one target side vocabulary, source vocabulary and corresponding first natural language of source sentence, target terminal word It converges and corresponds to the second nature language, that is, the language form of source vocabulary and source sentence is the first natural language, and target terminal word The language form of remittance is the second nature language.
Step 603, n source vocabulary corresponding with source sentence is searched in phrase table.
Optionally, which is also implemented as the form of phrase.
Schematically, source sentence is " city c room rate continuous rise ", then the source vocabulary is corresponding n in phrase table Source vocabulary is respectively as follows: " city c ", " room rate ", " lasting ", " rise ", " city c room rate ", " room rate is lasting ", " continuous rise ", " c City's room rate continue ", " room rate continuous rise ", " city c room rate continuous rise ", i.e., the source sentence is corresponding with 10 in phrase table Source vocabulary.
Optionally, it when searching n source vocabulary corresponding with source sentence in phrase table, needs first to source sentence After progress word segmentation processing obtains at least one participle vocabulary, n source including at least one participle vocabulary is searched in phrase table Terminal word converges, and for the source sentence " city c room rate continuous rise " of the example above, is aligned after being segmented, at least one obtained Segmenting vocabulary is " city c ", " room rate ", " lasting ", " rise ", then searching in phrase table includes this four sources for segmenting vocabulary Vocabulary.
Step 604, m target side vocabulary corresponding with n source vocabulary is determined in corresponding relationship.
Optionally, which becomes target side vocabulary.
Optionally, since a source vocabulary is corresponding at least one target side vocabulary, therefore determined in corresponding relationship The quantity of target side vocabulary and the quantity of source vocabulary may not wait.Optionally, this is according to n source vocabulary in corresponding relationship When determining m target side vocabulary, k target side vocabulary corresponding with n source vocabulary can be first determined in corresponding relationship, it should Include the vocabulary occurred at least twice in k target side vocabulary, duplicate removal processing is carried out to the k target side vocabulary, obtains m mesh Terminal word is marked to converge.
Schematically, by taking source sentence is " room rate continuous rise " as an example, the vocabulary in phrase table includes:
After then " room rate continuous rise " is segmented, source vocabulary corresponding with the source sentence includes " room rate ", " holds It is continuous ", " rise ", " room rate is lasting ", " continuous rise ", " room rate continuous rise ", corresponding target is determined according to the source vocabulary Holding vocabulary includes " continued cost go going growth house houses housing increasing keep last move over persist prices pricing rents rise risen rises rising seen It include 27 target side vocabulary in shot soaring to up years " the target side vocabulary, and 27 target vocabularies It is 27 target vocabularies obtained after duplicate removal processing.
Step 605, the target classification matrix in neural network model is determined according to target side vocabulary.
Optionally, preliminary classification matrix is filtered according to target side vocabulary, obtains target classification matrix, this is initial Include the corresponding target side vocabulary at least two fields in classification matrix, includes above-mentioned target domain at least two field.
Namely when being translated by preliminary classification matrix to source sentence, need to targets all at least two fields The probability of terminal word remittance composition object statement is determined, and is filtered after obtaining target classification matrix to preliminary classification matrix, The probability to the vocabulary composition object statement in target side vocabulary is only needed to be determined.
Optionally, when being filtered according to target side vocabulary to preliminary classification matrix, according to the target side vocabulary The dimension of preliminary classification matrix is reduced, the file of the target classification matrix corresponds to the target in each target side vocabulary Terminal word converges, and the file dimension of preliminary classification matrix is corresponding with the quantity of target side vocabulary all at least two fields, and filters The file dimension of target classification matrix afterwards is corresponding with the quantity of target side vocabulary in target side vocabulary.
Step 606, by source input by sentence neural network model, output obtains object statement.
It optionally, include above-mentioned filtered target classification matrix in the neural network model.
Schematically, in conjunction with the above-mentioned explanation for Fig. 4, to according to the corresponding target side vocabulary of source sentence to source Sentence carries out translation and is illustrated, in the example above, source sentence is is illustrated for " room rate continuous rise ", target side It include 27 vocabulary in vocabulary, referring to FIG. 7, according to filtered target side vocabulary (Y in Fig. 71To Yn’) determine Softmax ' classification matrix, namely original softmax target classification matrix are the probability to all target side vocabulary in phrase table It is determined, which carried out really to the probability of the target side vocabulary in filtered target side vocabulary Fixed, in conjunction with above-mentioned source sentence " room rate continuous rise ", which is to filtered 27 target terminal words The probability of remittance be determined namely Fig. 4 in Y1To YnBy Y1To Yn’Replacing, the softmax in Fig. 4 is replaced by softmax ', and Hidden layer size can not change.Assuming that hidden layer size is H, the quantity for the target side vocabulary for including in target side vocabulary is n’(Y1To Yn’), then hidden state htThe vector for being H for length, softmax ' are the matrixes that filtered dimension is H × n ', should Softmax ' is classification matrix, htIt is by h1、h2、h3Until ht-1What common determination obtained, by the hidden state htMultiplied by After softmax ' matrix, obtaining the vector that length is n ' is htEach target side vocabulary is general in corresponding target side vocabulary Rate value namely Y1Corresponding probability d1, Y2Corresponding probability d2, and so on, determine a target side vocabulary ginseng of maximum probability With subsequent time state ht+1Generation.It is worth noting that, above-mentioned Y1To YnAnd Y1To Yn’Only express mesh in target side vocabulary The quantity for marking vocabulary does not refer specifically to for some or certain some vocabulary yet.
Optionally, as above-mentioned Y1To YnAnd Y1To Yn’When expressing the target vocabulary in target side vocabulary, schematically, Y1 For Beijing, Y2For Peking, Y3For capital, Y4For house, Y5For prices, Y6For persist, then filtered Y1 To Yn’When including Peking, capital, then Y1To Yn’Do not express Y1To Y2, but it is expressed as Y2And Y3
As shown in figure 8, softmax ' the matrix is the matrix of H × n ', by hidden layer htWith the softmax ' matrix multiple, obtain To d1To dn’Probability size, wherein d1Corresponding Y1, d2Corresponding Y2, and so on, wherein hidden layer htIn include hidden layer htInstitute Part of speech, the lexical feature etc. of corresponding target side vocabulary, such as: the htIt corresponds to " I ", then hidden layer htFor indicating the target terminal word Converging is first person vocabulary, is subject.
Schematically, when determining highest 1 target vocabulary of probability for each hidden layer, it is assumed that target vocabulary is big Small is 3, and the vocabulary for including in the target vocabulary is a, b, c, when generating first word, determines probability by target classification matrix Maximum word is a, generates hidden layer h for a as input parameter2, and h is directed to by target classification matrix2Determine next probability most Big word is c, generates hidden layer h for c as input parameter3, and h is directed to by target classification matrix3Determine next maximum probability Word be b, be successively determined until obtain object statement be acbc.
Schematically, when being translated to source sentence " room rate continuous rise " by neural network model, for hidden layer h1The target side vocabulary for obtaining first maximum probability is housing, generates hidden layer h for housing as input parameter2, and lead to Target classification matrix is crossed for h2The word prices for continuing to determine next maximum probability is generated prices as input parameter Hidden layer h3, h is directed to by target classification matrix3The word for continuing to determine the next maximum probability of prices is continued, then will Continued generates hidden layer h as input parameter4, and next target vocabulary in object statement is determined, until most Obtaining object statement eventually is " Housingpricescontinuedtorise ".
Optionally, it is illustrated by taking neural network model as an example in the present embodiment, which can also realize For other machines learning model, the embodiment of the present application is not limited this.
In conclusion machine translation method provided in this embodiment, by determining phrase table corresponding with target domain, and When being translated by neural network model to source sentence, the translation of the source sentence is obtained using phrase table participation Object statement improves translation accuracy when the source sentence of target domain is translated in realization, and without being directed to different necks Domain is trained different neural network models, it is only necessary to pass through general neural network model combining target field Phrase table, which can be realized, translates the source sentence of target domain, and translation efficiency is higher.
Method provided in this embodiment is searched and source sentence in phrase table after segmenting to source sentence Corresponding n source vocabulary, and determine that m target side vocabulary corresponding with n source vocabulary as target side vocabulary, reduces The quantity of target side vocabulary in phrase table, translates source sentence according to the filtered target side vocabulary, turns over It is fast to translate speed, translation accuracy rate is high.
In an alternative embodiment, phrase table is generated or is obtained by filtration according to reference content, and Fig. 9 is this Shen Please another exemplary embodiment provide machine translation method flow chart, applied in this way in service as shown in Figure 2 It is illustrated in device 22, as shown in figure 9, the machine translation method includes:
Step 901, reference content is received.
Optionally, which is content corresponding with target domain, includes belonging to target domain in the reference content Corpus, which includes source corpus and corresponding with source corpus has translated corpus.
Optionally, which can be book, paper, report of the target domain etc., schematically, target neck Domain is News Field, then the reference content is that the text of news report arranges content, and it includes source language in content which, which arranges, Expect and has translated corpus.
Step 902, corresponding extraction is carried out to source corpus and the phrase for having translated in corpus, generates phrase table.
Step 903, initial phrase table is filtered according to reference content, obtains phrase table.
Optionally, which obtained after extracting to the corpus for belonging at least two fields, according to ginseng When examining content and being filtered to initial phrase table, the phrase occurred in reference content in initial phrase table can be protected It stays, gives up the phrase not occurred in reference content.
It is worth noting that, above-mentioned steps 901 to step 903 can be executed from step 904 to step 909 in different clothes It is engaged on device, can also execute on same server, can also be performed on same goods different terminals, the embodiment of the present application This is not limited.
It is worth noting that, above-mentioned steps 901 to step 903 be not each determination phrase table corresponding with target domain it Before require to carry out primary, and the phrase table is obtained after can executing in advance, and carry out in the source sentence to target domain The phrase table is directly acquired when machine translation to be applied.
Step 904, source sentence to be translated is received.
The source sentence is the sentence of target domain.Optionally, which is the sentence to be translated of user's input, Optionally, which is also possible to user and chooses generation when browsing word content, such as: user chooses when browsing article After word content " city C room rate continuous rise ", after selected text translation option, which is source sentence.
Step 905, phrase table corresponding with target domain is determined.
Optionally, the corresponding relationship of the source vocabulary in the phrase table including the target domain and target side vocabulary, each Source vocabulary is corresponding at least one target side vocabulary, source vocabulary and corresponding first natural language of source sentence, target terminal word It converges and corresponds to the second nature language.
Optionally, which is the phrase table corresponding with the target domain prestored in server.
Step 906, n source vocabulary corresponding with source sentence is searched in phrase table.
Optionally, it when searching n source vocabulary corresponding with source sentence in phrase table, needs first to source sentence After progress word segmentation processing obtains at least one participle vocabulary, n source including at least one participle vocabulary is searched in phrase table Terminal word converges.
Step 907, m target side vocabulary corresponding with n source vocabulary is determined in corresponding relationship.
Optionally, since a source vocabulary is corresponding at least one target side vocabulary, therefore determined in corresponding relationship The quantity of target side vocabulary and the quantity of source vocabulary may not wait, such as: the corresponding target side vocabulary packet of source vocabulary " Beijing " Include " Beijing " " Peking " and " capitalofChina ".Optionally, this is according to n source vocabulary in corresponding relationship When determining m target side vocabulary, k target side vocabulary corresponding with n source vocabulary can be first determined in corresponding relationship, it should Include the vocabulary occurred at least twice in k target side vocabulary, duplicate removal processing is carried out to the k target side vocabulary, obtains m mesh It marks terminal word to converge, that is, the quantity of target side vocabulary can be less than generally according to the quantity of the corresponding source vocabulary of source sentence.
Step 908, the target classification matrix in neural network model is determined according to target side vocabulary.
Optionally, preliminary classification matrix is filtered according to target side vocabulary, obtains target classification matrix, this is initial Include the corresponding target side vocabulary at least two fields in classification matrix, includes above-mentioned target domain at least two field.
Step 909, by source input by sentence neural network model, output obtains object statement.
It optionally, include above-mentioned filtered target classification matrix in the neural network model.
Optionally, it is illustrated by taking neural network model as an example in the present embodiment, which can also realize For other machines learning model, the embodiment of the present application is not limited this.
In conclusion machine translation method provided in this embodiment, by determining phrase table corresponding with target domain, and When translating by neural network model point to source sentence, translating to the source sentence is participated in using the phrase table Translation accuracy is improved to object statement, when the source sentence of target domain is translated in realization, and without for different Field is trained different neural network models, it is only necessary to be by a general neural network model combination phrase table It can be achieved to translate the source sentence of target domain, translation efficiency is higher.
Method provided in this embodiment, after being determined by the reference content of target domain to phrase table, in the phrase The corresponding target vocabulary of source sentence is determined in table, and source sentence is translated according to the target vocabulary, is improved The accuracy rate of target vocabulary in phrase table reduces the quantity of the target side vocabulary in target side vocabulary, after the filtering Target side vocabulary source sentence is translated, translation speed is fast, and translation accuracy rate is high.
In a schematical embodiment, referring to FIG. 10, being shown in the user interface 1010 of translation application There are three types of specific area interpretative functions, wherein specific area includes News Field 1011, patent field 1012 and Legal Translation Field 1013, wherein News Field 1011 is for translating source sentence according to the interpretative system of news, the News Field 1011 are corresponding with news phrase table, and patent field 1012, should for being translated according to the interpretative system of patent to source sentence Patent field 1012 is corresponding with patent phrase table, and Legal Translation field 1013 is used for the interpretative system according to legal document to source Sentence is translated, which is corresponding with law phrase table.User selects Legal Translation field 1013 After selecting, science of law translation interface 1020 is shown, in the science of law translation interface 1020, user inputs to be translated in input frame 1021 Source sentence " room rate continuous rise " after, click translation control 1022, source sentence is sent to server 1030 by terminal, should It is right by the server 1030 including above-mentioned law phrase table 1031 and general neural network model 1032 in server 1030 After source sentence is translated, the object statement that translation obtains is back to terminal and is shown, exhibition method includes that text is aobvious Show that displaying and/or voice output are shown, such as: server returns to object statement " Housingpriceskeeprising " to terminal, Terminal is played the object statement by way of voice and is shown.
It is worth noting that, three specific areas shown in user interface 1010 belong to translation content it is more formal Field, therefore it is shown in the same user interface 1010, " primary school translates field " " can also be translated neck in middle school Domain " and " university translates field " are concluded and are shown into the same user interface.
Figure 11 is the machine translation apparatus that one exemplary embodiment of the application provides, which may be implemented in such as Fig. 2 Shown in server 22, which includes: receiving module 1101, determining module 1102 and translation module 1103;
Receiving module 1101, for receiving source sentence to be translated, the source sentence is the sentence of target domain;
Determining module 1102 includes described in the phrase table for determining phrase table corresponding with the target domain The source vocabulary of target domain and the corresponding relationship of target side vocabulary, each source vocabulary and at least one target side vocabulary Corresponding, the source vocabulary and corresponding first natural language of the source sentence, the target side vocabulary correspond to the second nature language Speech;
Translation module 1103, for by machine learning model according to the phrase table by the source of first natural language Holding statement translation is the object statement of the second nature language.
In an alternative embodiment, as shown in figure 12, described device, further includes:
Searching module 1104, for searching n source vocabulary corresponding with the source sentence in the phrase table;
The determining module 1102 is also used to determine in the corresponding relationship m corresponding with the n source vocabulary Target side vocabulary, the m target side word combination become target side vocabulary, and m and n are positive integer;
The translation module 1103, being also used to will be described according to the target side vocabulary by the machine learning model The source statement translation of first natural language is the object statement of the second nature language.
In an alternative embodiment, the determining module 1102 is also used to be determined according to the target side vocabulary Target classification matrix in the machine learning model, the target classification matrix are used for according to the source sentence to the mesh The probability that each target side vocabulary generates the object statement in mark end vocabulary is determined;
The translation module 1103 is also used to machine learning model described in the source input by sentence, and output obtains institute State object statement.
In an alternative embodiment, the determining module 1102 is also used to through the target side vocabulary to first Beginning classification matrix is filtered, and obtains the target classification matrix, includes at least two fields pair in the preliminary classification matrix The target side vocabulary answered includes the target domain at least two field.
In an alternative embodiment, the searching module 1104 is also used to carry out at participle the source sentence Reason obtains at least one participle vocabulary;
The searching module 1104 is also used to search the n of at least one participle vocabulary described in including in the phrase table A source vocabulary.
In an alternative embodiment, the determining module 1102 is also used in the corresponding relationship determining and institute The corresponding k target side vocabulary of n source vocabulary is stated, includes the vocabulary occurred at least twice in the k target side vocabulary;It is right The k target side vocabulary carries out duplicate removal processing, obtains the m target side vocabulary.
In an alternative embodiment, the receiving module 1101 is also used to receive reference content, the reference content It include the corpus for belonging to the target domain, the corpus in the reference content for content corresponding with the target domain Including source corpus and corresponding with the source corpus corpus is translated;
Described device further include:
Abstraction module 1105, for carrying out corresponding extraction to the source corpus and the phrase translated in corpus, Generate the phrase table;Or, being filtered according to the reference content to initial phrase table, the phrase table is obtained, it is described first Beginning phrase table is extracted to the corpus for belonging at least two fields.
It should be noted that receiving module 1101, determining module 1102 in above-described embodiment, translation module 1103, looking into Look for module 1104 and abstraction module 1105 that processor and memory cooperative achievement can be realized or had by processor.
Present invention also provides a kind of server, which includes processor and memory, be stored in memory to A few instruction, at least one instruction is loaded by processor and executed to be turned over the machine for realizing that above-mentioned each embodiment of the method provides Translate method.It should be noted that the server can be server provided by following Figure 13.
Figure 13 is please referred to, it illustrates the structural schematic diagrams for the server that one exemplary embodiment of the application provides.Tool For body: the server 1300 includes 1302 He of central processing unit (CPU) 1301 including random access memory (RAM) The system storage 1304 of read-only memory (ROM) 1303, and connection system storage 1304 and central processing unit 1301 System bus 1305.The server 1300 further includes that the substantially defeated of information is transmitted between each device helped in computer Enter/output system (I/O system) 1306, and is used for storage program area 1313, application program 1314 and other program modules 1315 mass-memory unit 1307.
The basic input/output 1306 includes display 1308 for showing information and inputs for user The input equipment 1309 of such as mouse, keyboard etc of information.Wherein the display 1308 and input equipment 1309 all pass through The input and output controller 1310 for being connected to system bus 1305 is connected to central processing unit 1301.The basic input/defeated System 1306 can also include input and output controller 1310 to touch for receiving and handling from keyboard, mouse or electronics out Control the input of multiple other equipment such as pen.Similarly, input and output controller 1310 also provide output to display screen, printer or Other kinds of output equipment.
The mass-memory unit 1307 (is not shown by being connected to the bulk memory controller of system bus 1305 It is connected to central processing unit 1301 out).The mass-memory unit 1307 and its associated computer-readable storage medium Matter is that server 1300 provides non-volatile memories.That is, the mass-memory unit 1307 may include such as hard The computer readable storage medium (not shown) of disk or CD-ROI driver etc.
Without loss of generality, the computer readable storage medium may include computer storage media and communication media.Meter Calculation machine storage medium is believed including computer readable instructions, data structure, program module or other data etc. for storage The volatile and non-volatile of any method or technique realization of breath, removable and irremovable medium.Computer storage medium Including RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storages its technologies, CD-ROM, DVD or other optical storages, magnetic Tape drum, tape, disk storage or other magnetic storage devices.Certainly, skilled person will appreciate that computer storage is situated between Matter is not limited to above-mentioned several.Above-mentioned system storage 1304 and mass-memory unit 1307 may be collectively referred to as memory.
Memory is stored with one or more programs, and one or more programs are configured to by one or more central processings Unit 1301 executes, and one or more programs include the instruction for realizing above-mentioned machine translation method, central processing unit 1301, which execute the one or more program, realizes the machine translation method that above-mentioned each embodiment of the method provides.
According to various embodiments of the present invention, the server 1300 can also be arrived by network connections such as internets Remote computer operation on network.Namely server 1300 can be connect by the network being connected on the system bus 1305 Mouth unit 1311 is connected to network 1312, in other words, it is other kinds of to be connected to that Network Interface Unit 1311 also can be used Network or remote computer system (not shown).
The memory further includes that one or more than one program, the one or more programs are stored in In memory, the one or more programs include for carrying out in machine translation method provided in an embodiment of the present invention The step as performed by server.
The embodiment of the present application also provides a kind of computer readable storage medium, and at least one finger is stored in the storage medium Enable, at least one section of program, code set or instruction set, at least one instruction, at least one section of program, the code set or Instruction set is loaded by the processor 1310 and is executed to realize the machine translation method as described in Fig. 3, Fig. 6 and Fig. 9 are any.
Present invention also provides a kind of computer program products to make when computer program product is run on computers It obtains computer and executes the machine translation method that above-mentioned each embodiment of the method provides.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (15)

1. a kind of machine translation method, which is characterized in that the described method includes:
Source sentence to be translated is received, the source sentence is the sentence of target domain;
It determines phrase table corresponding with the target domain, includes the source vocabulary and mesh of the target domain in the phrase table The corresponding relationship that terminal word converges is marked, each source vocabulary is corresponding at least one target side vocabulary, the source vocabulary and institute Corresponding first natural language of source sentence is stated, the target side vocabulary corresponds to the second nature language;
By machine learning model according to the phrase table by the source statement translation of first natural language be described second The object statement of natural language.
2. the method according to claim 1, wherein described will according to the phrase table by machine learning model The source statement translation of first natural language is the object statement of the second nature language, comprising:
N source vocabulary corresponding with the source sentence is searched in the phrase table;
M target side vocabulary corresponding with the n source vocabulary, the m target terminal word are determined in the corresponding relationship Remittance is combined into target side vocabulary, and m and n are positive integer;
By the machine learning model according to the target side vocabulary by the source statement translation of first natural language For the object statement of the second nature language.
3. according to the method described in claim 2, it is characterized in that, it is described by machine learning model according to the target terminal word The source statement translation of first natural language is the object statement of the second nature language by remittance table, comprising:
The target classification matrix in the machine learning model, the target classification matrix are determined according to the target side vocabulary For the probability of the object statement to be generated to each target side vocabulary in the target side vocabulary according to the source sentence It is determined;
By machine learning model described in the source input by sentence, output obtains the object statement.
4. according to the method described in claim 3, it is characterized in that, described determine the machine according to the target side vocabulary Target classification matrix in learning model, comprising:
Preliminary classification matrix is filtered by the target side vocabulary, obtains the target classification matrix, it is described initial Include the corresponding target side vocabulary at least two fields in classification matrix, includes that the target is led at least two field Domain.
5. according to any method of claim 2 to 4, which is characterized in that it is described searched in the phrase table with it is described The corresponding n source vocabulary of source sentence, comprising:
Word segmentation processing is carried out to the source sentence, obtains at least one participle vocabulary;
N source vocabulary of at least one participle vocabulary described in including is searched in the phrase table.
6. according to any method of claim 2 to 4, which is characterized in that the determining and institute in the corresponding relationship State the corresponding m target side vocabulary of n source vocabulary, comprising:
K target side vocabulary corresponding with the n source vocabulary, the k target terminal word are determined in the corresponding relationship It include the vocabulary occurred at least twice in remittance;
Duplicate removal processing is carried out to the k target side vocabulary, obtains the m target side vocabulary.
7. method according to any one of claims 1 to 4, which is characterized in that the determination is corresponding with the target domain Before phrase table, further includes:
Reference content is received, it includes belonging in the reference content that the reference content, which is content corresponding with the target domain, In the corpus of the target domain, the corpus includes source corpus and corresponding with the source corpus has translated corpus;
Corresponding extraction is carried out to the source corpus and the phrase translated in corpus, generates the phrase table;Or, according to The reference content is filtered initial phrase table, obtains the phrase table, and the initial phrase table is to belonging at least two What the corpus in a field was extracted.
8. a kind of machine translation apparatus, which is characterized in that described device includes:
Receiving module, for receiving source sentence to be translated, the source sentence is the sentence of target domain;
Determining module includes the target domain in the phrase table for determining phrase table corresponding with the target domain Source vocabulary and target side vocabulary corresponding relationship, each source vocabulary is corresponding at least one target side vocabulary, institute Source vocabulary and corresponding first natural language of the source sentence are stated, the target side vocabulary corresponds to the second nature language;
Translation module, for being turned over the source sentence of first natural language according to the phrase table by machine learning model It is translated into the object statement of the second nature language.
9. device according to claim 8, which is characterized in that described device, further includes:
Searching module, for searching n source vocabulary corresponding with the source sentence in the phrase table;
The determining module is also used to determine m target terminal word corresponding with the n source vocabulary in the corresponding relationship It converges, the m target side word combination becomes target side vocabulary, and m and n are positive integer;
The translation module is also used to natural by described first according to the target side vocabulary by the machine learning model The source statement translation of language is the object statement of the second nature language.
10. device according to claim 9, which is characterized in that the determining module is also used to according to the target terminal word Remittance table determines that the target classification matrix in the machine learning model, the target classification matrix are used for according to the source sentence The probability for generating the object statement to each target side vocabulary in the target side vocabulary is determined;
The translation module is also used to machine learning model described in the source input by sentence, and output obtains the target language Sentence.
11. device according to claim 9, which is characterized in that the determining module is also used to through the target terminal word Remittance table is filtered preliminary classification matrix, obtains the target classification matrix, includes at least two in the preliminary classification matrix The corresponding target side vocabulary in a field includes the target domain at least two field.
12. according to any device of claim 9 to 11, which is characterized in that the searching module is also used to the mesh Poster sentence carries out word segmentation processing, obtains at least one participle vocabulary;
The searching module is also used to search n source word of at least one participle vocabulary described in including in the phrase table It converges.
13. according to any device of claim 9 to 11, which is characterized in that the determining module is also used to described right It should be related to middle determination k target side vocabulary corresponding with the n source vocabulary, include occurring in the k target side vocabulary Vocabulary at least twice;Duplicate removal processing is carried out to the k target side vocabulary, obtains the m target side vocabulary.
14. a kind of server, which is characterized in that the server includes processor and memory, is stored in the memory At least one instruction, at least one section of program, code set or instruction set, at least one instruction, at least one section of program, institute Code set or instruction set is stated to be loaded by the processor and executed to realize the machine translation as described in claim 1 to 7 is any Method.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one instruction, extremely in the storage medium Few one section of program, code set or instruction set, at least one instruction, at least one section of program, the code set or the instruction Collection is loaded by the processor and is executed to realize the machine translation method as described in claim 1 to 7 is any.
CN201811286094.8A 2018-10-31 2018-10-31 Machine translation method, device and readable storage medium Active CN109359308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811286094.8A CN109359308B (en) 2018-10-31 2018-10-31 Machine translation method, device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811286094.8A CN109359308B (en) 2018-10-31 2018-10-31 Machine translation method, device and readable storage medium

Publications (2)

Publication Number Publication Date
CN109359308A true CN109359308A (en) 2019-02-19
CN109359308B CN109359308B (en) 2023-01-10

Family

ID=65347516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811286094.8A Active CN109359308B (en) 2018-10-31 2018-10-31 Machine translation method, device and readable storage medium

Country Status (1)

Country Link
CN (1) CN109359308B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442878A (en) * 2019-06-19 2019-11-12 腾讯科技(深圳)有限公司 Interpretation method, the training method of Machine Translation Model, device and storage medium
WO2021077559A1 (en) * 2019-10-25 2021-04-29 北京小米智能科技有限公司 Information processing method and apparatus, and storage medium
CN114139560A (en) * 2021-12-03 2022-03-04 山东诗语翻译有限公司 Translation system based on artificial intelligence
WO2023005763A1 (en) * 2021-07-29 2023-02-02 北京有竹居网络技术有限公司 Information processing method and apparatus, and electronic device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140200878A1 (en) * 2013-01-14 2014-07-17 Xerox Corporation Multi-domain machine translation model adaptation
CN108132932A (en) * 2017-12-27 2018-06-08 苏州大学 Neural machine translation method with replicanism
CN108647214A (en) * 2018-03-29 2018-10-12 中国科学院自动化研究所 Coding/decoding method based on deep-neural-network translation model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140200878A1 (en) * 2013-01-14 2014-07-17 Xerox Corporation Multi-domain machine translation model adaptation
CN108132932A (en) * 2017-12-27 2018-06-08 苏州大学 Neural machine translation method with replicanism
CN108647214A (en) * 2018-03-29 2018-10-12 中国科学院自动化研究所 Coding/decoding method based on deep-neural-network translation model

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442878A (en) * 2019-06-19 2019-11-12 腾讯科技(深圳)有限公司 Interpretation method, the training method of Machine Translation Model, device and storage medium
CN110442878B (en) * 2019-06-19 2023-07-21 腾讯科技(深圳)有限公司 Translation method, training method and device of machine translation model and storage medium
WO2021077559A1 (en) * 2019-10-25 2021-04-29 北京小米智能科技有限公司 Information processing method and apparatus, and storage medium
US11461561B2 (en) 2019-10-25 2022-10-04 Beijing Xiaomi Intelligent Technology Co., Ltd. Method and device for information processing, and storage medium
WO2023005763A1 (en) * 2021-07-29 2023-02-02 北京有竹居网络技术有限公司 Information processing method and apparatus, and electronic device
CN114139560A (en) * 2021-12-03 2022-03-04 山东诗语翻译有限公司 Translation system based on artificial intelligence

Also Published As

Publication number Publication date
CN109359308B (en) 2023-01-10

Similar Documents

Publication Publication Date Title
US20240078386A1 (en) Methods and systems for language-agnostic machine learning in natural language processing using feature extraction
US11386271B2 (en) Mathematical processing method, apparatus and device for text problem, and storage medium
US10650102B2 (en) Method and apparatus for generating parallel text in same language
CN110019701B (en) Method for question answering service, question answering service system and storage medium
CN109359308A (en) Machine translation method, device and readable storage medium storing program for executing
US20160162569A1 (en) Methods and systems for improving machine learning performance
CN107515855B (en) Microblog emotion analysis method and system combined with emoticons
CN109165384A (en) A kind of name entity recognition method and device
CN106682170B (en) Application search method and device
CN108228576B (en) Text translation method and device
CN109657204A (en) Use the automatic matching font of asymmetric metric learning
CN107748744B (en) Method and device for establishing drawing box knowledge base
US11651015B2 (en) Method and apparatus for presenting information
CN116127020A (en) Method for training generated large language model and searching method based on model
CN114547274B (en) Multi-turn question and answer method, device and equipment
CN110059183A (en) A kind of automobile industry User Perspective sensibility classification method based on big data
CN116303537A (en) Data query method and device, electronic equipment and storage medium
CN109359198A (en) A kind of file classification method and device
CN111144137B (en) Method and device for generating corpus of machine post-translation editing model
CN105701743A (en) Cloud word learning system and method
Rahman A Cross Modal Deep Learning Based Approach for Caption Prediction and Concept Detection by CS Morgan State.
US20210142002A1 (en) Generation of slide for presentation
CN113190692B (en) Self-adaptive retrieval method, system and device for knowledge graph
CN113255331B (en) Text error correction method, device and storage medium
CN110738050A (en) Text recombination method, device and medium based on word segmentation and named entity recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant