CN104050160B - Interpreter's method and apparatus that a kind of machine is blended with human translation - Google Patents

Interpreter's method and apparatus that a kind of machine is blended with human translation Download PDF

Info

Publication number
CN104050160B
CN104050160B CN201410090457.6A CN201410090457A CN104050160B CN 104050160 B CN104050160 B CN 104050160B CN 201410090457 A CN201410090457 A CN 201410090457A CN 104050160 B CN104050160 B CN 104050160B
Authority
CN
China
Prior art keywords
translation
statement
sentence
machine
language
Prior art date
Application number
CN201410090457.6A
Other languages
Chinese (zh)
Other versions
CN104050160A (en
Inventor
高鹏
Original Assignee
北京紫冬锐意语音科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京紫冬锐意语音科技有限公司 filed Critical 北京紫冬锐意语音科技有限公司
Priority to CN201410090457.6A priority Critical patent/CN104050160B/en
Publication of CN104050160A publication Critical patent/CN104050160A/en
Application granted granted Critical
Publication of CN104050160B publication Critical patent/CN104050160B/en

Links

Abstract

The present invention proposes Interpreter's method and device that a kind of machine is blended with human translation.Methods described includes:Continuous voice paragraph is recognized, and punctuate cutting is carried out to which, obtain the input text in units of sentence;Database search is carried out according to the input text, corresponding object statement has been searched whether, directly by object statement with voice output if having;Translation is carried out to the input text using machine translation and obtains object statement, and confidence level marking is carried out to the object statement;Object statement is obtained by the be input into text of human translation;Assessed according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, and the method using phonetic synthesis will assess the adjustable voice output of the preferable special translating purpose sentence generation rhythm.

Description

Interpreter's method and apparatus that a kind of machine is blended with human translation
Technical field
The present invention relates to computerized information automatically processes field, and in particular to the mouth that a kind of machine is blended with human translation Language interpretation method and device.
Background technology
The research and development of automatic translation system have been carried out 50 years --- in fact since 1940's electronics Computer has begun to computer utility in the exploration of voiced translation from being born.Since the eighties of last century latter stage eighties, People start the research for being devoted to voiced translation (Speech-to-speech Translation) technology.So-called voiced translation Computer is exactly allowed to realize the process from a kind of voice of language to the voiced translation of another kind of language.Which is envisioned substantially, allows Computer serves as the role translated between the speaker for holding different language as people.The language used due to speaker is general all It is the spoken language in daily life, and people is also just wishing that machine translation system can receive and realize turning over for any spoken utterance Translate, also, this fast-developing and raising wished with speech recognition technology and spoken language analyzing technology, have become no longer vast Boundless and indistinct imagination.Therefore, voiced translation is often referred to as Interpreter (Spoken Language Translation, SLT) again.
Interpreter is related to various subjects such as automatic speech recognition, machine translation, phonetic synthesis and technology, therefore has weight Big Research Significance.In the eighties of last century early stage nineties, along with the continuous improvement of speech recognition technology, Interpreter is gradually fluffy Suddenly grow up.With the continuous development of its core technology, Interpreter is no longer that the high-quality being defined in certain field is turned over Translate, and will be to realize the exchange between different language as target.Research in Interpreter lays particular emphasis on the following aspects:1) exist Session structure is excavated in interactively voiced translation;2) study written word different from spoken language;3) performance of processing system and Shandong Rod sex chromosome mosaicism.In recent years, it is with the foundation of statistical framework in speech recognition, more and more in machine translation and Interpreter System begin to use statistical method to be modeled.And traditional oral translation system, due to by technical limitations, Zhi Nengying Under some voice constrained environments, and the significant challenge of current techniques is how to solve real-life oral communication to ask Topic.Specific application scenarios are seeked advice from from international conference (including sports meet) informix service to travel information etc., therefore From from the perspective of society and economy, Interpreter (SLT) has very big dependency with the world of our this globalization.
Spoken message expresses a kind of main communication way as the daily exchange of people, the terseness of language performance and use The popularity of scope is increasingly subject to people's attention.Machine translation system of the exploitation based on spoken message contents processing, just In use and the embedded system platform that carries on be applied to develop for people one of practicality machine translation system it is important in Hold.But, the spoken Some features of itself cause the research and development of practicality oral translation system difficult.Interpreter's is main Difficulty is that, 1) in the real scenes such as spoken dialog, Internet chat, the sentence of input often lacks standardization, it is difficult to flutter catch which Inherent grammatical structure information, the problems such as the result for causing statistical translation is more stiff, stability is poor;2) statistical machine translation be with Data-driven, its foundation depended on for existence is bilingual data resource.And in current data accumulation, towards the bilingual of spoken language Language material (China and Britain) is quite deficient.Therefore at present, the oral translation system for being completely dependent on statistical method can not fully meet people The widespread demand of daily life.3) Interpreter is different from text translation, and its exchange for mainly completing the crowd of different language is asked Topic, therefore its requirement of real-time to translation is high, especially how to optimize translation flow in a network environment is to improve Consumer's Experience Key.
The content of the invention
The present invention is intended to provide a kind of Interpreter's method that machine is blended with human translation, can solve simple use Machine carries out Interpreter, the problems such as its result is stiff, stability is poor and extensive in units of sentence by constantly accumulation Interpreter data base, improves the automaticity of translation, accelerates the commercialization of oral translation system.
According to an aspect of the present invention, which provides a kind of Interpreter's method that machine is blended with human translation, including:
Step 1, the continuous voice paragraph of identification, and punctuate cutting is carried out to which, obtain the input text in units of sentence This;
Step 2, according to it is described input text carry out database search, searched whether corresponding object statement, if having Object statement is otherwise gone to step into 3 with voice output directly;
Step 3, the input text is carried out by translation obtains object statement using machine translation, and to the object statement Carry out confidence level marking;
Step 4, object statement is obtained by the be input into text of human translation;
Step 5, assessed according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, and adopted Preferable special translating purpose sentence will be assessed with the method for phonetic synthesis and generate the adjustable voice output of the rhythm.
According to other method of the present invention, Interpreter's device that a kind of machine is blended with human translation, bag are it discloses Include:
Speech recognition and fragmentation module, recognize continuous voice paragraph, and carry out punctuate cutting to which, obtain with sentence For the input text of unit;
Masterplate is retrieved and replacement module, is carried out database search according to the input text, has been searched whether corresponding mesh Poster sentence, directly by object statement with voice output if having, otherwise into the first translation module;
Based on the machine translation module of level phrase, translation is carried out to the input text using machine translation and obtains target Sentence, and confidence level marking is carried out to the object statement;
The artificial mass-rent translation module of intelligence, obtains object statement by the be input into text of human translation;
Quality assessment modules, assess according to the degree of translation confidence of the machine translation and the quality of human translation is commented Estimate, and provide the last translation result of judgement;
Phonetic synthesis output module, last translation result quality assessment modules determined using the method for phonetic synthesis Generate the adjustable voice output of the rhythm.
The result of different interpretation methods is unified to pass in the way of phonetic synthesis described in such scheme proposed by the present invention User is passed, the real-time demand of speech exchange is met.For image transmission original language user feeling, this recognition methods is by knowing Other user feeling is simultaneously presented to object language user in the way of illustrating or synthesize, and improves the experience under user's application scenarios.
The present invention translates the warehouse-in of language material and effective by the artificial high-quality that paragraph punctuate is realized in units of sentence Utilize, overcome the problem that Traditional Man free translation data are difficult to be utilized by statistic translation method;The present invention will by phonetic synthesis The result of different interpretation methods all must stably be output as the voice comprising emotion, can be good at real-time in solution voiced translation Communication problem;The present invention can solve mouth of the magnanimity mobile subscriber under the scenes such as daily life, tourism, study, simple dialog Language issues for translation, accelerates the marketization of spoken automatic translation technology, with very great Practical significance.
Description of the drawings
Fig. 1 is Interpreter's method flow diagram that machine proposed by the present invention is blended with human translation;
Fig. 2 carries out the schematic diagram of sentence segmentation for the present invention to character string " ABCDEFGH ";
Fig. 3 is the user's growth curve figure in the present invention envisioned 2 years futures for arriving;
Fig. 4 is the schematic flow sheet of degree of translation confidence appraisal procedure in the present invention;
Fig. 5 is Interpreter's unit simulation figure that machine proposed by the present invention is blended with human translation.
Specific embodiment
To make the object, technical solutions and advantages of the present invention become more apparent, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in further detail.
Referring to Fig. 1, the flow process of the Interpreter's method blended with artificial mass-rent translation according to machine translation of the present invention is such as Under:
S11:The voice of original language is input into by terminal unit, by the continuous speech of input with the automatic identification of voice, mark Automatic addition of point symbol etc. carries out cutting punctuate for core, the input text in units of sentence needed for obtaining translating;
S12:After each corpus of text input system, enter line retrieval first in translation database, if matched into Work(, exports the cypher text for retrieving, and jumps to step S17;
S13:If corresponding data cannot be retrieved, start Intelligent management module, by the complexity for calculating input sentence, And the level condition with reference to user decides whether to enable human translation, while data are sent into machine translation module;
S14:Based on the statistical machine translation method of level phrase, fusion forces the parameter training result of alignment to be translated Confidence level estimation;
S15:If it is determined that enable human translation, then the method translated using artificial mass-rent, and on backstage using visualization Terminal enables the text that interpreter completes sentence by sentence to translate;
S16:Comprehensive assessment is carried out according to degree of translation confidence to machine translation result and human translation result, output is preferably Cypher text result;
S17:The cypher text of output is generated into stay-in-grade voice output, vivid user feeling can be transmitted, can be very Solve well the actual demand of face-to-face real time translation.
Said method step is described in detail below by embodiment two.
The fragmentation processing method of continuous speech in step S11
In on-line continuous speech recognition, voice endlessly can be input in sound identification module.Although eigentransformation Can calculate in real time with the posterior probability of every frame voice, if but decoder and internal memory cannot bear long speech data. Therefore extremely important to the punctuate cutting of voice in on-line continuous speech recognition, it is not only the guarantee of system stable operation, and And the recognition performance to system has vital effect.Traditional on-line continuous speech recognition is according to the quiet length in voice Degree carrys out cutting.So cutting read aloud or news voice in have a reasonable effect, but run into natural voice and know Other effect will decline suddenly.Because in naturally speaking, a word intermediate hold is also the phenomenon of generally existing, and according to Quiet cutting will be made pauses in reading unpunctuated ancient writings where pause in cutting.Can so cause pause former and later two sound bites identification can all occur Mistake.Additionally, in naturally speaking, it is as the fluctuation of emotion causes many words continuously to be said, middle even not include one Individual pause.The method made pauses in reading unpunctuated ancient writings using quiet has no way of detecting this cutting, and then influences whether the recognition effect of voice.
Our on-line continuous speech recognition uses the cutting punctuate method with discrimination as core, and cutting is not Carry out under the precondition of loss discrimination.Such cutting method can be automatically sentenced to the punctuate in natural-sounding It is disconnected.
Fig. 2 shows that the present invention carries out the schematic diagram of sentence segmentation to character string " ABCDEFGH ".As illustrated in fig. 2, it is assumed that One section of voice, scans for according to acoustic model and language model scores, is as a result ABCDEFG.An ensuing character is H, it is considered to be to be required for carrying out cutting punctuate to have two kinds of situations in figure below.
Thinking in the case of the first herein should paths above in cutting, i.e. figure, then the probability of language model is:
Wherein w=w1w2....wnRepresent length as n character string, x represents x gram language models.Character string is in this example ABCDEFG, length are 7.PhIt is the historical probabilities of recognized character string ABCDEFG, this is the initial state of two kinds of situations.It is probability that recognized character string last character G is a tail.P(w1=H|<s>) be first and do not know Malapropism accords with H as the probability of the beginning of a new sentence.
Second situation is non-divided herein, be exactly in figure below a paths, the probability of language model is:
Wherein PhIt in formula (1) is the same to be.It is that character H follows the probability behind character G.
Probabilistic language model in the case of relatively two kinds, if P1> P2, then we just think herein should cutting, from H Start in addition in short.Otherwise the words is not over, formed behind new character strings ABCDEFGH of the H immediately G, then Identical process is carried out to character I below.This method is not only able to exclude the false-alarm that the pause in a word brings, and Can avoid do not have pause continuous speech in failing to report so that discrimination reach it is best.Other this cutting method is not Depend on it is quiet, so as to it also avoid the inaccurate mistake cutting for bringing of quiet detection.
In step S12, extensive translation database search method
The translation database stores the high-quality translation result through human translation or artificial check and correction.The data base exists Two tasks are assume responsibility in system:
When corpus of text is after fragmentation process, enter line retrieval first in translation database, if the match is successful, Then returning result immediately.Here matching is realized by the similarity calculated between sentence.So-called sentence similarity refers to two Individual sentence is worth for the real number between [0,1] in matching matching degree semantically, and value shows that more greatly two sentences are more similar.When When value is 1, show that two sentences are identical semantically;Value is more little, shows that two sentence similarities are lower, works as value For 0 when, show that two sentences are entirely different semantically.Here we are mainly using fusion semantic information based on vector space The TF-IDF methods of model calculate sentence similarity.Due to information such as the spoken grammatical structures for lacking specification, we are difficult to flutter and catch The syntactic-semantic structure of input text.Therefore, semantic information herein mainly includes the shallow-layer languages such as time, numeral, name entity Adopted information, and be replaced on this basis generation translation template to be input into text match.
As statictic machine translation system is that language material drives, its performance is inseparable with the language material being input into.Especially, I Translation database collect be real-time, language material on accurate line, for machine translation system, with this partial data The continuous increase of scale and perfect, systematic function will be continued to optimize.
In step S13, intelligent coordinated method
If input sentence is not retrieved in translation database, we obtain the intelligent coordinated module of system start-up, according to reality The classification of border user, the complicated dynamic behaviour result of comprehensive input text, decides whether to be accomplished manually translation by interpreter.
The user of the default pays the number of expense, is divided into free users, domestic consumer and (pays making for small charge User), three kinds of classifications of advanced level user's (user had higher requirements by the accuracy of system).
Fig. 3 illustrates the user's growth curve in the system envisioned 2 years futures for arriving.As seen from Figure 3, in exploitation just Phase, mainly receiving free users, build and improve based on intelligent translation platform, with the growth of number of users, combined with intelligent The optimization updating ability of transcription platform itself, increasing paying customer will be attracted.In addition, Intelligent management module is also controlled And preferably translation result is added in data base, to ensure the continuous updating of data base.Treat its build up to certain scale it Afterwards, the part language material can be added in statistical translation system, translation model and language model is updated, and then be caused whole Individual translation system keeps continuing to optimize with the injection of real-time language material.Therefore, with the growth of system processes data scale, data The translation duties proportion that the automatic module such as library searching and machine translation can be completed is stepped up, and this is to a great extent The cost of human translation can be reduced, system effectiveness is improved.
In step S14, based on the statistical machine translation method of level phrase, fusion forces the parameter training result of alignment to be entered Row degree of translation confidence is assessed.
For Interpreter, using based on synchronous context free grammar (synchronous context-free Grammar, SCFG) stratification phrase statistical machine method.Basic thought is from extensive bilingualism corpora to extract magnanimity Alignment phrase segment, as knowledge source " memory " on a storage medium.The input text gone out to speech recognition is utilized and is efficiently searched Rope algorithmic match phrase fragment is simultaneously combined into sentence to be translated.And a kind of fusion is set up based on pressure alignment model parameter instruction Experienced confidence calculations method, the translation that primary study translation probability characteristic is brought are uncertain, are finally each object language Translation result generate confidence level marking.
Statistical machine model based on stratification phrase is set up in synchronous context free grammar (synchronous Context-free grammar, SCFG), belong to the category of formalization syntax.Statistical translation system based on stratification phrase General characteristic with the statistical machine translation based on formalization syntax:The structure of formalization syntax is borrowed, so that turning over The process of translating is stratification, is structured.Therefore, the long range sequencing in sentence can be become local sequencing by it;And Can tree structure in introduce variable solve it is extensive the problems such as.Meanwhile, with the statistical translation system phase for introducing linguisticss syntax Have than, formalization syntax and do not need the advantages such as extra syntactic analysis resource, completely compatible phrase system.Especially, in mouth Language is talked with, and in the real scene such as Internet chat, input sentence is often lack of standardization, has certain difference with real written language sentence Away from this causes the performance of syntactic analysis to be greatly affected.If the system based on linguistic grammatical is built in such environment Meter machine translation system, its performance will be had a greatly reduced quality because syntactic analysis performance is low.But, the system based on formal syntax In such language environment, performance will not then be subject to too big impact to meter machine translation, because formal syntax model is in this side The face still advantage with phrase-based model.The level phrase translation side with the output of confidence level translation result is described below Method:
1) level phrase translation
Under the interpretative system based on level phrase, the task of decoder is to find optimum object language stringMeet
Wherein e represents object language string, and D is represented and derived the process for generating target strings through a series of.I.e. it is capable to Synergetic syntax derivation D of a generation original language string is found, the target language that the object language end of the synchronous syntax generates in D Speech string, is exactly final translation.It should be noted that find here be maximum probability derivation rather than probability Maximum object language string, because may have various derivations generate similarly object language string, finds general The maximum object language string calculation cost of rate is higher.
According to log-linear model, probability P (D) may be considered the log-linear combination of multiple features:
Wherein ΦiAnd λiIt is ith feature function and its corresponding weight.Except language model feature PLMIn addition, other The feature used can be write as the form of regular product:
Wherein X →<R, a>A series of context-free grammars used in representing level phrase system derivation, therefore P (D) can be rewritten is:
Traditional statistical translation system, such as Pharaoh, using the linear codec algorithm such as beam search, A*, although can To incorporate the language model of simple sequencing model and N-Gram, but the word order of translation result is always barely satisfactory.But be based on The translation system front end of level phrase combines the structural information of tree, and the decoding algorithm of rear end accordingly will also utilize the analysis of tree to calculate Method.So the present invention uses for reference the syntactic analysis method based on chart analysis, the decoder of CYK-Style is realized.
CYK algorithms are improved bottom-up shift-in-reduction algorithms.Can produce in decoding process substantial amounts of it is assumed that being Avoid searching for all of possibility, we employ stack architecture, and necessity is carried out in search procedure using different strategies Beta pruning.AddEdge functions are set up for the side that each to be entered line chart (chart), first checks for whether it can pass through threshold value (threshold) beta pruning, if check by if the side whether will merger (recombination) again, finally check that it is No energy passes through block diagram (histogram) beta pruning.The side of these three beta prunings has only been successfully passed, line chart could be finally stored into In.
2) fusion is based on the degree of translation confidence appraisal procedure for forcing alignment model parameter training
Under ordinary meaning, confidence level is to evaluate measuring for a correct probability, and it reflects what a certain particular event occurred The degree of reliability.In machine translation, it is expressed as under conditions of in advance without reference to answer, the translation result to giving input Assessment.Within the system, our puzzlements not merely with some translation system external informations such as original language/object language Various information fusion in translation process are also come in by degree, length etc., scoring of the more preferable simulation people to translation result.
Fig. 4 shows the schematic flow sheet of degree of translation confidence appraisal procedure in the present invention.As described in Figure 4, based on pressure In the degree of translation confidence appraisal procedure of alignment model parameter training, first with the parameter instruction for forcing alignment techniques to complete model Practice, and retain the information such as the probability in alignment procedure;Meanwhile, we are by puzzled degree and length of original language string S and target strings T etc. Knowledge is dissolved in the assessment of confidence level as feature.By machine learning, can by much information be fused to one it is unified In framework, and simulated target function to greatest extent.In the present system, we used support vector regression (Support Vector Regression, SVR) as Machine learning tools.This be due to SVR object function can well it is approximate I The artificial scoring of target-mock translation result that proposes.
In step S15, artificial mass-rent interpretation method:
1) artificial mass-rent translation
Mass-rent (Crowdsourcing), refers to that task is contracted out to by a company with freely voluntary form nonspecific Popular network way, be a kind of social logistics pattern based on the Internet, business operation each can be embedded in Link, forms new type of organization.This concept is the well-known IT magazines in the U.S.《Line》Propose, be referred to as after " long-tail is managed by industry By " after another important business concept.In tissue human translation, this link adopts mass-rent pattern for the invention, has organized Each passerby of certain Interpreter's ability, creates the social new things that everybody is benefited using group intelligence.
Although mass-rent is translated this concept and is suggested, also have and translate that speech, tiger are flutterred, the website such as shell performs the text of mass-rent pattern Chapter and books translation community, but the main Translation of these community-type online media sites is to liking the networks such as current political news, competitive sports The text message of upper real time communication, finds suitable original text generally by the senior online friend of some forums and website employee, and to " translation The level of group " online friend and the division of labor are held.Mass-rent translation in the invention towards be the mutual mouth of exchange in daily life Language information, and have passed through fragmentation process.Therefore, translate compared to the mass-rent of the penman texts such as news, the mouth that the invention is proposed Language mass-rent translation has following some advantage:
The content of spoken message expression is for the purpose of communication exchange in actual life, it is therefore desirable to the more translation of " hommization " As a result.Mass-rent translation can utilize the strength of colony, not only reduce the cost of research and development, operation, manpower, also allow translation result more Stick on nearly user's request.
The content of spoken message expression is related to many aspects such as the clothing of people, food, shelter, row, is used herein mass-rent translation, More amateur translation talents can be converged, the advantage of group intelligence is fully demonstrated.
In visualization terminal, what interpreter faced is the text message after fragmentation.Generally, interpreter is good at good to arranging Good problem provides suggestion, and provides translation answer for targetedly problem, therefore such setting is more beneficial for interpreter and enters Row in real time, is efficiently translated.
By the language material of fragmentation is included, on mobile terminal (ipad, mobile phone), carrying out translating, editing for interpreter, Last interpreter can be selected translation result with voice mode input terminal, realize the purpose of real time translation.
2) interpreter terminal
Text that interpreter complete sentence by sentence is enable to translate using visualization terminal on backstage, the invention reduces the difficulty of interpreter Degree, reduces the cost of human translation, while ensure that the demand of real time translation, is mainly reflected in the following aspects:
The text of fragmentation is directly displayed in visualization terminal on backstage, what interpreter faced is no longer tediously long, connects Continuous voice, the substitute is fragmentation and visual text message.On the one hand, the text of fragmentation is with sentence as list Position, interpreter no longer need the information for remembering many or even whole section, and reduce the demand to interpreter's synthesis abstract ability;It is another Input information is converted into visualText by aspect, and the text message that interpreter quickly and easily can will be seen that translates into target Language.This working method relaxes the demand to the various abilities of interpreter so that some " amateur " translation talents can also be fast Pass and incorporate, drastically increase the scope of application of system.
The word for needing human translation is highlighted by setting the different conditions of current task, such as can be used in terminal Different colours represent whether data are retrieved, if by machine translation and whether need human translation.When needing human translation When, it is possible to use it is highlighted to be different from other states.This method for being highlighted particular state allows interpreter side Just the translation sentence by sentence of text, is quickly completed, the difficulty of interpreter is reduced, the cost of human translation is reduced.
Finally due to our outstanding speech recognition systems, interpreter can select translation result is whole with voice mode input End, is automatically converted to word by its embedded speech recognition chunk.Here word for word typing translation result is no longer needed, people is reduced The time of work work, improve work efficiency.
In step S16, the method that comprehensive assessment is carried out to automatic translation and human translation:
After automatic translation and human translation is completed, all results are input into the module and carry out quality evaluation comprehensively output. The module is given a mark according to the confidence level of automatic translation, and the rank of each interpreter beats grading information, and uses for reference automatic translation evaluation mark Quasi- BLEU computational methods, carry out comprehensive assessment to the result of different interpretation methods, preferably result will return to phonetic synthesis mould Block.
Fig. 5 shows Interpreter's unit simulation figure that machine proposed by the present invention is blended with human translation.Such as Fig. 5 institutes Show:
S11:The voice of original language is input into by terminal unit (such as mobile phone), by the continuous speech of input with the automatic of voice Identification, automatic addition of punctuation mark etc. carry out cutting punctuate for core;And it is defeated in units of sentence needed for obtaining translating Enter text;
S12:After each corpus of text input system, enter line retrieval first in translation database, if matched into Work(, exports the cypher text for retrieving, and after quality evaluation, directly exports the language composite result of cypher text;
S13:If corresponding data cannot be retrieved, start Intelligent management module, by the complexity for calculating input sentence, And the level condition with reference to user decides whether to enable human translation, while data will be fed into machine translation module;
S14:Based on the statistical machine translation method of level phrase, fusion forces the parameter training result of alignment to be translated Confidence level estimation;
S15:If it is determined that enable human translation, then the method translated using artificial mass-rent, and on backstage using visualization Terminal (ipad or other computers) enables the text that interpreter completes sentence by sentence to translate;
S16:Comprehensive assessment is carried out according to degree of translation confidence to machine translation result and human translation result, output is preferably Cypher text result;
S17:The cypher text of output is generated into stay-in-grade voice output, language performance can solve to face well The actual demand of face real time translation.
The invention allows for a kind of speech recognition and Interpreter's conversion equipment, which includes:
Speech recognition and fragmentation module:Automatic speech recognition, automatic marking addition can be carried out to continuous speech, is formed To be translated text of the sentence for unit;
The masterplate retrieval of extensive translation database and replacement module:Using fusion semantic information based on vector space mould The TF-IDF methods of type complete the large-scale data library searching of Interpreter's example sentence and complete to translate number using shallow-layer syntactic information According to partial content replace, strengthen data base generalization amount in terms of function;
Intelligent management module:Major function is the complexity by calculating input sentence, and with reference to the level condition of user Decide whether to enable human translation, while data will be fed into machine translation module.Meanwhile, which persistently can also be collected, collator The result of work mass-rent translation, forms high-quality translation database to provide efficiently retrieval service, and is machine translation mould Type updates the bilingual corpora that optimization provides high-quality;And the continuous accumulation with translation database and machine translation system is increasingly complete It is kind, the ratio of data retrieval and machine translation is stepped up, the dynamic for realizing whole translation system updates and optimizes;
Machine translation module based on level phrase:Comprising translation model, language model and three part of decoder, which is basic Thought is that magnanimity alignment phrase segment is extracted from extensive bilingualism corpora, as knowledge source " memory " on a storage medium. The Chinese sentence keyed in user, using efficient searching algorithm match phrase fragment and is combined into english sentence to be translated. The present invention also establishes a kind of fusion thereon based on the confidence calculations method for forcing alignment model parameter training, primary study The translation that translation probability characteristic is brought is uncertain, and finally the translation result for each object language generates confidence level marking.
The artificial mass-rent translation module of intelligence:Attract with oracy and be ready to be translated into by the way of mass-rent The translation talent of market value, forms the service type community for converging such talent, and passes through intelligent platform subcontracting, there is provided i.e. Shi Rengong interpretation services;And the language material of fragmentation is included on mobile terminal (ipad, mobile phone), carry out translating for interpreter, Editor, evaluation, passback.Visualization terminal, enables the text translation that interpreter is completed sentence by sentence in real time to include:On backstage by fragmentation Text be directly displayed at visualization terminal on, no longer directly facing continuous voice and word, this is to a great extent for interpreter On reduce the specific demand of the audition to interpreter, memory and abstract ability;By the different conditions for setting current task Being highlighted needs the word of human translation so that interpreter can quickly and easily complete the translation sentence by sentence of text, reduce and translate The difficulty of member, reduces the cost of human translation;Finally, after the completion of translation, interpreter no longer needs word for word to be input into text, Ke Yixuan Select and phonetic entry completed by terminal unit, and word is automatically converted to by its embedded sound identification module and submit to system, This will be greatly improved the work efficiency of interpreter.
Quality assessment modules:The execution degree marking provided by the result and machine translation apparatus of analytical data library searching, And may feed back in comprehensive artificial mass-rent translation translation result, the translation time, information and the adaptation user such as interpreter's grade Under the result of grade these multi-aspect informations, comprehensive judgement is carried out to each result, and provides the last translation result of judgement.
Voice synthetic module:Translation result is exported by way of phonetic synthesis.Use for image transmission original language Family emotion, this recognition methods are carried identifying user emotion and to be presented to object language user by way of illustrating or synthesizing Experience under high user's application scenarios.
Particular embodiments described above, has been carried out to the purpose of the present invention, technical scheme and beneficial effect further in detail Describe in detail bright, it should be understood that the foregoing is only the specific embodiment of the present invention, be not limited to the present invention, it is all Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements done etc. should be included in the protection of the present invention Within the scope of.

Claims (9)

1. a kind of Interpreter's method that machine is blended with human translation, it is characterised in that include:
Step 1, the continuous voice paragraph of identification, and punctuate cutting is carried out to which, obtain the input text in units of sentence;
Step 2, according to it is described input text carry out database search, searched whether corresponding object statement, it is direct if having By object statement with voice output, 3 are otherwise gone to step;
Step 3, the input text is carried out by translation obtains object statement using machine translation, and the object statement is carried out Confidence level is given a mark;
Step 4, object statement is obtained by the be input into text of human translation;
Step 5, assessed according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, and adopt language The method of sound synthesis will be assessed preferable special translating purpose sentence and generate the adjustable voice output of the rhythm;
Wherein, in the step 3 using the statistical method based on level phrase machine translation system, output is comprising confidence measure The object statement of amount, its detailed process include:
Magnanimity alignment phrase segment is extracted from bilingualism corpora, voice is known as knowledge source " memory " on a storage medium The input text not gone out is using searching algorithm match phrase fragment and is combined into object statement to be translated;And merge based on strong The confidence calculations method of alignment model parameter training processed, is that each object statement generates confidence.
2. method according to claim 1, it is characterised in that punctuate cutting is carried out in the step 1, obtain with sentence be The input text of unit is specifically included:
The continuous speech that voice paragraph is unit input is carried out into cutting punctuate by principal character of the rhythm, and with reference to voice from Dynamic identification, the automatic addition of punctuation mark carry out punctuate cutting, and carry out under the precondition for not losing discrimination.
3. method according to claim 1, it is characterised in that step 2 is specifically included:
To each input text, input text is calculated using the TF-IDF methods based on vector space model of fusion semantic information This similarity with sentence in data base, and then obtain object statement.
4. method according to claim 1, it is characterised in that following judgement was also needed to before execution step 4:Calculate defeated Enter the complexity of text, and the classification with reference to user determines the need for human translation.
5. the method as described in any one of claim 1-3, it is characterised in that the identification in step 1 according to different language model Whether probability judgment makes pauses in reading unpunctuated ancient writings, wherein historical probabilities, recognized character of the language model identification probability for recognized character string String last character is the product of the probability that the probability and first unknown character string of a tail starts for sentence.
6. the method for claim 1, it is characterised in that the statistical translation based on level phrase is referred specifically to from source language Speech string finds out the derivation of maximum probability in multiple derivation generating process of target strings, by the derivation corresponding target The result output gone here and there as machine translation.
7. the method for claim 1, it is characterised in that by source in support vector machine just translation process in step 3 The puzzled degree of language strings and target strings, length fusion study, and then confidence level marking is carried out to final object statement.
8. Interpreter's device that a kind of machine is blended with human translation, it is characterised in that include:
Speech recognition and fragmentation module, recognize continuous voice paragraph, and carry out punctuate cutting to which, obtain with sentence as list The input text of position;
Masterplate is retrieved and replacement module, is carried out database search according to the input text, has been searched whether corresponding target language Sentence, directly by object statement with voice output if having, otherwise into the first translation module;
Based on the machine translation module of level phrase, translation is carried out to the input text using machine translation and obtains target language Sentence, and confidence level marking is carried out to the object statement;
The artificial mass-rent translation module of intelligence, obtains object statement by the be input into text of human translation;
Quality assessment modules, assess according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, And provide the last translation result of judgement;
The last translation result that quality assessment modules are determined is generated by phonetic synthesis output module using the method for phonetic synthesis The adjustable voice output of the rhythm;
The machine translation module based on level phrase obtains object statement as follows:
Magnanimity alignment phrase segment is extracted from bilingualism corpora, voice is known as knowledge source " memory " on a storage medium The input text not gone out is using searching algorithm match phrase fragment and is combined into object statement to be translated;And merge based on strong The confidence calculations method of alignment model parameter training processed, is that each object statement generates confidence.
9. device as claimed in claim 8, it is characterised in that also include:
Intelligent management module, is decided whether to enable and is manually turned over by the complexity for calculating input sentence, and the classification with reference to user Translate, while data are sent into the machine translation module based on level phrase.
CN201410090457.6A 2014-03-12 2014-03-12 Interpreter's method and apparatus that a kind of machine is blended with human translation CN104050160B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410090457.6A CN104050160B (en) 2014-03-12 2014-03-12 Interpreter's method and apparatus that a kind of machine is blended with human translation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410090457.6A CN104050160B (en) 2014-03-12 2014-03-12 Interpreter's method and apparatus that a kind of machine is blended with human translation

Publications (2)

Publication Number Publication Date
CN104050160A CN104050160A (en) 2014-09-17
CN104050160B true CN104050160B (en) 2017-04-05

Family

ID=51503013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410090457.6A CN104050160B (en) 2014-03-12 2014-03-12 Interpreter's method and apparatus that a kind of machine is blended with human translation

Country Status (1)

Country Link
CN (1) CN104050160B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601834B (en) * 2014-12-19 2017-03-22 国家电网公司 Multilingual automatic speech calling and answering device and using method thereof
CN104731775B (en) * 2015-02-26 2017-11-14 北京捷通华声语音技术有限公司 The method and apparatus that a kind of spoken language is converted to written word
CN105389305B (en) * 2015-10-30 2019-01-01 北京奇艺世纪科技有限公司 A kind of text recognition method and device
CN106649283B (en) * 2015-11-02 2020-04-28 姚珍强 Translation device and method based on natural conversation mode for mobile equipment
CN105761201B (en) * 2016-02-02 2019-03-22 山东大学 A kind of method of text in translation picture
CN107632980B (en) * 2017-08-03 2020-10-27 北京搜狗科技发展有限公司 Voice translation method and device for voice translation
WO2019071607A1 (en) * 2017-10-09 2019-04-18 华为技术有限公司 Voice information processing method and device, and terminal
CN108009138A (en) * 2017-12-25 2018-05-08 中译语通科技(青岛)有限公司 A kind of interactive system of corpus crowdsourcing alignment
CN108962228A (en) * 2018-07-16 2018-12-07 北京百度网讯科技有限公司 model training method and device
CN110516063A (en) * 2019-07-11 2019-11-29 网宿科技股份有限公司 A kind of update method of service system, electronic equipment and readable storage medium storing program for executing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739867A (en) * 2008-11-19 2010-06-16 中国科学院自动化研究所 Method for scoring interpretation quality by using computer
CN102214166A (en) * 2010-04-06 2011-10-12 三星电子(中国)研发中心 Machine translation system and machine translation method based on syntactic analysis and hierarchical model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9070363B2 (en) * 2007-10-26 2015-06-30 Facebook, Inc. Speech translation with back-channeling cues

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739867A (en) * 2008-11-19 2010-06-16 中国科学院自动化研究所 Method for scoring interpretation quality by using computer
CN102214166A (en) * 2010-04-06 2011-10-12 三星电子(中国)研发中心 Machine translation system and machine translation method based on syntactic analysis and hierarchical model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
交互式多策略机器翻译系统(IHSMTS)面向对象分类模式库的研究;胡曾剑;《中国优秀硕士学位论文全文数据库信息科技辑》;20070215(第2期);摘要,第2.3.2.6节,图2.3.3 *
人机互助的交互式口语翻译方法;刘鹏;《中文信息学报》;20090531;第23卷(第3期);第58-64页 *
语音翻译词典的涉及与实现与系统评估;程洁;《中国优秀硕士学位论文全文数据库信息科技辑》;20050115(第1期);第第2.3.1节,第2.3.2节,图2.1,2.2 *

Also Published As

Publication number Publication date
CN104050160A (en) 2014-09-17

Similar Documents

Publication Publication Date Title
Mei et al. What to talk about and how? selective generation using lstms with coarse-to-fine alignment
CN106484682B (en) Machine translation method, device and electronic equipment based on statistics
Chung et al. Speech2vec: A sequence-to-sequence framework for learning word embeddings from speech
CN104050256B (en) Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method
Deng et al. Deep learning in natural language processing
CN104573028B (en) Realize the method and system of intelligent answer
Mairesse et al. Stochastic language generation in dialogue using factored language models
Keneshloo et al. Deep reinforcement learning for sequence-to-sequence models
Tur et al. Spoken language understanding: Systems for extracting semantic information from speech
EP3508991A1 (en) Man-machine interaction method and apparatus based on artificial intelligence
CN106776581B (en) Subjective text emotion analysis method based on deep learning
CN106919646B (en) Chinese text abstract generating system and method
US7475010B2 (en) Adaptive and scalable method for resolving natural language ambiguities
Hahn et al. Comparing stochastic approaches to spoken language understanding in multiple languages
US20140324435A1 (en) Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
WO2018118546A1 (en) Systems and methods for an emotionally intelligent chat bot
US20160364377A1 (en) Language Processing And Knowledge Building System
WO2018218705A1 (en) Method for recognizing network text named entity based on neural network probability disambiguation
CN106534548B (en) Voice error correction method and device
Mathews et al. Semstyle: Learning to generate stylised image captions using unaligned text
Kahn et al. Libri-light: A benchmark for asr with limited or no supervision
US20170076204A1 (en) Natural language question expansion and extraction
CN103761975B (en) Method and device for oral evaluation
CN102262634B (en) Automatic questioning and answering method and system
Morrissey et al. An example-based approach to translating sign language

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
GR01 Patent grant
GR01 Patent grant