WO2019223437A1 - Speech translation method and apparatus - Google Patents

Speech translation method and apparatus Download PDF

Info

Publication number
WO2019223437A1
WO2019223437A1 PCT/CN2019/082040 CN2019082040W WO2019223437A1 WO 2019223437 A1 WO2019223437 A1 WO 2019223437A1 CN 2019082040 W CN2019082040 W CN 2019082040W WO 2019223437 A1 WO2019223437 A1 WO 2019223437A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
translation
translated text
translated
user
Prior art date
Application number
PCT/CN2019/082040
Other languages
French (fr)
Chinese (zh)
Inventor
占萌萌
刘俊华
Original Assignee
科大讯飞股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 科大讯飞股份有限公司 filed Critical 科大讯飞股份有限公司
Publication of WO2019223437A1 publication Critical patent/WO2019223437A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

Disclosed in the present application are a speech translation method and apparatus, the method comprising: translating original speech data of a user to obtain a first translation text, the language of the first translation text being different from the language of the original speech data; then, by means of interaction with the user, determining whether a translation result of using the first translation text as original speech data is correct. Hence, by means of determining whether the translation result of using the first translation text as the original speech data is correct, the present application may process the first translation text on the basis of a determination result, and may thus improve the accuracy of the translation result.

Description

一种语音翻译方法及装置Method and device for speech translation
本申请要求于2018年5月23日提交中国专利局、申请号为201810503163.X、申请名称为“一种语音翻译方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority from a Chinese patent application filed with the Chinese Patent Office on May 23, 2018, with application number 201810503163.X, and with the application name "A Voice Translation Method and Device", the entire contents of which are incorporated herein by reference Applying.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及一种语音翻译方法及装置。The present application relates to the field of artificial intelligence technology, and in particular, to a method and device for speech translation.
背景技术Background technique
语音翻译是指将源语言的语音数据自动翻译成目标语言的语音数据的过程,其中,源语言与目标语言属于不同的语种。在现有的语音翻译技术中,是将源语言的语音数据直接进行翻译并得到翻译结果,但是,该翻译结果可能并不准确性。Speech translation refers to the process of automatically translating the speech data of the source language into the speech data of the target language, where the source language and the target language belong to different languages. In the existing speech translation technology, the speech data of the source language is directly translated and a translation result is obtained, but the translation result may not be accurate.
例如,源语言的语音数据为中文语音数据“行李必须经过安检吗?”,目标语言的语音数据为英文语音数据“Does Lee have to go through security?”,但是,该英文语音数据对应的中文含义实际是“李先生通过安检了吗?”,可见,翻译前的中文语音数据“行李必须经过安检吗?”与翻译后的英文语音数据的实际含义“李先生通过安检了吗?”是不同的,即翻译结果不准确。For example, the voice data in the source language is Chinese voice data "Does luggage have to go through security?", And the voice data in the target language is English voice data "Does Lee through security?". Actually, "Mr. Li has passed the security check?" It can be seen that the Chinese voice data before translation "Does luggage have to pass security check?" Is different from the actual meaning of the translated English voice data, "Mr. Li passed the security check?" , That is, the translation result is inaccurate.
发明内容Summary of the Invention
本申请实施例的主要目的在于提供一种语音翻译方法及装置,能够提高语音翻译结果的准确性。The main purpose of the embodiments of the present application is to provide a speech translation method and device, which can improve the accuracy of speech translation results.
本申请实施例提供了一种语音翻译方法,包括:An embodiment of the present application provides a voice translation method, including:
对用户的源语音数据进行翻译,得到第一翻译文本,其中,所述第一翻译文本的语种与所述源语音数据的语种不同;Translating the user's source speech data to obtain a first translated text, wherein the language of the first translated text is different from the language of the source speech data;
通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。By interacting with the user, it is determined whether the translation result of the first translated text as the source speech data is correct.
可选的,所述判断所述第一翻译文本作为所述源语音数据的翻译结果是 否正确之后,还包括:Optionally, after determining whether the translation result of the first translated text as the source speech data is correct, the method further includes:
若判断得到所述第一翻译文本作为所述源语音数据的翻译结果是错误的,则对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。If it is determined that the translation result of the first translated text as the source speech data is incorrect, the first translated text is corrected, and the corrected text is used as the translation result of the source speech data.
可选的,所述通过与所述用户进行交互之前,还包括:Optionally, before interacting with the user, the method further includes:
判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;Determine whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to characterize the correctness of the translation result of the first translated text as the source speech data;
若否,则执行通过与所述用户进行交互的步骤。If not, perform the step of interacting with the user.
可选的,所述判断所述第一翻译文本的翻译质量是否大于预设质量阈值,包括:Optionally, the determining whether the translation quality of the first translated text is greater than a preset quality threshold includes:
对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同;Translating the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data;
根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Determining whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
可选的,所述根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值,包括:Optionally, determining whether the translation quality of the first translated text is greater than a preset quality threshold based on the second translated text includes:
根据所述源语音数据的识别文本以及所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Determining whether the translation quality of the first translated text is greater than a preset quality threshold according to the recognized text of the source speech data and the second translated text.
可选的,所述通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确,包括:Optionally, determining whether the first translation text is the correct translation result of the source voice data by interacting with the user includes:
利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。Interact with the user by using the second translated text to determine whether the translation result of the first translated text as the source speech data is correct.
可选的,所述利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确,包括:Optionally, the using the second translated text to interact with the user to determine whether the translation result of the first translated text as the source voice data is correct includes:
向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似;Outputting a first query voice to the user, wherein the first query voice is used to query whether the source voice data is similar to the semantics of the second translated text;
若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的;If a positive answer to the first query voice is received by the user, the first translation text is correct as a translation result of the source voice data;
若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文 本作为所述源语音数据的翻译结果是错误的。If a negative answer is received from the user to the first query voice, the first translation text is incorrect as a translation result of the source voice data.
可选的,所述对所述第一翻译文本进行修正,包括:Optionally, the modifying the first translated text includes:
采用文本匹配的方式,对所述第一翻译文本进行修正。The first translation text is corrected by using a text matching method.
可选的,所述采用文本匹配的方式,对所述第一翻译文本进行修正,包括:Optionally, the correcting the first translated text by using a text matching method includes:
将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种相同,所述第二样本文本的语种与所述第一翻译文本的语种相同;Match the recognized text of the source speech data with text data in a database, wherein the database stores at least one set of sentence pairs, the sentence pairs including a first sample text and the first sample text A correctly translated second sample text, the language of the first sample text is the same as the language of the source speech data, and the language of the second sample text is the same as the language of the first translated text;
通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本;Obtaining the first sample text most similar to the recognition text of the source speech data through the matching operation;
根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。Correct the first translated text based on the most similar first sample text.
可选的,所述根据所述最相似的第一样本文本,对所述第一翻译文本进行修正,包括:Optionally, the modifying the first translated text based on the most similar first sample text includes:
利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。Interacting with the user by using the most similar first sample text to achieve correction of the first translated text.
可选的,所述利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正,包括:Optionally, interacting with the user by using the most similar first sample text to achieve the correction of the first translated text includes:
向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似;Outputting a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text;
若接收到所述用户对所述第二询问语音的肯定回答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。If a positive answer is received from the user to the second query voice, a second sample text is obtained from the sentence pair to which the most similar first sample text belongs, as a success of the first translated text Corrected text.
可选的,所述方法还包括:Optionally, the method further includes:
若接收到所述用户对所述第二询问语音的否定回答,则输出提示语音,其中,所述提示语音用于提示所述用户重复所述源语音数据、或者更换所述源语音数据的说法。If a negative answer to the second query voice is received by the user, a prompt voice is output, wherein the prompt voice is used to prompt the user to repeat the source voice data or to replace the source voice data. .
本申请实施例还提供了一种语音翻译装置,包括:An embodiment of the present application further provides a voice translation device, including:
语音翻译单元,用于对用户的源语音数据进行翻译,得到第一翻译文本, 其中,所述第一翻译文本的语种与所述源语音数据的语种不同;A voice translation unit, configured to translate a user's source voice data to obtain a first translated text, wherein a language of the first translated text is different from a language of the source voice data;
用户交互单元,用于通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。A user interaction unit is configured to determine whether the translation result of the first translated text as the source voice data is correct by interacting with the user.
可选的,所述装置还包括:Optionally, the device further includes:
文本修正单元,用于若判断得到所述第一翻译文本作为所述源语音数据的翻译结果是错误的,则对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。A text correction unit, configured to correct the first translated text if it is determined that the translation result of the first translated text as the source speech data is incorrect, and use the corrected text as the source speech Data translation results.
可选的,所述装置还包括:Optionally, the device further includes:
质量判断单元,用于判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;若否,则触发所述用户交互单元来通过与所述用户进行交互判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。A quality determining unit, configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to characterize the first translated text as the source speech data The correctness of the translation result; if not, triggering the user interaction unit to determine whether the translation result of the first translated text as the source speech data is correct by interacting with the user.
可选的,所述质量判断单元包括:Optionally, the quality judgment unit includes:
反向翻译子单元,用于对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同;A reverse translation subunit, configured to translate the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data;
质量判断子单元,用于根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。A quality judging subunit, configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
可选的,所述质量判断子单元,具体用于根据所述源语音数据的识别文本以及所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Optionally, the quality judging subunit is specifically configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold based on the recognized text of the source speech data and the second translated text.
可选的,所述用户交互单元,具体用于利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。Optionally, the user interaction unit is specifically configured to use the second translated text to interact with the user to determine whether the translation result of the first translated text as the source voice data is correct.
可选的,所述用户交互单元包括:Optionally, the user interaction unit includes:
第一询问子单元,用于向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似;A first query subunit, configured to output a first query voice to the user, wherein the first query voice is used to query whether the source voice data is semantically similar to the second translated text;
结果确定子单元,用于若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的;若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文本作为所述 源语音数据的翻译结果是错误的。A result determining subunit, configured to: if a positive answer to the first query voice is received by the user, the first translated text is correct as a translation result of the source voice data; if the user is received For a negative answer to the first query voice, the first translation text is incorrect as a translation result of the source voice data.
可选的,所述文本修正单元,具体用于采用文本匹配的方式,对所述第一翻译文本进行修正。Optionally, the text correction unit is specifically configured to correct the first translated text in a text matching manner.
可选的,所述文本修正单元包括:Optionally, the text correction unit includes:
文本匹配子单元,用于将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种相同,所述第二样本文本的语种与所述第一翻译文本的语种相同;Text matching sub-unit, configured to match the recognized text of the source speech data with text data in a database, wherein the database stores at least one sentence pair, the sentence pair includes a first sample text and A second sample text after the first sample text is correctly translated, the language of the first sample text is the same as the language of the source speech data, and the language of the second sample text is the same as the first sample text The language of the translated text is the same;
文本获取子单元,用于通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本;A text obtaining subunit, configured to obtain, through the matching operation, a first sample text that is most similar to the recognized text of the source speech data;
文本修正子单元,用于根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。A text correction subunit is configured to correct the first translated text according to the most similar first sample text.
可选的,所述文本修正子单元,具体用于利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。Optionally, the text correction subunit is specifically configured to interact with the user by using the most similar first sample text to implement correction on the first translated text.
可选的,所述文本修正子单元包括:Optionally, the text correction subunit includes:
第二询问子单元,用于向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似;A second query subunit, configured to output a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text;
修正完成子单元,用于若接收到所述用户对所述第二询问语音的肯定回答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。The modification completion subunit is configured to obtain a second sample text from the sentence pair to which the most similar first sample text belongs if the user's positive answer to the second query voice is received, as The text after the first translation is successfully revised.
可选的,所述文本修正子单元还包括:Optionally, the text correction subunit further includes:
语音提示子单元,用于若接收到所述用户对所述第二询问语音的否定回答,则输出提示语音,其中,所述提示语音用于提示所述用户重复所述源语音数据、或者更换所述源语音数据的说法。A voice prompting subunit, configured to output a prompting voice if a negative answer to the second query voice is received by the user, wherein the prompting voice is used to prompt the user to repeat the source voice data or replace The source speech data.
本申请实施例还提供了一种语音翻译装置,包括:处理器、存储器、系统总线;An embodiment of the present application further provides a voice translation device, including: a processor, a memory, and a system bus;
所述处理器以及所述存储器通过所述系统总线相连;The processor and the memory are connected through the system bus;
所述存储器用于存储一个或多个程序,所述一个或多个程序包括指令, 所述指令当被所述处理器执行时使所述处理器执行上述语音翻译方法中的任意一种实现方式。The memory is configured to store one or more programs, where the one or more programs include instructions, and the instructions, when executed by the processor, cause the processor to execute any one of the implementation methods of the speech translation method described above. .
本申请实施例还提供了一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行上述语音翻译方法中的任意一种实现方式。An embodiment of the present application further provides a computer-readable storage medium, including instructions, which, when running on a computer, cause the computer to execute any one of the above-mentioned voice translation methods.
本申请实施例还提供了一种计算机程序产品,所述计算机程序产品在终端设备上运行时,使得所述终端设备执行上述语音翻译方法中的任意一种实现方式。An embodiment of the present application further provides a computer program product, which, when the computer program product runs on a terminal device, causes the terminal device to execute any one of the above-mentioned voice translation methods.
本申请实施例提供的一种语音翻译方法及装置,对用户的源语音数据进行翻译,得到第一翻译文本,该第一翻译文本的语种与源语音数据的语种不同,然后,通过与用户进行交互,判断第一翻译文本作为源语音数据的翻译结果是否正确。可见,通过对第一翻译文本作为源语音数据的翻译结果是否正确进行判断,可以基于判断结果对第一翻译文本进行处理,从而可以提高翻译结果的准确性。A voice translation method and device provided in the embodiments of the present application are used to translate a user's source voice data to obtain a first translated text. The language of the first translated text is different from the language of the source voice data. Interact to determine whether the translation result of the first translated text as the source speech data is correct. It can be seen that by judging whether the translation result of the first translated text as the source speech data is correct, the first translated text can be processed based on the judgment result, thereby improving the accuracy of the translation result.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings in the following description are Some embodiments of the present application, for those of ordinary skill in the art, can obtain other drawings according to these drawings without paying creative labor.
图1为本申请实施例提供的一种语音翻译方法的流程示意图;FIG. 1 is a schematic flowchart of a speech translation method according to an embodiment of the present application;
图2为本申请实施例提供的一种翻译质量判定方法的流程示意图;2 is a schematic flowchart of a method for determining translation quality according to an embodiment of the present application;
图3为本申请实施例提供的一种判断翻译结果是否可信的方法流程示意图;3 is a schematic flowchart of a method for determining whether a translation result is credible according to an embodiment of the present application;
图4为本申请实施例提供的一种翻译文本修正方法的流程示意图;4 is a schematic flowchart of a method for correcting a translated text according to an embodiment of the present application;
图5为本申请实施例提供的一种语音翻译装置的组成示意图;FIG. 5 is a schematic structural diagram of a speech translation apparatus according to an embodiment of the present application; FIG.
图6为本申请实施例提供的一种语音翻译装置的硬件结构示意图。FIG. 6 is a schematic diagram of a hardware structure of a speech translation apparatus according to an embodiment of the present application.
具体实施方式Detailed ways
语音翻译是指将源语言的语音数据(即翻译前的语音数据)自动翻译成 目标语言的语音数据(即翻译后的语音数据)的过程,一般地,语音翻译技术涉及语音识别、机器翻译和语音合成这三个主要组成部分。其中,语音识别是指通过语音识别技术将源语言的语音数据进行识别,生成源语言文本;机器翻译是指通过机器翻译技术将源语言文本翻译成目标语言文本;语音合成是指通过语音合成技术将目标语言文本合成为目标语言的语音数据。Speech translation refers to the process of automatically translating the speech data of the source language (that is, the speech data before translation) into the speech data of the target language (that is, the translated speech data). Generally, speech translation technology involves speech recognition, machine translation, and The three main components of speech synthesis. Among them, speech recognition refers to the recognition of speech data in the source language through speech recognition technology to generate the source language text; machine translation refers to the translation of source language text into the target language text through machine translation technology; speech synthesis refers to the use of speech synthesis technology The target language text is synthesized into speech data of the target language.
随着语音翻译技术的应用越来越广泛,人们对翻译结果的准确性要求也越来越高。一种语音翻译方法是通过一轮人机对话实现语音翻译,即,通过一次输入和一次输出实现语音翻译,输入的是源语言的语音数据,输出的是目标语言的语音数据,具体是由用户将所需翻译的源语言的语音数据输入到语音翻译设备中,语音翻译设备再通过语音识别、机器翻译和语音合成等步骤,将源语言的语音数据自动翻译成目标语言的语音数据,并反馈给用户,但是,在此过程中,语音识别、机器翻译的结果都有可能会出现偏差,从而导致最后输出的目标语言的语音数据不准确,也就是说,用户只能被动的接受语音翻译设备的一次性翻译结果,而当翻译结果错误时,语音翻译设备无法对错误翻译结果进行及时修正,从而降低了翻译结果的准确性。As the application of speech translation technology becomes more and more widespread, people have higher and higher requirements for the accuracy of translation results. A method of speech translation is to realize speech translation through a round of man-machine conversation, that is, to realize speech translation through one input and one output. The input is the voice data of the source language and the output is the voice data of the target language. The voice data of the source language to be translated is input into the voice translation device, and the voice translation device then automatically translates the voice data of the source language into the voice data of the target language through steps such as speech recognition, machine translation and speech synthesis, and feedbacks To the user, however, in the process, the results of speech recognition and machine translation may have deviations, resulting in inaccurate speech data in the target language that is finally output, that is, the user can only passively accept the speech translation device The translation result is a one-time translation. When the translation result is wrong, the speech translation device cannot correct the incorrect translation result in time, thereby reducing the accuracy of the translation result.
为此,本申请实施例提供了一种语音翻译方法,增加了对翻译结果的修正功能,即,可以对上述一次性翻译结果的准确性进行评估,当评估结果表示翻译结果的准确性较低时,可以对该翻译结果进行修正,具体可以通过与用户进行交互,根据交互结果进行修正,从而提高了翻译结果的准确性。For this reason, the embodiment of the present application provides a speech translation method, which adds a correction function for the translation result, that is, the accuracy of the one-time translation result can be evaluated. When the evaluation result indicates that the translation result is less accurate At this time, the translation result can be modified. Specifically, the translation result can be corrected by interacting with the user according to the interaction result, thereby improving the accuracy of the translation result.
需要说明的是,本申请实施例提供的语音翻译方法,不对其应用场景进行限制,比如,该方法可以用于用户出国旅游、出入境安检等需要翻译的场景。It should be noted that the speech translation method provided in the embodiments of the present application does not limit its application scenarios. For example, the method can be used in scenarios where a user needs to translate, such as traveling abroad, entering and exiting security.
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments These are part of the embodiments of the present application, but not all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
第一实施例First embodiment
参见图1,为本实施例提供的一种语音翻译方法的流程示意图,该语音翻译方法包括以下步骤:Referring to FIG. 1, a schematic flowchart of a speech translation method according to this embodiment is provided. The speech translation method includes the following steps:
S101:对用户的源语音数据进行翻译,得到第一翻译文本,其中,所述第一翻译文本的语种与所述源语音数据的语种不同。S101: Translate the user's source speech data to obtain a first translated text, wherein the language of the first translated text is different from the language of the source speech data.
本实施例将翻译前的语音数据(也即待翻译语音)称为源语音数据;并且,本实施例不限制源语音数据的语种类型,比如,源语音数据可以是中文语音、或英文语音等。This embodiment refers to the speech data before translation (that is, the speech to be translated) as the source speech data. Moreover, this embodiment does not limit the language type of the source speech data. For example, the source speech data may be Chinese speech or English speech, etc. .
本实施例将翻译后的文本数据称为第一翻译文本;并且,本实施例不限制第一翻译文本的语种类型,只要第一翻译文本与源语音数据属于不同的语种类型即可,比如,源语音数据是中文语音,第一翻译文本是英文文本,又比如,源语音数据是英文语音,第一翻译文本是中文文本。In this embodiment, the translated text data is referred to as a first translated text. Moreover, this embodiment does not limit the language type of the first translated text, as long as the first translated text and the source speech data belong to different language types, for example, The source speech data is Chinese speech, and the first translated text is English text. For another example, the source speech data is English speech, and the first translated text is Chinese text.
在本实施例中,可以通过语音识别技术对源语音数据进行语音识别,得到源语音数据的识别文本A1,再通过机器翻译技术对识别文本A1进行机器翻译,得到第一翻译文本B1。需要说明的是,本实施例中的语音识别技术可以是现有的或未来出现的任意一种语音识别技术,同样地,本实施例中的机器翻译技术也可以是现有的或未来出现的任意一种机器翻译技术。In this embodiment, the source speech data can be speech-recognized by speech recognition technology to obtain the recognition text A1 of the source speech data, and then the machine-translated recognition text A1 is machine-translated by the machine translation technology to obtain the first translated text B1. It should be noted that the speech recognition technology in this embodiment may be any existing or future speech recognition technology. Similarly, the machine translation technology in this embodiment may also be existing or future Any kind of machine translation technology.
例如,在出入境安检时,用户希望通过语音翻译设备与安检人员进行对话,假设用户说的源语音数据为“行李必须经过安检吗?”,语音翻译设备对其进行语音识别后,得到的识别文本A1为“李必须经过安检吗?”,再将识别文本A1进行翻译(中译英),得到的第一翻译文本B1是“Does Lee have to go through security?”。可见,对源语音数据进行语音识别时,其识别文本A1出现了识别错误。For example, when entering and exiting the security check, the user wants to have a dialogue with the security checker through a voice translation device. Assume that the source voice data that the user said is "Does the baggage have to pass security check?" The text A1 is "Does Lee have to go through security?", And then the recognition text A1 is translated (Chinese to English). The first translated text B1 is "Does Lee has passed through security?". It can be seen that when speech recognition is performed on the source speech data, a recognition error occurs in its recognition text A1.
S102:通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。S102: Determine whether the translation result of the first translated text as the source voice data is correct by interacting with the user.
在本实施例中,语音翻译设备可以与用户进行交互,具体可以采用语音交互、或文本交互等方式,并根据交互结果,判断第一翻译文本作为源语音数据的翻译结果是否正确。如果判断得到第一翻译文本作为源语音数据的翻译结果是正确的,则可以将第一翻译文本B1作为源语音数据的翻译结果。In this embodiment, the voice translation device may interact with the user. Specifically, the voice translation device or the text interaction method may be used, and according to the interaction result, it is determined whether the translation result of the first translated text as the source voice data is correct. If it is determined that the translation result of the first translated text as the source speech data is correct, the first translated text B1 may be used as the translation result of the source speech data.
此时,可以进一步对第一翻译文本B1进行语音合成,得到目标语音数据,并将目标语音数据直接反馈给用户,从而结束本轮翻译。当然,当将第一翻译文本B1作为源语音数据的文本翻译结果后,也可以对其进行其它处理,本实施例不对后续处理方式进行限制。At this time, the first translation text B1 can be further speech synthesized to obtain target speech data, and the target speech data is directly fed back to the user, thereby ending the current round of translation. Of course, after the first translation text B1 is used as the text translation result of the source speech data, other processing may also be performed on it, and this embodiment does not limit the subsequent processing manner.
需要说明的是,如果判断得到第一翻译文本作为源语音数据的翻译结果是错误的,可以通过后续第四实施例对第一翻译文本B1进行修正处理,或者,请求用户重复一遍源语音数据、或者换一种与源语音数据在语义上相似的说法,以便开启新一轮的翻译交互。It should be noted that if it is determined that the translation result of the first translated text as the source speech data is incorrect, the first translated text B1 may be corrected through the subsequent fourth embodiment, or the user is requested to repeat the source speech data, Or put it another way that is semantically similar to the source speech data in order to start a new round of translation interactions.
综上,本实施例提供的一种语音翻译方法,对用户的源语音数据进行翻译,得到第一翻译文本,该第一翻译文本的语种与源语音数据的语种不同,然后,通过与用户进行交互,判断第一翻译文本作为源语音数据的翻译结果是否正确。可见,通过对第一翻译文本作为源语音数据的翻译结果是否正确进行判断,可以基于判断结果对第一翻译文本进行处理,从而可以提高翻译结果的准确性。In summary, a voice translation method provided in this embodiment translates a user's source voice data to obtain a first translated text, and a language of the first translated text is different from a language of the source voice data. Interact to determine whether the translation result of the first translated text as the source speech data is correct. It can be seen that by judging whether the translation result of the first translated text as the source speech data is correct, the first translated text can be processed based on the judgment result, thereby improving the accuracy of the translation result.
第二实施例Second embodiment
在本实施例中,还可以在第一实施例中的判断步骤S102之前,即,可以在通过人机交互方式判断第一翻译文本作为源语音数据的翻译结果是否正确之前,先由机器(即语音翻译设备)判断第一翻译文本作为源语音数据的翻译结果是否正确。In this embodiment, before the judgment step S102 in the first embodiment, that is, before judging whether the translation result of the first translated text as the source speech data is correct through human-computer interaction, the machine (that is, (Voice translation device) determines whether the translation result of the first translated text as the source voice data is correct.
因此,在第一实施例中的判断步骤S102之前,还可以包括:判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;若否,则执行第一实施例中的判断步骤S102。Therefore, before the determining step S102 in the first embodiment, the method may further include: determining whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to represent the The correctness of the first translated text as the translation result of the source speech data is described; if not, the determination step S102 in the first embodiment is performed.
在本实施例中,可以评估第一翻译文本B1的翻译质量,如果其翻译质量不高于预先设定的质量阈值,这里称其为预设质量阈值,则认为第一翻译文本B1作为源语音数据的翻译结果是不可信的,即第一翻译文本B1作为源语音数据的翻译结果是错误的,此时,可以继续执行步骤S102,来对第一翻译文本B1作为翻译结果的正确性做进一步判断。In this embodiment, the translation quality of the first translated text B1 can be evaluated. If the translation quality is not higher than a preset quality threshold, which is referred to herein as a preset quality threshold, the first translated text B1 is considered as the source speech. The translation result of the data is unreliable, that is, the translation result of the first translation text B1 as the source speech data is incorrect. At this time, step S102 can be continued to further correct the correctness of the first translation text B1 as the translation result. Judge.
反之,如果第一翻译文本B1的翻译质量高于预设质量阈值,则认为第一翻译文本B1作为源语音数据的翻译结果是可信的,即第一翻译文本作为源语音数据的翻译结果是正确的,此时,可以将第一翻译文本B1作为源语音数据的翻译结果,进一步地,可以对第一翻译文本B1进行语音合成,得到目标语音数据,并将目标语音数据直接反馈给用户,从而结束本轮翻译。当然,当将第一翻译文本B1作为源语音数据的文本翻译结果后,也可以对其进行其它 处理,本实施例不对后续处理方式进行限制。Conversely, if the translation quality of the first translated text B1 is higher than a preset quality threshold, the translation result of the first translated text B1 as the source speech data is considered to be credible, that is, the translation result of the first translated text B1 as the source speech data is Correctly, at this time, the first translated text B1 can be used as the translation result of the source speech data. Further, the first translated text B1 can be synthesized by speech to obtain the target speech data, and the target speech data is directly fed back to the user. Thus ending this round of translation. Of course, after the first translation text B1 is used as the text translation result of the source speech data, other processing may also be performed on it, and this embodiment does not limit the subsequent processing manner.
下面将对上述翻译质量判断步骤(即“判断所述第一翻译文本的翻译质量是否大于预设质量阈值”)的具体实现方式进行介绍。In the following, a specific implementation manner of the above-mentioned translation quality judgment step (that is, "determining whether the translation quality of the first translated text is greater than a preset quality threshold") will be described.
参见图2,为本实施例提供的一种翻译质量判定方法的流程示意图,该翻译质量判定方法包括以下步骤:Referring to FIG. 2, a flowchart of a method for determining translation quality according to this embodiment is provided. The method for determining translation quality includes the following steps:
S201:对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同。S201: Translate the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data.
在本实施例中,可以对第一翻译文本B1进行反向翻译,得到第二翻译文本A2。其中,第一翻译文本B1的语种为翻译后语言的语种,比如英文;第二翻译文本A2的语种为翻译前语言的语种,比如中文。In this embodiment, the first translated text B1 may be reversely translated to obtain a second translated text A2. The language of the first translated text B1 is the language of the translated language, such as English; the language of the second translated text A2 is the language of the pre-translation language, such as Chinese.
例如,继续上述例子,假设第一翻译文本B1是“Does Lee have to go through security?”,对其进行反向翻译后得到的第二翻译文本A2是“李先生通过安检了吗?”。For example, following the above example, assuming that the first translated text B1 is "Does, Lee has to go through security?", And the second translated text A2 obtained after reverse translation is "Did Mr. Li pass the security check?".
S202:根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。S202: Determine whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
在本实施例中,可以基于第二翻译文本A2,来对第一翻译文本B1的翻译质量进行判定。在一种实现方式中,本步骤S202具体可以包括:根据所述源语音数据的识别文本以及所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。In this embodiment, the translation quality of the first translated text B1 may be determined based on the second translated text A2. In an implementation manner, step S202 may specifically include: judging whether the translation quality of the first translated text is greater than a preset quality threshold according to the recognized text of the source speech data and the second translated text.
在本步骤202的具体实现方式中,具体可以利用BLEU(bilingual evaluation understudy)算法,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。In a specific implementation of this step 202, a BLEU (bilingual evaluation understudy) algorithm may be specifically used to determine whether the translation quality of the first translated text is greater than a preset quality threshold.
具体来讲,BLEU算法是一种对机器翻译结果的评测算法,用于评估一种自然语言翻译成另外一种自然语言的翻译质量。具体算法如下:Specifically, the BLEU algorithm is an evaluation algorithm for machine translation results, which is used to evaluate the translation quality of one natural language into another natural language. The specific algorithm is as follows:
首先,为了对第一翻译文本B1的翻译效果进行全面考虑,需要依次从1个词为基础单元(1-gram)到多个词为基础单元(n-gram)等多个角度,去统计识别文本A1与第二翻译文本A2之间可以匹配的基础单元个数,在统计过程中,不考虑各个基础单元在文本中所处的位置。然后,根据匹配出的基础单元个数,分别计算第二翻译文本A2在各阶基础单元下的匹配准确率。First of all, in order to comprehensively consider the translation effect of the first translated text B1, it is necessary to statistically identify from a number of perspectives, such as 1-word-based unit (1-gram) to multiple-word-based unit (n-gram). The number of basic units that can be matched between the text A1 and the second translated text A2. In the statistical process, the position of each basic unit in the text is not considered. Then, according to the number of matched basic units, the matching accuracy of the second translated text A2 under each order of basic units is calculated.
可以按照下面公式,计算第二翻译文本A2在各阶基础单元i-gram(i=1、 2…n)下的匹配准确率percison:The following formula can be used to calculate the matching accuracy rate percison of the second translated text A2 under each order basic unit i-gram (i = 1, 2 ... n):
Figure PCTCN2019082040-appb-000001
Figure PCTCN2019082040-appb-000001
其中,corret是第二翻译文本A2中与识别文本A1正确匹配的同阶基础单元个数,output_length是第二翻译文本A2中的同阶基础单元的总数。Corret is the number of same-level basic units in the second translation text A2 that correctly matches the recognition text A1, and output_length is the total number of same-level basic units in the second translation text A2.
例如,继续上述例子,假设识别文本A1是“李必须经过安检吗?”、第二翻译文本A2是“李先生通过安检了吗?”,则匹配准确率percison的计算结果如下表1所示。For example, following the above example, assuming that the recognition text A1 is "Does Lee have to go through security?" And the second translated text A2 is "Mr. Lee passed the security?" The calculation result of the matching accuracy rate percison is shown in Table 1 below.
表1Table 1
 Zh 正确匹配的基础单元Correctly matched base unit 匹配准确率percisonMatching accuracy percison
1-gram1-gram 李、过、安、检、吗、?Li, ever, security, check ,? 6/10=0.66/10 = 0.6
2-gram2-gram 过安、安检、吗?Go through security, security check? 3/9=0.333/9 = 0.33
3-gram3-gram 过安检Security check 1/8=0.1251/8 = 0.125
4-gram4-gram (无)(no) 0/7=00/7 = 0
然后,还要考虑对第二翻译文本A2中的冗余单词进行惩罚,所以,引入了长度惩罚因子来解决这个问题,其原则是,第二翻译文本A2越长,惩罚扣分则越多,长度惩罚因子C的计算公式如下:Then, we also need to consider punishing the redundant words in the second translation text A2. Therefore, a length penalty factor is introduced to solve this problem. The principle is that the longer the second translation text A2, the more penalties will be deducted. The formula for calculating the length penalty factor C is as follows:
C=min(1,L1/L2)              (2)C = min (1, L1 / L2) (2)
其中,L1是识别文本A1的长度,L2是第二翻译文本A2的长度。Among them, L1 is the length of the recognition text A1, and L2 is the length of the second translation text A2.
在公式(2)中,如果识别文本A1和第二翻译文本A2为中文文本,可以以字为单位计算文本长度。例如,当识别文本A1是“李必须经过安检吗?”,其长度为9;当第二翻译文本A2是“李先生通过安检了吗?”,其长度为10。In formula (2), if the recognized text A1 and the second translated text A2 are Chinese text, the text length can be calculated in word units. For example, when the recognition text A1 is "Does Lee have to go through security?", Its length is 9; when the second translated text A2 is "Mr. Lee has passed security?", Its length is 10.
最后,当按照上述公式(1)和(2)分别计算得到匹配准确率corret和长度惩罚因子C后,可以计算第二翻译文本A2的BLEU分数。具体可以选择某阶基础单元对应的BLEU分数,比如选择4-igram对应的BLEU分数,计算公式如下所示:Finally, after the matching accuracy rate corret and the length penalty factor C are calculated according to the above formulas (1) and (2) respectively, the BLEU score of the second translated text A2 can be calculated. Specifically, you can select the BLEU score corresponding to a certain basic unit, such as selecting the BLEU score corresponding to 4-igram. The calculation formula is as follows:
bleu 4-gram=C*f(4-gram)         (3) bleu 4-gram = C * f (4-gram) (3)
其中,bleu 4-gram为第二翻译文本A2的BLEU分数,C为长度惩罚因子,f是对1-gram、2-gram、3-gram、4-gram对应的匹配准确率的处理函数。 Among them, bleu 4-gram is the BLEU score of the second translated text A2, C is the length penalty factor, and f is a processing function for the matching accuracy rate corresponding to 1-gram, 2-gram, 3-gram, and 4-gram.
例如,当识别文本A1是“李必须经过安检吗?”、第二翻译文本A2是“李 先生通过安检了吗?”时,将通过公式(1)计算得到的各个匹配准确率(如表1所述)以及通过公式(2)计算得到的长度惩罚因子代入公式(3),可以计算得到第二翻译文本A2的BLEU分数为20.56。For example, when the recognition text A1 is "Does Lee have to go through security?" And the second translated text A2 is "Mr. Lee passed the security?", The accuracy of each match calculated by formula (1) (see Table 1) (Described above) and the length penalty factor calculated by formula (2) is substituted into formula (3), and the BLEU score of the second translated text A2 can be calculated to be 20.56.
在本实施例中,可以预先设置一个翻译评分阈值,这里将该翻译评分阈值作为预设质量阈值,比如将该阈值设置为50,由于上述计算得到的分数20.56小于阈值50,因此,可以判定第一翻译文本B1作为源语音数据的翻译结果是不可信的,比如上述第一翻译文本B1“Does Lee have to go through security?”是不可信的;反之,当BLEU分数大于或等于阈值50,则判定第一翻译文本B1作为源语音数据的翻译结果是可信的。In this embodiment, a translation scoring threshold may be set in advance. Here, the translation scoring threshold is used as a preset quality threshold. For example, the threshold is set to 50. Since the calculated score 20.56 is smaller than the threshold 50, it may be determined A translation text B1 as the translation result of the source speech data is not credible. For example, the above-mentioned first translation text B1 "Does Lee Lee Has To Go Through Security?" It is determined that the translation result of the first translated text B1 as the source speech data is credible.
综上,本实施例提供的一种翻译质量判定方法,可以对第一翻译文本进行反向翻译,得到第二翻译文本,并基于源语音数据的识别文本以及第二翻译文本,采用BLEU算法对第二翻译文本进行评分,从而可以根据评分结果对第一翻译文本的翻译质量进行判定,实现了翻译质量的评估问题。In summary, the method for determining translation quality provided in this embodiment can reversely translate the first translated text to obtain the second translated text, and use the BLEU algorithm to compare the translated text based on the source speech data and the second translated text. The second translated text is scored, so that the translation quality of the first translated text can be judged according to the scoring result, thereby achieving the problem of evaluating the translation quality.
第三实施例Third embodiment
在本实施例中,若通过上述第二实施例判断得到第一翻译文本的翻译质量不大于预设质量阈值,即判断第一翻译文本B1作为源语音数据的翻译结果不可信之后,由于语音翻译设备的判断结果可能并不准确,因此,语音翻译设备可以通过第一实施例中的步骤S102与用户进行交互,基于用户的交互反馈,来判断第一翻译文本B1是否是源语音数据的正确翻译结果。In this embodiment, if it is determined through the second embodiment that the translation quality of the first translated text is not greater than a preset quality threshold, that is, it is determined that the translation result of the first translated text B1 as the source speech data is unreliable, due to speech translation The judgment result of the device may not be accurate. Therefore, the voice translation device may interact with the user through step S102 in the first embodiment, and determine whether the first translated text B1 is the correct translation of the source voice data based on the user's interactive feedback. result.
在本实施例的一种实现方式中,第一实施例中的步骤S102具体可以包括:利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。在本实施方式中,可以将第二翻译文本A2作为与用户进行交互的内容,并根据用户反馈结果进行判断。In an implementation manner of this embodiment, step S102 in the first embodiment may specifically include: using the second translated text to interact with the user to determine the first translated text as the source voice data Is the translation result correct? In this embodiment, the second translated text A2 may be used as the content for interaction with the user, and the judgment may be made according to the user feedback result.
具体可以按照以下方式实现该判断步骤。Specifically, this determination step can be implemented in the following manner.
如图3所示,为本实施例提供的一种判断翻译结果是否可信的方法流程示意图,可以包括以下步骤:As shown in FIG. 3, a schematic flowchart of a method for determining whether a translation result is credible provided in this embodiment may include the following steps:
S301:向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似。S301: Output a first query voice to the user, where the first query voice is used to query whether the source voice data is semantically similar to the second translated text.
在本实施例中,可以将第二翻译文本A2合成语音后与用户进行交互,其交互的目的是询问用户想要翻译的语句是不是第二翻译文本A2(即源语音数 据与第二翻译文本A2的语义是否相似),为便于描述和区分,本实施例将该询问用户的语音称为第一询问语音,该第一询问语音具体可以是“您想要翻译的是第二翻译文本A2吗?”。In this embodiment, the second translated text A2 can be synthesized with speech to interact with the user. The purpose of the interaction is to ask whether the sentence the user wants to translate is the second translated text A2 (that is, the source speech data and the second translated text). Whether the semantics of A2 are similar), for the convenience of description and differentiation, this embodiment calls the voice of the inquiring user as the first inquiring voice, and the first inquiring voice may specifically be "Do you want to translate the second translated text A2?" ? ".
例如,假设第一翻译文本B1是“Does Lee have to go through security?”,通过第二实施例S201对其进行反向翻译得到第二翻译文本A2是“李先生通过安检了吗?”,当利用BLEU算法对该第二翻译文本A2进行评分,比如得到分值为20.56分,由于其低于预设质量阈值50分,则将反向翻译后的第二翻译文本A2合成第一询问语音,比如“请问您想翻译的是“李先生通过安检了吗?””。For example, assuming that the first translated text B1 is "Does Lee, through security?", The second translated text A2 is obtained by reverse-translating it through the second embodiment S201. "Did Mr. Li pass the security inspection?", When The BLEU algorithm is used to score the second translated text A2. For example, a score of 20.56 is obtained. Since it is 50 points lower than the preset quality threshold, the second translated text A2 after reverse translation is synthesized into the first query voice. For example, "What do you want to translate is" Mr. Li passed the security check? " "".
此时,语音翻译设备将该第一询问语音反馈给用户,并等待用户的回答。At this time, the voice translation device feeds back the first inquiry voice to the user, and waits for a response from the user.
S302:若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的。S302: If a positive answer to the first query voice is received by the user, the first translation text is correct as a translation result of the source voice data.
用户可以以语音或按键等方式,对第一询问语音作出肯定回答,比如,用户可以向语音翻译设备输入语音“是的”、或者在语音翻译设备上按下“OK”或“确认”键等。在这种情况下,语音翻译设备认为第一翻译文本B1作为源语音数据的翻译结果是可信的,即,认为第一翻译文本B1作为源语音数据的翻译结果是正确的,因此,可以通过步骤S103将第一翻译文本B1作为源语音数据的翻译结果。The user can give a positive answer to the first query voice by using voice or keys, for example, the user can input a voice "yes" to the voice translation device, or press the "OK" or "OK" key on the voice translation device, etc. . In this case, the speech translation device considers that the translation result of the first translation text B1 as the source speech data is credible, that is, it considers that the translation result of the first translation text B1 as the source speech data is correct. Step S103 uses the first translated text B1 as the translation result of the source speech data.
S303:若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是错误的。S303: If a negative answer to the first query voice is received by the user, the translation result of the first translated text as the source voice data is incorrect.
用户可以语音或按键等方式,对第一询问语音作出否定回答,比如,用户可以向语音翻译设备输入语音“不是”、或者在语音翻译设备上按下“NO”键等。在这种情况下,语音翻译设备认为第一翻译文本B1作为源语音数据的翻译结果是不可信的,即,认为第一翻译文本B1作为源语音数据的翻译结果是错误的。The user can make a negative answer to the first query voice by using voice or keys, for example, the user can input a voice "No" to the voice translation device, or press the "NO" key on the voice translation device. In this case, the speech translation device considers that the translation result of the first translation text B1 as the source speech data is unreliable, that is, it considers that the translation result of the first translation text B1 as the source speech data is wrong.
综上,本实施例提供的一种判断翻译结果是否可信的方法,可以向用户输出第一询问语音,该第一询问语音用于询问源语音数据与第二翻译文本的语义是否相似;若接收到肯定回答,则认为第一翻译文本作为源语音数据的翻译结果是可信的,反之,若接收到否定回答,则认为第一翻译文本作为源语音数据的翻译结果是不可信的。可见,通过与用户进行人机交互,可以确 认第一翻译文本是否正确,从而保证了翻译结果的准确性。In summary, a method for judging whether a translation result is credible provided in this embodiment may output a first query voice to a user, and the first query voice is used to query whether the source voice data is similar to the semantics of the second translated text; When a positive answer is received, the translation result of the first translated text as the source speech data is believed to be credible; on the other hand, if a negative answer is received, the translation result of the first translated text as the source speech data is considered unreliable. It can be seen that through human-computer interaction with the user, it is possible to confirm whether the first translated text is correct, thereby ensuring the accuracy of the translation result.
第四实施例Fourth embodiment
在本实施例中,当第一实施例通过步骤S102判断所述第一翻译文本作为所述源语音数据的翻译结果是错误的时,可以进一步对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。In this embodiment, when the first embodiment determines that the translation result of the first translated text as the source voice data is incorrect through step S102, the first translated text may be further modified, and the modified The resulting text is used as the translation result of the source speech data.
当修正成功后,可以将修正后的文本数据作为源语音数据的文本翻译结果,此时,可以进一步对修正后的文本数据进行语音合成,得到目标语音数据,并将目标语音数据直接反馈给用户,从而结束本轮翻译。当然,当将修正成功后的文本数据作为源语音数据的文本翻译结果后,也可以对其进行其它处理,本实施例不对后续处理方式进行限制。When the correction is successful, the corrected text data can be used as the text translation result of the source speech data. At this time, the corrected text data can be further speech synthesized to obtain the target speech data, and the target speech data is directly fed back to the user. , Thus ending this round of translation. Of course, after the corrected text data is used as the text translation result of the source speech data, other processing may also be performed on it, and this embodiment does not limit the subsequent processing manner.
可见,本实施例增加了对翻译结果的修正功能,即,可以对第一翻译文本的翻译质量进行评估,当评估结果表示第一翻译文本作为翻译结果的翻译质量较低时,可以对该翻译结果进行修正,从而提高了翻译结果的准确性。It can be seen that the present embodiment adds a correction function to the translation result, that is, the translation quality of the first translation text can be evaluated, and when the evaluation result indicates that the translation quality of the first translation text as a translation result is low, the translation can be The results are revised to improve the accuracy of the translation results.
需要说明的是,本申请可以在上述任一实施例的基础上,按照本实施例提供的修正方法对第一翻译文本B1进行修正。It should be noted that, based on any of the foregoing embodiments, the present application may modify the first translated text B1 according to the correction method provided by this embodiment.
在本实施例的一种实现方式中,具体可以采用文本匹配的方式,对第一翻译文本B1进行修正,接下来,将对本修正步骤的具体实现方式进行介绍。In an implementation manner of this embodiment, a text matching manner may be specifically used to modify the first translated text B1. Next, a specific implementation manner of this correction step will be described.
参见图4,为本实施例提供的一种翻译文本修正方法的流程示意图,该翻译文本修正方法包括以下步骤:Referring to FIG. 4, a flowchart of a method for correcting a translated text according to this embodiment is provided. The method for correcting a translated text includes the following steps:
S401:将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作。S401: Perform a matching operation on the recognized text of the source voice data and the text data in a database.
在本实施例中,可以预先构建一个数据库,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种(翻译前语言的语种)相同,所述第二样本文本的语种与所述第一翻译文本的语种(翻译后语言的语种)相同。In this embodiment, a database may be constructed in advance, where the database stores at least one set of sentence pairs, where the sentence pairs include a first sample text and a first translated text after the first sample text is correctly translated. Two sample texts, the language of the first sample text is the same as the language of the source speech data (the language of the language before translation), and the language of the second sample text is the same as the language of the first translation text (after translation) Language).
具体来讲,可以预先收集大量的第一样本文本以及对第一样本文本进行正确翻译后的第二样本文本,将相互对应的第一样本文本和第二样本文本形成语句对,利用这些语句对构建数据库,该数据库可以是语音翻译设备的本地数据库、也可以是与语音翻译设备进行通信的云服务器侧的数据库。Specifically, a large number of first sample texts and second sample texts after correct translation of the first sample texts can be collected in advance, and the first sample text and the second sample text corresponding to each other are formed into sentence pairs, and These sentence pairs construct a database, which may be a local database of the speech translation device, or a cloud server-side database that communicates with the speech translation device.
在本实施例中,该数据库可以根据具体的应用需求进行构建,即,该数据库可以只存储与具体应用场景相关的语句对,例如,用户需要在出入境安检时使用语音翻译设备,那么,可以预先在该数据库中存储出入境安检常用的语句对;当然,该数据库中也可以存储多个应用场景相关的语句对,在实际应用时,可以根据用户的源语音数据,自动判定应用场景,然后选择相应应用场景的语句对集合。In this embodiment, the database can be constructed according to specific application requirements, that is, the database can store only sentence pairs related to specific application scenarios. For example, if a user needs to use a voice translation device during immigration security, then, Sentence pairs commonly used in immigration and security checks are stored in this database in advance; of course, the database can also store multiple sentence pairs related to application scenarios. In actual applications, the application scenarios can be automatically determined based on the user's source voice data, and then Select a set of sentence pairs for the corresponding application scenario.
需要说明的是,本实施例不限制某一应用场景下的语句对的数量,比如1-4万个左右的语句对,但为了达到修正效果,需尽量涵盖与相应应用场景相关的常用或不常用语句对。It should be noted that this embodiment does not limit the number of sentence pairs in a certain application scenario, such as about 1 to 40,000 sentence pairs, but in order to achieve the correction effect, it is necessary to cover commonly used or Common sentence pairs.
以出入境安检场景为例,某语句对在数据库中的存储格式如下:Taking the immigration security check scenario as an example, the storage format of a sentence pair in the database is as follows:
{"cn":"行李必须经过安检吗?","update_time":"20171018T173941",{"cn": "Do my luggage have to go through security?", "update_time": "20171018T173941",
"en":"Must the luggage be checked by security?","create_time":"20171018T173941","id":"00000001"}"en": "Must the luggage checked security?", "create_time": "20171018T173941", "id": "00000001"}
其中:cn:表示中文语句;Where: cn: Chinese sentence;
en:表示对应的英文语句;en: the corresponding English sentence;
update_time:表示上传数据库时间;update_time: indicates the database upload time;
create_time:表示制作语句对的时间;create_time: indicates the time to make a statement pair;
id:表示数据对在数据库中的唯一标识。id: the unique identifier of the data pair in the database.
在本实施例中,将源语音数据的识别文本A1与数据库中的文本数据进行匹配,比如,可以采用Doc2Vec算法进行匹配,其中,Doc2Vec又叫做paragraph2vec或sentence embeddings,是一种非监督式算法。In this embodiment, the recognition text A1 of the source speech data is matched with the text data in the database. For example, the Doc2Vec algorithm can be used for matching. Doc2Vec is also called paragraph2vec or sentence embeddings, which is an unsupervised algorithm.
S402:通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本。S402: A first sample text that is most similar to the recognized text of the source speech data is obtained through the matching operation.
通过将识别文本A1与数据库中的文本数据进行匹配,得到数据库中与识别文本A1最相似的第一样本文本,这里简称其为样本文本A3。在进行匹配时,可以先将识别文本A1向量化后,得到识别文本A1的句向量,然后,对于数据库中与识别文本A1为相同语种的每一第一样本文本,分别计算识别文本A1的句向量与每一第一样本文本的句向量之间的距离,选择距离最近的第一样本文本,作为与识别文本A1最相似的样本文本A3。By matching the recognition text A1 with the text data in the database, a first sample text in the database that is most similar to the recognition text A1 is obtained, which is simply referred to as sample text A3. During the matching, the recognition text A1 can be vectorized to obtain the sentence vector of the recognition text A1. Then, for each first sample text in the same language as the recognition text A1 in the database, the recognition text A1 is calculated separately. The distance between the sentence vector and the sentence vector of each first sample text, the first sample text closest to the distance is selected as the sample text A3 most similar to the recognition text A1.
例如,当利用Doc2Vec算法进行匹配时,假设识别文本A1为“李必须经 过安检吗?”,将其与数据库进行文本匹配,若确定id为“00000001”的第一样本文本“行李必须经过安检吗?”的句向量与“李必须经过安检吗?”的句向量之间的距离最短,则将id为“00000001”的第一样本文本“行李必须经过安检吗?”,作为与识别文本A1最相似的样本文本A3。For example, when using the Doc2Vec algorithm for matching, suppose the recognition text A1 is "Does Lee have to go through security?", And match it with the database. If it is determined that the first sample text with id "00000001" "Luggage must go through security The shortest distance between the sentence vector of "?" And "Do Lee have to go through security?" The first sample text "Is luggage checked through?" With id "00000001" is used as the identification text. A1 is the most similar sample text A3.
S403:根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。S403: Correct the first translated text according to the most similar first sample text.
在本实施例中,当获取到与源语音数据的识别文本A1最相似的第一样本文本,即样本文本A3,可以利用样本文本A3对第一翻译文本B1进行修正。In this embodiment, when the first sample text that is most similar to the recognition text A1 of the source speech data, that is, the sample text A3 is obtained, the first translated text B1 may be modified by using the sample text A3.
在本实施例的一种实现方式中,可以直接将样本文本A3所属的语句对中的第二样本文本,作为对第一翻译文本进行成功修正后的文本。In an implementation manner of this embodiment, the second sample text in the sentence pair to which the sample text A3 belongs can be directly used as the text after the first translation text is successfully modified.
在本实施例的另一种实现方式中,步骤S403具体可以利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。在本实现方式中,可以将所述最相似的第一样本文本即样本文本A3,作为与用户进行交互的内容,并根据用户反馈结果对第一翻译文本的修正。In another implementation manner of this embodiment, step S403 may specifically use the most similar first sample text to interact with the user to implement correction of the first translated text. In this implementation manner, the most similar first sample text, that is, sample text A3, may be used as the content for interaction with the user, and the first translated text may be modified according to the user feedback result.
步骤S403的该具体实现方式,可以包括以下步骤A-B:The specific implementation of step S403 may include the following steps A-B:
步骤A:向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似。Step A: output a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text.
可以将从数据库匹配出的样本文本A3,合成语音后与用户进行交互,其交互的目的是询问用户想要翻译的语句是否是样本文本A3(即源语音数据与样本文本A3的语义是否相似),为便于描述和区分,本实施例将该询问用户的语音称为第二询问语音,该第二询问语音具体可以是“您想要翻译的是样本文本A3吗?”。The sample text A3 matched from the database can be used to interact with the user after synthesizing the speech. The purpose of the interaction is to ask whether the sentence the user wants to translate is the sample text A3 (that is, whether the semantics of the source speech data and the sample text A3 are similar). For the convenience of description and differentiation, this embodiment refers to the voice of the inquiring user as the second inquiring voice. The second inquiring voice may specifically be "Do you want to translate the sample text A3?".
例如,假设样本文本A3为“行李必须经过安检吗?”,则第二询问语音可以是“请问您想翻译的是“行李必须经过安检吗?””。For example, assuming the sample text A3 is "Does luggage have to go through security?", The second query voice may be "Do you want to translate" Does luggage have to go through security? " "".
此时,语音翻译设备将该第二询问语音反馈给用户,并等待用户的回答。At this time, the voice translation device feeds back the second query voice to the user, and waits for a response from the user.
步骤B:若接收到所述用户对所述第二询问语音的肯定回答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。Step B: if a positive answer to the second query voice is received by the user, obtain a second sample text from the sentence pair to which the most similar first sample text belongs, as a translation to the first Text The text after successful correction.
用户可以以语音或按键等方式,对第二询问语音作出肯定回答,比如,用户可以向语音翻译设备输入语音“是的”、或者在语音翻译设备上按下“OK”或“确认”键等。在这种情况下,可以通过查询数据库,从样本文本A3所属 的语句对中获取第二样本文本,这里将其称为样本文本B3,将样本文本B3作为对第一翻译文本进行成功修正后的文本。The user can give a positive answer to the second query voice by using voice or keys, for example, the user can input a voice "yes" to the voice translation device, or press the "OK" or "OK" key on the voice translation device, etc. . In this case, the second sample text can be obtained from the sentence pair to which the sample text A3 belongs by querying the database, which is referred to as sample text B3 here, and the sample text B3 is used as a successful modification of the first translated text text.
例如,用户听到语音翻译设备发出第二询问语音“请问你想翻译的是“行李必须经过安检吗?””,如果回答的结果为“是的”,此时,语音翻译设备认为用户想要翻译的是样本文本A3:“行李必须经过安检吗?”,并将语句对中与之对应的样本文本B3“Must the luggage be checked by security?”作为对第一翻译文本B1进行成功修正后的文本,修正成功。For example, the user hears the second questioning voice of the voice translation device, "What do you want to translate is" Does luggage have to go through security? " "", If the answer is "yes", at this time, the speech translation device considers that the user wants to translate the sample text A3: "Does the luggage have to go through security check?", And matches the sentence with the corresponding sample text B3 "Must", "checked", "security?" As the text after the first translation text B1 was successfully corrected, the correction was successful.
进一步地,用户也可能对第二询问语音作出否定回答,因此,本实施例还可以包括:Further, the user may also make a negative answer to the second query voice. Therefore, this embodiment may further include:
步骤C:若接收到所述用户对所述第二询问语音的否定回答,则输出提示语音,其中,所述提示语音用于提示所述用户重复所述源语音数据、或者更换所述源语音数据的说法。Step C: if a negative answer to the second query voice is received by the user, a prompt voice is output, wherein the prompt voice is used to prompt the user to repeat the source voice data or replace the source voice Data claims.
用户可以语音或按键等方式,对第二询问语音作出否定回答,比如,用户可以向语音翻译设备输入语音“不是”、或者在语音翻译设备上按下“NO”键等。在这种情况下,认为修正失败,此时,语音翻译设备可以以语音方式,请求用户重复一遍源语音数据、或者换一种与源语音数据在语义上相似的说法,以便开启新一轮的翻译交互。The user can make a negative answer to the second query voice by voice or by pressing a button, for example, the user can input a voice "No" to the voice translation device, or press the "NO" key on the voice translation device. In this case, it is considered that the correction has failed. At this time, the speech translation device may request the user to repeat the source speech data in a voice manner, or to change the term similar to the source speech data in semantics in order to start a new round of Translation interaction.
综上,本实施例提供的一种翻译文本修正方法,将源语音数据的识别文本与数据库中的文本数据进行匹配操作,以获取与识别文本最相似的语句,然后根据该最相似的语句,对第一翻译文本进行修正。可见,本实施例可以预先积累各翻译方向、各应用场景下的语句对,并存储在数据库中,可通过匹配算法,在数据库中查找与识别文本最相似的语句,将该语句的翻译文本作为修正后的文本,从而实现了文本修正。In summary, a translation text correction method provided in this embodiment performs a matching operation on the recognized text of the source speech data and the text data in the database to obtain the sentence most similar to the recognized text, and then according to the most similar sentence, Correct the first translated text. It can be seen that in this embodiment, sentence pairs in each translation direction and application scenario can be accumulated in advance and stored in the database. The matching algorithm can be used to find the sentence most similar to the recognized text in the database, and the translated text of the sentence is used as Corrected text, thus achieving text correction.
第五实施例Fifth Embodiment
本实施例将对一种语音翻译装置进行介绍,相关内容请参见上述方法实施例。需要说明的是,该语音翻译装置可以是上述语音翻译设备、也可以是上述语音翻译设备中的一部分。This embodiment will introduce a speech translation device. For related content, refer to the foregoing method embodiment. It should be noted that the voice translation device may be the above-mentioned voice translation device, or may be a part of the above-mentioned voice translation device.
参见图5,为本实施例提供的一种语音翻译装置的组成示意图,该装置500包括:Referring to FIG. 5, a composition diagram of a speech translation apparatus according to this embodiment is provided. The apparatus 500 includes:
语音翻译单元501,用于对用户的源语音数据进行翻译,得到第一翻译文 本,其中,所述第一翻译文本的语种与所述源语音数据的语种不同;A voice translation unit 501, configured to translate a user's source voice data to obtain a first translated text, wherein a language of the first translated text is different from a language of the source voice data;
用户交互单元502,用于通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。The user interaction unit 502 is configured to determine whether the translation result of the first translated text as the source voice data is correct by interacting with the user.
在本实施例的一种实现方式中,所述装置500还可以包括:In an implementation manner of this embodiment, the apparatus 500 may further include:
文本修正单元,用于若判断得到所述第一翻译文本作为所述源语音数据的翻译结果是错误的,则对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。A text correction unit, configured to correct the first translated text if it is determined that the translation result of the first translated text as the source speech data is incorrect, and use the corrected text as the source speech Data translation results.
在本实施例的一种实现方式中,所述装置500还可以包括:In an implementation manner of this embodiment, the apparatus 500 may further include:
质量判断单元,用于判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;若否,则触发所述用户交互单元502来通过与所述用户进行交互判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。A quality determining unit, configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to characterize the first translated text as the source speech data The correctness of the translation result; if not, triggering the user interaction unit 502 to determine whether the translation result of the first translated text as the source speech data is correct by interacting with the user.
在本实施例的一种实现方式中,所述质量判断单元包括:In an implementation manner of this embodiment, the quality determination unit includes:
反向翻译子单元,用于对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同;A reverse translation subunit, configured to translate the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data;
质量判断子单元,用于根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。A quality judging subunit, configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
在本实施例的一种实现方式中,所述质量判断子单元,具体用于根据所述源语音数据的识别文本以及所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。In an implementation manner of this embodiment, the quality judging subunit is specifically configured to determine whether the translation quality of the first translated text is greater than or equal to that of the first translated text based on the recognized text of the source voice data and the second translated text. Preset quality threshold.
在本实施例的一种实现方式中,所述用户交互单元502,具体可以用于利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。In an implementation manner of this embodiment, the user interaction unit 502 may be specifically configured to use the second translated text to interact with the user to determine the first translated text as the source speech data. Whether the translation result is correct.
在本实施例的一种实现方式中,所述用户交互单元502可以包括:In an implementation manner of this embodiment, the user interaction unit 502 may include:
第一询问子单元,用于向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似;A first query subunit, configured to output a first query voice to the user, wherein the first query voice is used to query whether the source voice data is semantically similar to the second translated text;
结果确定子单元,用于若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的;若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文本作为所述 源语音数据的翻译结果是错误的。A result determining subunit, configured to: if a positive answer to the first query voice is received by the user, the first translated text is correct as a translation result of the source voice data; if the user is received For a negative answer to the first query voice, the first translation text is incorrect as a translation result of the source voice data.
在本实施例的一种实现方式中,所述文本修正单元,具体可以用于采用文本匹配的方式,对所述第一翻译文本进行修正。In an implementation manner of this embodiment, the text correction unit may be specifically configured to correct the first translated text in a text matching manner.
在本实施例的一种实现方式中,所述文本修正单元可以包括:In an implementation manner of this embodiment, the text correction unit may include:
文本匹配子单元,用于将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种相同,所述第二样本文本的语种与所述第一翻译文本的语种相同;Text matching sub-unit, configured to match the recognized text of the source speech data with text data in a database, wherein the database stores at least one sentence pair, the sentence pair includes a first sample text and A second sample text after the first sample text is correctly translated, the language of the first sample text is the same as the language of the source speech data, and the language of the second sample text is the same as the first sample text The language of the translated text is the same;
文本获取子单元,用于通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本;A text obtaining subunit, configured to obtain, through the matching operation, a first sample text that is most similar to the recognized text of the source speech data;
文本修正子单元,用于根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。A text correction subunit is configured to correct the first translated text according to the most similar first sample text.
在本实施例的一种实现方式中,文本修正子单元,具体可以用于利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。In an implementation manner of this embodiment, the text correction subunit may be specifically configured to interact with the user by using the most similar first sample text to implement correction on the first translated text.
在本实施例的一种实现方式中,所述文本修正子单元可以包括:In an implementation manner of this embodiment, the text correction subunit may include:
第二询问子单元,用于向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似;A second query subunit, configured to output a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text;
修正完成子单元,用于若接收到所述用户对所述第二询问语音的肯定回答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。The modification completion subunit is configured to obtain a second sample text from the sentence pair to which the most similar first sample text belongs if the user's positive answer to the second query voice is received, as The text after the first translation is successfully revised.
在本实施例的一种实现方式中,所述文本修正子单元还可以包括:In an implementation manner of this embodiment, the text correction subunit may further include:
语音提示子单元,用于若接收到所述用户对所述第二询问语音的否定回答,则输出提示语音,其中,所述提示语音用于提示所述用户重复所述源语音数据、或者更换所述源语音数据的说法。A voice prompting subunit, configured to output a prompting voice if a negative answer to the second query voice is received by the user, wherein the prompting voice is used to prompt the user to repeat the source voice data or replace The source speech data.
第六实施例Sixth embodiment
本实施例将对另一种语音翻译装置进行介绍,相关内容请参见上述方法实施例。This embodiment will introduce another speech translation device. For related content, refer to the foregoing method embodiment.
参见图6,为本实施例提供的一种语音翻译装置的硬件结构示意图,所述语音交互装置600包括存储器601和接收器602,以及分别与所述存储器601和所述接收器602连接的处理器603,所述存储器601用于存储一组程序指令,所述处理器603用于调用所述存储器601存储的程序指令执行如下操作:Referring to FIG. 6, a schematic diagram of a hardware structure of a speech translation apparatus according to this embodiment. The speech interaction apparatus 600 includes a memory 601 and a receiver 602, and processes connected to the memory 601 and the receiver 602 respectively A processor 603, the memory 601 is configured to store a set of program instructions, and the processor 603 is configured to call the program instructions stored in the memory 601 to perform the following operations:
对用户的源语音数据进行翻译,得到第一翻译文本,其中,所述第一翻译文本的语种与所述源语音数据的语种不同;Translating the user's source speech data to obtain a first translated text, wherein the language of the first translated text is different from the language of the source speech data;
通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。By interacting with the user, it is determined whether the translation result of the first translated text as the source speech data is correct.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
若判断得到所述第一翻译文本作为所述源语音数据的翻译结果是错误的,则对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。If it is determined that the translation result of the first translated text as the source speech data is incorrect, the first translated text is corrected, and the corrected text is used as the translation result of the source speech data.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;Determine whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to characterize the correctness of the translation result of the first translated text as the source speech data;
若否,则执行通过与所述用户进行交互的步骤。If not, perform the step of interacting with the user.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同;Translating the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data;
根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Determining whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
根据所述源语音数据的识别文本以及所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Determining whether the translation quality of the first translated text is greater than a preset quality threshold according to the recognized text of the source speech data and the second translated text.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器 601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。Interact with the user by using the second translated text to determine whether the translation result of the first translated text as the source speech data is correct.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似;Outputting a first query voice to the user, wherein the first query voice is used to query whether the source voice data is similar to the semantics of the second translated text;
若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的;If a positive answer to the first query voice is received by the user, the first translation text is correct as a translation result of the source voice data;
若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是错误的。If a negative answer to the first query voice is received by the user, the first translation text is incorrect as a translation result of the source voice data.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
采用文本匹配的方式,对所述第一翻译文本进行修正。The first translation text is corrected by using a text matching method.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种相同,所述第二样本文本的语种与所述第一翻译文本的语种相同;Match the recognized text of the source speech data with text data in a database, wherein the database stores at least one set of sentence pairs, the sentence pairs including a first sample text and the first sample text A correctly translated second sample text, the language of the first sample text is the same as the language of the source speech data, and the language of the second sample text is the same as the language of the first translated text;
通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本;Obtaining the first sample text most similar to the recognition text of the source speech data through the matching operation;
根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。Correct the first translated text based on the most similar first sample text.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。Interacting with the user by using the most similar first sample text to achieve correction of the first translated text.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似;Outputting a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text;
若接收到所述用户对所述第二询问语音的肯定回答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。If a positive answer is received from the user to the second query voice, a second sample text is obtained from the sentence pair to which the most similar first sample text belongs, as a success of the first translated text Corrected text.
在本实施例的一种实现方式中,所述处理器603还用于调用所述存储器601存储的程序指令执行如下操作:In an implementation manner of this embodiment, the processor 603 is further configured to call a program instruction stored in the memory 601 to perform the following operations:
若接收到所述用户对所述第二询问语音的否定回答,则输出提示语音,其中,所述提示语音用于提示所述用户重复所述源语音数据、或者更换所述源语音数据的说法。If a negative answer to the second query voice is received by the user, a prompt voice is output, wherein the prompt voice is used to prompt the user to repeat the source voice data or to replace the source voice data. .
在一些实施方式中,所述处理器603可以为中央处理器(Central Processing Unit,CPU),所述存储器601可以为随机存取存储器(Random Access Memory,RAM)类型的内部存储器,所述接收器602可以包含普通物理接口,所述物理接口可以为以太(Ethernet)接口或异步传输模式(Asynchronous Transfer Mode,ATM)接口。所述处理器603、接收器602和存储器601可以集成为一个或多个独立的电路或硬件,如:专用集成电路(Application Specific Integrated Circuit,ASIC)。In some embodiments, the processor 603 may be a central processing unit (CPU), the memory 601 may be an internal memory of random access memory (RAM) type, and the receiver 602 may include a common physical interface, and the physical interface may be an Ethernet interface or an Asynchronous Transfer Mode (ATM) interface. The processor 603, the receiver 602, and the memory 601 may be integrated into one or more independent circuits or hardware, such as: Application Specific Integrated Circuit (ASIC).
进一步地,本实施例还提供了一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行上述语音翻译方法中的任意一种实现方式。Further, this embodiment also provides a computer-readable storage medium, which includes instructions that, when run on a computer, cause the computer to perform any one of the above-mentioned voice translation methods.
进一步地,本实施例还提供了一种计算机程序产品,所述计算机程序产品在终端设备上运行时,使得所述终端设备执行上述语音翻译方法中的任意一种实现方式。Further, this embodiment also provides a computer program product, which, when the computer program product runs on a terminal device, causes the terminal device to execute any one of the above-mentioned voice translation methods.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到上述实施例方法中的全部或部分步骤可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者诸如媒体网关等网络通信设 备,等等)执行本申请各个实施例或者实施例的某些部分所述的方法。It can be known from the description of the foregoing embodiments that those skilled in the art can clearly understand that all or part of the steps in the method of the above embodiment can be implemented by means of software plus a necessary universal hardware platform. Based on such an understanding, the technical solution of the present application, in essence, or a part that contributes to the existing technology, can be embodied in the form of a software product, which can be stored in a storage medium, such as ROM / RAM, magnetic disk , Optical discs, etc., including a number of instructions for causing a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to execute the various embodiments or certain parts of the embodiments described in this application. method.
需要说明的是,本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。It should be noted that each embodiment in this specification is described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments may refer to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part may refer to the description of the method.
还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should also be noted that in this article, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations There is any such actual relationship or order among them. Moreover, the terms "including", "comprising", or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article, or device that includes a series of elements includes not only those elements but also those that are not explicitly listed Or other elements inherent to such a process, method, article, or device. Without more restrictions, the elements defined by the sentence "including a ..." do not exclude the existence of other identical elements in the process, method, article, or equipment that includes the elements.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to implement or use the present application. Various modifications to these embodiments will be apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the application. Therefore, this application will not be limited to the embodiments shown herein, but should conform to the widest scope consistent with the principles and novel features disclosed herein.

Claims (25)

  1. 一种语音翻译方法,其特征在于,包括:A speech translation method, comprising:
    对用户的源语音数据进行翻译,得到第一翻译文本,其中,所述第一翻译文本的语种与所述源语音数据的语种不同;Translating the user's source speech data to obtain a first translated text, wherein the language of the first translated text is different from the language of the source speech data;
    通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。By interacting with the user, it is determined whether the translation result of the first translated text as the source speech data is correct.
  2. 根据权利要求1所述的方法,其特征在于,所述判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确之后,还包括:The method according to claim 1, wherein after determining whether the translation result of the first translated text as the source speech data is correct, further comprising:
    若判断得到所述第一翻译文本作为所述源语音数据的翻译结果是错误的,则对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。If it is determined that the translation result of the first translated text as the source speech data is incorrect, the first translated text is corrected, and the corrected text is used as the translation result of the source speech data.
  3. 根据权利要求1所述的方法,其特征在于,所述通过与所述用户进行交互之前,还包括:The method according to claim 1, before the interacting with the user, further comprising:
    判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;Determine whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to characterize the correctness of the translation result of the first translated text as the source speech data;
    若否,则执行通过与所述用户进行交互的步骤。If not, perform the step of interacting with the user.
  4. 根据权利要求3所述的方法,其特征在于,所述判断所述第一翻译文本的翻译质量是否大于预设质量阈值,包括:The method according to claim 3, wherein the determining whether the translation quality of the first translated text is greater than a preset quality threshold comprises:
    对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同;Translating the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data;
    根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Determining whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值,包括:The method according to claim 4, wherein determining whether the translation quality of the first translated text is greater than a preset quality threshold based on the second translated text, comprises:
    根据所述源语音数据的识别文本以及所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。Determining whether the translation quality of the first translated text is greater than a preset quality threshold according to the recognized text of the source speech data and the second translated text.
  6. 根据权利要求4所述的方法,其特征在于,所述通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确,包括:The method according to claim 4, wherein the determining whether the first translation text is the correct translation result of the source speech data by interacting with the user comprises:
    利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。Interact with the user by using the second translated text to determine whether the translation result of the first translated text as the source speech data is correct.
  7. 根据权利要求6所述的方法,其特征在于,所述利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确,包括:The method according to claim 6, wherein the using the second translated text to interact with the user to determine whether the translation result of the first translated text as the source speech data is correct comprises:
    向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似;Outputting a first query voice to the user, wherein the first query voice is used to query whether the source voice data is similar to the semantics of the second translated text;
    若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的;If a positive answer to the first query voice is received by the user, the first translation text is correct as a translation result of the source voice data;
    若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是错误的。If a negative answer to the first query voice is received by the user, the first translation text is incorrect as a translation result of the source voice data.
  8. 根据权利要求2至7任一项所述的方法,其特征在于,所述对所述第一翻译文本进行修正,包括:The method according to any one of claims 2 to 7, wherein the modifying the first translated text comprises:
    采用文本匹配的方式,对所述第一翻译文本进行修正。The first translation text is corrected by using a text matching method.
  9. 根据权利要求8所述的方法,其特征在于,所述采用文本匹配的方式,对所述第一翻译文本进行修正,包括:The method according to claim 8, wherein the modifying the first translated text in a text matching manner comprises:
    将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种相同,所述第二样本文本的语种与所述第一翻译文本的语种相同;Match the recognized text of the source speech data with text data in a database, wherein the database stores at least one set of sentence pairs, the sentence pairs including a first sample text and the first sample text A correctly translated second sample text, the language of the first sample text is the same as the language of the source speech data, and the language of the second sample text is the same as the language of the first translated text;
    通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本;Obtaining the first sample text most similar to the recognition text of the source speech data through the matching operation;
    根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。Correct the first translated text based on the most similar first sample text.
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述最相似的第一样本文本,对所述第一翻译文本进行修正,包括:The method according to claim 9, wherein the modifying the first translated text based on the most similar first sample text comprises:
    利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。Interacting with the user by using the most similar first sample text to achieve correction of the first translated text.
  11. 根据权利要求10所述的方法,其特征在于,所述利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正,包括:The method according to claim 10, wherein the interacting with the user by using the most similar first sample text to achieve the correction of the first translated text comprises:
    向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似;Outputting a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text;
    若接收到所述用户对所述第二询问语音的肯定回答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。If a positive answer is received from the user to the second query voice, a second sample text is obtained from the sentence pair to which the most similar first sample text belongs, as a success of the first translated text Corrected text.
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:The method according to claim 11, further comprising:
    若接收到所述用户对所述第二询问语音的否定回答,则输出提示语音,其中,所述提示语音用于提示所述用户重复所述源语音数据、或者更换所述源语音数据的说法。If a negative answer to the second query voice is received by the user, a prompt voice is output, wherein the prompt voice is used to prompt the user to repeat the source voice data or to replace the source voice data. .
  13. 一种语音翻译装置,其特征在于,包括:A speech translation device, comprising:
    语音翻译单元,用于对用户的源语音数据进行翻译,得到第一翻译文本,其中,所述第一翻译文本的语种与所述源语音数据的语种不同;A voice translation unit, configured to translate a user's source voice data to obtain a first translated text, wherein a language of the first translated text is different from a language of the source voice data;
    用户交互单元,用于通过与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。A user interaction unit is configured to determine whether the translation result of the first translated text as the source voice data is correct by interacting with the user.
  14. 根据权利要求13所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 13, further comprising:
    文本修正单元,用于若判断得到所述第一翻译文本作为所述源语音数据的翻译结果是错误的,则对所述第一翻译文本进行修正,并将修正后的文本作为所述源语音数据的翻译结果。A text correction unit, configured to correct the first translated text if it is determined that the translation result of the first translated text as the source speech data is incorrect, and use the corrected text as the source speech Data translation results.
  15. 根据权利要求13所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 13, further comprising:
    质量判断单元,用于判断所述第一翻译文本的翻译质量是否大于预设质量阈值,其中,所述第一翻译文本的翻译质量用于表征所述第一翻译文本作为所述源语音数据的翻译结果的正确性;若否,则触发所述用户交互单元来通过与所述用户进行交互判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。A quality determining unit, configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold, wherein the translation quality of the first translated text is used to characterize the first translated text as the source speech data The correctness of the translation result; if not, triggering the user interaction unit to determine whether the translation result of the first translated text as the source speech data is correct by interacting with the user.
  16. 根据权利要求15所述的装置,其特征在于,所述质量判断单元包括:The device according to claim 15, wherein the quality judgment unit comprises:
    反向翻译子单元,用于对所述第一翻译文本进行翻译,得到第二翻译文本,其中,所述第二翻译文本的语种与所述源语音数据的语种相同;A reverse translation subunit, configured to translate the first translated text to obtain a second translated text, wherein the language of the second translated text is the same as the language of the source speech data;
    质量判断子单元,用于根据所述第二翻译文本,判断所述第一翻译文本的翻译质量是否大于预设质量阈值。A quality judging subunit, configured to determine whether the translation quality of the first translated text is greater than a preset quality threshold according to the second translated text.
  17. 根据权利要求16所述的装置,其特征在于,所述用户交互单元,具 体用于利用所述第二翻译文本与所述用户进行交互,判断所述第一翻译文本作为所述源语音数据的翻译结果是否正确。The device according to claim 16, wherein the user interaction unit is specifically configured to use the second translated text to interact with the user to determine the first translated text as the source speech data. Whether the translation result is correct.
  18. 根据权利要求17所述的装置,其特征在于,所述用户交互单元包括:The apparatus according to claim 17, wherein the user interaction unit comprises:
    第一询问子单元,用于向所述用户输出第一询问语音,其中,所述第一询问语音用于询问所述源语音数据与所述第二翻译文本的语义是否相似;A first query subunit, configured to output a first query voice to the user, wherein the first query voice is used to query whether the source voice data is semantically similar to the second translated text;
    结果确定子单元,用于若接收到所述用户对所述第一询问语音的肯定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是正确的;若接收到所述用户对所述第一询问语音的否定回答,则所述第一翻译文本作为所述源语音数据的翻译结果是错误的。A result determining subunit, configured to: if a positive answer to the first query voice is received by the user, the first translated text is correct as a translation result of the source voice data; if the user is received For a negative answer to the first query voice, the first translation text is incorrect as a translation result of the source voice data.
  19. 根据权利要求14至18任一项所述的装置,其特征在于,所述文本修正单元,具体用于采用文本匹配的方式,对所述第一翻译文本进行修正。The device according to any one of claims 14 to 18, wherein the text correction unit is specifically configured to correct the first translated text in a text matching manner.
  20. 根据权利要求19所述的装置,其特征在于,所述文本修正单元包括:The apparatus according to claim 19, wherein the text correction unit comprises:
    文本匹配子单元,用于将所述源语音数据的识别文本与数据库中的文本数据进行匹配操作,其中,所述数据库中存储至少一组语句对,所述语句对包括第一样本文本以及对所述第一样本文本进行正确翻译后的第二样本文本,所述第一样本文本的语种与所述源语音数据的语种相同,所述第二样本文本的语种与所述第一翻译文本的语种相同;Text matching sub-unit, configured to match the recognized text of the source speech data with text data in a database, wherein the database stores at least one sentence pair, the sentence pair includes a first sample text and A second sample text after the first sample text is correctly translated, the language of the first sample text is the same as the language of the source speech data, and the language of the second sample text is the same as the first sample text The language of the translated text is the same;
    文本获取子单元,用于通过所述匹配操作,获取与所述源语音数据的识别文本最相似的第一样本文本;A text obtaining subunit, configured to obtain, through the matching operation, a first sample text that is most similar to the recognized text of the source speech data;
    文本修正子单元,用于根据所述最相似的第一样本文本,对所述第一翻译文本进行修正。A text correction subunit is configured to correct the first translated text according to the most similar first sample text.
  21. 根据权利要求20所述的装置,其特征在于,文本修正子单元,具体用于利用所述最相似的第一样本文本与所述用户进行交互,实现对所述第一翻译文本的修正。The device according to claim 20, wherein the text correction subunit is specifically configured to interact with the user by using the most similar first sample text to implement correction on the first translated text.
  22. 根据权利要求21所述的装置,其特征在于,所述文本修正子单元包括:The apparatus according to claim 21, wherein the text correction subunit comprises:
    第二询问子单元,用于向所述用户输出第二询问语音,其中,所述第二询问语音用于询问所述源语音数据与所述最相似的第一样本文本的语义是否相似;A second query subunit, configured to output a second query voice to the user, wherein the second query voice is used to query whether the source voice data is semantically similar to the most similar first sample text;
    修正完成子单元,用于若接收到所述用户对所述第二询问语音的肯定回 答,则从所述最相似的第一样本文本所属的语句对中获取第二样本文本,作为对所述第一翻译文本进行成功修正后的文本。The modification completion subunit is configured to obtain a second sample text from the sentence pair to which the most similar first sample text belongs if the user's positive answer to the second query voice is received, as The text after the first translation is successfully revised.
  23. 一种语音翻译装置,其特征在于,包括:处理器、存储器、系统总线;A speech translation device, comprising: a processor, a memory, and a system bus;
    所述处理器以及所述存储器通过所述系统总线相连;The processor and the memory are connected through the system bus;
    所述存储器用于存储一个或多个程序,所述一个或多个程序包括指令,所述指令当被所述处理器执行时使所述处理器执行如权利要求1-12任一项所述的方法。The memory is configured to store one or more programs, and the one or more programs include instructions that, when executed by the processor, cause the processor to execute the method according to any one of claims 1-12 Methods.
  24. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-12任意一项所述的方法。A computer-readable storage medium includes instructions that, when run on a computer, cause the computer to perform the method according to any one of claims 1-12.
  25. 一种计算机程序产品,其特征在于,所述计算机程序产品在终端设备上运行时,使得所述终端设备执行权利要求1-12任一项所述的方法。A computer program product, wherein when the computer program product is run on a terminal device, the terminal device executes the method according to any one of claims 1-12.
PCT/CN2019/082040 2018-05-23 2019-04-10 Speech translation method and apparatus WO2019223437A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810503163.X 2018-05-23
CN201810503163.XA CN108710616A (en) 2018-05-23 2018-05-23 A kind of voice translation method and device

Publications (1)

Publication Number Publication Date
WO2019223437A1 true WO2019223437A1 (en) 2019-11-28

Family

ID=63869422

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/082040 WO2019223437A1 (en) 2018-05-23 2019-04-10 Speech translation method and apparatus

Country Status (2)

Country Link
CN (1) CN108710616A (en)
WO (1) WO2019223437A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710616A (en) * 2018-05-23 2018-10-26 科大讯飞股份有限公司 A kind of voice translation method and device
CN111508484B (en) * 2019-01-31 2024-04-19 阿里巴巴集团控股有限公司 Voice data processing method and device
CN110047488B (en) * 2019-03-01 2022-04-12 北京彩云环太平洋科技有限公司 Voice translation method, device, equipment and control equipment
CN111507113B (en) * 2020-03-18 2021-03-02 北京捷通华声科技股份有限公司 Method and device for machine-assisted manual translation
CN111245460B (en) * 2020-03-25 2020-10-27 广州锐格信息技术科技有限公司 Wireless interphone with artificial intelligence translation
CN112215015A (en) * 2020-09-02 2021-01-12 文思海辉智科科技有限公司 Translation text revision method, translation text revision device, computer equipment and storage medium
CN112818703B (en) * 2021-01-19 2024-02-27 传神语联网网络科技股份有限公司 Multilingual consensus translation system and method based on multithread communication
CN112818702B (en) * 2021-01-19 2024-02-27 传神语联网网络科技股份有限公司 Multi-user multi-language cooperative speech translation system and method
CN113362818A (en) * 2021-05-08 2021-09-07 山西三友和智慧信息技术股份有限公司 Voice interaction guidance system and method based on artificial intelligence
CN114783437A (en) * 2022-06-15 2022-07-22 湖南正宇软件技术开发有限公司 Man-machine voice interaction realization method and system and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744843A (en) * 2013-12-25 2014-04-23 北京百度网讯科技有限公司 Online voice translation method and device
US20150179174A1 (en) * 2008-06-25 2015-06-25 Verint Systems Ltd. System and method for context sensitive inference in a speech processing system
CN108710616A (en) * 2018-05-23 2018-10-26 科大讯飞股份有限公司 A kind of voice translation method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254238A1 (en) * 2007-10-26 2015-09-10 Facebook, Inc. System and Methods for Maintaining Speech-To-Speech Translation in the Field
CN102043774A (en) * 2011-01-13 2011-05-04 北京交通大学 Machine translation evaluation device and method
CN102662934A (en) * 2012-04-01 2012-09-12 百度在线网络技术(北京)有限公司 Method and device for proofing translated texts in inter-lingual communication
CN103810158A (en) * 2012-11-07 2014-05-21 中国移动通信集团公司 Speech-to-speech translation method and device
CN107844470B (en) * 2016-09-18 2021-04-30 腾讯科技(深圳)有限公司 Voice data processing method and equipment thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150179174A1 (en) * 2008-06-25 2015-06-25 Verint Systems Ltd. System and method for context sensitive inference in a speech processing system
CN103744843A (en) * 2013-12-25 2014-04-23 北京百度网讯科技有限公司 Online voice translation method and device
CN108710616A (en) * 2018-05-23 2018-10-26 科大讯飞股份有限公司 A kind of voice translation method and device

Also Published As

Publication number Publication date
CN108710616A (en) 2018-10-26

Similar Documents

Publication Publication Date Title
WO2019223437A1 (en) Speech translation method and apparatus
US10713441B2 (en) Hybrid learning system for natural language intent extraction from a dialog utterance
WO2015096564A1 (en) On-line voice translation method and device
US10127909B2 (en) Query rewrite corrections
JP6334815B2 (en) Learning apparatus, method, program, and spoken dialogue system
US11205052B2 (en) Deriving multiple meaning representations for an utterance in a natural language understanding (NLU) framework
US20120166942A1 (en) Using parts-of-speech tagging and named entity recognition for spelling correction
CN110083819B (en) Spelling error correction method, device, medium and electronic equipment
WO2018055983A1 (en) Translation device, translation system, and evaluation server
CN108228574B (en) Text translation processing method and device
WO2012079247A1 (en) Machine translation evaluation device and method
CN109299471A (en) A kind of method, apparatus and terminal of text matches
US11699435B2 (en) System and method to interpret natural language requests and handle natural language responses in conversation
WO2021051872A1 (en) Entity identification method, device, apparatus, and computer readable storage medium
WO2022105493A1 (en) Semantic recognition-based data query method and apparatus, device and storage medium
WO2019218809A1 (en) Chapter-level text translation method and device
JP2015170094A (en) Translation device and translation method
KR101740671B1 (en) Multilingual translation method
CN108304389B (en) Interactive voice translation method and device
US20170229116A1 (en) Method of and system for processing a user-generated input command
US20230020574A1 (en) Disfluency removal using machine learning
TW201812612A (en) Method and associated processor for adaptive linkify of a text
CN114358026A (en) Speech translation method, device, equipment and computer readable storage medium
WO2024042963A1 (en) Error correcting translation device, error correcting translation method, program and storage medium for same
WO2023159749A1 (en) Dialogue process control method and apparatus of customer service robot, server and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19808263

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19808263

Country of ref document: EP

Kind code of ref document: A1