WO2016101571A1 - 一种语音翻译方法、通讯方法及相关装置 - Google Patents

一种语音翻译方法、通讯方法及相关装置 Download PDF

Info

Publication number
WO2016101571A1
WO2016101571A1 PCT/CN2015/082390 CN2015082390W WO2016101571A1 WO 2016101571 A1 WO2016101571 A1 WO 2016101571A1 CN 2015082390 W CN2015082390 W CN 2015082390W WO 2016101571 A1 WO2016101571 A1 WO 2016101571A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
speech
module
translated
communication application
Prior art date
Application number
PCT/CN2015/082390
Other languages
English (en)
French (fr)
Inventor
尚国强
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016101571A1 publication Critical patent/WO2016101571A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language

Definitions

  • This paper relates to the field of speech translation technology, especially a speech translation method, communication method and related devices.
  • the artificial intelligence pattern recognition algorithm has obtained a large application environment, and the various data collected can be quickly calculated by the cloud computing platform. Better training results are obtained, which makes the various speech feature libraries more suitable for the actual use environment.
  • the current mobile phone does not have an instant translation function, and there are communication barriers when the languages of the two parties are different or there are dialects. Therefore, there is a need for a voice technology based on a cloud computing platform to realize instant translation during communication.
  • the technical problem to be solved by the present invention is to provide a speech translation method, a communication method and a related device, which can retain the translated speech feature of the speaker and improve the experience of the listener.
  • a speech translation method comprising:
  • the step of converting the first voice to obtain translated voice data includes:
  • the first voice is converted based on a language database to obtain translated voice data.
  • the voice feature comprises: a pitch of the first voice, or a pitch and an overtone of the first voice.
  • the step of acquiring the first voice includes:
  • the first voice to be translated is obtained based on the communication application.
  • the method further includes:
  • the second voice is output.
  • the step of acquiring the first voice of the language to be converted based on the communication application includes:
  • the step of acquiring the first voice of the language to be converted based on the communication application further includes:
  • the step of outputting the second voice includes:
  • the second voice is output to the local user.
  • the step of outputting the second voice further includes:
  • the second voice is output to the peer user based on the communication application.
  • a speech translation apparatus includes a first acquisition module, an extraction module, a first conversion module, and a first fitting module, wherein:
  • the first obtaining module is configured to: acquire a first voice
  • the extracting module is configured to: extract a voice feature of the first voice
  • the first conversion module is configured to: convert the first voice to obtain a translated language Tone data
  • the first fitting module is configured to perform a voice fitting on the translated voice data according to the voice feature to obtain a second voice.
  • the voice feature comprises: a pitch of the first voice, or a pitch and an overtone of the first voice.
  • the first acquiring module is configured to acquire the first voice as follows:
  • the terminal After the terminal starts the communication application, based on the communication application, the first voice of the language to be converted is obtained.
  • the device further includes an output module, wherein
  • the output module is configured to: output the second voice.
  • the first obtaining module includes a first acquiring subunit, where
  • the first obtaining sub-module is configured to: acquire, according to the communication application, the first voice of the language to be converted sent by the peer user.
  • the first obtaining module further includes a second acquiring submodule, where:
  • the second obtaining sub-module is configured to: acquire, according to the communication application, the first voice to be translated input by the local user.
  • the solution of the invention can translate the voice transmitted by the communication application, thereby facilitating communication between users. Since the translated second voice can retain the speaker's voice feature, it brings a more realistic experience to the listening party when applied to the terminal communication.
  • FIG. 1 is a schematic diagram of steps of a voice translation method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of steps of a communication method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a communication method applied to a voice call according to an embodiment of the present invention
  • FIG. 4 is a schematic flowchart of a communication method applied to communication software according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a voice translation apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
  • an embodiment of the present invention provides a voice translation method, as shown in FIG. 1, including:
  • Step 11 Acquire a first voice
  • Step 12 Extract a voice feature of the first voice
  • Step 13 converting the first voice to obtain translated voice data
  • Step 14 Perform speech fitting on the translated voice data according to the voice feature to obtain a second voice.
  • the present embodiment extracts the voice features of the original voice before performing the translation, and after the translation, restores the translated voice data to the speaker's tone according to the extracted voice features, so that The listener gets a more realistic experience and is good for understanding.
  • the first voice is converted based on a language database to obtain translated voice data.
  • the language database may be saved locally, and after the first voice is acquired, the first voice is language-recognized and translated according to the local language database.
  • the language database can also be set on the server side to achieve online translation. It should be noted that the translation of this embodiment may be a conversion between languages, or a conversion between different dialects in the same language.
  • the voice feature in this embodiment includes: a pitch of the first voice, or a pitch and an overtone of the first voice.
  • the pitch is the sound produced by the overall vibration of the sounding body, and the pitch determines the pitch.
  • the sound produced by the partial vibration of the sound body is called overtone, and the overtone determines the tone.
  • This embodiment can be translated by the pitch feature.
  • the voice data is restored back to the speaker's original pitch.
  • the overtone feature can also be combined to restore the translated speech data to achieve better results.
  • another embodiment of the present invention provides a communication method applied to a terminal, which can translate the voice of the communication party to the other party in the real-time.
  • the communication method includes:
  • Step 21 After the terminal starts the communication application, acquiring the first voice to be translated based on the communication application;
  • Step 22 extracting a voice feature of the first voice
  • Step 23 Convert the first voice to obtain translated voice data.
  • Step 24 Perform a speech fitting on the translated voice data according to the voice feature to obtain a second voice
  • step 25 the second voice is output.
  • the translation process may be performed on the terminal on the receiving side, that is, in the foregoing step 21, the first voice to be translated sent by the peer user is obtained based on the communication application run by the terminal; The end user outputs the second voice.
  • the user can translate the received voice sent from the peer end on his own terminal.
  • the translation process can be performed on the terminal on the transmitting side, that is, in the above step 21, based on the communication application run by the terminal, the first voice to be translated input by the local user is acquired; in the above step 25, the terminal is operated.
  • the communication application outputs the second voice to the peer user.
  • the user can translate the spoken voice on his own terminal and then send it to the peer. Even if the peer device does not adopt the scheme of the embodiment, the translated voice can be received, thereby achieving normal communication.
  • the communication parties are voice calls
  • the calling terminal is configured to convert the Cantonese spoken by the calling user into the Shanghai language, and then send it to the called end, as shown in FIG. 2, and the specific process includes:
  • the translation system is configured on the calling terminal, and the language feature library of the translation system is configured, such as configuration ⁇ a mixed feature library of Cantonese and Shanghainese;
  • the calling terminal establishes a voice call with the called terminal, and obtains the first voice input by the calling user through the microphone of the calling terminal;
  • the calling terminal extracts the pitch of the first voice (may also include overtones);
  • the calling terminal converts the first voice based on the mixed feature library to obtain the translated voice data.
  • the calling terminal performs voice fitting on the translated voice data according to the extracted pitch, and obtains a second voice that conforms to the speaking voice of the calling user.
  • the calling end performs voice processing and modulation on the second voice
  • the calling terminal sends the modulated signal to the called terminal, and the called terminal receives the modulated signal and performs demodulation processing to obtain and play the second voice.
  • the second voice played at the called terminal is already the translated Shanghai language.
  • the called terminal does not need to perform additional configuration, so the solution has high practicability.
  • the calling terminal may only send the translated second voice to the called terminal, so as to prevent the first voice from causing interference to the called user.
  • the two parties communicate based on the instant messaging software.
  • the called terminal After receiving the Japanese voice file sent by the calling user, the called terminal translates it into Chinese and plays it to the called user.
  • the specific process includes:
  • the called user obtains and saves the Japanese voice file sent by the calling user through the instant messaging software
  • the instant messaging software extracts the pitch of the Japanese voice file, and invokes the Japanese translation software to translate the Japanese voice file to obtain a Chinese voice file;
  • the instant messaging software performs speech fitting on the Chinese voice file through the extracted pitch, and restores the Chinese voice file to the pitch of the calling user;
  • the instant messaging software can, but does not necessarily, save the fitted Chinese voice file instead of the translated Japanese voice file, and play the saved Chinese phonetic text to the called user through the called user operation or automatically.
  • the translation step can be performed by a voice translation software provided by a third party, and the instant messaging software only needs to invoke the voice translation software to perform real-time voice translation.
  • the called user can download and install the corresponding translation APP according to their own translation requirements, and then associate the instant messaging software with the translation APP.
  • FIG. 5 Another embodiment of the present invention further provides a voice translation apparatus, as shown in FIG. 5, including:
  • the first obtaining module 501 is configured to: acquire the first voice
  • the extracting module 502 is configured to: extract a voice feature of the first voice
  • the first conversion module 503 is configured to: convert the first voice to obtain translated voice data
  • the first fitting module 504 is configured to perform voice fitting on the translated voice data according to the voice feature to obtain a second voice.
  • the voice feature includes: a pitch of the first voice, or a pitch and an overtone of the first voice.
  • the first obtaining module 501 is configured to acquire the first voice as follows:
  • the terminal After the terminal starts the communication application, based on the communication application, the first voice of the language to be converted is obtained.
  • the device further includes an output module, wherein
  • the output module is configured to: output the second voice.
  • the first obtaining module 501 includes a first acquiring subunit, where
  • the first obtaining sub-module is configured to: acquire, according to the communication application, the first voice of the language to be converted sent by the peer user.
  • the first obtaining module 501 further includes a second acquiring submodule, where:
  • the second obtaining sub-module is configured to: acquire, according to the communication application, the first voice to be translated input by the local user.
  • the output module includes a first output submodule, wherein:
  • the first output submodule is configured to output the second voice to the local user.
  • the output module further includes a second output submodule, wherein:
  • the second output submodule is configured to output the second voice to a peer user based on the communication application.
  • the speech translation apparatus of the present embodiment extracts the speech features of the original speech before performing the translation, and after the translation, restores the translated speech data to the speaker's tone according to the extracted speech features, so that the listener can more easily understand.
  • the speech translation apparatus of the present embodiment can achieve the same technical effects as the speech translation method described above.
  • an embodiment of the present invention further provides a terminal, as shown in FIG. 6, including:
  • the second obtaining module 601 is configured to: after the terminal starts the communication application, acquire the first voice of the language to be converted based on the communication application;
  • the second extraction module 602 is configured to: extract a voice feature of the first voice
  • the second conversion module 603 is configured to: convert the first voice to obtain translated voice data
  • the second fitting module 604 is configured to perform a voice fitting on the translated voice data according to the voice feature to obtain a second voice;
  • the output module 605 is configured to: output the second voice.
  • the second obtaining module 601 includes:
  • the first obtaining sub-module is configured to: obtain, according to the communication application, a first voice of a language to be converted sent by the peer user;
  • the output module 605 includes:
  • the first output submodule is configured to: output the second voice to the local user.
  • the second obtaining module 601 further includes:
  • the second obtaining sub-module is configured to: obtain, according to the communication application, the first voice to be translated input by the local user;
  • the output module 605 further includes:
  • the second output submodule is configured to: output the second voice to the peer user based on the communication application.
  • the embodiment of the invention further discloses a computer program, comprising program instructions, when the program instruction is executed by the terminal, so that the terminal can perform any of the above methods for detecting wireless network access security.
  • the embodiment of the invention also discloses a carrier carrying the computer program.
  • the terminal of this embodiment can achieve the same technical effect corresponding to the communication method described above.
  • the translated second voice can retain the speaker's voice feature, so that when applied to the terminal communication, a more realistic experience is brought to the listening party. Therefore, the present invention has strong industrial applicability.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种语言翻译方法、通讯方法及相关装置,涉及终端应用领域。方法包括:获取第一语音;提取所述第一语音的语音特征;对所述第一语音进行转换,得到翻译后的语音数据;根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。采用本发明方案,经翻译后的第二语音能够保留说话者的语音特征,因此在应用到终端通讯时,为收听一方带来更真实的体验效果。

Description

一种语音翻译方法、通讯方法及相关装置 技术领域
本文涉及语音翻译技术领域,特别是一种语音翻译方法、通讯方法及相关装置。
背景技术
随着硬件技术的发展和软件的发展,包括云计算技术的快速发展,使得人工智能模式识别的算法得到了很大应用环境,对于收录的各种数据可以经过云计算平台的大量快速计算,容易得到较好的训练结果,使得各种语音特征库更加符合实际使用环境。
苹果公司的siri应用的使用,激发了社会上的各种语音技术的使用热潮,语音技术的发展进一步解放了使用智能终端者的双手,对社会生产力的发展也是一种很大的促进。
目前的手机不具备即时翻译功能,当通信双方的语种不同或存在方言时,存在沟通障碍。因此,当前亟需一种基于云计算平台上的语音技术,实现通讯时的即时翻译。
发明内容
本发明要解决的技术问题是提供一种语音翻译方法、通讯方法及相关装置,能够将翻译后的第二语音能够保留有说话者的语音特征,提高接听者的体验。
为解决上述技术问题,采用如下技术方案:
一种语音翻译方法,包括:
获取第一语音;
提取所述第一语音的语音特征;
对所述第一语音进行转换,得到翻译后的语音数据;
根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。
可选地,所述对所述第一语音进行转换,得到翻译后的语音数据的步骤包括:
基于语言数据库,对所述第一语音进行转换,得到翻译后的语音数据。
可选地,所述语音特征包括:所述第一语音的基音,或所述第一语音的基音和泛音。
可选地,所述获取第一语音的的步骤包括:
在终端启动通讯应用后,基于该通讯应用,获取待翻译的所述第一语音。
可选地,所述根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音的步骤之后,该方法还包括:
输出所述第二语音。
可选地,所述基于该通讯应用,获取待转换语言的第一语音的步骤包括:
基于该通讯应用,获取对端用户发送过来的待转换语言的第一语音;
可选地,所述基于该通讯应用,获取待转换语言的第一语音的步骤还包括:
基于该通讯应用,获取本端用户输入的待转换语言的第一语音;
可选地,所述输出所述第二语音的步骤包括:
向本端用户输出所述第二语音。
可选地,所述输出所述第二语音的步骤还包括:
基于该通讯应用,向对端用户输出所述第二语音。
一种语音翻译装置,包括第一获取模块、提取模块、第一转换模块和第一拟合模块,其中:
所述第一获取模块设置成:获取第一语音;
所述提取模块设置成:提取所述第一语音的语音特征;
所述第一转换模块设置成:对所述第一语音进行转换,得到翻译后的语 音数据;
所述第一拟合模块设置成:根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。
可选地,所述语音特征包括:所述第一语音的基音,或所述第一语音的基音和泛音。
可选地,所述第一获取模块设置成按照如下方式获取第一语音:
在终端启动通讯应用后,基于该通讯应用,获取待转换语言的第一语音。
可选地,该装置还包括输出模块,其中
所述输出模块设置成:输出所述第二语音。
可选地,所述第一获取模块包括第一获取子单元,其中
所述第一获取子模块设置成:基于该通讯应用,获取对端用户发送过来的待转换语言的第一语音。
可选地,所述第一获取模块还包括第二获取子模块,其中:
所述第二获取子模块设置成:基于该通讯应用,获取本端用户输入的待翻译的第一语音。
本发明的上述技术方案的有益效果如下:
本发明的方案能够地对通讯应用所传输的语音进行翻译,从而方便用户之间的沟通。由于经翻译后的第二语音能够保留说话者的语音特征,因此在应用到终端通讯时,为收听一方带来更真实的体验效果。
附图概述
图1为本发明实施例的语音翻译方法的步骤示意图;
图2为本发明实施例的通讯方法的步骤示意图;
图3为本发明实施例的通讯方法应用于语音通话的流程示意图;
图4为本发明实施例的通讯方法应用于通讯软件的流程示意图;
图5为本发明实施例的语音翻译装置的结构示意图;
图6为本发明实施例的终端的结构示意图。
本发明的较佳实施方式
下面将结合附图及具体实施例进行详细描述。
本发明的目的是提供给一种能够在通讯时,实现即时翻译的方案。而相关的语音翻译技术,并不能保留说话者的语音特征,因此翻译后的语音数据在音调上存在违和感,不利于用户去理解。为解决这一问题,本发明实施例提出了一种语音翻译方法,如图1所示,包括:
步骤11,获取第一语音;
步骤12,提取所述第一语音的语音特征;
步骤13,对所述第一语音进行转换,得到翻译后的语音数据;
步骤14,根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。
根据上述步骤11至步骤14可以知道:本实施例在进行翻译前,提取原语音的语音特征,并在翻译后,根据提取到的语音特征将已翻译的语音数据还原成说话者的音调,使接听者得到更真实体验效果,利于理解。
具体地,在上述步骤13中,基于语言数据库,对所述第一语音进行转换,得到翻译后的语音数据。
示例性地,语言数据库可以保存在本地,在获取到第一语音后,根据本地的语言数据库对第一语音进行语言识别并翻译。此外,语言数据库还可以设置在服务端,实现在线翻译。需要给予说明的是,本实施例的翻译可以是语种之间的转换,或者是同一语种不同方言之间的转换。
具体地,本实施例所述语音特征包括:所述第一语音的基音,或所述第一语音的基音和泛音。
基音是发音体整体振动产生的声音,基音决定音高。发音体部分振动产生的声音叫做泛音,泛音决定音色。本实施例通过基音特征即可将翻译出的 语音数据恢复回说话人原来的音调。作为优选方案,也可以再结合泛音特征,对翻译后的语音数据进行还原,实现更好的效果。
此外,本发明的另一实施例提供一种应用于终端的通讯方法,能够实时地将通讯一方的语音翻译给通讯另一方,如图2所示,所述通讯方法包括:
步骤21,在终端启动通讯应用后,基于该通讯应用,获取待翻译的第一语音;
步骤22,提取所述第一语音的语音特征;
步骤23,对所述第一语音进行转换,得到翻译后的语音数据;
步骤24,根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音;
步骤25,输出所述第二语音。
具体地,翻译过程可以在接收侧的终端进行,即在上述步骤21中,基于终端所运行的通讯应用,获取对端用户发送过来的待翻译的第一语音;在上述步骤25中,向本端用户输出所述第二语音。
在实际应用中,用户可在自己的终端上对接收到的来自对端所发送的语音进行翻译。
此外,翻译过程可以在发送侧的终端进行,即在上述步骤21中,基于终端所运行的通讯应用,获取本端用户输入的待翻译的第一语音;在上述步骤25中,基于终端所运行的通讯应用,向对端用户输出所述第二语音。
在实际应用中,用户可将说出语音在自己的终端上进行翻译,之后再发送给对端。即便对端设备未采用本实施例的方案,也能接收到翻译后的语音,从而实现正常交流。
下面对本发明实施例的通讯方法的应用场景进行介绍。
<应用场景一>
在应用场景一中,通讯双方为语音通话,主叫终端用于将主叫用户说出的粤语转换为沪语后,再发送至被叫端,如图2所示,具体过程包括:
A1,主叫终端上配置翻译系统,配置翻译系统的语言特征库,如配置闽 粤语与沪语的混合特征库;
A2,主叫终端与被叫终端建立语音通话,通过主叫终端麦克风获取主叫用户输入的第一语音;
A3,主叫终端提取第一语音的基音(也可以包括泛音);
A4,主叫终端基于混合特征库,对第一语音进行转换,得到翻译后的语音数据;
A5,主叫终端根据提取到的基音对翻译后的语音数据进行语音拟合,得到符合主叫用户说话音调的第二语音。
A6,主叫端对第二语音进行语音处理并调制;
A7,主叫终端将调制信号发送至被叫终端,被叫终端接收调至信号后进行解调处理,得到并播放第二语音。此时在被叫终端播放的第二语音已经是翻译后的沪语。
在应用场景一中,被叫终端不需要进行额外配置,因此本方案的具有较高的实用性。此外,主叫终端可以只向被叫终端发送已翻译后的第二语音,避免第一语音对被叫用户带来干扰。
<应用场景二>
在应用场景二中,通讯双方基于即时通讯软件进行通话,被叫终端在接收到主叫用户发送的日语语音文件后,将其翻译为汉语,并播放给被叫用户,具体过程包括:
B1,在被叫终端上设置日语翻译软件,并允许即时通讯软件调用日语翻译软件;
B2,被叫用户通过即时通讯软件获取并保存来自主叫用户发送的日语语音文件;
B3,即时通讯软件提取所述日语语音文件的基音,并调用日语翻译软件对该日语语音文件进行翻译,得到汉语语音文件;
B4,即时通讯软件通过提取到的基音,对汉语语音文件进行语音拟合,使汉语语音文件还原为主叫用户的音调;
B5,即时通讯软件可以但不一定将拟合后的汉语语音文件代替翻译前的日语语音文件进行保存,并通过被叫用户操作或自动将已保存的汉语语音文播放给被叫用户。
在应用场景二中,翻译步骤可以由第三方提供的语音翻译软件来执行,而即时通讯软件只需要调用语音翻译软件即可进行实时的语音翻译。在实际应用中,被叫用户可根据自己的翻译需求下载并安装对应的翻译APP,之后将即时通讯软件与翻译APP进行关联绑定。
此外,本发明的另一实施例还提供一种语音翻译装置,如图5所示,包括:
第一获取模块501,设置成:获取第一语音;
提取模块502,设置成:提取所述第一语音的语音特征;
第一转换模块503,设置成:对所述第一语音进行转换,得到翻译后的语音数据;
第一拟合模块504,设置成:根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。
其中,所述语音特征包括:所述第一语音的基音,或所述第一语音的基音和泛音。
可选地,所述第一获取模块501设置成按照如下方式获取第一语音:
在终端启动通讯应用后,基于该通讯应用,获取待转换语言的第一语音。
可选地,该装置还包括输出模块,其中
所述输出模块设置成:输出所述第二语音。
可选地,所述第一获取模块501包括第一获取子单元,其中
所述第一获取子模块设置成:基于该通讯应用,获取对端用户发送过来的待转换语言的第一语音。
可选地,所述第一获取模块501还包括第二获取子模块,其中:
所述第二获取子模块设置成:基于该通讯应用,获取本端用户输入的待翻译的第一语音。
可选地,所述输出模块包括第一输出子模块,其中:
所述第一输出子模块设置成:向本端用户输出所述第二语音。
可选地,所述输出模块还包括第二输出子模块,其中:
所述第二输出子模块设置成:基于该通讯应用,向对端用户输出所述第二语音。
本实施例的语音翻译装置在进行翻译前,提取原语音的语音特征,并在翻译后,根据提取到的语音特征将已翻译的语音数据还原成说话者的音调,使接听者更方便理解。
显然,本实施例的语音翻译装置与上文所述语音翻译方法相对应,均能够实现同样的技术效果。
此外,本发明的实施例还提供一种终端,如图6所示,包括:
第二获取模块601,设置成:在终端启动通讯应用后,基于该通讯应用,获取待转换语言的第一语音;
第二提取模块602,设置成:提取所述第一语音的语音特征;
第二转换模块603,设置成:对所述第一语音进行转换,得到翻译后的语音数据;
第二拟合模块604,设置成:根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音;
输出模块605,设置成:输出所述第二语音。
其中,所述第二获取模块601包括:
第一获取子模块,设置成:基于该通讯应用,获取对端用户发送过来的待转换语言的第一语音;
所述输出模块605包括:
第一输出子模块,设置成:向本端用户输出所述第二语音。
此外,在上述基础之上,所述第二获取模块601还包括:
第二获取子模块,设置成:基于该通讯应用,获取本端用户输入的待翻译的第一语音;
所述输出模块605还包括:
第二输出子模块,设置成:基于该通讯应用,向对端用户输出所述第二语音。
本发明实施例还公开了一种计算机程序,包括程序指令,当该程序指令被终端执行时,使得该终端可执行上述任意的检测无线网络接入安全的方法。
本发明实施例还公开了一种载有所述的计算机程序的载体。
显然,本实施例的终端与上文所述的通讯方法相对应,均能够达到相同的技术效果。
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明所述原理的前提下,还可以作出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
工业实用性
采用本发明方案,经翻译后的第二语音能够保留说话者的语音特征,因此在应用到终端通讯时,为收听一方带来更真实的体验效果。因此本发明具有很强的工业实用性。

Claims (15)

  1. 一种语音翻译方法,包括:
    获取第一语音;
    提取所述第一语音的语音特征;
    对所述第一语音进行转换,得到翻译后的语音数据;
    根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。
  2. 根据权利要求1所述的语音翻译方法,其中,所述对所述第一语音进行转换,得到翻译后的语音数据的步骤包括:
    基于语言数据库,对所述第一语音进行转换,得到翻译后的语音数据。
  3. 根据权利要求1所述的语音翻译方法,其中,
    所述语音特征包括:所述第一语音的基音,或所述第一语音的基音和泛音。
  4. 根据权利要求1所述的语音翻译方法,其中,所述获取第一语音的的步骤包括:
    在终端启动通讯应用后,基于该通讯应用,获取待翻译的所述第一语音。
  5. 根据权利要求1所述的语音翻译方法,其中,所述根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音的步骤之后,该方法还包括:
    输出所述第二语音。
  6. 根据权利要求4所述的语音翻译方法,其中,所述基于该通讯应用,获取待转换语言的第一语音的步骤包括:
    基于该通讯应用,获取对端用户发送过来的待转换语言的第一语音。
  7. 根据权利要求6所述的语音翻译方法,其中,所述基于该通讯应用,获取待转换语言的第一语音的步骤还包括;
    基于该通讯应用,获取本端用户输入的待转换语言的第一语音。
  8. 根据权利要求4所述的语音翻译方法,其中,所述输出所述第二语音 的步骤包括:
    向本端用户输出所述第二语音。
  9. 根据权利要求8所述的语音翻译方法,其中,所述输出所述第二语音的步骤还包括:
    基于该通讯应用,向对端用户输出所述第二语音。
  10. 一种语音翻译装置,包括第一获取模块、提取模块、第一转换模块和第一拟合模块,其中:
    所述第一获取模块设置成:获取第一语音;
    所述提取模块设置成:提取所述第一语音的语音特征;
    所述第一转换模块设置成:对所述第一语音进行转换,得到翻译后的语音数据;
    所述第一拟合模块设置成:根据所述语音特征对所述翻译后的语音数据进行语音拟合,得到第二语音。
  11. 根据权利要求10所述的语音翻译装置,其中,
    所述语音特征包括:所述第一语音的基音,或所述第一语音的基音和泛音。
  12. 根据权利要求10所述的语音翻译装置,其中,所述第一获取模块设置成按照如下方式获取第一语音:
    在终端启动通讯应用后,基于该通讯应用,获取待转换语言的第一语音。
  13. 根据权利要求10所述的语音翻译装置,其中,该装置还包括输出模块,其中
    所述输出模块设置成:输出所述第二语音。
  14. 根据权利要求10所述的语音翻译装置,其中,所述第一获取模块包括第一获取子单元,其中
    所述第一获取子模块设置成:基于该通讯应用,获取对端用户发送过来的待转换语言的第一语音。
  15. 根据权利要求14所述的语音翻译装置,其中,所述第一获取模块还 包括第二获取子模块,其中:
    所述第二获取子模块设置成:基于该通讯应用,获取本端用户输入的待翻译的第一语音。
PCT/CN2015/082390 2014-12-22 2015-06-25 一种语音翻译方法、通讯方法及相关装置 WO2016101571A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410810783.X 2014-12-22
CN201410810783.XA CN105786801A (zh) 2014-12-22 2014-12-22 一种语音翻译方法、通讯方法及相关装置

Publications (1)

Publication Number Publication Date
WO2016101571A1 true WO2016101571A1 (zh) 2016-06-30

Family

ID=56149130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/082390 WO2016101571A1 (zh) 2014-12-22 2015-06-25 一种语音翻译方法、通讯方法及相关装置

Country Status (2)

Country Link
CN (1) CN105786801A (zh)
WO (1) WO2016101571A1 (zh)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919562B (zh) * 2017-04-28 2024-01-05 深圳市大乘科技股份有限公司 一种实时翻译系统、方法及装置
CN107170453B (zh) 2017-05-18 2020-11-03 百度在线网络技术(北京)有限公司 基于人工智能的跨语种语音转录方法、设备及可读介质
CN107248947B (zh) * 2017-05-22 2019-01-08 腾讯科技(深圳)有限公司 表情处理方法及装置、计算机设备及存储介质
CN107341148A (zh) * 2017-06-27 2017-11-10 深圳市沃特沃德股份有限公司 翻译方法、翻译设备及翻译系统
WO2019071541A1 (zh) * 2017-10-12 2019-04-18 深圳市沃特沃德股份有限公司 语音翻译方法、装置和终端设备
CN107945806B (zh) * 2017-11-10 2022-03-08 北京小米移动软件有限公司 基于声音特征的用户识别方法及装置
CN108231062B (zh) * 2018-01-12 2020-12-22 科大讯飞股份有限公司 一种语音翻译方法及装置
CN108447486B (zh) * 2018-02-28 2021-12-03 科大讯飞股份有限公司 一种语音翻译方法及装置
CN110164414B (zh) * 2018-11-30 2023-02-14 腾讯科技(深圳)有限公司 语音处理方法、装置及智能设备
CN110516238B (zh) * 2019-08-20 2023-12-19 广州国音智能科技有限公司 语音翻译方法、装置、终端及计算机存储介质
CN112201224A (zh) * 2020-10-09 2021-01-08 北京分音塔科技有限公司 用于即时通话同声翻译的方法、设备及系统

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258539A (zh) * 2012-02-15 2013-08-21 展讯通信(上海)有限公司 一种语音信号特性的变换方法和装置
CN103810158A (zh) * 2012-11-07 2014-05-21 中国移动通信集团公司 一种语音翻译方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005301817A (ja) * 2004-04-14 2005-10-27 Ricoh Co Ltd 翻訳支援システム
JP4271224B2 (ja) * 2006-09-27 2009-06-03 株式会社東芝 音声翻訳装置、音声翻訳方法、音声翻訳プログラムおよびシステム
US20080133245A1 (en) * 2006-12-04 2008-06-05 Sehda, Inc. Methods for speech-to-speech translation
JP2009048003A (ja) * 2007-08-21 2009-03-05 Toshiba Corp 音声翻訳装置及び方法
CN101727904B (zh) * 2008-10-31 2013-04-24 国际商业机器公司 语音翻译方法和装置
CN102196100A (zh) * 2010-03-04 2011-09-21 深圳富泰宏精密工业有限公司 通话即时翻译系统及方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258539A (zh) * 2012-02-15 2013-08-21 展讯通信(上海)有限公司 一种语音信号特性的变换方法和装置
CN103810158A (zh) * 2012-11-07 2014-05-21 中国移动通信集团公司 一种语音翻译方法及装置

Also Published As

Publication number Publication date
CN105786801A (zh) 2016-07-20

Similar Documents

Publication Publication Date Title
WO2016101571A1 (zh) 一种语音翻译方法、通讯方法及相关装置
WO2016165590A1 (zh) 语音翻译方法及装置
US10176366B1 (en) Video relay service, communication system, and related methods for performing artificial intelligence sign language translation services in a video relay service environment
US9424836B2 (en) Privacy-sensitive speech model creation via aggregation of multiple user models
CN108141498B (zh) 一种翻译方法及终端
CN108198569B (zh) 一种音频处理方法、装置、设备及可读存储介质
CN104834847B (zh) 身份验证方法及装置
US9294834B2 (en) Method and apparatus for reducing noise in voices of mobile terminal
CN108777751A (zh) 一种呼叫中心系统及其语音交互方法、装置和设备
WO2016094598A1 (en) Translation control
EP3920183A1 (en) Speech data processing method and apparatus, electronic device and readable storage medium
CN111683317B (zh) 一种应用于耳机的提示方法、装置、终端及存储介质
CN110992955A (zh) 一种智能设备的语音操作方法、装置、设备及存储介质
CN105206273B (zh) 语音传输控制方法及系统
WO2019075829A1 (zh) 语音翻译方法、装置和翻译设备
CN108418791A (zh) 具有添加字幕功能的通信方法及移动终端
CN114495901A (zh) 语音合成方法、装置、存储介质及电子设备
JP2024037831A (ja) 音声端末機の音声検証及び制限方法
JP2007041089A (ja) 情報端末および音声認識プログラム
CN110232919A (zh) 实时语音流提取与语音识别系统及方法
US11659078B2 (en) Presentation of communications
TWM515143U (zh) 語音翻譯系統及翻譯處理裝置
CN114708849A (zh) 语音处理方法、装置、计算机设备及计算机可读存储介质
KR102344645B1 (ko) 대화자간 실시간 동시통역 서비스 제공방법
US20200184973A1 (en) Transcription of communications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15871646

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15871646

Country of ref document: EP

Kind code of ref document: A1