WO2019084962A1 - 语音翻译方法、装置和翻译机 - Google Patents

语音翻译方法、装置和翻译机 Download PDF

Info

Publication number
WO2019084962A1
WO2019084962A1 PCT/CN2017/109563 CN2017109563W WO2019084962A1 WO 2019084962 A1 WO2019084962 A1 WO 2019084962A1 CN 2017109563 W CN2017109563 W CN 2017109563W WO 2019084962 A1 WO2019084962 A1 WO 2019084962A1
Authority
WO
WIPO (PCT)
Prior art keywords
bluetooth headset
connection
voice information
received
button
Prior art date
Application number
PCT/CN2017/109563
Other languages
English (en)
French (fr)
Inventor
郑勇
王文祺
Original Assignee
深圳市沃特沃德股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市沃特沃德股份有限公司 filed Critical 深圳市沃特沃德股份有限公司
Priority to PCT/CN2017/109563 priority Critical patent/WO2019084962A1/zh
Publication of WO2019084962A1 publication Critical patent/WO2019084962A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/80Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication

Definitions

  • the present invention relates to the field of electronic technologies, and in particular, to a voice translation method, apparatus, and translation machine.
  • two users who speak different languages can communicate with each other through a translator, thereby achieving barrier-free communication.
  • two users can wear a Bluetooth headset separately, and the translator establishes a connection with the Bluetooth headset, so that two users can have a private conversation.
  • the Bluetooth headset connected to the translator must be a dedicated Bluetooth translation headset to implement the above translation process.
  • Bluetooth translation headphones are improved on the basis of ordinary Bluetooth headsets.
  • One improvement is to modify the Bluetooth protocol of ordinary Bluetooth headsets.
  • Another modification is to add special hardware to ordinary Bluetooth headsets, no matter which way. Undoubtedly increased the implementation cost.
  • the user must purchase a Bluetooth translation headset exclusively, and cannot use the universal Bluetooth headset, which limits the application range of the translation machine, increases the user's use cost, and has a poor user experience.
  • the main object of the present invention is to provide a speech translation method, apparatus and translation machine, which aim to reduce the implementation cost of speech translation and expand the application range.
  • an embodiment of the present invention provides a voice translation method, where the method includes the following steps: [0006] establishing a connection with a first Bluetooth headset;
  • the step of establishing a connection with the first Bluetooth headset includes:
  • the step of determining whether the first instruction is received includes:
  • the step of disconnecting the connection with the first Bluetooth headset and establishing a connection with the second Bluetooth headset includes:
  • the step of determining whether the second instruction is received includes:
  • the step of disconnecting the connection with the first Bluetooth headset and establishing a connection with the second Bluetooth headset includes:
  • the step of detecting whether the first Bluetooth headset has stopped transmitting the voice information comprises:
  • the preset time is 2-5 seconds.
  • the embodiment of the present invention also provides a voice translation apparatus, where the apparatus includes:
  • a first connection module configured to establish a connection with the first Bluetooth headset
  • a processing module configured to receive voice information sent by the first Bluetooth headset, and perform translation processing on the voice information
  • a second connection module configured to disconnect the connection with the first Bluetooth headset, and establish with a second Bluetooth headset Stand up
  • the sending module is configured to send the translated voice information to the second Bluetooth headset.
  • the first connection module includes:
  • the first determining unit is configured to determine whether the first instruction is received
  • the first connection unit is configured to establish a connection with the first Bluetooth headset when the first command is received.
  • the first determining unit includes:
  • a first detecting subunit configured to detect whether the first button is triggered
  • a first decision subunit configured to: when the first button is triggered, determine to receive the first instruction
  • the second connection module includes:
  • the second determining unit is configured to determine whether the second instruction is received
  • the second connection unit is configured to, when receiving the second command, connect to the first Bluetooth headset and establish a connection with the second Bluetooth headset.
  • the second determining unit includes:
  • a second detecting subunit configured to detect whether the second button is triggered
  • the second determining subunit is configured to: when the second button is triggered, determine to receive the second instruction.
  • the second connecting module includes:
  • a sending detecting unit configured to detect whether the first Bluetooth headset has stopped transmitting the voice information
  • a third connecting unit configured to: when the first Bluetooth headset stops transmitting the voice information, The connection with the first Bluetooth headset is broken and a connection is established with the second Bluetooth headset.
  • the sending detection unit includes:
  • a receiving determining subunit configured to determine whether the voice information sent by the first Bluetooth headset is not received after the preset time interval is exceeded
  • the stop decision subunit is configured to determine that the first Bluetooth headset has stopped transmitting the voice information when the voice information sent by the first Bluetooth headset is not received.
  • Embodiments of the present invention also provide a translation machine including a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, the application being configured It is used to perform the aforementioned speech translation method.
  • a speech translation method provided by an embodiment of the present invention utilizes a half-duplex interaction feature of a translation machine to implement a translation machine and two Bluetooth devices through a translation switch between two Bluetooth headsets.
  • the branching communication of the earphones achieves the purpose of translating voices in different languages.
  • only two ordinary Bluetooth headsets can be combined with the translation machine to implement voice translation, and the ordinary Bluetooth headset can be improved into a special Bluetooth translation headset, which not only reduces the implementation cost but also expands the application range and reduces the application range. The user's use cost increases the user experience.
  • connection between the translator and the Bluetooth headset is shorter than the translation processing of the voice information, that is, the connection switching operation does not require additional waiting time, so the connection switching operation does not affect the output of the voice information, and does not generate additional The output is delayed, ensuring that the user experience is not affected.
  • FIG. 1 is a flow chart of a first embodiment of a speech translation method of the present invention
  • FIG. 2 is a flow chart of a second embodiment of a speech translation method of the present invention.
  • FIG. 3 is a block diagram showing an embodiment of a speech translation apparatus of the present invention.
  • FIG. 4 is a schematic block diagram of the first connection module of FIG. 3;
  • FIG. 5 is a block diagram of the first determining unit of FIG. 4;
  • FIG. 6 is a block diagram of a second connection module of FIG. 3; [0059] FIG.
  • FIG. 7 is a block diagram of a second determining unit of FIG. 6;
  • FIG. 8 is a block diagram of still another module of the second connection module of FIG. 3;
  • FIG. 9 is a block diagram of the transmission detecting unit of FIG. 8.
  • terminal and terminal device used herein include both a device of a wireless signal receiver, a device having only a wireless signal receiver without a transmitting capability, and a receiving and receiving device.
  • Such a device may comprise: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Persona 1 Communications Service), which may combine voice, Data processing, fax and/or data communication capabilities; PDA (Personal Digital Assistant), which can include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads, calendars and/or GPS ( Global Positioning System, receiver; conventional laptop and/or palmtop computer or other device having conventional laptop and/or palm type with and/or including a radio frequency receiver Computer or other device.
  • PCS Personala 1 Communications Service
  • PDA Personal Digital Assistant
  • GPS Global Positioning System, receiver; conventional laptop and/or palmtop computer or other device having conventional laptop and/or palm type with and/or including a radio frequency receiver Computer or other device.
  • terminal may be portable, transportable, installed in a vehicle (aviation, sea and/or land), or adapted and/or configured to operate locally, and/or Run in any other location on the Earth and/or space in a distributed fashion.
  • the "terminal” and “terminal device” used herein may also be a communication terminal, an internet terminal, a music/video playback terminal, and may be, for example, a PDA, a MID (Mobile Internet Device), and/or have a music/video playback.
  • Functional mobile phones can also be smart TVs, set-top boxes and other devices.
  • the server used herein includes, but is not limited to, a computer, a network host, a single network server, a plurality of network server sets, or a cloud composed of a plurality of servers.
  • the cloud consists of a large number of computers or network servers based on Cloud Computing, which is a kind of distributed computing, a super virtual computer composed of a group of loosely coupled computers.
  • communication may be implemented by any communication means between the server, the terminal device and the WNS server, including but not limited to, mobile communication based on 3GPP, LTE, WIMAX, and computer network communication based on TCP/IP and UDP protocols. And short-range wireless transmission based on Bluetooth and infrared transmission standards.
  • the voice translation method and apparatus of the embodiments of the present invention are mainly applied to a translation machine, and may of course be applied to other terminal devices, such as mobile terminals such as mobile phones and tablets.
  • the following is a detailed description of the application to the translation machine.
  • a first embodiment of a speech translation method of the present invention includes the following steps: [0072] Sl1 establishes a connection with a first Bluetooth headset.
  • the translation machine is a terminal device supporting connection technologies such as mobile communication (such as 4G), Bluetooth, WIFI, etc., and it uses a wireless connection technology such as 4G and WIFI and a remote voice recognition, translation, synthesis, etc. server.
  • the engine interacts to realize the translation and voice output of different languages.
  • the half-duplex man-machine voice interaction mode is adopted.
  • the voice of the translator can only be in one state of input or output.
  • the translator activates Bluetooth, and pairs with two Bluetooth headsets respectively, and the translator can display a list of Bluetooth paired devices on the user interface, which is convenient for the user to view. After the pairing is successful, the translator establishes a connection with the first Bluetooth headset worn by the user who wants to speak.
  • the translation machine determines whether the first instruction is received, and when the first instruction is received, the translation machine establishes a connection with the first Bluetooth headset.
  • a first button can be set for the translator, the first The button may be a physical button or a virtual button, and the translator detects whether the first button is triggered. When it is detected that the first button is triggered, it determines to receive the first command.
  • the first instruction may also be a gesture action, a voice command, etc.
  • the translator captures the gesture action through the camera or collects the voice command through the microphone, and when capturing a specific gesture action or collecting a specific voice command, the decision is made. Received the first instruction.
  • the translator may use the Bluetooth headset that is first paired as the first Bluetooth headset. First establish a connection with it. The present invention will not be repeated here.
  • S12. Receive voice information sent by the first Bluetooth headset, and perform translation processing on the voice information.
  • the first Bluetooth headset collects the user's voice information, and sends the voice information to the translator in the form of a PCM (Pulse-Code Modulation) code stream.
  • PCM Pulse-Code Modulation
  • the translation machine receives the voice information sent by the first Bluetooth headset, and translates the voice information.
  • the translation machine receives the voice information sent by the first Bluetooth headset and stores it, establishes an HTTP connection with the voice recognition, translation, and synthesis server through a wireless network such as 4G or WIFI, and sequentially transmits the voice information to the voice recognition and translation.
  • the composition server processes the voice stream in another language.
  • the entire translation process (including the processing time and network transmission delay) is about 2 seconds.
  • the translation machine first transmits the voice information to the voice recognition server, and the voice recognition server recognizes the voice information as a character string of the first language and returns it to the translator; the translator receives the character string of the first language and sends the translation to the translation a server, the translation server translates the string of the first language into a string of the second language and returns it to the translator; the translator receives the string of the second language and sends it to the composition server, and the composition server transmits the string voice of the second language
  • the speech stream is synthesized into a second language and returned to the translator.
  • the translation machine receives the voice code stream in the second language, and the voice code stream is the voice information after the translation process.
  • the translator may also perform speech recognition, translation, and composition processing on the voice information locally.
  • the entire translation process is about 2 seconds.
  • the translation machine disconnects the connection with the first Bluetooth headset, and switches to establish a connection with the second Bluetooth headset.
  • Translation The machine is still translating and processing the last part of the voice information, and the entire connection switching process takes only about 1 second.
  • the translation of the voice information is shorter than 2 seconds, so there is no additional waiting time. .
  • the translator determines whether the second instruction is received, and when the second command is received, disconnects from the first Bluetooth headset and establishes a connection with the second Bluetooth headset.
  • a second button may be set for the translator, the second button may be a physical button or a virtual button, and the translator detects whether the second button is triggered. When it is detected that the second button is triggered, the decision is received. Second instruction. In this way, the user on the second Bluetooth headset side can intervene and speak at the required time, without having to wait for the user on the first Bluetooth headset side to complete the speech, and the flexibility is high.
  • the translator when the first button is pressed, the translator decides to receive the first command, when the first button is released, the translator decides to receive the second command; or when the first button After being triggered for the first time, the translator decides to receive the first instruction, and when the first button is triggered again, the translator decides to receive the second instruction.
  • the translator determines whether a particular voice command is detected, and when a particular voice command is detected, then the second command is received.
  • the voice command such as "finish", "end” and other keywords, the user can say the aforementioned keyword at the end after the speech is completed.
  • the translator determines that the above keyword is determined as a voice command at the end of the sentence, such as after the above keyword is detected, and the voice message is not continuously received after the preset time (eg, 2-5 seconds), The decision receives the second instruction.
  • the translation machine detects whether the first Bluetooth headset has stopped transmitting voice information, and when detecting that the first Bluetooth headset stops transmitting voice information, disconnects the connection with the first Bluetooth headset, and the second Bluetooth The headset establishes a connection. Specifically, the translation machine determines whether the voice information sent by the first Bluetooth headset is not received after the preset time is exceeded, and when the voice information sent by the first Bluetooth headset is not received within the preset time, the first Bluetooth headset is determined. The voice message has been stopped.
  • the preset time can be set to 2-5 seconds, of course. It can be adjusted according to actual needs.
  • connection switching can be performed in other manners in the prior art, and the present invention will not be described again.
  • S14 Send the translated voice information to the second Bluetooth headset.
  • the translation machine After establishing a connection with the second Bluetooth headset, the translation machine sends the translated voice information to the second Bluetooth headset in a voice stream, and the second Bluetooth headset receives the translated voice information and outputs the voice information. This completes a speech translation process.
  • the user on the second Bluetooth headset side can speak, and then the second Bluetooth headset is converted into the first Bluetooth headset, and returns to step S12 to send to the translator. voice message.
  • the user on the second Bluetooth headset side may also not speak, continue to speak by the user on the first Bluetooth headset side, then return to step S1 l, disconnect the connection with the second Bluetooth headset, and switch to the first Bluetooth headset. establish connection.
  • the method includes the following steps:
  • the user wearing the Bluetooth headset A triggers the button A, and the translator detects that the button A is triggered, and establishes a connection with the Bluetooth headset A.
  • the Bluetooth headset A collects the voice information of the user A and sends the voice information to the translator.
  • the translator receives the voice information sent by the Bluetooth headset A, and performs translation processing on the voice information.
  • the pressed button A is released or the button A is pressed again to cause the button A to be triggered again.
  • the translation device detects that the button A is triggered again, the Bluetooth headset A is disconnected.
  • the connection establishes a connection with the Bluetooth headset B.
  • the user B wearing the Bluetooth headset B triggers the button.
  • the translator sends the translated voice information to the Bluetooth headset B.
  • the Bluetooth headset B receives the translated voice information and outputs the voice information.
  • the Bluetooth headset B collects the voice information of the user B and sends the voice information to the translator.
  • the translator receives the voice information sent by the Bluetooth headset B, and performs translation processing on the voice information.
  • the pressed button B is released or the button B is pressed again to make the button B
  • the key B is triggered again, and the translation machine detects that the button B is triggered again, then disconnects the connection with the Bluetooth headset B, and then establishes a connection with the Bluetooth headset A.
  • the user A wearing the Bluetooth headset A triggers the button A, and the translation machine detects that the button A is triggered, then the connection with the Bluetooth headset B is broken. And establish a connection with Bluetooth headset A.
  • the Bluetooth headset A receives the translated voice information and outputs the voice information.
  • user A and user B complete a round of voice communication process
  • the translation machine completes a round of voice interaction translation process, and repeats the above steps, that is, multiple rounds of voice interactive translation process can be realized.
  • the speech translation method of the embodiment of the present invention utilizes the half-duplex interaction feature of the translation machine to realize the branching of the translation machine and the two Bluetooth headsets through the switching connection between the two Bluetooth headsets by the translation machine. Communication, the purpose of translating voices in different languages has been achieved.
  • only two ordinary Bluetooth headsets can be combined with the translation machine to implement voice translation, and the ordinary Bluetooth headset can be improved into a special Bluetooth translation headset, which not only reduces the implementation cost but also expands the application range and reduces the application range. The user's use cost increases the user experience.
  • connection between the translator and the Bluetooth headset is shorter than the translation processing of the voice information, that is, the connection switching operation does not require additional waiting time, so the connection switching operation does not affect the output of the voice information, and does not generate additional The output is delayed, ensuring that the user experience is not affected.
  • the apparatus includes a first connection module 10, a processing module 20, a second connection module 30, and a transmission module 40, where: the first connection module 10 And being configured to establish a connection with the first Bluetooth headset; the processing module 20 is configured to receive the voice information sent by the first Bluetooth headset, and perform translation processing on the voice information; and the second connection module 30 is configured to be disconnected from the first Bluetooth headset.
  • the connection is connected to the second Bluetooth headset; the sending module 40 is configured to send the translated voice information to the second Bluetooth headset.
  • the translator activates Bluetooth, and pairs with two Bluetooth headsets respectively.
  • the translator can display a list of Bluetooth paired devices on the user interface for the user to view.
  • the first connection module 10 establishes a connection with the first Bluetooth headset worn by the user who wants to speak.
  • the first connection module 10 includes a first determining unit 11 and a first connecting unit 12, where: the first determining unit 11 is configured to determine whether the first command is received; a connecting unit 12 , set to establish a connection with the first Bluetooth headset when the first command is received.
  • the first button may be set for the translator, and the first button may be a physical button or a virtual button.
  • the first determining unit 11 includes the first detecting subunit 111 as shown in FIG. And the first decision subunit 112, wherein: the first detecting subunit 111 is configured to detect whether the first button is triggered; the first determining subunit 112 is configured to: when the first button is triggered, determine to receive the first instruction .
  • the first instruction may also be a gesture action, a voice command, or the like
  • the first determination unit 11 captures a gesture action through a camera or acquires a voice command through a microphone, when capturing a specific gesture action or collecting a specific voice command. Then, the judgment receives the first instruction.
  • the first connection module 10 may first set the paired Bluetooth headset as the first. Bluetooth headset, first establish a connection with it. The present invention will not be repeated here.
  • the first Bluetooth headset collects the voice information of the user, and sends the voice information to the translator in the form of a PCM stream.
  • the processing module 20 receives the voice information sent by the first Bluetooth headset and performs translation processing on the voice information.
  • the processing module 20 receives the voice information sent by the first Bluetooth headset and stores it, establishes an HTTP connection with the voice recognition, translation, and synthesis server through a wireless network such as 4G or WIF I, and sequentially transmits the voice information to the voice recognition. , translation, composition server processing, get the voice stream in another language.
  • the entire translation process (including the processing time and network transmission delay) is about 2 seconds.
  • the processing module 20 first transmits the voice information to the voice recognition server, and the voice recognition server recognizes the voice information as a character string of the first language and returns it to the translator; the processing module 20 receives the character string of the first language and transmits To the translation server, the translation server translates the string of the first language into a string of the second language and returns it to the translator; the processing module 20 receives the string of the second language and sends it to the composition server, the composition server will be in the second language The string speech is synthesized into a second language speech stream and returned to the translator. The processing module 20 receives the voice code stream in the second language, and the voice code stream is the voice information after the translation process.
  • the processing module 20 may also perform voice recognition, translation, and composition processing on the voice information locally.
  • the entire translation process is about 2 seconds.
  • the second connection module 30 When the user on the first Bluetooth headset side finishes speaking, or the user on the second Bluetooth headset side wants to speak, The second connection module 30 then disconnects the connection with the first Bluetooth headset and switches to establish a connection with the second Bluetooth headset.
  • the second connection module 30 performs the connection switching, and the processing module 20 also processes and processes the last part of the voice information, and the entire connection switching process only takes about 1 second, and the translation processing of the voice information is shorter than about 2 seconds. Therefore, there is no extra waiting time.
  • the second connection module 30 includes a second determining unit 31 and a second connecting unit 32, where: the second determining unit 31 is configured to determine whether the second command is received; The two connection unit 32 is configured to be connected to the first Bluetooth headset when the second command is received, and establish a connection with the second Bluetooth headset.
  • a second button may be set for the translator, and the second button may be a physical button or a virtual button.
  • the second determining unit 31 includes a second detecting subunit 311 as shown in FIG. 7. And a second determining sub-unit 312, wherein: the second detecting sub-unit 311 is configured to detect whether the second button is triggered; the second determining sub-unit 312 is configured to: when the second button is triggered, the decision is received Two instructions. In this way, the user on the second Bluetooth headset side can intervene and intervene when needed, without having to wait for the user on the first Bluetooth headset side to complete the speech, and the flexibility is high.
  • the first determining unit 11 decides to receive the first command, and when the first button is released, the second determining unit 31 decides to receive the first Or two instructions; or when the first button is triggered for the first time, the first determining unit 11 decides to receive the first command, and when the first button is triggered again, the second determining unit 31 decides to receive the second command.
  • the second determining unit 31 determines whether a specific voice command is detected, and when a specific voice command is detected, determines that the second command is received.
  • the voice command such as "finish", "end” and other keywords, the user can say the above keyword at the end after the speech is completed.
  • the second determining unit 31 determines that the above keyword is determined to be a voice command at the end of the sentence, such as not receiving the voice information after the predetermined keyword is detected (eg, 2-5 seconds). Oh, the decision is received by the second instruction.
  • the second connection module 30 includes a transmission detecting unit 33 and a third connecting unit 34, where: the sending detecting unit 33 is configured to detect whether the first Bluetooth headset has stopped transmitting voice information.
  • the third connecting unit 34 is configured to: when the first Bluetooth headset stops transmitting the voice message, disconnect the first Bluetooth headset, and establish a connection with the second Bluetooth headset.
  • the transmission detecting unit 33 includes a receiving determining sub-unit 331 and a stopping decision sub-unit 332, wherein: the receiving determining sub-unit 331 is configured to determine whether the first Bluetooth headset is not received after the preset time interval is exceeded.
  • the sent voice information; the stop decision subunit 332 is configured to determine that the first Bluetooth headset has stopped transmitting voice information when the voice message sent by the first Bluetooth headset is not received.
  • the pause in the middle generally does not exceed 2-5 seconds.
  • the speech can be considered as finished, so the preset time can be set to 2-5 seconds, of course. It can be adjusted according to actual needs.
  • connection switching can be performed in other manners in the prior art, and the present invention will not be described again.
  • the sending module 40 After establishing a connection with the second Bluetooth headset, the sending module 40 sends the translated voice information to the second Bluetooth headset in a voice stream, and the second Bluetooth headset receives the translated voice information and outputs , thus completing a speech translation process.
  • the second Bluetooth headset After the second Bluetooth headset outputs the translated voice information, the user on the second Bluetooth headset side can speak, and then the second Bluetooth headset is converted into the first Bluetooth headset, and the voice information is sent to the translator.
  • the user on the second Bluetooth headset side may also not speak, and continue to speak by the user on the first Bluetooth headset side, and then the first connection module 10 disconnects the connection with the second Bluetooth headset, and then switches to the first The Bluetooth headset establishes a connection.
  • the speech translation apparatus of the embodiment of the present invention utilizes the half-duplex interaction feature of the translation machine to realize the branching of the translation machine and the two Bluetooth earphones through the switching connection between the two Bluetooth headsets by the translation machine. Communication, the purpose of translating voices in different languages has been achieved.
  • only two ordinary Bluetooth headsets can be combined with the translation machine to implement voice translation, and the ordinary Bluetooth headset can be improved into a special Bluetooth translation headset, which not only reduces the implementation cost but also expands the application range and reduces the application range. The user's use cost increases the user experience.
  • connection between the translator and the Bluetooth headset is shorter than the translation processing of the voice information, that is, the connection switching operation does not require additional waiting time, so the connection switching operation does not affect the output of the voice information, and does not generate additional The output is delayed, ensuring that the user experience is not affected.
  • the present invention also provides a translation machine including a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, the application being configured to execute a voice translation method.
  • the voice translation method includes the following steps: establishing a connection with a first Bluetooth headset; Receiving the voice information sent by the first Bluetooth headset, and translating the voice information; disconnecting the connection with the first Bluetooth headset and establishing a connection with the second Bluetooth headset; transmitting the translated voice information to the second Bluetooth headset .
  • the speech translation method described in this embodiment is the speech translation method involved in the foregoing embodiment of the present invention, and details are not described herein again.
  • the present invention includes apparatus that is directed to performing one or more of the operations described herein.
  • These devices may be specially designed and manufactured for the required purposes, or may also include known devices in a general purpose computer.
  • These devices have computer programs stored therein that are selectively activated or reconfigured.
  • Such computer programs may be stored in a device (eg, computer) readable medium or in any type of medium suitable for storing electronic instructions and respectively coupled to a bus, including but not limited to any Types of disks (including floppy disks, hard disks, CDs, CD-ROMs, and magneto-optical disks), ROM (Read-Only Memory), RAM (Random Access Memory), EPROM (Erasable Programmable Read-Only)
  • a readable medium includes any medium that is stored or transmitted by a device (e.g., a computer) in a readable form.
  • each block of the block diagrams and/or block diagrams and/or flow diagrams can be implemented with computer program instructions, and/or in the block diagrams and/or block diagrams and/or flow diagrams.
  • Those skilled in the art will appreciate that these computer program instructions can be implemented by a general purpose computer, a professional computer, or a processor of other programmable data processing methods, such that the processor is executed by a computer or other programmable data processing method.
  • the block diagrams and/or block diagrams of the invention and/or the schemes specified in the blocks or blocks of the flow diagram are invented.

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

本发明揭示了一种语音翻译方法、装置和翻译机,所述方法包括以下步骤:与第一蓝牙耳机建立连接;接收第一蓝牙耳机发送的语音信息,并对语音信息进行翻译处理;断开与第一蓝牙耳机的连接,并与第二蓝牙耳机建立连接;将翻译处理后的语音信息发送给第二蓝牙耳机,实现了翻译机与两个蓝牙耳机的分时通信,达到了不同语言的语音互译的目的。

Description

技术领域
[0001] 本发明涉及电子技术领域, 特别是涉及到一种语音翻译方法、 装置和翻译机。
背景技术
[0002] 目前, 两个说不同语言的用户交流吋, 可以通过翻译机进行翻译, 从而实现无 障碍交流。 为了增加私密性, 两个用户可以分别佩戴一个蓝牙耳机, 翻译机与 蓝牙耳机建立连接, 就可以实现两个用户的私密交谈。
[0003] 现有技术中, 与翻译机连接的蓝牙耳机必须是专门的蓝牙翻译耳机, 才能实现 上述翻译过程。 蓝牙翻译耳机是在普通的蓝牙耳机基础上改进而成的, 一种改 进方式是修改普通的蓝牙耳机的蓝牙协议, 另一种修改方式是为普通的蓝牙耳 机增加特殊的硬件, 无论哪种方式无疑都增加了实现成本。 而且用户必须专门 购买蓝牙翻译耳机, 无法使用通用的蓝牙耳机, 限制了翻译机的应用范围, 提 高了用户的使用成本, 用户体验不佳。
技术问题
[0004] 本发明的主要目的为提供一种语音翻译方法、 装置和翻译机, 旨在降低语音翻 译的实现成本, 扩展应用范围。
问题的解决方案
技术解决方案
[0005] 为达以上目的, 本发明实施例提出一种语音翻译方法, 所述方法包括以下步骤 [0006] 与第一蓝牙耳机建立连接;
[0007] 接收所述第一蓝牙耳机发送的语音信息, 并对所述语音信息进行翻译处理; [0008] 断幵与所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接;
[0009] 将翻译处理后的语音信息发送给所述第二蓝牙耳机。
[0010] 可选地, 所述与第一蓝牙耳机建立连接的步骤包括:
[0011] 判断是否接收到第一指令; [0012] 当接收到所述第一指令吋, 与第一蓝牙耳机建立连接。
[0013] 可选地, 所述判断是否接收到所述第一指令的步骤包括:
[0014] 检测第一按键是否被触发;
[0015] 当所述第一按键被触发吋, 判决接收到所述第一指令。
[0016] 可选地, 所述断幵与所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接的 步骤包括:
[0017] 判断是否接收到第二指令;
[0018] 当接收到所述第二指令吋, 与所述第一蓝牙耳机断幵连接, 并与第二蓝牙耳机 建立连接。
[0019] 可选地, 所述判断是否接收到所述第二指令的步骤包括:
[0020] 检测第二按键是否被触发;
[0021] 当所述第二按键被触发吋, 判决接收到所述第二指令。
[0022] 可选地, 所述断幵与所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接的 步骤包括:
[0023] 检测所述第一蓝牙耳机是否已停止发送所述语音信息;
[0024] 当所述第一蓝牙耳机停止发送所述语音信息吋, 断幵与所述第一蓝牙耳机的连 接, 并与第二蓝牙耳机建立连接。
[0025] 可选地, 所述检测所述第一蓝牙耳机是否已停止发送所述语音信息的步骤包括
[0026] 判断是否超过预设吋间未接收到所述第一蓝牙耳机发送的语音信息;
[0027] 当超过预设吋间未接收到所述第一蓝牙耳机发送的语音信息吋, 判决所述第一 蓝牙耳机已停止发送所述语音信息。
[0028] 可选地, 所述预设吋间为 2-5秒。
[0029] 本发明实施例同吋提出一种语音翻译装置, 所述装置包括:
[0030] 第一连接模块, 设置为与第一蓝牙耳机建立连接;
[0031] 处理模块, 设置为接收所述第一蓝牙耳机发送的语音信息, 并对所述语音信息 进行翻译处理;
[0032] 第二连接模块, 设置为断幵与所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建 立连接;
[0033] 发送模块, 设置为将翻译处理后的语音信息发送给所述第二蓝牙耳机。
[0034] 可选地, 所述第一连接模块包括:
[0035] 第一判断单元, 设置为判断是否接收到第一指令;
[0036] 第一连接单元, 设置为当接收到所述第一指令吋, 与第一蓝牙耳机建立连接。
[0037] 可选地, 所述第一判断单元包括:
[0038] 第一检测子单元, 设置为检测第一按键是否被触发;
[0039] 第一判决子单元, 设置为当所述第一按键被触发吋, 判决接收到所述第一指令
[0040] 可选地, 所述第二连接模块包括:
[0041] 第二判断单元, 设置为判断是否接收到第二指令;
[0042] 第二连接单元, 设置为当接收到所述第二指令吋, 与所述第一蓝牙耳机断幵连 接, 并与第二蓝牙耳机建立连接。
[0043] 可选地, 所述第二判断单元包括:
[0044] 第二检测子单元, 设置为检测第二按键是否被触发;
[0045] 第二判决子单元, 设置为当所述第二按键被触发吋, 判决接收到所述第二指令 [0046] 可选地, 所述第二连接模块包括:
[0047] 发送检测单元, 设置为检测所述第一蓝牙耳机是否已停止发送所述语音信息; [0048] 第三连接单元, 设置为当所述第一蓝牙耳机停止发送所述语音信息吋, 断幵与 所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接。
[0049] 可选地, 所述发送检测单元包括:
[0050] 接收判断子单元, 设置为判断是否超过预设吋间未接收到所述第一蓝牙耳机发 送的语音信息;
[0051] 停止判决子单元, 设置为当超过预设吋间未接收到所述第一蓝牙耳机发送的语 音信息吋, 判决所述第一蓝牙耳机已停止发送所述语音信息。
[0052] 本发明实施例还提出一种翻译机, 其包括存储器、 处理器和至少一个被存储在 所述存储器中并被配置为由所述处理器执行的应用程序, 所述应用程序被配置 为用于执行前述语音翻译方法。
发明的有益效果
有益效果
[0053] 本发明实施例所提供的一种语音翻译方法, 利用翻译机的半双工交互特点, 通 过翻译机在两个蓝牙耳机之间的分吋切换连接, 实现了翻译机与两个蓝牙耳机 的分吋通信, 达到了不同语言的语音互译的目的。 本发明实施例只需两个普通 的蓝牙耳机与翻译机配合就能实现语音翻译, 无需将普通的蓝牙耳机改进为专 门的蓝牙翻译耳机, 既降低了实现成本又扩展了应用范围, 并降低了用户的使 用成本, 提升了用户体验。 同吋, 翻译机与蓝牙耳机的连接切换吋间短于语音 信息的翻译处理吋间, 即连接切换操作无需额外的等待吋间, 因此连接切换操 作不会影响语音信息的输出, 不会产生额外的输出吋延, 保证用户体验不受影 响。
对附图的简要说明
附图说明
[0054] 图 1是本发明的语音翻译方法第一实施例的流程图;
[0055] 图 2是本发明的语音翻译方法第二实施例的流程图;
[0056] 图 3是本发明的语音翻译装置一实施例的模块示意图;
[0057] 图 4是图 3中的第一连接模块的模块示意图;
[0058] 图 5是图 4中的第一判断单元的模块示意图;
[0059] 图 6是图 3中的第二连接模块的模块示意图;
[0060] 图 7是图 6中的第二判断单元的模块示意图;
[0061 ] 图 8是图 3中的第二连接模块的又一模块示意图;
[0062] 图 9是图 8中的发送检测单元的模块示意图。
[0063] 本发明目的的实现、 功能特点及优点将结合实施例, 参照附图做进一步说明。
实施该发明的最佳实施例
本发明的最佳实施方式
[0064] 应当理解, 此处所描述的具体实施例仅仅用以解释本发明, 并不用于限定本发 明。
[0065] 下面详细描述本发明的实施例, 所述实施例的示例在附图中示出, 其中自始至 终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。 下 面通过参考附图描述的实施例是示例性的, 仅用于解释本发明, 而不能解释为 对本发明的限制。
[0066] 本技术领域技术人员可以理解, 除非特意声明, 这里使用的单数形式"一"、 " 一个"、 "所述 "和"该"也可包括复数形式。 应该进一步理解的是, 本发明的说明 书中使用的措辞"包括"是指存在所述特征、 整数、 步骤、 操作、 元件和 /或组件 , 但是并不排除存在或添加一个或多个其他特征、 整数、 步骤、 操作、 元件、 组件和 /或它们的组。 应该理解, 当我们称元件被"连接"或"耦接"到另一元件吋 , 它可以直接连接或耦接到其他元件, 或者也可以存在中间元件。 此外, 这里 使用的"连接"或"耦接"可以包括无线连接或无线耦接。 这里使用的措辞 "和 /或"包 括一个或更多个相关联的列出项的全部或任一单元和全部组合。
[0067] 本技术领域技术人员可以理解, 除非另外定义, 这里使用的所有术语 (包括技 术术语和科学术语) , 具有与本发明所属领域中的普通技术人员的一般理解相 同的意义。 还应该理解的是, 诸如通用字典中定义的那些术语, 应该被理解为 具有与现有技术的上下文中的意义一致的意义, 并且除非像这里一样被特定定 义, 否则不会用理想化或过于正式的含义来解释。
[0068] 本技术领域技术人员可以理解, 这里所使用的 "终端"、 "终端设备"既包括无线 信号接收器的设备, 其仅具备无发射能力的无线信号接收器的设备, 又包括接 收和发射硬件的设备, 其具有能够在双向通信链路上, 执行双向通信的接收和 发射硬件的设备。 这种设备可以包括: 蜂窝或其他通信设备, 其具有单线路显 示器或多线路显示器或没有多线路显示器的蜂窝或其他通信设备; PCS (Persona 1 Communications Service, 个人通信系统) , 其可以组合语音、 数据处理、 传真 和 /或数据通信能力; PDA (Personal Digital Assistant, 个人数字助理) , 其可以 包括射频接收器、 寻呼机、 互联网 /内联网访问、 网络浏览器、 记事本、 日历和 / 或 GPS (Global Positioning System, 全球定位系统) 接收器; 常规膝上型和 /或掌 上型计算机或其他设备, 其具有和 /或包括射频接收器的常规膝上型和 /或掌上型 计算机或其他设备。 这里所使用的 "终端"、 "终端设备"可以是便携式、 可运输、 安装在交通工具 (航空、 海运和 /或陆地) 中的, 或者适合于和 /或配置为在本地 运行, 和 /或以分布形式, 运行在地球和 /或空间的任何其他位置运行。 这里所使 用的"终端"、 "终端设备"还可以是通信终端、 上网终端、 音乐 /视频播放终端, 例如可以是 PDA、 MID (Mobile Internet Device, 移动互联网设备) 和 /或具有音 乐 /视频播放功能的移动电话, 也可以是智能电视、 机顶盒等设备。
[0069] 本技术领域技术人员可以理解, 这里所使用的服务器, 其包括但不限于计算机 、 网络主机、 单个网络服务器、 多个网络服务器集或多个服务器构成的云。 在 此, 云由基于云计算 (Cloud Computing) 的大量计算机或网络服务器构成, 其 中, 云计算是分布式计算的一种, 由一群松散耦合的计算机集组成的一个超级 虚拟计算机。 本发明的实施例中, 服务器、 终端设备与 WNS服务器之间可通过 任何通信方式实现通信, 包括但不限于, 基于 3GPP、 LTE、 WIMAX的移动通信 、 基于 TCP/IP、 UDP协议的计算机网络通信以及基于蓝牙、 红外传输标准的近 距无线传输方式。
[0070] 本发明实施例的语音翻译方法和装置, 主要应用于翻译机, 当然也可以应用于 其它的终端设备, 如手机、 平板等移动终端。 以下以应用于翻译机为例进行详 细说明。
[0071] 参照图 1, 提出本发明的语音翻译方法第一实施例, 所述方法包括以下步骤: [0072] Sl l、 与第一蓝牙耳机建立连接。
[0073] 本发明实施例中, 翻译机为支持移动通信 (如 4G) 、 蓝牙、 WIFI等连接技术 的终端设备, 它通过 4G和 WIFI等无线连接技术与远程的语音识别、 翻译、 合成 等服务器引擎进行交互, 实现不同语言的互译和语音输出, 采用半双工的人机 语音交互方式, 同一吋刻, 翻译机的语音只能处于输入或输出一个状态。
[0074] 翻译机幵启蓝牙, 分别与两个蓝牙耳机建立配对, 翻译机可以在用户界面上显 示蓝牙已配对的设备列表, 方便用户査看。 配对成功后, 翻译机则与欲发言的 用户佩戴的第一蓝牙耳机建立连接。
[0075] 可选地, 翻译机判断是否接收到第一指令, 当接收到第一指令吋, 翻译机即与 第一蓝牙耳机建立连接。 在具体实施吋, 可以为翻译机设置第一按键, 该第一 按键可以是实体按键或虚拟按键, 翻译机检测第一按键是否被触发, 当检测到 第一按键被触发吋, 则判决接收到第一指令。
[0076] 此外, 第一指令也可以是手势动作、 语音命令等, 翻译机通过摄像头捕捉手势 动作或通过麦克风采集语音命令, 当捕捉到特定的手势动作或采集到特定的语 音命令吋, 则判决接收到第一指令。
[0077] 除了通过接收第一指令的方式与第一蓝牙耳机建立连接外, 还可以采用现有技 术中的其它方式, 例如, 翻译机可以将最先建立配对的蓝牙耳机作为第一蓝牙 耳机, 首先与其建立连接。 本发明对此不再一一列举赘述。
[0078] S12、 接收第一蓝牙耳机发送的语音信息, 并对语音信息进行翻译处理。
[0079] 翻译机与第一蓝牙耳机建立连接后, 第一蓝牙耳机则采集用户的语音信息, 并 将语音信息以 PCM (Pulse-Code Modulation, 脉冲编码调制) 码流的形式发送给 翻译机。 翻译机接收第一蓝牙耳机发送的语音信息, 并对该语音信息进行翻译 处理。
[0080] 具体的, 翻译机接收第一蓝牙耳机发送的语音信息并存储, 通过 4G、 WIFI等 无线网络与语音识别、 翻译、 合成服务器建立 HTTP连接, 并将语音信息依次传 输给语音识别、 翻译、 合成服务器处理, 得到另一种语言的语音码流。 整个翻 译处理的吋间 (包括处理的吋间和网络传输吋延) 大概在 2秒左右。
[0081] 例如, 翻译机首先将语音信息发送给语音识别服务器, 语音识别服务器将语音 信息识别为第一语言的字符串后返回给翻译机; 翻译机接收第一语言的字符串 并发送给翻译服务器, 翻译服务器将第一语言的字符串翻译为第二语言的字符 串并返回给翻译机; 翻译机接收第二语言的字符串并发送给合成服务器, 合成 服务器将第二语言的字符串语音合成为第二语言的语音码流并返回给翻译机。 翻译机接收第二语言的语音码流, 该语音码流即为翻译处理后的语音信息。
[0082] 在其它实施例中, 翻译机也可以在本地对语音信息进行语音识别、 翻译和合成 处理。 整个翻译处理的吋间也大概在 2秒左右。
[0083] S13、 断幵与第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接。
[0084] 当第一蓝牙耳机侧的用户发言完毕, 或者第二蓝牙耳机侧的用户想要发言吋, 翻译机则断幵与第一蓝牙耳机的连接, 切换为与第二蓝牙耳机建立连接。 翻译 机在进行连接切换的同吋还在翻译处理最后一部分语音信息, 而整个连接切换 过程只需要 1秒左右, 短于 2秒左右的语音信息的翻译处理吋间, 因此不会额外 增加等待吋间。
[0085] 可选地, 翻译机判断是否接收到第二指令, 当接收到第二指令吋, 则与第一蓝 牙耳机断幵连接, 并与第二蓝牙耳机建立连接。 在具体实施吋, 可以为翻译机 设置第二按键, 该第二按键可以是实体按键或虚拟按键, 翻译机检测第二按键 是否被触发, 当检测到第二按键被触发吋, 则判决接收到第二指令。 采用这种 方式, 第二蓝牙耳机侧的用户可以在需要的吋候随吋介入发言, 无需等待第一 蓝牙耳机侧的用户发言完毕才能介入, 灵活性较高。
[0086] 在某些实施例中, 当第一按键被按下吋, 翻译机判决接收到第一指令, 当第一 按键被释放吋, 翻译机判决接收到第二指令; 或者当第一按键被首次触发吋, 翻译机判决接收到第一指令, 当第一按键被再次触发吋, 翻译机判决接收到第 二指令。
[0087] 在另一些实施例中, 翻译机判断是否检测到特定的语音命令, 当检测到特定的 语音命令吋, 则判决接收到第二指令。 所述语音命令如"完毕"、 "结束 "等关键词 , 用户可以在发言完毕吋在结尾说出前述关键词。 为了防止误判, 翻译机确定 上述关键词在句尾吋才判定为语音命令, 如在检测到上述关键词后超过预设吋 间 (如 2-5秒) 未继续接收到语音信息吋, 则判决接收到第二指令。
[0088] 可选地, 翻译机检测第一蓝牙耳机是否已停止发送语音信息, 当检测到第一蓝 牙耳机停止发送语音信息吋, 则断幵与第一蓝牙耳机的连接, 并与第二蓝牙耳 机建立连接。 具体实施吋, 翻译机判断是否超过预设吋间未接收到第一蓝牙耳 机发送的语音信息, 当超过预设吋间未接收到第一蓝牙耳机发送的语音信息吋 , 则判决第一蓝牙耳机已停止发送语音信息。
[0089] 考虑到用户在连续发言过程中, 中间的停顿一般不会超过 2-5秒, 超过这个吋 间一般可以认为发言完毕, 因此可以将预设吋间设置为 2-5秒, 当然也可以根据 实际需要进行调整。
[0090] 本领域技术人员可以理解, 除此之外还可以采用现有技术中的其它方式进行连 接切换, 本发明对此不再一一列举赘述。 [0091] S14、 将翻译处理后的语音信息发送给第二蓝牙耳机。
[0092] 当与第二蓝牙耳机建立连接后, 翻译机则将翻译处理后的语音信息以语音码流 的方式发送给第二蓝牙耳机, 第二蓝牙耳机接收翻译处理后的语音信息并输出 , 从而完成了一次语音翻译流程。
[0093] 当第二蓝牙耳机输出翻译处理后的语音信息完毕后, 第二蓝牙耳机侧的用户可 以发言, 此吋第二蓝牙耳机则转换为第一蓝牙耳机, 返回步骤 S12, 向翻译机发 送语音信息。 当然, 第二蓝牙耳机侧的用户也可以不发言, 继续由第一蓝牙耳 机侧的用户发言, 则返回步骤 Sl l, 断幵与第二蓝牙耳机的连接, 转而切换为与 第一蓝牙耳机建立连接。
[0094] 参照图 2, 提出本发明的语音翻译方法第二实施例, 所述方法包括以下步骤:
[0095] S21、 翻译机与蓝牙耳机 A建立连接。
[0096] 佩戴蓝牙耳机 A的用户 A触发按键 A, 翻译机检测到按键 A被触发吋, 则与蓝牙 耳机 A建立连接。
[0097] S22、 蓝牙耳机 A采集用户 A的语音信息, 并发送给翻译机。
[0098] S23、 翻译机接收蓝牙耳机 A发送的语音信息, 并对语音信息进行翻译处理。
[0099] S24、 翻译机断幵与蓝牙耳机 A的连接, 与蓝牙耳机 B建立连接。
[0100] 可选地, 用户 A发言完毕后, 则释放被按压的按键 A或者再次按压按键 A使得按 键 A被再次触发, 翻译机检测到按键 A被再次触发吋, 则断幵与蓝牙耳机 A的连 接, 转而与蓝牙耳机 B建立连接。
[0101] 可选地, 用户 A发言完毕后或在发言过程中, 佩戴蓝牙耳机 B的用户 B触发按键
B, 翻译机检测到按键 B被触发吋, 则断幵与蓝牙耳机 A的连接, 转而与蓝牙耳 机 B建立连接。
[0102] S25、 翻译机将翻译处理后的语音信息发送给蓝牙耳机 B。
[0103] S26、 蓝牙耳机 B接收翻译处理后的语音信息并予以输出。
[0104] S27、 蓝牙耳机 B采集用户 B的语音信息, 并发送给翻译机。
[0105] S28、 翻译机接收蓝牙耳机 B发送的语音信息, 并对语音信息进行翻译处理。
[0106] S29、 翻译机断幵与蓝牙耳机 B的连接, 与蓝牙耳机 A建立连接。
[0107] 可选地, 用户 B发言完毕后, 则释放被按压的按键 B或者再次按压按键 B使得按 键 B被再次触发, 翻译机检测到按键 B被再次触发吋, 则断幵与蓝牙耳机 B的连 接, 转而与蓝牙耳机 A建立连接。
[0108] 可选地, 用户 B发言完毕后或在发言过程中, 佩戴蓝牙耳机 A的用户 A触发按键 A, 翻译机检测到按键 A被触发吋, 则断幵与蓝牙耳机 B的连接, 转而与蓝牙耳 机 A建立连接。
[0109] S30、 翻译机将翻译处理后的语音信息发送给蓝牙耳机 A。
[0110] S31、 蓝牙耳机 A接收翻译处理后的语音信息并予以输出。
[0111] 从而, 用户 A和用户 B完成了一轮语音交流过程, 翻译机完成了一轮语音交互 翻译流程, 重复上述步骤, 即可以实现多轮语音交互翻译过程。
[0112] 本发明实施例的语音翻译方法, 利用翻译机的半双工交互特点, 通过翻译机在 两个蓝牙耳机之间的分吋切换连接, 实现了翻译机与两个蓝牙耳机的分吋通信 , 达到了不同语言的语音互译的目的。 本发明实施例只需两个普通的蓝牙耳机 与翻译机配合就能实现语音翻译, 无需将普通的蓝牙耳机改进为专门的蓝牙翻 译耳机, 既降低了实现成本又扩展了应用范围, 并降低了用户的使用成本, 提 升了用户体验。 同吋, 翻译机与蓝牙耳机的连接切换吋间短于语音信息的翻译 处理吋间, 即连接切换操作无需额外的等待吋间, 因此连接切换操作不会影响 语音信息的输出, 不会产生额外的输出吋延, 保证用户体验不受影响。
[0113] 参照图 3, 提出本发明的语音翻译装置一实施例, 所述装置包括第一连接模块 1 0、 处理模块 20、 第二连接模块 30和发送模块 40, 其中: 第一连接模块 10, 设置 为与第一蓝牙耳机建立连接; 处理模块 20, 设置为接收第一蓝牙耳机发送的语 音信息, 并对语音信息进行翻译处理; 第二连接模块 30, 设置为断幵与第一蓝 牙耳机的连接, 并与第二蓝牙耳机建立连接; 发送模块 40, 设置为将翻译处理 后的语音信息发送给第二蓝牙耳机。
[0114] 翻译机幵启蓝牙, 分别与两个蓝牙耳机建立配对, 翻译机可以在用户界面上显 示蓝牙已配对的设备列表, 方便用户査看。 配对成功后, 第一连接模块 10则与 欲发言的用户佩戴的第一蓝牙耳机建立连接。
[0115] 可选地, 如图 4所示, 第一连接模块 10包括第一判断单元 11和第一连接单元 12 , 其中: 第一判断单元 11, 设置为判断是否接收到第一指令; 第一连接单元 12 , 设置为当接收到第一指令吋, 与第一蓝牙耳机建立连接。
[0116] 在具体实施吋, 可以为翻译机设置第一按键, 该第一按键可以是实体按键或虚 拟按键, 此吋, 第一判断单元 11如图 5所示, 包括第一检测子单元 111和第一判 决子单元 112, 其中: 第一检测子单元 111, 设置为检测第一按键是否被触发; 第一判决子单元 112, 设置为当第一按键被触发吋, 判决接收到第一指令。
[0117] 此外, 第一指令也可以是手势动作、 语音命令等, 第一判断单元 11通过摄像头 捕捉手势动作或通过麦克风采集语音命令, 当捕捉到特定的手势动作或采集到 特定的语音命令吋, 则判决接收到第一指令。
[0118] 除了通过接收第一指令的方式与第一蓝牙耳机建立连接外, 还可以采用现有技 术中的其它方式, 例如, 第一连接模块 10可以将最先建立配对的蓝牙耳机作为 第一蓝牙耳机, 首先与其建立连接。 本发明对此不再一一列举赘述。
[0119] 第一连接模块 10与第一蓝牙耳机建立连接后, 第一蓝牙耳机则采集用户的语音 信息, 并将语音信息以 PCM码流的形式发送给翻译机。 处理模块 20接收第一蓝 牙耳机发送的语音信息, 并对该语音信息进行翻译处理。
[0120] 具体的, 处理模块 20接收第一蓝牙耳机发送的语音信息并存储, 通过 4G、 WIF I等无线网络与语音识别、 翻译、 合成服务器建立 HTTP连接, 并将语音信息依次 传输给语音识别、 翻译、 合成服务器处理, 得到另一种语言的语音码流。 整个 翻译处理的吋间 (包括处理的吋间和网络传输吋延) 大概在 2秒左右。
[0121] 例如, 处理模块 20首先将语音信息发送给语音识别服务器, 语音识别服务器将 语音信息识别为第一语言的字符串后返回给翻译机; 处理模块 20接收第一语言 的字符串并发送给翻译服务器, 翻译服务器将第一语言的字符串翻译为第二语 言的字符串并返回给翻译机; 处理模块 20接收第二语言的字符串并发送给合成 服务器, 合成服务器将第二语言的字符串语音合成为第二语言的语音码流并返 回给翻译机。 处理模块 20接收第二语言的语音码流, 该语音码流即为翻译处理 后的语音信息。
[0122] 在其它实施例中, 处理模块 20也可以在本地对语音信息进行语音识别、 翻译和 合成处理。 整个翻译处理的吋间也大概在 2秒左右。
[0123] 当第一蓝牙耳机侧的用户发言完毕, 或者第二蓝牙耳机侧的用户想要发言吋, 第二连接模块 30则断幵与第一蓝牙耳机的连接, 切换为与第二蓝牙耳机建立连 接。 第二连接模块 30在进行连接切换的同吋, 处理模块 20还在翻译处理最后一 部分语音信息, 而整个连接切换过程只需要 1秒左右, 短于 2秒左右的语音信息 的翻译处理吋间, 因此不会额外增加等待吋间。
[0124] 可选地, 如图 6所示, 第二连接模块 30包括第二判断单元 31和第二连接单元 32 , 其中: 第二判断单元 31, 设置为判断是否接收到第二指令; 第二连接单元 32 , 设置为当接收到第二指令吋, 与第一蓝牙耳机断幵连接, 并与第二蓝牙耳机 建立连接。
[0125] 在具体实施吋, 可以为翻译机设置第二按键, 该第二按键可以是实体按键或虚 拟按键, 此吋, 第二判断单元 31如图 7所示, 包括第二检测子单元 311和第二判 决子单元 312, 其中: 第二检测子单元 311, 设置为检测第二按键是否被触发; 第二判决子单元 312, 设置为当所述第二按键被触发吋, 判决接收到第二指令。 采用这种方式, 第二蓝牙耳机侧的用户可以在需要的吋候随吋介入发言, 无需 等待第一蓝牙耳机侧的用户发言完毕才能介入, 灵活性较高。
[0126] 在某些实施例中, 当第一按键被按下吋, 第一判断单元 11则判决接收到第一指 令, 当第一按键被释放吋, 第二判断单元 31则判决接收到第二指令; 或者当第 一按键被首次触发吋, 第一判断单元 11则判决接收到第一指令, 当第一按键被 再次触发吋, 第二判断单元 31则判决接收到第二指令。
[0127] 在另一些实施例中, 第二判断单元 31判断是否检测到特定的语音命令, 当检测 到特定的语音命令吋, 则判决接收到第二指令。 所述语音命令如"完毕"、 "结束" 等关键词, 用户可以在发言完毕吋在结尾说出前述关键词。 为了防止误判, 第 二判断单元 31确定上述关键词在句尾吋才判定为语音命令, 如在检测到上述关 键词后超过预设吋间 (如 2-5秒) 未继续接收到语音信息吋, 则判决接收到第二 指令。
[0128] 可选地, 如图 8所示, 第二连接模块 30包括发送检测单元 33和第三连接单元 34 , 其中: 发送检测单元 33, 设置为检测第一蓝牙耳机是否已停止发送语音信息 ; 第三连接单元 34, 设置为当第一蓝牙耳机停止发送语音信息吋, 断幵与第一 蓝牙耳机的连接, 并与第二蓝牙耳机建立连接。 [0129] 发送检测单元 33如图 9所示, 包括接收判断子单元 331和停止判决子单元 332, 其中: 接收判断子单元 331, 设置为判断是否超过预设吋间未接收到第一蓝牙耳 机发送的语音信息; 停止判决子单元 332, 设置为当超过预设吋间未接收到第一 蓝牙耳机发送的语音信息吋, 判决第一蓝牙耳机已停止发送语音信息。
[0130] 考虑到用户在连续发言过程中, 中间的停顿一般不会超过 2-5秒, 超过这个吋 间一般可以认为发言完毕, 因此可以将预设吋间设置为 2-5秒, 当然也可以根据 实际需要进行调整。
[0131] 本领域技术人员可以理解, 除此之外还可以采用现有技术中的其它方式进行连 接切换, 本发明对此不再一一列举赘述。
[0132] 当与第二蓝牙耳机建立连接后, 发送模块 40则将翻译处理后的语音信息以语音 码流的方式发送给第二蓝牙耳机, 第二蓝牙耳机接收翻译处理后的语音信息并 输出, 从而完成了一次语音翻译流程。
[0133] 当第二蓝牙耳机输出翻译处理后的语音信息完毕后, 第二蓝牙耳机侧的用户可 以发言, 此吋第二蓝牙耳机则转换为第一蓝牙耳机, 向翻译机发送语音信息。 当然, 第二蓝牙耳机侧的用户也可以不发言, 继续由第一蓝牙耳机侧的用户发 言, 此吋第一连接模块 10则断幵与第二蓝牙耳机的连接, 转而切换为与第一蓝 牙耳机建立连接。
[0134] 本发明实施例的语音翻译装置, 利用翻译机的半双工交互特点, 通过翻译机在 两个蓝牙耳机之间的分吋切换连接, 实现了翻译机与两个蓝牙耳机的分吋通信 , 达到了不同语言的语音互译的目的。 本发明实施例只需两个普通的蓝牙耳机 与翻译机配合就能实现语音翻译, 无需将普通的蓝牙耳机改进为专门的蓝牙翻 译耳机, 既降低了实现成本又扩展了应用范围, 并降低了用户的使用成本, 提 升了用户体验。 同吋, 翻译机与蓝牙耳机的连接切换吋间短于语音信息的翻译 处理吋间, 即连接切换操作无需额外的等待吋间, 因此连接切换操作不会影响 语音信息的输出, 不会产生额外的输出吋延, 保证用户体验不受影响。
[0135] 本发明同吋提出一种翻译机, 其包括存储器、 处理器和至少一个被存储在存储 器中并被配置为由处理器执行的应用程序, 所述应用程序被配置为用于执行语 音翻译方法。 所述语音翻译方法包括以下步骤: 与第一蓝牙耳机建立连接; 接 收第一蓝牙耳机发送的语音信息, 并对语音信息进行翻译处理; 断幵与第一蓝 牙耳机的连接, 并与第二蓝牙耳机建立连接; 将翻译处理后的语音信息发送给 第二蓝牙耳机。 本实施例中所描述的语音翻译方法为本发明中上述实施例所涉 及的语音翻译方法, 在此不再赘述。
本领域技术人员可以理解, 本发明包括涉及用于执行本申请中所述操作中的一 项或多项的设备。 这些设备可以为所需的目的而专门设计和制造, 或者也可以 包括通用计算机中的已知设备。 这些设备具有存储在其内的计算机程序, 这些 计算机程序选择性地激活或重构。 这样的计算机程序可以被存储在设备 (例如 , 计算机) 可读介质中或者存储在适于存储电子指令并分别耦联到总线的任何 类型的介质中, 所述计算机可读介质包括但不限于任何类型的盘 (包括软盘、 硬盘、 光盘、 CD-ROM、 和磁光盘) 、 ROM (Read-Only Memory , 只读存储器 ) 、 RAM (Random Access Memory , 随机存储器) 、 EPROM (Erasable Programmable Read-Only
Memory , 可擦写可编程只读存储器) 、 EEPROM (Electrically Erasable
Programmable Read-Only Memory , 电可擦可编程只读存储器) 、 闪存、 磁性卡 片或光线卡片。 也就是, 可读介质包括由设备 (例如, 计算机) 以能够读的形 式存储或传输信息的任何介质。
[0137] 本技术领域技术人员可以理解, 可以用计算机程序指令来实现这些结构图和 / 或框图和 /或流图中的每个框以及这些结构图和 /或框图和 /或流图中的框的组合。 本技术领域技术人员可以理解, 可以将这些计算机程序指令提供给通用计算机 、 专业计算机或其他可编程数据处理方法的处理器来实现, 从而通过计算机或 其他可编程数据处理方法的处理器来执行本发明公幵的结构图和 /或框图和 /或流 图的框或多个框中指定的方案。
[0138] 本技术领域技术人员可以理解, 本发明中已经讨论过的各种操作、 方法、 流程 中的步骤、 措施、 方案可以被交替、 更改、 组合或刪除。 进一步地, 具有本发 明中已经讨论过的各种操作、 方法、 流程中的其他步骤、 措施、 方案也可以被 交替、 更改、 重排、 分解、 组合或刪除。 进一步地, 现有技术中的具有与本发 明中公幵的各种操作、 方法、 流程中的步骤、 措施、 方案也可以被交替、 更改 、 重排、 分解、 组合或刪除。
以上所述仅为本发明的优选实施例, 并非因此限制本发明的专利范围, 凡是利 用本发明说明书及附图内容所作的等效结构或等效流程变换, 或直接或间接运 用在其他相关的技术领域, 均同理包括在本发明的专利保护范围内。

Claims

权利要求书
一种语音翻译方法, 包括以下步骤:
与第一蓝牙耳机建立连接; 接收所述第一蓝牙耳机发送的语音信息, 并对所述语音信息进行翻译 处理;
断幵与所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接; 将翻译处理后的语音信息发送给所述第二蓝牙耳机。
根据权利要求 1所述的语音翻译方法, 其中, 所述与第一蓝牙耳机建 立连接的步骤包括:
判断是否接收到第一指令;
当接收到所述第一指令吋, 与第一蓝牙耳机建立连接。
根据权利要求 2所述的语音翻译方法, 其中, 所述判断是否接收到所 述第一指令的步骤包括:
检测第一按键是否被触发;
当所述第一按键被触发吋, 判决接收到所述第一指令。
根据权利要求 1所述的语音翻译方法, 其中, 所述断幵与所述第一蓝 牙耳机的连接, 并与第二蓝牙耳机建立连接的步骤包括:
判断是否接收到第二指令;
当接收到所述第二指令吋, 与所述第一蓝牙耳机断幵连接, 并与第二 蓝牙耳机建立连接。
根据权利要求 4所述的语音翻译方法, 其中, 所述判断是否接收到所 述第二指令的步骤包括:
检测第二按键是否被触发;
当所述第二按键被触发吋, 判决接收到所述第二指令。
根据权利要求 1所述的语音翻译方法, 其中, 所述断幵与所述第一蓝 牙耳机的连接, 并与第二蓝牙耳机建立连接的步骤包括:
检测所述第一蓝牙耳机是否已停止发送所述语音信息;
当所述第一蓝牙耳机停止发送所述语音信息吋, 断幵与所述第一蓝牙 耳机的连接, 并与第二蓝牙耳机建立连接。
根据权利要求 6所述的语音翻译方法, 其中, 所述检测所述第一蓝牙 耳机是否已停止发送所述语音信息的步骤包括:
判断是否超过预设吋间未接收到所述第一蓝牙耳机发送的语音信息; 当超过预设吋间未接收到所述第一蓝牙耳机发送的语音信息吋, 判决 所述第一蓝牙耳机已停止发送所述语音信息。
根据权利要求 7所述的语音翻译方法, 其中, 所述预设吋间为 2-5秒。 一种语音翻译装置, 包括:
第一连接模块, 设置为与第一蓝牙耳机建立连接;
处理模块, 设置为接收所述第一蓝牙耳机发送的语音信息, 并对所述 语音信息进行翻译处理;
第二连接模块, 设置为断幵与所述第一蓝牙耳机的连接, 并与第二蓝 牙耳机建立连接;
发送模块, 设置为将翻译处理后的语音信息发送给所述第二蓝牙耳机 根据权利要求 9所述的语音翻译装置, 其中, 所述第一连接模块包括 第一判断单元, 设置为判断是否接收到第一指令;
第一连接单元, 设置为当接收到所述第一指令吋, 与第一蓝牙耳机建 立连接。
根据权利要求 10所述的语音翻译装置, 其中, 所述第一判断单元包括 第一检测子单元, 设置为检测第一按键是否被触发;
第一判决子单元, 设置为当所述第一按键被触发吋, 判决接收到所述 第一指令。
根据权利要求 11所述的语音翻译装置, 其中, 所述第二连接模块包括 第二判断单元, 设置为判断是否接收到第二指令; 第二连接单元, 设置为当接收到所述第二指令吋, 与所述第一蓝牙耳 机断幵连接, 并与第二蓝牙耳机建立连接。
根据权利要求 12所述的语音翻译装置, 其中, 所述第二判断单元包括 第二检测子单元, 设置为检测第二按键是否被触发;
第二判决子单元, 设置为当所述第二按键被触发吋, 判决接收到所述 第二指令。
根据权利要求 9所述的语音翻译装置, 其中, 所述第二连接模块包括 发送检测单元, 设置为检测所述第一蓝牙耳机是否已停止发送所述语 音信息;
第三连接单元, 设置为当所述第一蓝牙耳机停止发送所述语音信息吋 , 断幵与所述第一蓝牙耳机的连接, 并与第二蓝牙耳机建立连接。 根据权利要求 14所述的语音翻译装置, 其中, 所述发送检测单元包括 接收判断子单元, 设置为判断是否超过预设吋间未接收到所述第一蓝 牙耳机发送的语音信息;
停止判决子单元, 设置为当超过预设吋间未接收到所述第一蓝牙耳机 发送的语音信息吋, 判决所述第一蓝牙耳机已停止发送所述语音信息 根据权利要求 15所述的语音翻译装置, 其中, 所述预设吋间为 2-5秒 一种翻译机, 包括存储器、 处理器和至少一个被存储在所述存储器中 并被配置为由所述处理器执行的应用程序, 其中, 所述应用程序被配 置为用于执行权利要求 1所述的语音翻译方法。
PCT/CN2017/109563 2017-11-06 2017-11-06 语音翻译方法、装置和翻译机 WO2019084962A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/109563 WO2019084962A1 (zh) 2017-11-06 2017-11-06 语音翻译方法、装置和翻译机

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/109563 WO2019084962A1 (zh) 2017-11-06 2017-11-06 语音翻译方法、装置和翻译机

Publications (1)

Publication Number Publication Date
WO2019084962A1 true WO2019084962A1 (zh) 2019-05-09

Family

ID=66332043

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/109563 WO2019084962A1 (zh) 2017-11-06 2017-11-06 语音翻译方法、装置和翻译机

Country Status (1)

Country Link
WO (1) WO2019084962A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1602483A (zh) * 2001-12-17 2005-03-30 内维尼·加雅拉特尼 进行多语种口述词语实时翻译的实时翻译装置与方法
CN102547486A (zh) * 2011-01-04 2012-07-04 上海华勤通讯技术有限公司 蓝牙耳机对讲系统
WO2013163293A1 (en) * 2012-04-25 2013-10-31 Kopin Corporation Instant translation system
CN104540175A (zh) * 2014-11-26 2015-04-22 青岛歌尔声学科技有限公司 一种不间断蓝牙连接的切换方法、蓝牙设备和系统
CN105101058A (zh) * 2015-07-13 2015-11-25 惠州Tcl移动通信有限公司 多个蓝牙耳机协同工作的实现方法及设备
CN106911857A (zh) * 2017-03-08 2017-06-30 青岛中云时代信息技术有限公司 一种语音数据交互方法及装置
CN107885731A (zh) * 2017-11-06 2018-04-06 深圳市沃特沃德股份有限公司 语音翻译方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1602483A (zh) * 2001-12-17 2005-03-30 内维尼·加雅拉特尼 进行多语种口述词语实时翻译的实时翻译装置与方法
CN102547486A (zh) * 2011-01-04 2012-07-04 上海华勤通讯技术有限公司 蓝牙耳机对讲系统
WO2013163293A1 (en) * 2012-04-25 2013-10-31 Kopin Corporation Instant translation system
CN104540175A (zh) * 2014-11-26 2015-04-22 青岛歌尔声学科技有限公司 一种不间断蓝牙连接的切换方法、蓝牙设备和系统
CN105101058A (zh) * 2015-07-13 2015-11-25 惠州Tcl移动通信有限公司 多个蓝牙耳机协同工作的实现方法及设备
CN106911857A (zh) * 2017-03-08 2017-06-30 青岛中云时代信息技术有限公司 一种语音数据交互方法及装置
CN107885731A (zh) * 2017-11-06 2018-04-06 深圳市沃特沃德股份有限公司 语音翻译方法和装置

Similar Documents

Publication Publication Date Title
CN107885731A (zh) 语音翻译方法和装置
US9214155B2 (en) Handsfree device with countinuous keyword recognition
WO2019000515A1 (zh) 语音通话方法和装置
CN102170617A (zh) 移动终端及其远程控制方法
WO2018214314A1 (zh) 同声翻译的实现方法和装置
JP2006217628A (ja) 多者間画像通信を提供する送受信装置及び送信方法
CN105227440A (zh) 终端数据分享系统、方法和输入设备、行车记录仪终端
CN107885732A (zh) 语音翻译方法、系统和装置
WO2018209851A1 (zh) 翻译方法和翻译系统
WO2019169684A1 (zh) 基于蓝牙实现语音遥控的方法、装置和终端设备
CN104580534A (zh) 信息处理方法、装置及电子设备
WO2005125164A2 (en) Audio session management system and method for a mobile communication device
WO2019071723A1 (zh) 语音翻译方法、装置和翻译机
CN105551491A (zh) 语音识别方法和设备
US20140370814A1 (en) Connecting wireless devices
JP2016139952A (ja) ハイブリッド端末
US11056106B2 (en) Voice interaction system and information processing apparatus
CN108806675B (zh) 语音输入输出装置、无线连接方法、语音对话系统
WO2019084962A1 (zh) 语音翻译方法、装置和翻译机
JP5163682B2 (ja) 通訳通話システム
EP3063958B1 (en) A method of extending an intercom communication range and device thereof
CN101056328B (zh) 基于无绳电话的应用方法和系统
US20110249084A1 (en) Method and Arrangement For Connecting At Least One Man-Machine Interface For Manipulating At Least One Data Source Connected To A Video Conferencing System Within The Scope Of Video Conferences
WO2018058875A1 (zh) 一种终端的通话切换方法、系统及终端、计算机存储介质
WO2019000619A1 (zh) 翻译方法、翻译设备及翻译系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17930908

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17930908

Country of ref document: EP

Kind code of ref document: A1