WO2019186639A1 - Système de traduction, procédé de traduction, dispositif de traduction, et dispositif d'entrée/sortie vocale - Google Patents

Système de traduction, procédé de traduction, dispositif de traduction, et dispositif d'entrée/sortie vocale Download PDF

Info

Publication number
WO2019186639A1
WO2019186639A1 PCT/JP2018/012098 JP2018012098W WO2019186639A1 WO 2019186639 A1 WO2019186639 A1 WO 2019186639A1 JP 2018012098 W JP2018012098 W JP 2018012098W WO 2019186639 A1 WO2019186639 A1 WO 2019186639A1
Authority
WO
WIPO (PCT)
Prior art keywords
language
translation
user
unit
data
Prior art date
Application number
PCT/JP2018/012098
Other languages
English (en)
Japanese (ja)
Inventor
純 葛西
Original Assignee
株式会社フォルテ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社フォルテ filed Critical 株式会社フォルテ
Priority to PCT/JP2018/012098 priority Critical patent/WO2019186639A1/fr
Priority to JP2018545518A priority patent/JP6457706B1/ja
Priority to TW108102574A priority patent/TWI695281B/zh
Publication of WO2019186639A1 publication Critical patent/WO2019186639A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Definitions

  • the present invention relates to a translation system, a translation method, a translation device, and a speech input / output device for translating input speech.
  • Japanese Patent Application Laid-Open No. 2004-133830 discloses a technique for translating input speech to generate a character code string, transmitting the character code string to a communication partner apparatus, and displaying the character code string as a caption on the communication partner apparatus. It is disclosed.
  • the present invention has been made in view of these points, and provides a translation system, a translation method, a translation device, and a voice input / output device that can improve the quality of communication with a partner who uses a different language. For the purpose.
  • a translation system includes a first user terminal used by a first user, and a translation apparatus capable of communicating with the first user terminal.
  • the first user terminal transmits, to the translation device, a voice input unit that receives voice input in a first language and first language text data obtained by converting the voice in the first language received by the voice input unit.
  • a terminal transmission unit and a terminal reception unit that receives, from the translation device, first language translation data in which the language information of the second language input in the second user terminal used by the second user is translated into the first language
  • a voice output unit that outputs the voice of the first language into which the first language translation data received by the terminal reception unit is converted.
  • the translation device receives the first language text data from the first user terminal and a specifying unit that specifies a first language used by the first user and a second language used by the second user.
  • the language conversion unit converts the second language translation data into the first language to generate retranslation data
  • the device transmission unit transmits the retranslation data to the first user terminal
  • the voice The output unit may output the retranslation data while the voice input unit receives an input of the first language voice.
  • the device transmission unit stops transmitting the retranslation data when the language conversion unit converts the language information of the second language input in the second user terminal into the first language translation data.
  • the first language translation data may be transmitted.
  • the device transmitting unit sends the retranslated data to the second user terminal.
  • the transmission of the second language translation data may be stopped.
  • the device receiving unit receives second language text data obtained by converting second language speech from the second user terminal, and the language converting unit receives the second language text data received by the device receiving unit.
  • Bilingual text data is converted into first language translation data, and the device transmission unit generates new first language translation data while the device transmission unit transmits the first language translation data.
  • transmission of the first language translation data being transmitted may be stopped and transmission of new first language translation data may be started.
  • the voice output unit may include a bone conduction speaker, and the first language voice may be output from the bone conduction speaker while the voice input unit receives the first language voice.
  • the voice input unit includes a plurality of main microphones for acquiring sound and a sub microphone for acquiring ambient sound, and the terminal transmission unit receives the sub microphone from sounds acquired by the plurality of main microphones.
  • Text data corresponding to the sound after removing the acquired ambient sound may be transmitted as the first language text data corresponding to the voice of the first language.
  • a distance between the sub microphone and the mouth of the first user may be larger than a distance between the plurality of main microphones and the mouth of the first user.
  • the first user terminal is configured to output a sound input to at least one of the plurality of main microphones based on a result of comparing a sound input to each of the plurality of main microphones with a sound input to the sub microphone.
  • a text conversion unit that converts the first language speech extracted by the extraction unit into the first language text data.
  • the extraction unit identifies a sound that is estimated as the voice of the first language based on a result of comparing the sounds input to the plurality of main microphones, and the identified sound and the sound input to the sub microphone
  • the voice of the first language may be extracted based on the result of the comparison.
  • the terminal transmitting unit transmits user specifying information for specifying the second user in association with the first language text data to the translation device, and the specifying unit is associated with a plurality of users,
  • the second language stored in the storage unit in association with the second user indicated by the user specifying information transmitted by the terminal transmission unit is referred to by referring to the storage unit storing the language used by each of the users. You may specify.
  • the terminal transmission unit transmits language information for specifying the language of the second user to the translation device before starting transmission of the first language text data, and the specification unit includes the terminal transmission unit.
  • the second language may be specified based on the language information transmitted by the user.
  • the specifying unit may specify the second language based on a word included in the first language text data.
  • the translation device specifies a first language used by the first user, a second language used by the second user, and a first used by the first user.
  • the translation device includes a specifying unit that specifies the first language used by the first user and the second language used by the second user, and the first user used by the first user.
  • a device receiving unit that receives first language text data in which the speech of the first language is converted from a terminal, and a language conversion that converts the first language text data received by the device receiving unit into second language translation data
  • a device transmission unit that transmits the second language translation data to a second user terminal used by the second user.
  • a voice input / output device includes a voice input unit that receives a voice input in a first language, and text data obtained by converting the voice in the first language received by the voice input unit, A terminal transmission unit that transmits to a translation device that translates a first language into a second language, and language information of the second language that is input from the translation device at a second user terminal used by a second user is the first language.
  • the terminal receiving unit that receives the first language translation data translated into the language, and the first language translation data received by the terminal receiving unit is converted while the voice is being input to the voice input unit.
  • a voice output unit that outputs voice in the first language.
  • FIG. 1 is a diagram showing a configuration of the translation system S1 of the present embodiment.
  • the translation system S1 includes a headset 1 (1a, 1b, 1c) and an information terminal 2 (2a, 2b, 2c), and a translation device 3 that functions as a translation device.
  • the headset 1 and the information terminal 2 function as a user terminal that is a voice input / output device by operating in cooperation.
  • FIG. 1 shows a user U1 who speaks a first language (eg, Japanese), a user U2 who speaks a second language (eg, English), and a user U3 who speaks a third language (eg, Chinese).
  • the translation system S1 translates words spoken by each user into languages spoken by other users so that a plurality of users who speak different languages can understand the words spoken by other users. Is output as audio.
  • the user U1 is a guide for guiding tourists in a sightseeing spot, and the user U2 and the user U3 are tourists receiving explanation from the user U1.
  • users U1, U2, and U3 wear headsets 1a, 1b, and 1c, respectively.
  • users U1, U2, and U3 hold information terminals 2a, 2b, and 2c, respectively.
  • Information terminal 2a, 2b, 2c is a smart phone, for example.
  • the headsets 1a, 1b, and 1c can transmit and receive data to and from the information terminals 2a, 2b, and 2c via wireless channels B1, B2, and B3, respectively.
  • the wireless channel is, for example, Bluetooth (registered trademark).
  • the headsets 1a, 1b, and 1c have the same configuration, and in the following description, when there is no need to distinguish each, the headsets 1a, 1b, and 1c may be referred to as headsets 1.
  • the information terminals 2a, 2b, and 2c have the same configuration. In the following description, when there is no need to distinguish each, the information terminals 2a, 2b, and 2c may be referred to as information terminals 2.
  • the headset 1 is configured so that the user can wear it on the head, receives the voice spoken by the user, and converts the inputted voice into a digital voice signal.
  • the headset 1 transmits a digital audio signal to the information terminal 2 associated in advance via the wireless channel W1.
  • the information terminal 2 recognizes the voice included in the digital voice signal received from the headset 1 and converts it into text data.
  • voice in the first language is input in the headset 1, the information terminal 2 creates text data in the first language.
  • Japanese voice is input in the headset 1, the information terminal 2 creates Japanese text data.
  • the information terminal 2 transmits the created text data in the first language to the translation device 3.
  • the information terminal 2 transmits text data in the first language to the translation device 3 via the wireless communication line W, the access point 4 and the network N.
  • the wireless communication line W is, for example, a Wi-Fi (registered trademark) line, but may be a line using another wireless communication method.
  • the translation device 3 When the translation device 3 receives the text data in the first language, the translation device 3 converts the received text data in the first language into text data in the second language specified in advance. For example, when the translation device 3 receives Japanese text data from the information terminal 2a used by the user U1 who speaks Japanese, the text data translated into English that can be understood by the user U2 selected as the partner with whom the user U1 speaks. Create In this specification, text data generated by translation by the translation device 3 is referred to as translation data. The translation apparatus 3 transmits the created translation data of the second language to the information terminal 2b.
  • the information terminal 2b converts the received text data in the second language into a digital audio signal, and transmits the converted digital audio signal to the headset 1b via the wireless channel B2.
  • the headset 1b converts the received digital audio signal into an analog audio signal and outputs the analog audio signal so that the user U2 can recognize it.
  • the translation device 3 creates text data in the second language, converts text data in the second language into text data in the first language, and creates retranslation data in the first language.
  • the translation apparatus 3 transmits the created retranslation data to the information terminal 2a.
  • the information terminal 2a converts the received retranslation data of the first language into a digital audio signal, and transmits the digital audio signal to the headset 1a via the radio channel B1.
  • the headset 1a converts the received digital audio signal into an analog audio signal and outputs the analog audio signal so that the user U1 can recognize it.
  • the headset 1a transmits text data based on speech input by the user U1 during a predetermined period to the translation device 3, and then the time required for the translation device 3 to generate retranslation data has elapsed.
  • the voice based on the retranslation data corresponding to a predetermined period is output.
  • the predetermined period is, for example, a period during which the user U1 is inputting words to be translated, which are set by operating the information terminal 2a. Details of the operation for setting the predetermined period by the user U1 will be described later.
  • the translation device 3 creates the retranslation data, and the headset 1 outputs the voice based on the retranslation data, so that the user U1 compares the words that the user U1 speaks with the words indicated by the voice based on the retranslation data. , You can check whether the words you have uttered are translated correctly. Therefore, when it is confirmed that the user U1 has not been translated correctly, the user U1 may make a correction gesture for the user U2 and the user U3 who are talking with each other, or may be paraphrased with another word. it can.
  • the translation system S1 outputs the text data obtained by translating the speech input in the headset 1 as speech from the headset 1, so that the user wearing the headset 1 is a partner who uses a language different from his / her own. Talking while looking at your face, you can understand what the other person speaks. Therefore, translation system S1 can improve the quality of communication with the partner who uses a different language.
  • the headset 1 has a bone conduction speaker. Therefore, the user listens to the content of the speech uttered by the partner and the content re-translated of the speech uttered by the user through the bone conduction speaker while listening to the live voice of the partner talking with the ear. be able to. Since the headset 1 has such a configuration, the user can listen to the translated voice by bone conduction while listening to the other party's live voice with his / her ear. , It will be possible to understand what the other party speaks. As a result, the quality of communication with a partner who uses a different language can be further enhanced.
  • the headset 1, the information terminal 2, and the translation device 3 will be described.
  • FIG. 2 is a diagram illustrating an appearance of the headset 1.
  • the headset 1 includes a first main microphone 11, a second main microphone 12, a sub microphone 13, a bone conduction speaker 14, a control unit 15, a cable 16, and a microphone housing portion 17. And a connecting member 18 and a main body 19.
  • the first main microphone 11, the second main microphone 12, and the sub microphone 13 function as an audio input unit.
  • the 1st main microphone 11 and the 2nd main microphone 12 are main microphones for acquiring the voice which user U utters
  • submicrophone 13 is the submicrophone for acquiring ambient sound.
  • the distance between the sub microphone 13 and the mouth of the user U is based on the distance between the first main microphone 11 and the second main microphone 12, which are a plurality of main microphones, and the mouth of the user U. Is also big.
  • the first main microphone 11 and the second main microphone 12 are provided side by side in the microphone housing portion 17 connected to the main body portion 19 via a flexible connecting member 18.
  • the sub microphone 13 is provided in the vicinity of the bone conduction speaker 14 that is worn so as to be in contact with the vicinity of the lower part of the user's ear. Since the user uses the microphone accommodating portion 17 in a state in which the microphone is close to the mouth, the first main microphone 11 and the second main microphone 12 are emitted by the user U at a position closer to the user U's mouth than the sub microphone 13. Get audio.
  • the sub microphone 13 is provided on the outer side of the bone conduction speaker 14L on the side opposite to the bone conduction speaker 14R to which the connection member 18 is connected (that is, the side not in contact with the user U). As described above, the sub microphone 13 is provided in the bone conduction speaker 14L on the side electrically separated from the first main microphone 11 and the second main microphone 12 as compared with the bone conduction speaker 14R. Since the sound signal input to the main microphone 11 and the second main microphone 12 and the sound signal input to the sub microphone 13 are difficult to interfere with each other, it is possible to improve the noise removal performance described later.
  • the bone conduction speaker 14 is a speaker that can transmit sound to the user U by vibrating the bone with sound pressure.
  • the bone conduction speaker 14R is attached so as to contact the condylar process of the lower part of the user's right ear
  • the bone conduction speaker 14L is attached so as to contact the condyle process of the lower part of the user's left ear.
  • the position where the bone conduction speaker 14R and the bone conduction speaker 14L are mounted is arbitrary.
  • the bone conduction speaker 14 outputs the sound in the first language while the first main microphone 11 and the second main microphone 12 receive the sound input in the language used by the user (for example, the first language).
  • the sound of the first language output from the bone conduction speaker 14 is sound based on data obtained by translating the sound of another user who emits the second language, or sound based on re-translated data.
  • the headset 1 Since the headset 1 has the bone conduction speaker 14, the user can hear the translated voice by bone conduction while listening to the other party's raw voice with his / her ear. , It will be possible to understand what the other party speaks.
  • the control unit 15 accommodates various electric circuits electrically connected to the first main microphone 11, the second main microphone 12, the sub microphone 13, the bone conduction speaker 14R, and the bone conduction speaker 14L via the cable 16.
  • the electric circuit is, for example, a circuit that functions as an extraction unit that removes noise from the sound input from the first main microphone 11, the second main microphone 12, and the sub microphone 13 and extracts the sound input by the user, an analog audio signal A codec circuit for converting the digital audio signal into a digital audio signal, a communication circuit for transmitting / receiving the digital audio signal to / from the information terminal 2, and the like.
  • FIG. 3 is a diagram showing the internal configuration of the headset 1 and the information terminal 2.
  • the control unit 15 includes an audio processing unit 151, a communication unit 152, and a control unit 153.
  • the headset 1 is the headset 1 a used by the user U 1 who speaks the first language.
  • the sound processing unit 151 removes ambient sounds other than the sound emitted by the user U1 based on the sound signals input from the first main microphone 11, the second main microphone 12, and the sub microphone 13, thereby causing the user U1 to emit sound. It functions as an extraction unit that extracts the voice.
  • the audio processing unit 151 generates a digital audio signal by encoding the extracted audio, for example, by PCM (Pulse Code Modulation).
  • PCM Pulse Code Modulation
  • the audio processing unit 151 generates an analog audio signal by decoding the digital audio signal input from the communication unit 152.
  • the audio processing unit 151 outputs the generated analog audio signal via the bone conduction speaker 14.
  • the audio processing unit 151 has a function of removing noise such as ambient sounds from the sound signals input from the first main microphone 11 and the second main microphone 12. In order to remove noise, the sound processing unit 151 first determines the sound estimated as the sound of the first language based on the result of comparing the sounds input to the first main microphone 11 and the second main microphone 12. Is identified.
  • the audio processing unit 151 extracts, for example, a signal whose phase difference from the analog audio signal input to the second main microphone 12 is within a predetermined range from the analog audio signal input to the first main microphone 11, By removing a signal whose amplitude difference is outside a predetermined range, an estimated speech signal obtained by extracting speech that is estimated to be speech of the first language is generated.
  • the predetermined range is assumed as, for example, a difference between a distance from the user U's mouth to the first main microphone 11 and a distance from the user U's mouth to the second main microphone 12 in a state where the user U wears the headset 1. It is a range that is less than or equal to the difference in amplitude that is assumed at the maximum value.
  • the sound processing unit 151 inputs the sound specified based on the result of comparing the analog sound signal input to the first main microphone 11 and the analog sound signal input to the second main microphone 12 to the sub microphone 13. Based on the result of the comparison with the sound, the first language speech from which the ambient sound component is removed is extracted from the first speech signal. Specifically, the sound processing unit 151 removes at least a part of the ambient sound signal input to the sub microphone 13 from the generated estimated sound signal.
  • the sound processing unit 151 Before removing the ambient sound signal from the estimated sound signal, the sound processing unit 151 is configured so that the attenuation amount of the signal equal to or higher than the level that is clearly the user U's sound is within the predetermined range in the estimated sound signal.
  • the ambient sound signal input to the sub microphone 13 is attenuated.
  • the communication unit 152 includes a wireless communication module for transmitting the digital audio signal generated by the audio processing unit 151 to the information terminal 2 and receiving the digital audio signal from the information terminal 2.
  • the communication unit 152 functions as a terminal transmission unit that transmits the first language text data obtained by converting the speech of the first language received by the first main microphone 11 and the second main microphone 12 to the translation device 3.
  • the communication unit 152 is a terminal receiving unit that receives from the translation device 3 the first language translation data in which the language information of the second language input in the headset 1b used by the user U2 is translated into the first language. Function. Further, the communication unit 152 receives retranslation data obtained by translating the second language translation data obtained by translating the first language text data transmitted from the communication unit 152 to the translation device 3 into the first language.
  • the control unit 153 is, for example, a one-chip microcomputer incorporating a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory).
  • the control unit 153 controls the audio processing unit 151 and the communication unit 152 by the CPU executing the program stored in the ROM and writing the data values described in the program to the registers of the audio processing unit 151 and the communication unit 152. To do.
  • the information terminal 2 includes a first communication unit 21, a second communication unit 22, a display unit 23, an operation unit 24, a storage unit 25, and a control unit 26.
  • the control unit 26 includes a text conversion unit 261 and a UI processing unit 262.
  • the first communication unit 21 includes a wireless communication module for receiving a digital audio signal from the headset 1 via the wireless channel B and transmitting the digital audio signal output from the audio processing unit 151 to the headset 1.
  • the second communication unit 22 includes a wireless communication module for transmitting / receiving text data to / from the access point 4 via the wireless communication line W.
  • the second communication unit 22 generates text data corresponding to the sound after removing the ambient sound input to the sub microphone 13 from the sound input to the first main microphone 11 and the second main microphone 12 in the first language. It transmits as 1st language text data corresponding to an audio
  • the display unit 23 is a display that displays various types of information.
  • the display unit 23 is controlled by the UI processing unit 262, for example, the first language text data generated by the text conversion unit 261 based on the digital audio signal transmitted from the headset 1, and the second communication unit 22 uses the translation device 3.
  • the first language translation data received from is displayed. An example of data displayed on the display unit 23 will be described later.
  • the operation unit 24 is a device for accepting a user operation, and is, for example, a touch panel provided so as to overlap the display unit 23.
  • the operation unit 24 inputs an electrical signal generated according to a user operation to the UI processing unit 262.
  • the storage unit 25 is a storage medium such as a ROM or a RAM.
  • the storage unit 25 stores a program executed by the control unit 26.
  • the storage unit 25 stores a language name that can be spoken by a user using the information terminal 2 input via the operation unit 24.
  • the text conversion unit 261 stores a speech recognition dictionary for converting a digital speech signal into text data, and a speech synthesis dictionary for converting text data into a digital speech signal.
  • the storage unit 25 stores a plurality of speech recognition dictionaries and a plurality of speech synthesis dictionaries in association with a plurality of language names.
  • the control unit 26 is, for example, a CPU, and functions as a text conversion unit 261 and a UI processing unit 262 by executing a program stored in the storage unit 25.
  • the text conversion unit 261 converts the first language speech extracted by the speech processing unit 151 functioning as an extraction unit into first language text data. Specifically, first, the text conversion unit 261 analyzes the digital audio signal input from the first communication unit 21 and identifies the phoneme. The text conversion unit 261 uses the speech recognition dictionary corresponding to the language name that can be spoken by the user using the information terminal 2 by referring to the storage unit 25, and the words included in the digital speech signal Is specified, and the digital voice signal is converted into the first language text data.
  • the text conversion unit 261 associates the account name as the user specifying information for specifying the user and the account names of the user U2 and the user U3 who are conversation partners with the second language text data in association with the generated first language text data. It transmits to the translation apparatus 3 via the communication part 22. For example, the text conversion unit 261 transmits the account names of the users U2 and U3 together with the first language text data in response to receiving an instruction from the UI processing unit 262 to start translating the input speech. When the text conversion unit 261 acquires the second language name usable by the user U2 via the UI processing unit 262, the text conversion unit 261 can use the language usable by the user U2 before starting transmission of the first language text data. The language information for specifying can be transmitted to the translation device 3.
  • the text conversion unit 261 converts the first language text data received from the translation device 3 via the second communication unit 22 into a digital speech signal by referring to the speech synthesis dictionary stored in the storage unit 25. .
  • the text conversion unit 261 transmits the generated digital audio signal to the headset 1 via the first communication unit 21.
  • the UI processing unit 262 causes the display unit 23 to display the text data acquired from the text conversion unit 261.
  • the UI processing unit 262 specifies the operation content indicated by the electrical signal input from the operation unit 24 and notifies the text conversion unit 261 of the specified operation content.
  • the UI processing unit 262 notifies the text conversion unit 261 of the account name of the user U1 set by the user and the account names of the user U2 and the user U3 and causes the storage unit 25 to store them.
  • FIG. 4 is a diagram illustrating a configuration of the translation apparatus 3.
  • the translation device 3 includes a communication unit 31, a storage unit 32, and a control unit 33.
  • the communication unit 31 includes a communication interface for transmitting / receiving text data to / from the information terminal 2 via the network N.
  • the communication unit 31 includes, for example, a LAN (Local Area Network) controller.
  • the storage unit 32 includes storage media such as a ROM, a RAM, and a hard disk.
  • the storage unit 32 stores a program executed by the control unit 33.
  • the storage unit 32 stores a dictionary for the language conversion unit 332 to convert the text data of the first language into text data of another language.
  • the storage unit 32 stores a use language table in which account names of a plurality of users who can use the translation system S1 are associated with language names that can be used by each user.
  • the control unit 33 is, for example, a CPU, and functions as a translation control unit 331 and a language conversion unit 332 by executing a program stored in the storage unit 32.
  • the translation control unit 331 controls the language conversion unit 332 to convert the first language text data received from the communication unit 31 into the second language translation data. Also, the translation control unit 331 controls the language conversion unit 332 to convert the second language text data received from the information terminal 2b via the communication unit 31 into the first language translation data.
  • the translation control unit 331 transmits the second language translation data generated by conversion by the language conversion unit 332 to the information terminal 2b via the communication unit 31, and the first language translation generated by conversion by the language conversion unit 332 Data is transmitted to the information terminal 2a via the communication unit 31.
  • the translation control unit 331 acquires the second language translation data generated based on the first language text data from the language conversion unit 332, the translation control unit 331 translates the acquired second language translation data into text data of the first language.
  • the language conversion unit 332 is controlled so as to generate retranslation data.
  • the translation control unit 331 transmits the retranslation data generated by the language conversion unit 332 to the information terminal 2a via the communication unit 31.
  • the translation control unit 331 functions as a specifying unit that specifies the first language used by the user U1 and the second language used by the user U2.
  • the translation control unit 331 refers to the language table stored in the storage unit 32 and associates the first language text data via the first communication unit 21 with the account name of the user U1 transmitted from the second communication unit 22 And a second language name corresponding to the account name of the user U2 received in association with the first language text data.
  • the translation control unit 331 notifies the language conversion unit 332 of the identified result.
  • the translation control unit 331 may specify a second language that can be used by the user U2 based on the language information transmitted from the information terminal 2a.
  • the translation control unit 331 specifies, for example, a first language that can be used by the user U1 and a second language that can be used by the user U2 based on information input when the user U1 using the information terminal 2a performs login processing. To do.
  • the translation control unit 331 causes the language conversion unit 332 to translate based on the identified first language and second language until the user U1 logs off.
  • the translation control unit 331 may specify that the language used by the user U1 is the first language by analyzing the received first language text data. Moreover, the translation control part 331 may specify a 2nd language based on the word contained in 1st language text data. For example, the translation control unit 331 specifies that the language used by the user U2 is the second language based on the account name included in the received first language text data. Specifically, when the content of the first language text data is “Tom, nice to meet you”, the translation control unit 331 detects that “Tom” is an account name included in the language table used. By specifying that the language that “Tom” can use is English, the second language is specified as English.
  • the language conversion unit 332 converts the first language text data received by the first communication unit 21 from the information terminal 2a into the second language translation data based on the language type notified from the translation control unit 331. Upon receiving the input of the first language text data from the translation control unit 331, the language conversion unit 332 generates the second language translation data by translating the input first language text data, and the generated second language translation Data is notified to the translation control unit 331. Moreover, the language conversion part 332 converts the 2nd language text data which the 1st communication part 21 received from the information terminal 2b into 1st language translation data. Upon receiving the input of the second language text data from the translation control unit 331, the language conversion unit 332 generates the first language translation data by translating the input second language text data, and the generated first language translation Data is notified to the translation control unit 331.
  • the language conversion unit 332 translates the second language translation data generated based on the first language text data into the first language based on an instruction from the translation control unit 331, and generates retranslation data.
  • the language conversion unit 332 notifies the translation control unit 331 of the retranslation data.
  • FIG. 5 is a diagram illustrating a user selection screen displayed on the display unit 23 when the user U1 starts using the information terminal 2a.
  • the information terminal 2a is not a terminal of the user U1 itself but a rented terminal. Therefore, first, the user U1 needs to set a language that can be used by the user U1.
  • an account name and a language name of a user who can use the translation service by the translation device 3 are displayed in association with each other.
  • the user U1 can use the translation service using the information terminal 2a by touching the check box on the left side of “Taro” which is his account name. .
  • the UI processing unit 262 stores in the storage unit 25 that the account name of the user U1 is “Taro” and the language to be used is Japanese.
  • FIG. 6 is a diagram showing a partner selection screen displayed on the display unit 23 when selecting a partner to talk with. Also in FIG. 6A, the account name and language name of the user who can use the translation service by the translation apparatus 3 are displayed in association with each other as in FIG. As illustrated in FIG. 6B, when the user U1 touches the check box on the left side of “Tom” and the check box on the left side of “Turn”, the UI processing unit 262 indicates that the user U2 is “Tom”. The storage unit 25 stores that the user U3 is “around”.
  • FIG. 7 is a diagram showing a screen for conversation displayed on the display unit 23 when having a conversation.
  • a first area R1 in which the first language text data in which the speech uttered by himself is converted is displayed, and a second area in which the retranslation data is displayed.
  • R2 and a third region R3 in which the first language translation data in which the speech uttered by the other party is translated are displayed.
  • the conversation screen includes a “speak” icon that the user operates while inputting voice.
  • the text conversion unit 261 converts the voice input in the headset 1 into the first language text data while the finger is touching the “speak” icon, and the first time when the finger is released from the “speak” icon. The conversion to language text data is terminated. Then, the text conversion unit 261 transmits the first language text data corresponding to the voice input from when the finger touches the “speak” icon to when the finger is released to the translation device 3. By doing so, only the period designated by the user is subject to translation, so that surrounding sounds input to the headset 1 while the user is not inputting voice are prevented from being erroneously translated. it can.
  • the first language text data indicating the content of the voice uttered by the user U1 is displayed in the first region R1. Is displayed.
  • the user U1 can visually confirm the displayed first language text data and confirm that the voice has been correctly recognized.
  • the information terminal 2a may operate so as to cancel the input voice when the user U1 emits a predetermined word when there is an error in the first language text data visually recognized by the user U1. For example, when the word “redo” is detected in the voice input in the headset 1 a, the text conversion unit 261 is generated after the “speak” icon is touched until the word “redo” is detected. The first language text data is deleted. The text conversion unit 261 instructs the UI processing unit 262 to delete the first language text data displayed on the display unit 23. By doing in this way, the text conversion part 261 will transmit incorrect 1st language text data to the translation apparatus 3, when the audio
  • FIG. 8 is a diagram illustrating the screen of the information terminal 2a and the screen of the information terminal 2b after the user U1 releases his / her finger from the “speak” icon.
  • the retranslation data is displayed in the second area R2 of the screen of the information terminal 2a shown in FIG.
  • FIG. 8B shows a screen of the information terminal 2b.
  • an English sentence translated from the Japanese sentence “Nice to meet you” issued by the user U1 is displayed. .
  • the user U1 can confirm whether or not the translation has been correctly performed by viewing the retranslation data displayed on the information terminal 2a.
  • the text conversion unit 261 detects the word “redo” issued by the user U1 while displaying the retranslation data, the text conversion unit 261 notifies the translation device 3 that there is an error in the second language translation data. Also good.
  • translation apparatus 3 notifies information terminal 2b that there was an error in translation, and information terminal 2b erases the English sentence displayed in third region R3. Then, a word indicating cancellation (for example, canceled) may be displayed. By doing in this way, the translation apparatus 3 will continue to display incorrect 2nd language translation data on the information terminal 2b, when the speech which the user U1 uttered cannot be converted into correct 1st language text data. Can be prevented.
  • the translation control unit 331 transmits the second language translation data to the information terminal 2b after waiting for a time necessary for the user U1 to confirm the contents of the retranslation data after transmitting the retranslation data to the information terminal 2a. You may send it. In this case, the translation control unit 331 transmits the retranslation data to the information terminal 2a, and then, when the communication unit 31 receives first language text data including a predetermined word (for example, “redo”), to the user U2. The transmission of the second language translation data may be stopped.
  • a predetermined word for example, “redo”
  • the translation apparatus 3 when the translation apparatus 3 has not been able to convert the voice uttered by the user U1 into correct first language text data, the translation device 3 is notified that erroneous second language translation data is transmitted to the information terminal 2b. Can be prevented.
  • FIG. 9 is a diagram showing the screen of the information terminal 2a and the screen of the information terminal 2b after the user U2 utters the second language voice following the state of FIG.
  • the second language text data corresponding to the voice uttered by the user U2 is displayed, and the second language text data of the information terminal 2a shown in FIG. 9A is displayed.
  • the first language translation data generated by translating the second language text data is displayed.
  • re-translated data obtained by translating the first language translation data is displayed in the second region R2 of the information terminal 2b shown in FIG. 9B.
  • the headset 1a From the headset 1a, the first language speech corresponding to the retranslation data shown in the second region R2 of FIG. 9A and the first language speech corresponding to the first language translation data shown in the third region R3. Is output.
  • the translation control unit 331 performs the language conversion unit 332.
  • the language information of the second language input in the headset 1b is converted into the first language translation data, the transmission of the retranslation data is stopped and the first language translation data is transmitted to the information terminal 2a.
  • the user U1 becomes possible to grasp
  • the language conversion unit 332 newly generates the user U2.
  • transmission of the first language translation data being transmitted is stopped, and transmission of new first language translation data is started. May be.
  • the language conversion unit 332 While the translation control unit 331 is transmitting the first language translation data based on the voice of the second language uttered by the user U2 to the information terminal 2a, the language conversion unit 332 newly generates the third language uttered by the user U3.
  • the transmission of the first language translation data being transmitted is stopped, and the first language based on the third language speech uttered by the user U3 You may start transmission of translation data.
  • the translation control unit 331 notifies the information terminal 2b that the transmission of the first language translation data based on the voice uttered by the user U2 is interrupted, and the information terminal 2b transmits the first language translation data. You may display that transmission was interrupted. In this way, the user U2 can recognize that what he / she has spoken is not transmitted to the user U1 and take an appropriate response such as speaking again.
  • FIG. 10 is a diagram showing a processing sequence in the translation system S1.
  • FIG. 10 starts from the point in time when the user U1 starts inputting the voice in the first language in the headset 1a (step S11).
  • the headset 1 a transmits digital voice data corresponding to the first language voice to the information terminal 2.
  • the information terminal 2a converts the received digital voice data into first language text data (step S12). During this time, the UI processing unit 262 of the information terminal 2a monitors whether or not the voice input is completed (step S13), and the text conversion unit 261 generates the first language text data until the voice input is completed. continue. The text conversion unit 261 transmits the generated first language text data to the translation device 3 via the second communication unit 22 when the UI processing unit 262 determines that the voice input has been completed (YES in step S13). .
  • the language conversion unit 332 converts the first language text data received via the communication unit 31 into the second language text data, and generates second language translation data (step S14).
  • the translation control unit 331 transmits the second language translation data generated by the language conversion unit 332 to the information terminal 2b via the communication unit 31.
  • the text conversion unit 261 of the information terminal 2b converts the received second language translation data into a second language digital audio signal (step S15).
  • the text conversion unit 261 of the information terminal 2 b transmits a digital audio signal in the second language to the headset 1 b via the first communication unit 21.
  • the voice processing unit 151 of the headset 1b converts the digital voice signal received from the information terminal 2b into an analog voice signal, and outputs the second language voice from the bone conduction speaker 14 (step S16).
  • the translation control unit 331 causes the language conversion unit 332 to translate the second language translation data into the first language and create retranslation data (step S17).
  • the retranslation data is transmitted to the information terminal 2a.
  • the text conversion unit 261 of the information terminal 2a converts the received retranslation data into a digital speech signal in the first language (step S18).
  • the text conversion unit 261 of the information terminal 2 a transmits a digital audio signal in the first language to the headset 1 a via the first communication unit 21.
  • the voice processing unit 151 of the headset 1a converts the digital voice signal received from the information terminal 2a into an analog voice signal, and outputs the first language voice from the bone conduction speaker 14 (step S19).
  • the text conversion unit 261 transmits the retranslation data to the information terminal 2a before transmitting the second language translation data to the information terminal 2b after step S14, and then the user U1 receives the retranslation data.
  • the second language translation data may be transmitted to the information terminal 2b after the time necessary for confirming the elapses.
  • FIG. 11 is a diagram showing a configuration of a translation system S2 according to a modification of the present embodiment.
  • the translation system S2 is different from the translation system S1 in that the user U1 uses a headset 10 having a part of the functions of the information terminal 2a instead of the headset 1a.
  • the translation system S1 in that the user U2 and the user U3 do not use the headset 1b and the headset 1c but use the information terminal 20b and the information terminal 20c instead of the information terminal 2b and the information terminal 2c. And different.
  • the headset 10 has a function of the text conversion unit 261 in addition to the function of the headset 1 shown in FIG. Moreover, it has the function of the 2nd communication part 22 instead of the communication part 152 in the headset 1. FIG. By doing in this way, the user U1 can talk with the user U2 using the second language and the user U3 using the third language only by using the headset 10 without using the information terminal 2a. Can do.
  • the headset 10 is configured to acquire the operation content in the information terminal 2 connected via the wireless channel, and the user U1 can set the timing to start translation using the information terminal 2. Also good. Further, the headset 10 displays the retranslation data and the first language translation data received from the translation device 3 via the access point 4a on the information terminal 2 connected via the radio channel so that the user U1 can visually recognize the retranslation data and the first language translation data. You may let them.
  • the information terminal 20 has a microphone for inputting voice and a speaker for outputting voice in addition to the information terminal 2.
  • the user U2 and the user U3 can talk with the user U1 without wearing the headset 1b and the headset 1c.
  • the translation device 3 translates the first language speech input in the headset 1a into the second language speech before the head.
  • the data is transmitted to the set 1b, and the headset 1b outputs the sound translated into the second language.
  • the translation device 3 translates the second language voice input in the headset 1b into the first language voice, and then transmits the translated voice to the headset 1a, and the headset 1a translates the voice translated into the first language. Output.
  • the headset 1 has a bone conduction speaker 14. Therefore, the user can hear the translated voice by bone conduction while listening to the other party's live voice with his / her ear. become. As a result, the quality of communication with a partner who uses a different language can be further enhanced.
  • FIG. 12 is a diagram illustrating a configuration of the translation device 3 when the function of the language conversion unit 332 is realized by the language conversion server 5 different from the translation device 3.
  • the translation control unit 331 may execute the translation process by interlocking with the external language conversion server 5 that operates in the same manner as the language conversion unit 332 via the communication unit 31.
  • the information terminal 2 is a terminal rented to the user U.
  • the information terminal 2 may be a terminal with which the user U has contracted with a mobile phone operator.
  • the information terminal 2 may store the language information used by the user U and not display the user selection screen shown in FIG.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

La présente invention concerne un procédé de traduction comprenant : une étape dans laquelle un dispositif de traduction (3) identifie une première langue utilisée par un premier utilisateur et une deuxième langue utilisée par un deuxième utilisateur ; une étape dans laquelle un premier terminal d'utilisateur (U1) utilisé par le premier utilisateur reçoit une entrée vocale dans la première langue ; une étape dans laquelle le premier terminal d'utilisateur convertit la parole dans la première langue en données de texte et transmet les données de texte obtenues au dispositif de traduction (3) ; une étape dans laquelle le dispositif de traduction (3) convertit les données de texte en données de traduction dans la deuxième langue ; une étape dans laquelle le dispositif de traduction (3) transmet les données de traduction à un deuxième terminal d'utilisateur utilisé par le deuxième utilisateur ; et une étape dans laquelle le deuxième terminal d'utilisateur délivre en sortie un allocution dans la deuxième langue dans laquelle les données de traduction ont été converties.
PCT/JP2018/012098 2018-03-26 2018-03-26 Système de traduction, procédé de traduction, dispositif de traduction, et dispositif d'entrée/sortie vocale WO2019186639A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/JP2018/012098 WO2019186639A1 (fr) 2018-03-26 2018-03-26 Système de traduction, procédé de traduction, dispositif de traduction, et dispositif d'entrée/sortie vocale
JP2018545518A JP6457706B1 (ja) 2018-03-26 2018-03-26 翻訳システム、翻訳方法、及び翻訳装置
TW108102574A TWI695281B (zh) 2018-03-26 2019-01-23 翻譯系統、翻譯方法、以及翻譯裝置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2018/012098 WO2019186639A1 (fr) 2018-03-26 2018-03-26 Système de traduction, procédé de traduction, dispositif de traduction, et dispositif d'entrée/sortie vocale

Publications (1)

Publication Number Publication Date
WO2019186639A1 true WO2019186639A1 (fr) 2019-10-03

Family

ID=65270550

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/012098 WO2019186639A1 (fr) 2018-03-26 2018-03-26 Système de traduction, procédé de traduction, dispositif de traduction, et dispositif d'entrée/sortie vocale

Country Status (3)

Country Link
JP (1) JP6457706B1 (fr)
TW (1) TWI695281B (fr)
WO (1) WO2019186639A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476040A (zh) * 2020-03-27 2020-07-31 深圳光启超材料技术有限公司 语言输出方法、头戴设备、存储介质及电子设备
CN111696552A (zh) * 2020-06-05 2020-09-22 北京搜狗科技发展有限公司 一种翻译方法、装置和耳机

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457716B (zh) * 2019-07-22 2023-06-06 维沃移动通信有限公司 一种语音输出方法及移动终端

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015060332A (ja) * 2013-09-18 2015-03-30 株式会社東芝 音声翻訳装置、音声翻訳方法およびプログラム
JP2017038911A (ja) * 2015-03-16 2017-02-23 和弘 谷口 耳装着型装置
JP2017083583A (ja) * 2015-10-26 2017-05-18 日本電信電話株式会社 雑音抑圧装置、その方法及びプログラム
JP2017126042A (ja) * 2016-01-15 2017-07-20 シャープ株式会社 コミュニケーション支援システム、コミュニケーション支援方法、およびプログラム
WO2018008227A1 (fr) * 2016-07-08 2018-01-11 パナソニックIpマネジメント株式会社 Dispositif de traduction et procédé de traduction

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008077601A (ja) * 2006-09-25 2008-04-03 Toshiba Corp 機械翻訳装置、機械翻訳方法および機械翻訳プログラム
JP4481972B2 (ja) * 2006-09-28 2010-06-16 株式会社東芝 音声翻訳装置、音声翻訳方法及び音声翻訳プログラム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015060332A (ja) * 2013-09-18 2015-03-30 株式会社東芝 音声翻訳装置、音声翻訳方法およびプログラム
JP2017038911A (ja) * 2015-03-16 2017-02-23 和弘 谷口 耳装着型装置
JP2017083583A (ja) * 2015-10-26 2017-05-18 日本電信電話株式会社 雑音抑圧装置、その方法及びプログラム
JP2017126042A (ja) * 2016-01-15 2017-07-20 シャープ株式会社 コミュニケーション支援システム、コミュニケーション支援方法、およびプログラム
WO2018008227A1 (fr) * 2016-07-08 2018-01-11 パナソニックIpマネジメント株式会社 Dispositif de traduction et procédé de traduction

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476040A (zh) * 2020-03-27 2020-07-31 深圳光启超材料技术有限公司 语言输出方法、头戴设备、存储介质及电子设备
CN111696552A (zh) * 2020-06-05 2020-09-22 北京搜狗科技发展有限公司 一种翻译方法、装置和耳机
CN111696552B (zh) * 2020-06-05 2023-09-22 北京搜狗科技发展有限公司 一种翻译方法、装置和耳机

Also Published As

Publication number Publication date
JPWO2019186639A1 (ja) 2020-04-30
TW201941084A (zh) 2019-10-16
TWI695281B (zh) 2020-06-01
JP6457706B1 (ja) 2019-02-06

Similar Documents

Publication Publication Date Title
JP2019175426A (ja) 翻訳システム、翻訳方法、翻訳装置、及び音声入出力装置
KR102108500B1 (ko) 번역 기반 통신 서비스 지원 방법 및 시스템과, 이를 지원하는 단말기
US9280539B2 (en) System and method for translating speech, and non-transitory computer readable medium thereof
US10599785B2 (en) Smart sound devices and language translation system
KR101861006B1 (ko) 통역 장치 및 방법
KR100819928B1 (ko) 휴대 단말기의 음성 인식장치 및 그 방법
US20050261890A1 (en) Method and apparatus for providing language translation
US10872605B2 (en) Translation device
JP6165321B2 (ja) 装置及び方法
JP6457706B1 (ja) 翻訳システム、翻訳方法、及び翻訳装置
KR20070026452A (ko) 음성 인터랙티브 메시징을 위한 방법 및 장치
JP2021150946A (ja) ワイヤレスイヤホンデバイスとその使用方法
JP3820245B2 (ja) 3者通話方式の自動通訳システム及び方法
KR101517975B1 (ko) 동시 통/번역 기능을 가지는 이어폰 장치
CN111783481A (zh) 耳机控制方法、翻译方法、耳机和云端服务器
KR20080054591A (ko) 휴대단말기의 통화 서비스 방법
JP2014186713A (ja) 会話システムおよびその会話処理方法
WO2021080362A1 (fr) Système de traitement de langue utilisant un écouteur
JP3225682U (ja) 音声翻訳端末、モバイル端末及び翻訳システム
CN211319717U (zh) 用于语言交互的配件、移动终端及交互系统
WO2006001204A1 (fr) Dispositif de traduction automatique et procédé de traduction automatique
KR102181583B1 (ko) 음성인식 교감형 로봇, 교감형 로봇 음성인식 시스템 및 그 방법
WO2022113189A1 (fr) Dispositif de traitement de traduction de parole
CN210691322U (zh) 一种双模蓝牙翻译机
CN110489764B (zh) 一种双模蓝牙翻译机及其使用方法

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2018545518

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18911519

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18911519

Country of ref document: EP

Kind code of ref document: A1