KR20150025750A

KR20150025750A - user terminal apparatus and two way translation method thereof

Info

Publication number: KR20150025750A
Application number: KR20130103746A
Authority: KR
Inventors: 권오윤; 남대현; 전병조
Original assignee: 삼성전자주식회사
Priority date: 2013-08-30
Filing date: 2013-08-30
Publication date: 2015-03-11

Abstract

Provided are a user terminal apparatus capable of translating a received voice or a user voice and transmitting the translated voice and a two-way translation method thereof. The user terminal apparatus includes: an interface unit configured to receive a voice signal for a voice spoken with a first language from another device; a voice signal processing unit configured to detect a voice signal to detect a voice from the decoded voice signal; a buffer unit configured to store the detected voice; an output unit; a control unit configured to call a voice back from the buffer unit, convert the voice into text, translate the text using a predetermined second language, and output the translated text through the output unit; and a microphone unit configured to receive a user voice, wherein the control unit translates a user voice using the first language and transmits the translated user voice to another device through the interface unit.

Description

[0001] USER TERMINAL APPARATUS AND TWO WAY TRANSLATION METHOD THEREOF [0002]

본 발명은 사용자 단말 장치 및 그 양방향 번역 방법에 관한 것으로, 더욱 상세하게는 수신된 음성을 번역하거나 사용자 음성을 번역하여 송수신할 수 있는 사용자 단말 장치 및 그 양방향 번역 방법에 관한 것이다. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a user terminal device and a bi-directional translation method thereof, and more particularly, to a user terminal device capable of translating a received voice or translating a user's voice, and a bi-directional translation method thereof.

전자 기술 및 통신 기술의 발달에 힘입어 다양한 종류의 전자 기기들이 개발 및 보급되고 있으며, 전자 기기들 간 각종 데이터의 통신도 활발해지고 있다. Various kinds of electronic devices have been developed and popularized by the development of electronic technology and communication technology, and communication of various data between electronic devices is becoming active.

특히, 이러한 전자 및 통신 기술의 발달에 따라 사용자는 소셜 네트워크 서비스(SNS, Social Network Service), 문자 전송 및 영상 통화를 쉽게 이용할 수 있고, 이러한 서비스를 이용하여 다른 언어권에 속하는 사람 간의 교류도 더 이상 어렵지 않다. Particularly, according to the development of such electronic and communication technologies, users can easily use social network service (SNS, social network service), text transmission and video communication, and using these services, It is not difficult.

이에 따라, 서로 다른 언어를 사용함으로써 겪는 불편을 최소화하기 위해 각종 번역 서비스가 제공되고 있다. Accordingly, various translation services are provided to minimize the inconvenience of using different languages.

그러나, 이러한 번역 서비스는 주로 하나의 전자 기기를 통해 이루어지는 단방향인 경우가 많으므로 원거리에 위치한 사용자 간의 통신이 이루어지는 경우 그 실효성이 떨어진다. However, such a translation service is often unidirectional, which is performed mainly through a single electronic device, and therefore, the effectiveness of communication is low when communication is performed between users located at a remote location.

따라서, 원거리에 위치한 다른 선호 언어를 사용하는 사용자가 자신의 선호 언어를 그대로 사용하면서 자유롭게 통신할 수 있는 기술이 필요하다. Therefore, there is a need for a technology that allows users using other preferred languages located at a remote location to freely communicate while using their own preferred language.

본 발명은 상술한 필요성에 따라 안출된 것으로, 본 발명의 목적은 수신된 음성을 번역하거나 사용자 음성을 번역하여 송수신할 수 있는 사용자 단말 장치 및 이의 양방향 번역 방법을 제공함에 있다. SUMMARY OF THE INVENTION The present invention has been made in view of the above-mentioned needs, and it is an object of the present invention to provide a user terminal device capable of translating a received voice or translating a user voice, and a bi-directional translation method thereof.

상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른, 사용자 단말 장치는, 타 장치에서 제1 언어로 발화(speak)된 음성에 대한 음성 신호를 수신하는 인터페이스부, 기 음성 신호를 디코딩하여 상기 음성을 검출하는 음성 신호 처리부, 상기 검출된 음성을 저장하는 버퍼부, 출력부, 상기 버퍼부로부터 상기 음성을 콜백(callback)하여 텍스트 변환하고, 상기 변환된 텍스트를 기 설정된 제2 언어로 번역하여 상기 출력부를 통해 출력하는 제어부 및 사용자 음성을 입력받기 위한 마이크부를 포함하며, 상기 제어부는, 상기 사용자 음성을 상기 제1 언어로 번역하여 상기 인터페이스부를 통해 상기 타 장치로 전송하는 것을 특징으로 할 수 있다. According to an aspect of the present invention, there is provided a user terminal, comprising: an interface for receiving a voice signal for a voice spoken in a first language in another device; A voice signal processing unit for detecting voice, a buffer unit for storing the detected voice, an output unit, and a callback unit for converting the text from the buffer unit to a text, and translating the converted text into a predetermined second language And a control unit for outputting through the output unit and a microphone unit for receiving a user's voice, wherein the control unit translates the user's voice into the first language and transmits the user's voice to the other device through the interface unit .

그리고, 상기 출력부는, 상기 버퍼부에 저장된 상기 음성을 출력하는 스피커부, 디스플레이부를 포함하며, 상기 제어부는, 상기 인터페이스부를 통해 상기 음성 신호와 함께 수신되는 이미지 및 상기 제2 언어로 번역된 텍스트를 상기 디스플레이부에 디스플레이하는 것을 특징으로 할 수 있다. The output unit may include a speaker unit for outputting the voice stored in the buffer unit, and a display unit. The control unit may receive an image received together with the voice signal and text translated into the second language through the interface unit, And displaying the result on the display unit.

한편, 상기 출력부는, 상기 버퍼부에 저장된 상기 음성을 출력하기 위한 스피커부; 를 포함하며, 상기 제어부는, 상기 제1 언어로 발화(speak)된 음성 및 상기 제2 언어로 번역된 음성을 순차적으로 상기 스피커부를 통해 출력하는 것을 특징으로 할 수 있다. The output unit may include: a speaker unit for outputting the voice stored in the buffer unit; Wherein the control unit sequentially outputs the voice spoken in the first language and the voice translated into the second language through the speaker unit.

그리고, 상기 출력부는, 스피커부를 포함하며, 상기 제어부는, 상기 제1 언어로 발화(speak)된 음성 대신에 상기 제2 언어로 번역된 음성을 상기 스피커부를 통해 출력하는 것을 특징으로 할 수 있다. The output unit may include a speaker unit, and the control unit may output the voice translated into the second language through the speaker unit instead of the voice spoken in the first language.

한편, 상기 버퍼부는, 상기 음성 신호 처리부로부터 검출된 상기 음성을 저장하기 위한 제1 버퍼부, 상기 제2 언어로 번역된 음성을 저장하기 위한 제2 버퍼부 및 상기 제1 버퍼부 및 상기 제2 버퍼부 중 하나를 상기 스피커부와 연결하기 위한 스위칭부를 포함하며, 상기 제어부는, 상기 검출된 음성에 대한 번역이 이루어지면, 상기 제2 버퍼부를 상기 스피커부와 연결하도록 상기 스위칭부를 제어하여, 상기 제2 언어로 번역된 음성을 상기 스피커부를 통해 출력하는 것을 특징으로 할 수 있다. The buffer unit may include a first buffer unit for storing the voice detected from the voice signal processing unit, a second buffer unit for storing voice translated into the second language, and a second buffer unit for storing the first buffer unit and the second And a switching unit for connecting one of the buffer units to the speaker unit, wherein when the detected voice is translated, the controller controls the switching unit to connect the second buffer unit to the speaker unit, And outputting the voice translated into the second language through the speaker unit.

그리고, 상기 제어부는, 상기 버퍼부로부터 상기 음성을 콜백(call back)하여 ASR 서버로 전송하고, 상기 ASR 서버로부터 상기 음성에 대응되는 텍스트를 수신하는 음성 인식 엔진, UI를 통해 설정된 번역 언어 설정 정보 및 상기 텍스트를 번역 서버로 전송하여, 상기 번역 언어 설정 정보에 따라 상기 제2 언어로 번역된 상기 텍스트를 수신하는 텍스트 처리 모듈, 상기 수신된 텍스트를 GUI(Graphic User Interface)로 디스플레이하는 그래픽 처리 모듈을 포함할 수 있다. The control unit may include a speech recognition engine that calls back the speech from the buffer unit and transmits the speech back to the ASR server and receives text corresponding to the speech from the ASR server, A text processing module for transmitting the text to a translation server and receiving the text translated into the second language according to the translation language setting information, a graphics processing module for displaying the received text using a GUI (Graphic User Interface) . &Lt; / RTI >

한편, 상기 ASR 서버 및 상기 번역 서버는 상기 사용자 단말 장치에 내장되는 것을 특징으로 할 수 있다. Meanwhile, the ASR server and the translation server may be embedded in the user terminal.

그리고, 상기 제어부는, 상기 제2 언어로 번역된 상기 텍스트를 음성 신호로 변환하는 음성 처리 모듈을 더 포함하는 것을 특징으로 할 수 있다. The control unit may further include a speech processing module for converting the text translated into the second language into a speech signal.

한편, 상기 출력부는, 디스플레이부를 더 포함하며, 상기 제어부는, 상기 인터페이스부를 통해 상기 음성 신호와 함께 수신되는 이미지 및 상기 제2 언어로 번역된 텍스트를 상기 디스플레이부에 디스플레이하는 것을 특징으로 할 수 있다. The output unit may further include a display unit, and the control unit displays the image received with the voice signal through the interface unit and the text translated into the second language on the display unit .

한편, 본 발명의 일 실시예에 따른 사용자 단말 장치의 양방향 번역 방법은, 타 장치에서 제1 언어로 발화(speak)된 음성에 대한 음성 신호를 수신하는 단계, 상기 음성 신호를 디코딩하여 상기 음성을 검출하는 단계, 상기 검출된 음성을 버퍼부에 저장하는 단계, 상기 버퍼부로부터 상기 음성을 콜백하여 텍스트 변환하고, 상기 변환된 텍스트를 기 설정된 제2 언어로 번역하는 단계, 상기 제2 언어로 번역된 상기 음성을 출력하는 단계, 사용자 음성을 입력받는 단계, 상기 사용자 음성을 상기 제1 언어로 번역하여 상기 타 장치로 전송하는 단계를 포함할 수 있다. Meanwhile, a bi-directional translation method of a user terminal according to an embodiment of the present invention includes receiving a voice signal for a voice that is spoken in a first language at another device, decoding the voice signal, A step of storing the detected voice in a buffer unit, a step of text-converting the voice from the buffer unit into a callback, and translating the converted text into a predetermined second language, Outputting the voice, receiving a user voice, and translating the user voice into the first language and transmitting the voice to the other device.

그리고, 상기 출력하는 단계는, 상기 제2 언어로 상기 음성을 텍스트 형태로 디스플레이하는 것을 특징으로 할 수 있다.The outputting step may display the voice in a text form in the second language.

한편, 상기 출력하는 단계는, 상기 제1 언어로 발화(speak)된 음성 및 상기 제2 언어로 번역된 음성을 순차적으로 스피커부를 통해 출력하는 것을 특징으로 할 수 있다. The outputting step may include outputting, through the speaker unit, a voice that is spoken in the first language and a voice that is translated into the second language.

그리고, 상기 출력하는 단계는, 상기 제1 언어로 발화(speak)된 음성 대신에 상기 제2 언어로 번역된 음성을 스피커부를 통해 출력하는 것을 특징으로 할 수 있다. The outputting step may output the voice translated into the second language through a speaker unit instead of the voice spoken in the first language.

한편, 상기 제2 언어로 번역하는 단계는, 상기 버퍼부로부터 상기 음성을 콜백(call back)하여 ASR 서버로 전송하고, 상기 ASR 서버로부터 상기 음성에 대응되는 텍스트를 수신하는 단계, UI를 통해 설정된 번역 언어 설정 정보 및 상기 텍스트를 번역 서버로 전송하여, 상기 번역 언어 설정 정보에 따라 상기 제2 언어로 번역된 상기 텍스트를 수신하는 단계; 를 포함하는 것을 특징으로 할 수 있다. The step of translating into the second language may include the steps of: calling back the voice from the buffer unit, transmitting the voice to the ASR server, receiving text corresponding to the voice from the ASR server, Transmitting the translation language setting information and the text to a translation server and receiving the translated text in the second language according to the translation language setting information; And a control unit.

그리고, 상기 출력하는 단계는, 상기 음성 신호와 함께 수신되는 이미지 및 상기 제2 언어로 번역된 텍스트를 디스플레이하는 단계; 를 더 포함하는 것을 특징으로 할 수 있다. The outputting may include displaying an image received together with the voice signal and a text translated into the second language; And further comprising:

본 발명의 다양한 실시예에 따르면, 수신된 음성을 번역하거나 사용자 음성을 번역하여 송수신하여 양방향 번역 서비스를 제공할 수 있다. According to various embodiments of the present invention, it is possible to translate a received voice or translate and transmit a user voice to provide a bidirectional translation service.

도 1은 본 발명의 일 실시예에 따른, 사용자 단말 장치의 구성을 도시한 블럭도,
도 2 내지 도 4는 양방향 번역 방법의 다양한 실시 예를 나타내는 도면,
도 5는 사용자 단말 장치의 버퍼부의 구체적인 구성을 나타내는 도면,
도 6은 사용자 단말 장치의 제어부의 구체적인 구성을 나타내는 도면, 그리고
도 7은 본 발명의 일 실시 예에 따른, 양방향 번역 방법을 설명하기 위한 흐름도이다. 1 is a block diagram illustrating a configuration of a user terminal according to an exemplary embodiment of the present invention.
Figures 2 to 4 illustrate various embodiments of a bidirectional translation method,
5 is a diagram showing a specific configuration of a buffer unit of a user terminal device,
6 is a diagram showing a specific configuration of the control unit of the user terminal device, and
7 is a flowchart illustrating a bidirectional translation method according to an embodiment of the present invention.

이하에서는 첨부된 도면을 참조하여, 본 발명의 다양한 실시 예를 좀더 상세하게 설명한다. 본 발명을 설명함에 있어서, 관련된 공지기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단된 경우 그 상세한 설명은 생략한다. 그리고 후술 되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.Various embodiments of the present invention will now be described in more detail with reference to the accompanying drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. The following terms are defined in consideration of the functions of the present invention, and these may be changed according to the intention of the user, the operator, or the like. Therefore, the definition should be based on the contents throughout this specification.

도 1은 본 발명의 일 실시예에 따른, 사용자 단말 장치(100)의 구성을 도시한 블럭도이다. 도 1에 도시된 바와 같이, 사용자 단말 장치(100)는 인터페이스부(110), 음성 신호 처리부(120), 버퍼부(130), 출력부(140), 마이크부(150) 및 제어부(160)를 포함한다. 이때, 사용자 단말 장치(100)는 휴대폰일 수 있으나, 이는 일 실시예에 불과할 뿐, TV, 태블릿 PC, 디지털 카메라, 캠코더, 노트북 PC, PDA 등과 같은 다양한 전자 장치로 구현될 수 있다.1 is a block diagram illustrating a configuration of a user terminal device 100 according to an embodiment of the present invention. 1, the user terminal 100 includes an interface unit 110, a voice signal processing unit 120, a buffer unit 130, an output unit 140, a microphone unit 150, and a controller 160, . At this time, the user terminal device 100 may be a cellular phone, but it may be implemented by various electronic devices such as a TV, a tablet PC, a digital camera, a camcorder, a notebook PC, a PDA and the like.

한편, 도 1은 사용자 단말 장치(100)가 통신 기능, 음성 인식 기능, 디스플레이 기능, 음성 번역 기능 등과 같이 다양한 기능을 구비한 장치인 경우를 예로 들어 각종 구성 요소들을 종합적으로 도시한 것이므로, 실시 예에 따라서는, 도 1에 도시된 구성 요소 중 일부는 생략 또는 변경될 수도 있고, 다른 구성요소가 더 추가될 수도 있다.Meanwhile, FIG. 1 is a schematic diagram illustrating various components in the case where the user terminal device 100 is a device having various functions such as a communication function, a voice recognition function, a display function, a voice translation function, Some of the constituent elements shown in Fig. 1 may be omitted or changed, and other constituent elements may be further added.

인터페이스부(110)는 타 장치에서 발화(speak)된 음성에 대한 음성 신호를 수신하거나 사용자 단말 장치(100)에서 음성을 타 장치로 전송하기 위한 구성요소이다. 즉, 타 장치에서 제1 언어로 발화된 음성에 대한 음성 신호를 수신할 수 있고, 사용자 단말 장치(100)에 입력받은 제1 언어 이외의 사용자 음성을 제1 언어로 번역하여 타 장치로 전송할 수 있다. The interface unit 110 is a component for receiving a voice signal for speech spoken in another apparatus or transmitting voice from the user terminal apparatus 100 to another apparatus. That is, it is possible to receive a voice signal for a voice uttered in a first language by another apparatus, translate a user voice other than the first language inputted into the user terminal apparatus 100 into a first language, have.

인터페이스부(110)는 안테나, 모뎀, 등화 회로, 통신 칩 등과 같은 다양한 하드웨어를 포함할 수 있다. 인터페이스부(110)는 PSTN(public switched telephone network)이나 3G, 4G, 와이파이망 등과 같은 각종 모바일 네트워크를 통해서 타 장치와 음성 통화 또는 화상 통화를 위한 신호를 송수신할 수 있다. 인터페이스부(110)는 안테나를 통해 수신된 신호를 디멀티플렉싱하여 음성 신호를 검출하고, 그 음성 신호에 대해 복조 및 등화 등의 처리를 수행한 후, 음성 신호 처리부(120)로 제공한다. The interface unit 110 may include various hardware such as an antenna, a modem, an equalization circuit, a communication chip, and the like. The interface unit 110 can transmit and receive signals for voice communication or video communication with other devices through various mobile networks such as a public switched telephone network (PSTN), a 3G, a 4G, and a Wi-Fi network. The interface unit 110 demultiplexes the signal received through the antenna to detect a voice signal, performs demodulation and equalization on the voice signal, and provides the voice signal to the voice signal processing unit 120.

음성 신호 처리부(120)는 음성 신호를 디코딩하여 음성을 검출한다. 검출된 음성은 버퍼부(130)에 저장된다. The audio signal processing unit 120 decodes the audio signal to detect the audio. The detected voice is stored in the buffer unit 130.

버퍼부(130)는 음성 신호 처리부(120)를 통해 검출된 음성을 저장하여, 출력부(140)로 제공하기 위한 구성요소이다. The buffer unit 130 is a component for storing the voice detected through the voice signal processing unit 120 and providing the voice to the output unit 140.

출력부(140)는 타 장치로부터 수신된 신호에 따라 음성 신호 및 영상 신호 중 적어도 하나를 출력하기 위한 구성요소이다. 구체적으로는, 출력부(140)는 스피커부(141) 및 디스플레이부(142)를 포함한다. The output unit 140 is a component for outputting at least one of a voice signal and a video signal according to a signal received from another apparatus. More specifically, the output unit 140 includes a speaker unit 141 and a display unit 142.

스피커부(141)는 버퍼부(130)에 저장된 음성을 출력할 수 있고, 디스플레이부(142)는 인터페이스부(110)를 통해 수신된 신호로부터 검출되는 영상을 출력할 수 있다. 가령, 사용자 단말 장치(100)가 타 장치와 화상 통화를 수행하는 경우, 사용자 단말 장치(100)는 타 장치로부터 음성 신호 및 영상 신호를 포함하는 신호를 수신할 수 있다. 사용자 단말 장치(100)는 비디오 프로세서(미도시)를 이용하여, 영상 신호를 디코딩하여, 그 영상 신호에 대응되는 영상 프레임을 디스플레이부(142)를 통해 디스플레이할 수 있다. The speaker unit 141 can output the voice stored in the buffer unit 130 and the display unit 142 can output the image detected from the signal received through the interface unit 110. [ For example, when the user terminal device 100 performs a video call with another device, the user terminal device 100 may receive a signal including a voice signal and a video signal from another device. The user terminal device 100 can decode a video signal using a video processor (not shown), and display the video frame corresponding to the video signal through the display unit 142.

한편, 마이크부(150)는 사용자 음성을 입력받는다. 마이크부(150)는 사용자 단말 장치(100)와 일체형(all-in-one)뿐만 아니라 분리된 형태로 구현될 수 있다. 분리된 마이크부(150)는 사용자 단말 장치(100)와 유선 또는 무선 네트워크를 통하여 연결될 수 있다. On the other hand, the microphone unit 150 receives the user's voice. The microphone unit 150 may be implemented separately from the user terminal 100 as well as all-in-one. The separated microphone unit 150 may be connected to the user terminal device 100 through a wired or wireless network.

제어부(160)는 사용자 단말 장치(100)의 전반적인 동작을 제어한다. 가령, 제어부(160)는 타 장치와의 전화 통화가 이루어지면, 타 장치의 사용자가 발화하는 음성과, 사용자 단말 장치(100)의 사용자가 발화하는 음성을 각각 번역해주는 양방향 번역 서비스를 제공한다. The control unit 160 controls the overall operation of the user terminal device 100. For example, the control unit 160 provides a bidirectional translation service for translating a voice uttered by a user of the other apparatus and a voice uttered by a user of the user terminal apparatus 100, respectively, when a telephone conversation with another apparatus is performed.

구체적으로는, 제어부(160)는 인터페이스부(110)를 통해 제1 언어로 발화된 음성에 대한 신호가 수신되어, 버퍼부(130)로부터 그 음성이 저장된 경우, 저장된 음성을 콜백(callback)하여 사용자 단말 장치(100)의 사용자가 설정한 제2 언어로 번역하여 준다. 또한, 제어부(160)는 사용자가 발화하는 제2 언어의 음성을 제1 언어로 번역하여 타 장치로 제공하여 준다. Specifically, when the control unit 160 receives a signal for a voice uttered in the first language through the interface unit 110 and the voice is stored in the buffer unit 130, the controller 160 calls back the stored voice To the second language set by the user of the user terminal device 100. Also, the control unit 160 translates the voice of the second language that the user uttered into the first language, and provides the voice to the other apparatus.

번역 과정에서, 제어부(160)는 음성을 인식하고 인식된 음성을 텍스트로 변환하여, 변환된 텍스트를 이용하여 번역을 진행할 수도 있다. In the translation process, the control unit 160 recognizes the voice, converts the recognized voice into text, and proceeds the translation using the converted text.

번역된 언어는 실시 예에 따라 다양한 방식으로 제공하여 줄 수 있다. The translated language can be provided in various ways depending on the embodiment.

가령, 사용자 단말 장치(100)가 영상 통화 기능을 포함하거나, 영상 통화 기능을 포함하는 어플리케이션의 이용이 가능한 경우, 사용자 단말 장치(100)는 촬영부(미도시)를 포함할 수 있고, 촬영부(미도시)가 촬영한 이미지를 인터페이스부(110)를 통해 사용자 단말 장치(100)와 영상 통화를 수행하는 타 장치로 전송할 수 있다. For example, when the user terminal 100 includes a video call function or an application including a video call function is available, the user terminal 100 may include a photographing unit (not shown) (Not shown) through the interface unit 110 to the user terminal device 100 and the other device performing the video call.

또한, 타 장치의 촬영부가 촬영한 이미지를 음성 신호와 함께 인터페이스부(110)를 통해 수신할 수 있으며, 디스플레이부(142))는 수신된 이미지를 디스플레이 할 수 있다. In addition, the image pickup section of another apparatus can receive an image taken by the interface section 110 together with a voice signal, and the display section 142 can display the received image.

수신된 이미지를 디스플레이부(142)에 디스플레이하는 경우, 제어부(160)는 제2 언어로 번역된 텍스트를 수신된 이미지와 함께 디스플레이부(142)에 디스플레이할 수 있다. 제2 언어로 번역된 텍스트는 자막과 같은 형태로, 이미지의 일 측 가장자리 부분에 디스플레이될 수 있다. When the received image is displayed on the display unit 142, the control unit 160 can display the translated text in the second language together with the received image on the display unit 142. The text translated into the second language can be displayed in the form of a subtitle and on one side edge of the image.

또는, 제어부(160)는 제1 언어로 발화(speak)된 음성 및 제2 언어로 번역된 음성을 스피커부(141)를 통해 출력하도록 스피커부(141)를 제어할 수 있다. 이 경우, 제어부(160)는 제1 언어로 이루어진 원 음성을 먼저 출력하고, 번역된 음성, 즉, 제2 언어로 이루어진 음성을 그 다음에 출력하도록 스피커부(141)를 제어할 수 있다. 또는, 제1 언어로 발화된 음성 대신에 제2 언어로 번역된 음성만을 출력하도록 스피커부(141)를 제어할 수도 있다. Alternatively, the control unit 160 may control the speaker unit 141 to output the voice spoken in the first language and the voice translated into the second language through the speaker unit 141. In this case, the control unit 160 may control the speaker unit 141 to output the original voice in the first language first, and then output the translated voice, that is, the voice in the second language. Alternatively, the speaker unit 141 may be controlled to output only the voice translated into the second language instead of the voice uttered in the first language.

도 2 내지 도 4는 양방향 번역 방법의 다양한 실시 예를 나타낸다. 도 2 내지 도 4에 도시된 바와 같은 실시 예는 사용자 단말 장치에 내장된 기능을 이용할 수도 있고, 별도의 어플리케이션을 통해 수행될 수도 있다. Figures 2-4 illustrate various embodiments of a bi-directional translation method. The embodiment as shown in FIGS. 2 to 4 may utilize the functions built in the user terminal device or may be performed through a separate application.

먼저, 도 2는 번역된 언어를 텍스트로 출력하는 실시 예를 나타내는 도면이다. 2 is a diagram showing an embodiment for outputting a translated language as text.

도 2는 제1 언어로 한국어를 사용하는 제1 사용자(10)가 TV(100-2)를 사용하고, 제2 언어로 영어를 사용하는 제2 사용자(20)가 휴대폰(100-1)을 사용하여 영상 통화를 하는 모습을 도시하고 있다. 휴대폰(100-1) 및 TV(100-2)는 사용자 단말 장치(100)의 일 실시예에 불과할 뿐이며, 영상 촬영 기능, 통화 기능 또는 문자 전송 기능 등을 포함하는 다양한 사용자 단말 장치가 사용될 수 있다. 2 shows a case where the first user 10 using the Korean language in the first language uses the TV 100-2 and the second user 20 using English in the second language connects the mobile phone 100-1 And a video call is made using the video signal. The mobile phone 100-1 and the TV 100-2 are merely an embodiment of the user terminal 100 and various user terminal devices including a video taking function, a call function or a text transmission function can be used .

휴대폰(100-1)은 제1 사용자(10)가 TV(100-2)를 통해 한국어로 발화한 "좋은 아침"에 대한 음성 신호를 인터페이스부(110)를 통해 수신하고, 수신한 음성 신호를 디코딩하여 음성을 검출한 뒤, 검출된 음성을 버퍼부(130)에 저장할 수 있다. 즉, 다른 실시 예에 따라, 한국어로 발화한 "좋은 아침"을 출력하는 경우에는, 버퍼부(130)에 저장된 "좋은 아침"을 출력할 수 있다. The cellular phone 100-1 receives the voice signal for the "good morning" which the first user 10 has uttered in Korean through the TV 100-2 through the interface unit 110 and outputs the received voice signal And the detected voice may be stored in the buffer unit 130 after decoding. That is, according to another embodiment, in the case of outputting "Good Morning" which is uttered in Korean, it is possible to output "Good Morning" stored in the buffer unit 130. [

제어부(160)는 버퍼부(130)로부터 음성을 콜백(callback)하여 텍스트로 변환하고, 변환된 텍스트를 영어로 번역하여 출력할 수 있으며, 디스플레이부(142))에 영어 자막(210) 형태로 출력할 수 있다. 즉, 제어부(160)는 음성을 번역하여 출력하기 위해, 버퍼부(130)에 저장된 음성을 콜백(callback)할 수 있다. 음성을 텍스트로 변환하는 과정 및 변환된 텍스트를 영어로 번역하는 과정은 휴대폰(100-1) 내부에서 이루어질 수 있으나, 외부 서버를 이용할 수도 있다. The control unit 160 can call back the voice from the buffer unit 130 and convert the text into text and translate the converted text into English and output it to the display unit 142 in the form of an English caption 210 Can be output. That is, the control unit 160 may call back the voice stored in the buffer unit 130 to translate and output the voice. The process of converting the voice to text and the process of translating the converted text into English may be performed in the mobile phone 100-1, but an external server may also be used.

또한, 휴대폰(100-1)은 인터페이스부(110)를 통해 음성 신호 함께 제1 사용자(10)를 촬영한 이미지를 함께 수신할 수 있다. 이 경우, 디스플레이부(142))는 제1 사용자의 이미지(11)를 디스플레이하면서 동시 또는 이시에 디스플레이부(142))의 일 부분에 영어 자막(210)을 디스플레이할 수 있다. 또는, 도 2에 도시된 바와 달리, 디스플레이부(142))를 분할하여, 일 부분에 제1 사용자의 이미지(11)를 출력하고, 다른 부분에 영어 자막(210)을 디스플레이할 수도 있다. In addition, the mobile phone 100-1 can receive an image taken by the first user 10 together with the voice signal through the interface unit 110. [ In this case, the display unit 142 may display the English subtitle 210 on a part of the display unit 142 simultaneously or at the same time while displaying the image 11 of the first user. Alternatively, the display unit 142 may be divided to display the first user's image 11 in one portion and the English subtitle 210 in another portion, as shown in Fig.

또한, TV(100-2)도 상술한 휴대폰(100-1)과 마찬가지 방법으로 제2 사용자(20)가 휴대폰(100-1)을 통해 영어로 발화한 "hello"에 대한 음성신호를 인터페이스부(110)를 통해 수신하고, 한국어로 번역하여, 한글 자막(220) 형태로 출력할 수 있다. 그리고, TV(100-2)도 제2 사용자의 이미지(21)를 함께 수신하여, 제2 사용자의 이미지(21)와 한글 자막(220)을 동시 또는 이시에 디스플레이할 수도 있다.The TV 100-2 can also receive a voice signal for "hello " which the second user 20 has uttered in English via the cellular phone 100-1, in the same manner as the above-described cellular phone 100-1, (110), translate it into Korean, and output it in the form of a Korean subtitle (220). The TV 100-2 may also receive the image 21 of the second user and display the image 21 of the second user and the Korean subtitle 220 at the same time or at the same time.

한편, 도 3 및 도 4는 스피커부(141)를 통해 음성을 출력하는 경우의 실시 예를 나타낸 도면이다. 3 and 4 are views showing an embodiment in which voice is outputted through the speaker unit 141. [

도 3을 살펴보면, 휴대폰(100-1)은 제1 사용자(10)가 TV(100-2)를 통해 한국어로 발화한 "좋은 아침"에 대한 음성 신호를 인터페이스부(110)를 통해 수신하고, 수신한 음성 신호를 디코딩하여 음성을 검출한 뒤, 검출된 음성을 제1 버퍼부(131)에 저장한다. 그리고 스피커부(141)는 제1 버퍼부(131)에 저장된 "좋은 아침"에 대한 음성을 출력한다. 3, the mobile phone 100-1 receives a voice signal for "good morning" which the first user 10 has uttered in Korean through the TV 100-2 through the interface unit 110, The received voice signal is decoded to detect the voice, and the detected voice is stored in the first buffer unit 131. [ Then, the speaker unit 141 outputs a voice corresponding to "good morning" stored in the first buffer unit 131. [

한편, 제어부(160)는 버퍼부(130)로부터 음성을 콜백(callback)하여 텍스트로 변환하고, 변환된 텍스트를 제2 언어인 영어로 번역한 뒤, 번역된 음성 "Good Morning"을 제2 버퍼부(132)에 저장한다. 그리고 스피커부(141)는 제2 버퍼부(132)에 저장된 음성인 "Good Morning"을 출력한다. On the other hand, the control unit 160 calls back the voice from the buffer unit 130 to convert it into text, translates the converted text into English, which is a second language, and then translates the translated voice "Good Morning " (132). And the speaker unit 141 outputs "Good Morning" which is the voice stored in the second buffer unit 132. [

즉, 스피커부(141)는 제1 버퍼부(131) 및 제2 버퍼부(132)에 저장된 음성들을 순서대로 출력한다. 구체적으로, 제어부(160)가 제1 버퍼부(131) 및 제2 버퍼부(132) 중 하나를 스피커부(141)와 연결하기 위한 스위칭부(133)를 제어하여, 스피커부(141)가 제1 버퍼부(131)에 저장된 음성인 "좋은 아침"을 출력한 뒤, 제2 버퍼부(132)에 저장된 음성인 "Good Morning"을 출력한다.That is, the speaker unit 141 sequentially outputs the voices stored in the first buffer unit 131 and the second buffer unit 132. The control unit 160 controls the switching unit 133 for connecting one of the first buffer unit 131 and the second buffer unit 132 to the speaker unit 141 so that the speaker unit 141 Good morning "stored in the first buffer unit 131, and outputs" Good Morning, " which is the voice stored in the second buffer unit 132. [

이와 같은 실시 예에 따라, 영어를 사용자는 제2 사용자(20)는 제1 사용자(10)가 발화한 "좋은 아침"을 들은 후, 영어로 번역된 "Good Morning"을 들을 수 있다. According to this embodiment, the English user can hear "Good Morning" translated into English after the second user 20 hears the "good morning " the first user 10 has uttered.

한편, 도 4는 도 3과 같은 실시 예에서, 제어부(160)가 제2 버퍼부(132)와 스피커부(141)를 연결하도록 스위칭부(133)를 제어하여, 제1 버퍼부(131)에 저장된 "좋은 아침"은 출력하지 않고, 제2 버퍼부(132)에 저장된 "Good Morning"만을 스피커부(141)를 통해 출력하는 실시 예를 도시하고 있다. 3, the control unit 160 controls the switching unit 133 to connect the second buffer unit 132 and the speaker unit 141 to the first buffer unit 131, Good Morning "stored in the second buffer unit 132 is output through the speaker unit 141 without outputting" Good morning "

즉, 제2 사용자(20)는 자신의 선호 언어인 제2 언어인 영어로 번역된 음성만을 들을 수 있으므로, 제1 사용자(10)와의 영상 통화에서 빠르게 반응할 수 있다. That is, the second user 20 can only hear voice translated into English, which is a second language, which is his / her preferred language, so that the second user 20 can react quickly in a video call with the first user 10.

또한, 도 3 및 도 4는 스피커부(141)를 통해 음성이 출력되는 경우만을 도시하고 있으나, 이는 일 실시예에 불과할 뿐, 인터페이스부(110)를 통해 음성 신호와 함께 수신되는 이미지 및 번역된 텍스트를 디스플레이부(142))에 디스플레이할 수 있다. 3 and 4 show only a case where a voice is outputted through the speaker 141. However, the present invention is not limited to this, Text on the display unit 142).

즉, 음성 신호와 함께 수신되는 상대방의 영상을 디스플레이부(142))를 통해 디스플레이할 수 있으며, 사용자의 선호 언어로 번역된 텍스트를 자막 등의 형태로 디스플레이부(142))에 디스플레이할 수 있다. That is, the image of the other party received together with the voice signal can be displayed through the display unit 142), and the text translated into the user's preferred language can be displayed on the display unit 142 in the form of caption or the like .

한편, 도 5는 도 3 및 도 4에서 상술한 바와 같은 버퍼부(130)의 구성을 도시한 블럭도이다. 버퍼부(130)는 음성 신호 처리부(120)로부터 검출된 음성을 저장하기 위한 제1 버퍼부(131), 제2 언어로 번역된 음성을 저장하기 위한 제2 버퍼부(132), 제1 버퍼부(131) 및 제2 버퍼부(132) 중 하나를 스피커부(141)와 연결하기 위한 스위칭부(133)를 포함할 수 있다.FIG. 5 is a block diagram illustrating the configuration of the buffer unit 130 described above with reference to FIG. 3 and FIG. The buffer unit 130 includes a first buffer unit 131 for storing the voice detected from the voice signal processing unit 120, a second buffer unit 132 for storing voice translated into the second language, And a switching unit 133 for connecting one of the first buffer unit 131 and the second buffer unit 132 to the speaker unit 141.

즉, 버퍼부(130)는 타 장치에서 발화된 제1 언어를 사용자 단말 장치(100)에서 출력하기 위해 저장할 뿐만 아니라, 사용자 단말 장치(100)에서 번역한 제2 언어를 음성으로 출력하기 위해 제2 언어로 번역된 음성을 저장할 수도 있다. That is, the buffer unit 130 not only stores the first language uttered by the other apparatus for output from the user terminal device 100, but also stores the second language translated by the user terminal apparatus 100 It is also possible to store voice translated into two languages.

그리고, 제1 언어 및 제2 언어는 출력되기 위해 서로 다른 버퍼부(130) 즉, 제1 버퍼부(131) 및 제2 버퍼부(132)에 각각 저장될 수 있다. The first language and the second language may be stored in different buffer units 130, i.e., the first buffer unit 131 and the second buffer unit 132, respectively, for output.

또한, 스위칭부(133)는 제1 버퍼부(131) 및 제2 버퍼부(132) 중 하나를 스피커부(141)와 연결하여 제1 언어 및 제2 언어를 순차적으로 출력하도록 할 수 있다. The switching unit 133 may sequentially output the first language and the second language by connecting one of the first buffer unit 131 and the second buffer unit 132 to the speaker unit 141.

또한, 제어부(160)는 버퍼부(130)가 제1 버퍼부(131) 및 제2 버퍼부(132)로 구성되는 경우, 검출된 음성에 대한 번역이 이루어지면, 제1 버퍼부(131) 및 제2 버퍼부(132)를 순차적으로 스피커부(141)와 연결하도록 버퍼부(130)의 스위칭부(133)를 제어하여, 스피커부(141)가 제1 언어 및 제2 언어를 순차적으로 출력하도록 할 수 있다. 그리고, 제어부(160)는 제2 버퍼부(132)에 저장된 제2 언어로 번역된 음성만을 스피커부(141)를 통해 출력하도록 제2 버퍼부(132)와 스피커부(141)가 연결되도록 스위칭부(133)를 제어할 수 있다. If the buffer unit 130 includes the first buffer unit 131 and the second buffer unit 132 and the translation of the detected voice is performed, the controller 160 controls the first buffer unit 131, The speaker unit 141 controls the switching unit 133 of the buffer unit 130 to sequentially connect the first and second buffer units 132 and 132 to the speaker unit 141, Can be output. The controller 160 controls the switching of the second buffer unit 132 and the speaker unit 141 so that only the voice translated into the second language stored in the second buffer unit 132 is outputted through the speaker unit 141, It is possible to control the unit 133.

한편, 도 6은 사용자 단말 장치(100)의 제어부(160)의 구체적인 구성을 나타내는 도면이다. 즉, 제어부(160)가 버퍼부(130)로부터 제1 언어로 발화된 음성을 콜백(callback)하여 텍스트로 변환하고, 변환된 텍스트를 제2 언어로 번역하여 출력부(140)를 통해 출력하는 과정을 제어부(160)의 구체적인 구성을 이용하여 설명한다. 6 is a diagram showing a specific configuration of the control unit 160 of the user terminal 100. As shown in FIG. That is, the control unit 160 calls back the voice uttered in the first language from the buffer unit 130, converts the voice into text, translates the converted text into the second language, and outputs the translated text through the output unit 140 Will be described using a specific configuration of the control unit 160. FIG.

도 6에 도시된 바와 같이 제어부(160)는 음성 인식 엔진(161), 텍스트 처리 모듈(162) 및 그래픽 처리 모듈(163)을 포함할 수 있다. 한편, 음성 인식 엔진(161), 텍스트 처리 모듈(162) 및 그래픽 처리 모듈(163)은 소프트웨어로 제어부(160)를 구성할 수 있다. 6, the control unit 160 may include a speech recognition engine 161, a text processing module 162, and a graphic processing module 163. Meanwhile, the speech recognition engine 161, the text processing module 162, and the graphic processing module 163 may constitute the control unit 160 by software.

음성 인식 엔진(161)은 버퍼부(130)로부터 음성을 콜백(call back)하여 ASR 서버(510)로 전송하고, ASR 서버(510)로부터 상기 음성에 대응되는 텍스트를 수신할 수 있다. 즉, ASR 서버(510)는 음성을 인식하여 데이터 베이스에 매칭되는 텍스트로 변환하기 위한 서버이며, 사용자 단말 장치(100)에 내장되거나 사용자 단말 장치(100)의 외부에 존재하면서 사용자 단말 장치(100)와 무선 통신으로 데이터를 주고 받을 수도 있다. The speech recognition engine 161 can call back the voice from the buffer unit 130 and transmit it to the ASR server 510 and receive the text corresponding to the voice from the ASR server 510. [ That is, the ASR server 510 is a server for recognizing a voice and converting it into text matched to a database. The ASR server 510 is a server built in the user terminal 100 or existing outside the user terminal 100, ) And wireless communication.

텍스트 처리 모듈(162)은 UI를 통해 설정된 번역 언어 설정 정보 및 텍스트를 번역 서버(520)로 전송하여, 번역 언어 설정 정보에 따라 제2 언어로 번역된 텍스트를 수신할 수 있다. The text processing module 162 may transmit the translation language setting information and the text set through the UI to the translation server 520 and receive the translated text in the second language according to the translation language setting information.

번역 언어는 사용자 단말 장치(100)에 기 설정되어 있을 수 있으나, 사용자의 선호 언어에 따라 설정될 수도 있다. 즉, 사용자는 UI를 통해 직접 선호 언어를 번역 언어로 설정할 수 있다. The translation language may be predefined in the user terminal device 100, but may be set according to the user's preferred language. That is, the user can set the preferred language directly in the translation language through the UI.

또한, 번역 서버(520)는 다국어 사전을 내장하거나, 다국어 사전을 호출할 수 있는 서버로, 제1 음성에 대응되는 텍스트를 수신하여 번역 언어로 설정된 제2 언어로 번역할 수 있다. 그리고, 번역 서버(520)는 사용자 단말 장치(100)에 내장되거나 사용자 단말 장치(100)의 외부에 존재하면서 사용자 단말 장치(100)와 무선 통신으로 데이터를 주고 받을 수도 있다. Also, the translation server 520 may include a multilingual dictionary or a server capable of calling a multilingual dictionary, and may receive the text corresponding to the first voice and translate the text into a second language set as a translation language. The translation server 520 may be embedded in the user terminal device 100 or may exchange data with the user terminal device 100 in wireless communication while being present outside the user terminal device 100.

한편, 그래픽 처리 모듈(163)은 수신된 텍스트를 GUI(Graphic User Interface)로 디스플레이할 수 있다. 즉, 그래픽 처리 모듈(163)은 제2 언어로 번역되어 수신된 텍스트를 자막 형태로 그래픽 처리하여 디스플레이부(142))에 디스플레이할 수 있다. Meanwhile, the graphic processing module 163 can display the received text using a GUI (Graphic User Interface). That is, the graphic processing module 163 can display the text received by translating the second language in the form of a graphic in the form of a subtitle and displaying the text on the display unit 142).

또한, 제어부(160)는 음성 처리 모듈(미도시)을 더 포함할 수도 있으며, 음성 처리 모듈(미도시)은 제2 언어로 번역된 텍스트를 음성 신호로 변환하기 위한 구성이다. 즉, 음성 처리 모듈(미도시)은 제1 언어를 제2 언어로 번역하여 스피커부(141)를 통해 출력하기 위해서, 제2 언어로 번역된 텍스트를 음성 신호로 변환할 수 있다. The control unit 160 may further include a voice processing module (not shown), and the voice processing module (not shown) converts the text translated into the second language into a voice signal. That is, the speech processing module (not shown) may convert the text translated into the second language into a speech signal in order to translate the first language into the second language and output it through the speaker unit 141.

그리고, 그래픽 처리 모듈(163)을 통해 자막 형태로 그래픽 처리된 제2 언어로 번역되어 수신된 텍스트와 음성 처리 모듈(미도시)을 통해 음성 신호로 변환된 제2 언어는, 출력부(140)를 통해 동시 또는 이시에 출력될 수 있다. The second language converted into the voice signal through the text and speech processing module (not shown) translated into the second language graphically processed through the graphic processing module 163 and then received is output to the output unit 140, Lt; RTI ID = 0.0 > and / or < / RTI >

한편, 도 7은 본 발명의 일 실시예에 따른, 양방향 번역 방법을 설명하기 위한 흐름도이다. 7 is a flowchart illustrating a bidirectional translation method according to an embodiment of the present invention.

사용자 단말 장치(100)는 타 장치에서 제1 언어로 발화된 음성에 대한 음성 신호를 수신하고(S605), 음성 신호를 디코딩하여 음성을 검출한다(S610). 즉, 음성 신호는 디지털 신호 형태로 수신되므로, 수신된 음성 신호를 디코딩하여 음성을 검출한다. The user terminal device 100 receives the voice signal for the voice uttered in the first language at the other device (S605), and decodes the voice signal to detect the voice (S610). That is, since the voice signal is received in the form of a digital signal, the received voice signal is decoded to detect the voice.

또한, 사용자 단말 장치(100)는 검출된 음성을 버퍼부(130)에 저장한다(S615). 즉, 검출된 음성을 스피커부(141)로 출력하기 위해, 버퍼부(130)에 저장한다. 또한, 사용자 단말 장치(100)는 버퍼부(130)에서 음성을 콜백하여 텍스트로 변환하고, 변환된 텍스트를 기설정된 제2 언어로 번역한다(S620). In addition, the user terminal device 100 stores the detected voice in the buffer unit 130 (S615). That is, the detected voice is stored in the buffer unit 130 for output to the speaker unit 141. Also, the user terminal device 100 calls back the voice in the buffer unit 130, converts the voice into text, and translates the converted text into a predetermined second language (S620).

그리고, 사용자 단말 장치(100)는 제2 언어로 번역된 음성을 출력한다(S625). 제2 언어는 사용자 단말 장치(100)에 기 설정되거나, 사용자의 선호 언어에 따라 사용자가 직접 설정할 수도 있다. Then, the user terminal device 100 outputs the voice translated into the second language (S625). The second language may be preset in the user terminal device 100 or may be set by the user according to the user's preferred language.

사용자 단말 장치(100)는 사용자의 음성을 입력받고(S630), 사용자 음성을 제1 언어로 번역하여 타 장치로 전송한다(S635). 즉, 사용자 단말 장치(100)는 타 장치에서 제1 언어로 발화된 음성을 사용자 단말 장치(100)의 사용자의 선호 언어인 제2 언어로 번역하여 출력하면서 동시 또는 이시에 사용자 단말 장치(100)의 사용자의 음성을 입력받아, 제1 언어로 번역하여 타 장치로 전송할 수 있다. The user terminal device 100 receives the user's voice (S630), translates the user's voice into the first language, and transmits the user's voice to another device (S635). That is, the user terminal device 100 translates the voice uttered in the first language in the other device into the second language, which is the preferred language of the user terminal device 100, and outputs the voice to the user terminal device 100 at the same time or at the same time, The user's voice of the user can be translated into the first language and transmitted to another device.

이와 같은 실시 예에 따르면, 사용자는 양방향으로 언어를 번역하여 자신의 선호 언어를 사용하면서 다른 언어를 사용하는 사용자와 자유롭게 커뮤니케이션 할 수 있게 된다. According to this embodiment, the user can freely communicate with a user who uses another language while translating the language in both directions and using his / her preferred language.

상술한 다양한 실시 예들에 따른 사용자 단말 장치의 양방향 번역 방법은, 소프트웨어로 코딩되어 비일시적 판독 가능 매체(non-transitory readable medium) 에 저장될 수 있다. 이러한 비일시적 판독 가능 매체는 다양한 장치에 탑재되어 사용될 수 있다. The bi-directional translation method of the user terminal according to various embodiments described above may be coded in software and stored in a non-transitory readable medium. Such non-transiently readable media can be used in various devices.

일 예로, 타 장치에서 제1 언어로 발화(speak)된 음성에 대한 음성 신호를 수신하는 단계, 음성 신호를 디코딩하여 음성을 검출하는 단계, 검출된 음성을 버퍼부에 저장하는 단계, 버퍼부로부터 음성을 콜백하여 텍스트 변환하고, 변환된 텍스트를 기 설정된 제2 언어로 번역하는 단계, 제2 언어로 번역된 음성을 출력하는 단계 사용자 음성을 입력받는 단계 및 사용자 음성을 제1 언어로 번역하여 타 장치로 전송하는 단계를 수행하기 위한 프로그램 코드가 비일시적 판독 가능 매체에 저장되어 제공될 수 있다. 그 밖에도, 상술한 다양한 실시 예들에서 설명한 디스플레이 방법이 프로그램으로 코딩되어 비일시적 판독 가능 매체에 저장될 수 있다. For example, the method includes receiving a voice signal for a voice in a first language from another device, decoding the voice signal to detect voice, storing the detected voice in a buffer unit, Translating the text into a text and translating the text into a predetermined second language; outputting a voice translated into the second language; receiving a user voice; translating the user voice into a first language; The program code for performing the step of transmitting to the device may be provided stored on a non-volatile readable medium. In addition, the display methods described in the various embodiments described above may be coded into a program and stored in a non-volatile readable medium.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등이 될 수 있다.A non-transitory readable medium is a medium that stores data for a short period of time, such as a register, cache, memory, etc., but semi-permanently stores data and is readable by the apparatus. Specifically, it may be a CD, a DVD, a hard disk, a Blu-ray disk, a USB, a memory card, a ROM, or the like.

또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present invention.

100 : 사용자 단말 장치 110 : 인터페이스부
120 : 음성 신호 처리부 130 : 버퍼부
140 : 출력부 150 : 마이크부
160 : 제어부 100: user terminal device 110:
120: audio signal processing unit 130: buffer unit
140: output unit 150: microphone unit
160:

Claims

In a user terminal,
An interface unit for receiving a voice signal for a voice uttered in a first language in another apparatus;
A voice signal processor for decoding the voice signal to detect the voice;
A buffer unit for storing the detected voice;
An output section;
A control unit for calling back the voice from the buffer unit to text-convert the voice, translating the converted text into a predetermined second language, and outputting the translated voice through the output unit; And
And a microphone unit for receiving a user voice,
Wherein,
Translating the user voice into the first language, and transmitting the user voice to the other device through the interface unit.

The method according to claim 1,
The output unit includes:
A speaker unit for outputting the voice stored in the buffer unit;
And a display unit,
Wherein,
Wherein the display unit displays the image received together with the voice signal through the interface unit and the text translated into the second language on the display unit.

The method according to claim 1,
The output unit includes:
And a speaker unit for outputting the voice stored in the buffer unit,
Wherein,
And outputting the voice spoken in the first language and the voice translated into the second language sequentially through the speaker unit.

The method according to claim 1,
The output unit includes:
And a speaker section,
Wherein,
And outputs the voice translated into the second language through the speaker unit instead of the voice spoken in the first language.

5. The method of claim 4,
The buffer unit includes:
A first buffer for storing the voice detected by the voice signal processor;
A second buffer for storing a voice translated into the second language; And
A switching unit for connecting one of the first buffer unit and the second buffer unit to the speaker unit; / RTI >
Wherein,
Wherein the control unit controls the switching unit to connect the second buffer unit with the speaker unit when the detected voice is translated and outputs the voice translated into the second language through the speaker unit. Device.

6. The method according to any one of claims 1 to 5,
Wherein,
A voice recognition engine that calls back the voice from the buffer unit and transmits the voice to the ASR server and receives text corresponding to the voice from the ASR server;
A text processing module for transmitting the translation language setting information set through the UI and the text to a translation server and receiving the translated text in the second language according to the translation language setting information;
A graphic processing module for displaying the received text in a GUI (Graphic User Interface); The user terminal device comprising:

The method according to claim 6,
Wherein the ASR server and the translation server are embedded in the user terminal.

The method according to claim 6,
Wherein,
A speech processing module for converting the text translated into the second language into a speech signal; Further comprising: means for determining whether the user terminal device is in use.

The method according to claim 3 or 4,
The output unit includes:
And a display unit,
Wherein,
Wherein the display unit displays the image received together with the voice signal through the interface unit and the text translated into the second language on the display unit.

A bi-directional translation method of a user terminal,
Receiving a speech signal for speech in a first language at another device;
Decoding the voice signal to detect the voice;
Storing the detected voice in a buffer unit;
Translating the speech from the buffer into a text and translating the converted text into a predetermined second language;
Outputting the voice translated into the second language;
Receiving a user voice; And
Translating the user speech into the first language and transmitting the translated speech to the other device; Lt; / RTI >

11. The method of claim 10,
Wherein the outputting step comprises:
And displaying the voice in text form in the second language.

11. The method of claim 10,
Wherein the outputting step comprises:
Wherein the voice output unit is configured to output the voice spoken in the first language and the voice translated into the second language through the speaker unit in order.

11. The method of claim 10,
Wherein the outputting step comprises:
And outputting the voice translated into the second language instead of the voice spoken in the first language through the speaker unit.

14. The method according to any one of claims 10 to 13,
Wherein translating into the second language comprises:
Sending back the voice from the buffer unit to the ASR server and receiving text corresponding to the voice from the ASR server;
Transmitting the translation language setting information set through the UI and the text to a translation server and receiving the translated text in the second language according to the translation language setting information; Wherein the translation method comprises:

The method according to claim 12 or 13,
Wherein the outputting step comprises:
Displaying an image received with the voice signal and text translated into the second language; Further comprising the steps < RTI ID = 0.0 > of: < / RTI >