KR102170684B1

KR102170684B1 - System for interpreting and translating using smart clothes

Info

Publication number: KR102170684B1
Application number: KR1020190007783A
Authority: KR
Inventors: 오순영
Original assignee: (주)한컴인터프리
Priority date: 2019-01-21
Filing date: 2019-01-21
Publication date: 2020-10-27
Also published as: KR20200090580A

Abstract

본 발명은 대화자 사이에서 발화되는 음성을 수집하여 음성신호로 변환하는 입력부; 및 상기 음성신호를 이용하여 인식된 음성 내용에 대한 번역된 내용을 텍스트 및 합성된 음성 중에서 적어도 하나의 형태로 출력하는 출력부가 포함된 스마트 의류를 포함하는 스마트 의류 통번역 시스템을 개시한다. 본 발명에 의하면, 스마트 의류를 이용하여 오디오 및 비디오를 통해 통번역 결과를 출력할 수 있다.The present invention includes an input unit for collecting voices spoken between chatters and converting them into voice signals; And it discloses a smart clothing interpretation and translation system including a smart clothing including an output unit for outputting at least one form of a text and a synthesized voice translated content for the voice content recognized using the voice signal. According to the present invention, it is possible to output the result of interpretation and translation through audio and video using smart clothing.

Description

Interpretation and translation system using smart clothing {SYSTEM FOR INTERPRETING AND TRANSLATING USING SMART CLOTHES}

본 발명은 스마트 의류를 이용하는 통번역 시스템에 관한 것으로, 더욱 상세하게는 대화자 사이에서 또는 연설자와 청중들 사이에서 음성을 인식해서 번역 및 통역된 내용을 스마트 의류를 통해 출력하는 스마트 의류를 이용하는 통번역 시스템에 관한 것이다.The present invention relates to an interpretation and translation system using smart clothing, and more particularly, to an interpretation and translation system using smart clothing that recognizes voices between talkers or between speakers and audiences and outputs translated and interpreted content through smart clothing. About.

웨어러블 디바이스(wearable device)는 신체에 부착하여 컴퓨팅 행위를 할 수 있는 모든 전자기기를 지칭하며, 일부 컴퓨팅 기능을 수행할 수 있는 어플리케이션까지 포함하는 개념이다. A wearable device refers to all electronic devices that are attached to a body to perform computing actions, and includes applications that can perform some computing functions.

또한, 웨어러블 디바이스는 사용자가 이동 또는 활동 중에도 자유롭게 사용할 수 있도록 신체나 의복에 착용 가능하도록 작고 가볍게 개발되어 신체의 가장 가까운 곳에서 사용자와 소통 가능한 차세대 전자기기를 의미한다.In addition, a wearable device refers to a next-generation electronic device that is developed small and light so that it can be worn on a body or clothing so that the user can freely use it while moving or performing activities, and thus can communicate with the user at the nearest place of the body.

웨어러블 디바이스는 크게 액세서리형, 의류일체형 및 신체부착형으로 분류되기도 한다. 그 중에서 의류일체형 웨어러블 디바이스는 스마트 의류라 불린다.Wearable devices are largely classified into accessory type, clothing integrated type, and body attachment type. Among them, wearable devices with integrated clothing are called smart clothing.

본 발명의 실시 예에 따른 스마트 의류를 이용하는 통번역 시스템은 생체 인식 또는 상황 기반 색/무늬 변화에 한정적으로 이용되던 스마트 의류를 통번역 알고리즘과 접목하여 대화자 사이 또는 화자와 청중 사이에서 스마트 의류를 통한 입력 및 출력 기능을 이용하여 오디오 및 비디오적으로 통번역을 수행하는 통번역 시스템에 관한 것으로 상기 살펴본 종래 기술과 구별되는 기술로서 상기 문제점을 해결하기 위한 것이다.The interpretation and translation system using smart clothing according to an embodiment of the present invention combines smart clothing that has been limitedly used for biometrics or context-based color/pattern change with an interpretation and translation algorithm, and input and input through smart clothing between talkers or between speakers and audiences. The present invention relates to an interpretation and translation system for performing audio and video interpretation and translation using an output function, and is a technology that is distinguished from the above-described conventional technology, and is to solve the above problem.

본 발명은 상기와 같은 문제점을 해결하기 위해 창작된 것으로서, 스마트 의류를 이용하여 오디오 및 비디오를 통해 통번역 결과를 출력할 수 있는 스마트 의류를 이용하는 통번역 시스템을 제공하는 것을 목적으로 한다.The present invention has been created to solve the above problems, and an object of the present invention is to provide an interpretation and translation system using smart clothing capable of outputting interpretation and translation results through audio and video using smart clothing.

또한, 유선 또는 무선 연결된 사용자 단말의 통신 기능을 이용하여 원격의 통번역 서버를 통해 통번역 결과를 출력할 수 있는 스마트 의류를 이용하는 통번역 시스템을 제공하는 것을 목적으로 한다.In addition, it is an object of the present invention to provide an interpretation and translation system using smart clothing capable of outputting interpretation and translation results through a remote interpretation and translation server using a communication function of a user terminal connected by wire or wirelessly.

또한, 대화자 사이 또는 화자와 청중 사이의 거리에 따라 출력되는 음성 및 텍스트의 크기가 자동 조절되는 스마트 의류를 이용하는 통번역 시스템을 제공하는 것을 목적으로 한다.In addition, an object of the present invention is to provide an interpretation and translation system using smart clothing in which the sizes of voices and texts that are output are automatically adjusted according to a distance between a speaker or a speaker and an audience.

또한, 외부 노이즈를 인식해서 이를 능동적으로 제거할 수 있는 스마트 의류를 이용하는 통번역 시스템을 제공하는 것을 목적으로 한다.In addition, an object of the present invention is to provide an interpretation and translation system using smart clothing that can recognize external noise and actively remove it.

본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 시스템은, 발화된 음성에서 음성신호로의 변환을 위해 음성을 수집하는 입력부; 상기 음성신호를 이용하여 인식된 음성 내용에 대한 번역된 내용을 텍스트 및 합성된 음성 중에서 적어도 하나의 형태로 출력하는 출력부가 포함된 스마트 의류를 포함한다.An interpretation and translation system using smart clothing according to an embodiment of the present invention includes: an input unit for collecting voice for conversion from spoken voice to voice signal; And a smart clothing including an output unit that outputs the translated content of the voice content recognized using the voice signal in at least one form of text and synthesized voice.

여기서, 상기 음성신호를 이용하여 음성 인식 과정을 거쳐 번역 및 통역을 수행하는 통번역부를 더 포함하는 것을 특징으로 한다.Here, it characterized in that it further comprises an interpretation and translation unit for performing translation and interpretation through a speech recognition process using the voice signal.

여기서, 상기 통번역부는, 상기 스마트 의류와 유선 또는 무선으로 연결되어 상기 발화된 음성에 대한 번역 및 통역을 수행하는 사용자 단말 형태로 구현되는 것을 특징을 한다.Here, the interpretation and translation unit is characterized in that it is implemented in the form of a user terminal that is connected to the smart clothing by wire or wirelessly to translate and interpret the spoken voice.

여기서, 상기 스마트 의류는, 사용자가 대화자 또는 청중과 떨어져 있는 거리를 인식하는 거리센서; 및 상기 거리센서의 센싱 정보를 이용하여 상기 출력부를 통해 출력되는 음성의 음량 및 텍스트의 크기를 제어하는 제어부를 더 포함하는 것을 특징으로 한다.Here, the smart clothing includes: a distance sensor for recognizing a distance from a user to a talker or an audience; And a controller for controlling a volume of a voice and a volume of a text output through the output unit using the sensing information of the distance sensor.

여기서, 상기 스마트 의류는, 전후면에 걸쳐 상기 입력부의 마이크로폰을 다수 개 포함하고, 마이크로폰의 위치에 따른 음성의 음량 차이를 이용해서 음성신호를 인식하고, 이를 필터링하는 오디오 처리부를 더 포함하는 것을 특징으로 한다.Here, the smart clothing further comprises an audio processing unit configured to include a plurality of microphones of the input unit over the front and rear surfaces, and to recognize and filter a voice signal using a volume difference of the voice according to the location of the microphone. To do.

여기서, 상기 스마트 의류는, 음성신호에 포함된 노이즈 신호의 파형을 분석하고, 분석된 파형을 상쇄하는 간섭파를 발생시키고, 노이즈 신호와 상기 간섭파의 합성파를 피드백 받아 상기 합성파가 상쇄되도록 상기 간섭파를 보정하는 노이즈 처리부를 더 포함하는 것을 특징으로 한다.Here, the smart clothing analyzes the waveform of the noise signal included in the voice signal, generates an interference wave that cancels the analyzed waveform, and receives the synthesized wave of the noise signal and the interference wave as feedback so that the synthesized wave is canceled. It characterized in that it further comprises a noise processing unit for correcting the interference wave.

여기서, 상기 스마트 의류는, 상기 입력부 및 출력부에 전력을 공급하는 전원부; 및 상기 전원부를 이용하여 사용자 단말을 충전시키는 충전부를 더 포함하는 것을 특징으로 한다.Here, the smart clothing includes: a power supply unit for supplying power to the input unit and the output unit; And a charging unit for charging the user terminal using the power supply unit.

여기서, 상기 출력부는 디스플레이 디바이스를 포함하고, 상기 디스플레이 디바이스는, 사용자 외의 타인이 볼 수 있도록, 스마트 의류의 가슴 및 배 부분의 정면과 등 부분의 후면 중에서 적어도 하나에 위치하고, 사용자가 볼 수 있도록, 스마트 의류의 팔 부분에 위치하는 것을 특징으로 한다.Here, the output unit includes a display device, and the display device is positioned at least one of the front of the chest and the abdomen and the rear of the back of the smart clothing so that others other than the user can see it, so that the user can see, It characterized in that it is located on the arm of the smart clothing.

본 발명의 일 실시 예에 다른 스마트 의류를 이용하는 통번역 방법은, 발화되는 음성을 스마트 의류의 입력부를 통해 수집하여 음성신호로 변환하는 단계; 상기 음성신호를 이용하여 음성 인식 과정을 거쳐 번역 및 통역을 수행하는 단계; 및 상기 번역 및 통역 결과를 스마트 의류를 이용하여 텍스트 및 합성된 음성 중에서 적어도 하나의 형태로 출력하는 단계를 포함하는 것을 특징으로 한다.According to an embodiment of the present invention, a method for interpreting and translating using smart clothing includes the steps of: collecting spoken speech through an input unit of the smart clothing and converting it into a speech signal; Performing translation and interpretation through a voice recognition process using the voice signal; And outputting the translation and interpretation results in at least one form of text and synthesized voice using smart clothing.

여기서, 상기 번역 및 통역을 수행하는 단계는, 상기 스마트 의류와 유선 또는 무선으로 연결되어 통신하는 사용자 단말에 의해 수행되는 것을 특징으로 한다.Here, the step of performing the translation and interpretation may be performed by a user terminal that is connected to and communicates with the smart clothing by wire or wirelessly.

여기서, 상기 스마트 의류를 이용하는 통번역 방법은, 마이크로폰의 위치에 따른 음성의 음량 차이를 이용해서 음성신호를 인식하고, 이를 필터링하는 단계; 및 음성신호에 포함된 노이즈 신호의 파형을 분석하고, 분석된 파형을 상쇄하는 간섭파를 발생시키고, 노이즈 신호와 상기 간섭파의 합성파를 피드백 받아 상기 합성파가 상쇄되도록 상기 간섭파를 보정하는 단계 중에서 적어도 하나의 단계를 더 포함하는 것을 특징으로 한다.Here, the interpretation and translation method using the smart clothing includes the steps of recognizing a voice signal using a difference in volume of a voice according to a location of a microphone and filtering the voice signal; And analyzing the waveform of the noise signal included in the voice signal, generating an interference wave that cancels the analyzed waveform, and correcting the interference wave so that the synthesized wave is canceled by receiving feedback of the synthesized wave of the noise signal and the interference wave. It characterized in that it further comprises at least one of the steps.

본 발명에 의하면, 스마트 의류를 이용하여 오디오 및 비디오를 통해 통번역 결과를 출력할 수 있다.According to the present invention, it is possible to output the result of interpretation and translation through audio and video using smart clothing.

또한, 유선 또는 무선 연결된 사용자 단말의 통신 기능을 이용하여 원격의 통번역 서버의 서비스를 통해 통번역 결과를 출력할 수 있다.In addition, it is possible to output an interpretation and translation result through a service of a remote interpretation/translation server using a communication function of a user terminal connected by wire or wirelessly.

또한, 스마트 의류를 이용하여 대화자 사이 또는 화자와 청중 사이의 거리에 따라 출력되는 음성 및 텍스트의 크기가 자동 조절되도록 통번역 결과를 출력할 수 있다.In addition, an interpretation and translation result may be output so that the size of the voice and text output is automatically adjusted according to the distance between the talker or between the speaker and the audience using smart clothing.

또한, 스마트 의류를 통해 외부 노이즈를 인식해서 이를 능동적으로 제거할 수 있다.In addition, external noise can be recognized and actively removed through smart clothing.

도 1은 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 시스템의 네트워크 환경의 예시도이다.
도 2는 본 발명의 일 실시 예에 따른 통번역 서버의 블록도이다.
도 3은 본 발명의 일 실시 예에 따른 스마트 의류의 블록도이다.
도 4는 본 발명의 일 실시 예에 따른 스마트 의류의 예시도이다.
도 5는 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 방법의 예시도이다.
도 6은 본 발명의 일 실시 예에 따른 음성신호 전처리 단계의 예시도이다.1 is an exemplary diagram of a network environment of an interpretation and translation system using smart clothing according to an embodiment of the present invention.
2 is a block diagram of an interpretation and translation server according to an embodiment of the present invention.
3 is a block diagram of smart clothing according to an embodiment of the present invention.
4 is an exemplary diagram of smart clothing according to an embodiment of the present invention.
5 is an exemplary diagram of an interpretation and translation method using smart clothing according to an embodiment of the present invention.
6 is an exemplary diagram of a preprocessing step of a voice signal according to an embodiment of the present invention.

이하, 첨부한 도면을 참조하여 스마트 의류를 이용하는 통번역 방법 및 통번역 시스템에 대한 바람직한 실시 예를 상세히 설명한다. 각 도면에 제시된 동일한 참조부호는 동일한 부재를 나타낸다. 또한 본 발명의 실시 예들에 대해서 특정한 구조적 내지 기능적 설명들은 단지 본 발명에 따른 실시 예를 설명하기 위한 목적으로 예시된 것으로, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는 것이 바람직하다.Hereinafter, a preferred embodiment of an interpretation and translation method and an interpretation and translation system using smart clothing will be described in detail with reference to the accompanying drawings. The same reference numerals in each drawing indicate the same member. In addition, specific structural or functional descriptions of the embodiments of the present invention are exemplified only for the purpose of describing the embodiments according to the present invention, and unless otherwise defined, all terms used herein including technical or scientific terms They have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms as defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related technology, and should not be interpreted as an ideal or excessively formal meaning unless explicitly defined in this specification. It is desirable not to.

이하 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 시스템의 네트워크 환경에 대해 설명하기로 한다.Hereinafter, a network environment of an interpretation and translation system using smart clothing according to an embodiment of the present invention will be described.

도 1은 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 시스템의 네트워크 환경의 예시도이다.1 is an exemplary diagram of a network environment of an interpretation and translation system using smart clothing according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 시스템(1)이 도시되어 있다. 스마트 의류를 이용하는 통번역 시스템(1)은 스마트 의류(100)을 기본적으로 포함하고, 여기에 사용자 단말(200) 및 통번역 서버(300)를 추가적으로 포함할 수 있다.Referring to FIG. 1, an interpretation and translation system 1 using smart clothing according to an embodiment of the present invention is shown. The interpretation and translation system 1 using smart clothing basically includes the smart clothing 100, and may additionally include a user terminal 200 and an interpretation and translation server 300.

스마트 의류(100)는 대화자들 사이에서 또는 청중과 관객들 사이에서 외국어 음성을 입력 받아 이를 인식하고, 인식된 내용을 이용하여 번역 및 통역을 수행하고, 수행 결과를 음성합성 또는 텍스트로 출력하는 입출력 기능을 수행한다. 여기서, 음성인식, 번역, 통역 및 음성합성 기능을 어떠한 장치가 수행하느냐에 따라 스마트 의류를 이용하는 통번역 시스템(1)은 스마트 의류(100)만을 포함하거나 사용자 단말(200) 및 통번역 서버(300)를 더 포함할 수 있다.The smart clothing 100 receives and recognizes foreign language voices between talkers or between audiences and audiences, performs translation and interpretation using the recognized contents, and outputs the results of speech synthesis or text. Functions. Here, the interpretation and translation system 1 using smart clothing may include only the smart clothing 100 or a user terminal 200 and an interpretation and translation server 300 according to which device performs voice recognition, translation, interpretation, and speech synthesis functions. Can include.

사용자 단말(200)은 스마트 의류(100)를 통해 입력되는 음성 신호를 수신하고, 이를 이용하여 음성인식, 번역, 통역 및 음성합성 기능 중에서 적어도 하나를 수행할 수 있다. 예를 들어 사용자 단말(200)은 음성인식, 번역, 통역 및 음성합성 기능 전체를 직접 수행하거나, 대용량의 데이터베이스가 필요한 음성인식 및 번역, 통역 기능을 통번역 서버(300)를 통해 수행하고 음성합성 기능만을 단독으로 수행할 수 있다.The user terminal 200 may receive a voice signal input through the smart clothing 100 and use this to perform at least one of voice recognition, translation, interpretation, and speech synthesis functions. For example, the user terminal 200 directly performs voice recognition, translation, interpretation and speech synthesis functions, or performs voice recognition, translation, and interpretation functions that require a large database through the interpretation and translation server 300, and a speech synthesis function. Can be performed alone.

사용자 단말(200)은 통신부(210), 디스플레이부(220), 음성합성부(230), 저장부(240), 입력부(250), 출력부(260), 재생부(270), 전원부(280) 및 제어부(290)를 포함한다.The user terminal 200 includes a communication unit 210, a display unit 220, a speech synthesis unit 230, a storage unit 240, an input unit 250, an output unit 260, a playback unit 270, and a power supply unit 280. ) And a control unit 290.

사용자 단말(200)의 다양한 실시 예들은 셀룰러 전화기, 무선 통신 기능을 가지는 스마트 폰, 무선 통신 기능을 가지는 개인 휴대용 단말기(PDA), 무선 모뎀, 무선 통신 기능을 가지는 휴대용 컴퓨터, 무선 통신 기능을 가지는 디지털 카메라와 같은 촬영장치, 무선 통신 기능을 가지는 게이밍 (gaming) 장치, 무선 통신 기능을 가지는 음악저장 및 재생 가전제품, 무선 인터넷 접속 및 브라우징이 가능한 인터넷 가전제품뿐만 아니라 그러한 기능들의 조합들을 통합하고 있는 휴대형 유닛 또는 단말기들을 포함할 수 있으나, 이에 한정되는 것은 아니다.Various embodiments of the user terminal 200 include a cellular phone, a smart phone having a wireless communication function, a personal portable terminal (PDA) having a wireless communication function, a wireless modem, a portable computer having a wireless communication function, and a digital device having a wireless communication function. A portable device that incorporates combinations of such functions, as well as photographing devices such as cameras, gaming devices with wireless communication functions, music storage and playback home appliances with wireless communication functions, and Internet home appliances capable of wireless Internet access and browsing. It may include a unit or terminals, but is not limited thereto.

통번역 서버(300)는 사용자 단말(200)과의 분담을 통해 음성인식, 번역, 통역 및 음성합성 기능을 선택적으로 수행할 수 있다.The interpretation and translation server 300 may selectively perform voice recognition, translation, interpretation, and speech synthesis functions through sharing with the user terminal 200.

도 2는 본 발명의 일 실시 예에 따른 통번역 서버의 블록도이다.2 is a block diagram of an interpretation and translation server according to an embodiment of the present invention.

도 2를 참조하면, 통번역 서버(300)는 음성인식 모듈(310), 음성 DB(315), 번역 모듈(320), 번역 DB(325) 음성합성 모듈(330)을 포함한다.Referring to FIG. 2, the interpretation and translation server 300 includes a speech recognition module 310, a speech DB 315, a translation module 320, a translation DB 325, and a speech synthesis module 330.

음성인식 모듈(310)은, 발화자의 음성을 녹음하고, 그 녹음된 데이터를 이용하여 음성인식을 수행한다. 음성인식 모듈(310)은 발화자의 입으로부터 나온 음성 신호를 자동으로 인식하여 문자열로 변환해 주는 과정을 수행한다. 음성인식부의 다른 명칭은 ASR(automatic speech recognition), Voice Recognition 또는 STT(speech-to-text)이다.The voice recognition module 310 records a talker's voice and performs voice recognition using the recorded data. The voice recognition module 310 automatically recognizes the voice signal from the talker's mouth and converts it into a character string. Another name for the speech recognition unit is automatic speech recognition (ASR), voice recognition or speech-to-text (STT).

음성인식 모듈(310)은 확률통계 방식에 기반할 수 있다. 즉 음성인식 모듈(310)은 음성인식 과정에서 사용되는 음향모델(acoustic model, AM) 언어모델(language model, LM)로서 확률통계에 기반한 모델을 사용한다. 그리고 핵심 알고리즘인 HMM(hidden markov model)도 역시 확률통계에 기반할 수 있다. 상기의 모델들은 예시에 해당하며, 본 발명을 한정하려는 것은 아니다.The speech recognition module 310 may be based on a probability statistics method. That is, the speech recognition module 310 uses a model based on probability statistics as an acoustic model (AM) language model (LM) used in the speech recognition process. And the core algorithm HMM (hidden markov model) can also be based on probability statistics. The above models are examples and are not intended to limit the present invention.

음향모델로서 GMM(Gaussian Mixture Model)이, 언어모델로서 N-gram이 사용될 수 있다. 더 나아가, GMM 대신에 딥 러닝(Deep Learning) 아키텍처 중의 하나인 DNN(Deep Neural Network)이 사용되는 것도 바람직하다. 그리고 음성인식의 성능을 높이기 위해 양질의 음성모델 및 언어모델이 설정되고, 설정된 모델들은 딥 러닝 알고리즘에 의해 학습될 수 있다. 학습에 필요한 학습 DB는 구어체, 대화체의 음성 및 언어 DB를 포함하고 있을 것이 바람직하다.A Gaussian Mixture Model (GMM) may be used as an acoustic model, and N-gram may be used as a language model. Furthermore, it is also preferable to use a Deep Neural Network (DNN), one of deep learning architectures, instead of GMM. In addition, in order to improve the performance of speech recognition, a high-quality speech model and a language model are set, and the set models can be learned by a deep learning algorithm. It is preferable that the learning DB required for learning includes spoken and conversational speech and language DB.

번역 모듈(320)은 음성인식 모듈(310)에 의해 인식된 출발어(Source Language)로 발화된 발화자의 음성이 텍스트로 출력되면, 출력된 텍스트를 도착어(Target Language)의 문자로 번역한다. 본 발명의 일 실시 예에 따른 통역역 서버(300)는 음성인식 모듈(310)과 함께 번역 모듈(320)도 자체 포함하고 있는 것을 특징으로 한다.When the speech of the speaker uttered in the source language recognized by the speech recognition module 310 is output as text, the translation module 320 translates the output text into characters of a target language. The interpretation server 300 according to an embodiment of the present invention is characterized in that it includes a translation module 320 as well as a voice recognition module 310.

번역 모듈(320)이 수행하는 번역의 방식은 규칙에 기반한 방법, 말뭉치에 기반한 방법 및 인공신경망번역(Neural Machine Translation, NMT) 중에서 적어도 하나를 포함한다. 규칙에 기반한 방법은 분석 깊이에 따라 다시 직접 번역방식이나 간접 변환방식, 중간 언어방식으로 나뉜다. 말뭉치에 기반한 방법으로 예제 기반 방법과 통계기반 방법이 있다.The translation method performed by the translation module 320 includes at least one of a rule-based method, a corpus-based method, and an artificial neural network translation (NMT). Rule-based methods are divided into direct translation methods, indirect conversion methods, and intermediate language methods according to the depth of analysis. There are two types of corpus-based methods, an example-based method and a statistical method.

통계 기반 자동번역(Stochastic Machine Translation, SMT) 기술은 통계적 분석을 통해 이중언어 말뭉치로부터 모델 파라미터를 학습하여 문장을 번역하는 기술이다. 문법이나 의미표상을 개발할 때 수작업으로 하지 않고 번역하고자 하는 언어 쌍에 대한 말뭉치로부터 번역에 필요한 모델을 만든다. 그래서 말뭉치만 확보할 수 있다면 비교적 용이하게 언어 확장을 할 수 있다.Stochastic Machine Translation (SMT) technology is a technology that translates sentences by learning model parameters from a bilingual corpus through statistical analysis. When developing a grammar or semantic representation, a model necessary for translation is created from the corpus of the language pair to be translated rather than manually. So, if only the corpus can be secured, the language can be expanded relatively easily.

통계 기반 자동번역 기술의 단점은, 대규모의 이중언어 말뭉치가 필요하고, 다수의 언어들을 연결하는 공통된 의미표상이 없다는 것이다.The disadvantage of statistics-based automatic translation technology is that it requires a large-scale bilingual corpus, and there is no common semantic representation that connects multiple languages.

이러한 단점을 보완하기 위한 기술이 인공신경망 번역(Neural Machine Translation, NMT)이다.Neural Machine Translation (NMT) is a technology to compensate for these shortcomings.

SMT는 문장을 단어 또는 몇 개의 단어가 모인 구 단위로 쪼갠 뒤 통계적 모델에 기반해 번역하는 방식이다. 방대한 학습 데이터를 바탕으로 통계적 번역 규칙을 모델링하는 게 핵심이다.SMT is a method of dividing a sentence into words or phrases of several words and then translating it based on a statistical model. The key is to model statistical translation rules based on vast training data.

이와 달리 NMT는 인공지능(AI)이 문장을 통째로 번역한다. 문장 단위 번역이 가능한 이유는 인공신경망이 문장 정보를 가상공간의 특정 지점을 의미하는 벡터(좌표값)로 변환하기 때문이다.In contrast, in NMT, artificial intelligence (AI) translates the whole sentence. The reason why sentence-by-sentence translation is possible is because the artificial neural network converts sentence information into a vector (coordinate value) that means a specific point in the virtual space.

가령 '사람'이란 단어를 '[a, b, c, …, x, z]' 형태로 인식한다. 벡터에는 단어, 구절, 어순 등의 정보가 전부 들어있기 때문에 문맥을 이해한 문장 단위 번역이 가능하다. 인공신경망은 비슷한 의미를 담은 문장들을 서로 가까운 공간에 배치한다.For example, the word'person' is called'[a, b, c,… , x, z]'. Since the vector contains all information such as words, phrases, and word order, it is possible to translate sentences by understanding the context. Artificial neural networks place sentences with similar meanings in close proximity to each other.

NMT 기술에서 고차원의 벡터가 활용된다. 출발어의 문장과 도착어의 문장으로 이루어진 학습 데이터를 활용하여 인공신경망을 학습시키고, 학습된 인공신경망은 문장 정보를 벡터로 인식하게 된다.High-dimensional vectors are used in NMT technology. The artificial neural network is trained by using the learning data consisting of the sentences of the starting word and the sentence of the destination word, and the learned artificial neural network recognizes the sentence information as a vector.

음성합성 모듈(330)은 번역 모듈(310)의 번역에 따라 발화자의 음성에 대응하는 합성 음성을 출력한다. 스마트 의류(100)가 음성합성 기능을 수행하는 경우에, 통번역 서버(300)로부터 사용자 단말(200)을 통해 TTS데이터를 수신한다. 스마트 의류(100)는 TTS데이터를 기초로 음성합성 모듈을 이용하여 음성을 합성한다. 스마트 의류(100)는 사용자 단말(200)로부터 전송된 TTS 데이터를 재생하고, 그 결과는 스피커로 출력된다.The speech synthesis module 330 outputs a synthesized speech corresponding to the speaker's speech according to the translation by the translation module 310. When the smart clothing 100 performs a speech synthesis function, TTS data is received from the interpretation and translation server 300 through the user terminal 200. The smart clothing 100 synthesizes speech using a speech synthesis module based on TTS data. The smart clothing 100 reproduces the TTS data transmitted from the user terminal 200, and the result is output to a speaker.

통번역 서버(300)가 음성합성 모듈(330)을 포함하는 경우, 통번역 서버(300) 스스로 TTS데이터를 생성한다.When the interpretation and translation server 300 includes the speech synthesis module 330, the interpretation and translation server 300 itself generates TTS data.

음성합성은 TTS(Text-to-speech) 또는 Voice Synthesis라고 불린다. 음성합성의 방법으로 음편조합방식이 사용될 수 있다. 음편조합방식은, 문장 분석, 분석 결과에 따른 음편을 음편 DB에서 추출, 이를 이어 붙인다. 여러 후보들의 합성음이 생성되고, 운율 및 매끄러움을 고려하여 가장 적합한 것이 채택된다. 더욱이 발화자 음성의 사운드 스펙트럼을 이용하여 발화자의 음색을 결정하고, 합성음을 음색에 맞도록 후처리함으로써 원발화자의 음색에 가까운 합성음이 출력될 수 있다. 또한, 발화자의 감정이 인지되고, 인지된 감정이 합성음에 실릴 수도 있다.Speech synthesis is called TTS (Text-to-speech) or Voice Synthesis. As a method of speech synthesis, a sound combination method can be used. In the sound piece combination method, the sound piece according to the sentence analysis and analysis result is extracted from the sound piece DB, and it is connected. A composite sound of several candidates is generated, and the most suitable one is selected in consideration of prosody and smoothness. Furthermore, by determining the tone of the talker using the sound spectrum of the talker's voice, and post-processing the synthesized tone to match the tone, a synthesized sound close to the tone of the primary talker can be output. In addition, the speaker's emotion is recognized, and the perceived emotion may be carried on the synthesized sound.

음성인식 DB(315)와 번역 DB(325)는, 언어의 종류에 따라 출현 빈도수가 낮은 순으로 인식 범위를 축소시켜 결정된 데이터베이스에 해당하고, 음성인식 모듈(310)과 번역 모듈(320)은 소형화된 음성인식 DB 또는 번역 DB를 이용하는 엔진을 포함한다.The speech recognition DB 315 and the translation DB 325 correspond to databases determined by reducing the recognition range in the order of lower frequency of appearance according to the type of language, and the voice recognition module 310 and the translation module 320 are miniaturized. It includes an engine that uses the voice recognition DB or translation DB.

음성인식 DB(315)는, 딥 러닝의 알고리즘을 이용하여 다양한 발화로 인한 음성을 학습시키고 발화 내용의 빈도수에 따라 인식 범위를 축소 또는 확대시켜 구축된 DB인 것을 특징으로 한다. 즉 빈도수가 높은 발화 내용을 인식시키기 위해서는 DB 양을 상대적으로 늘리고, 빈도수가 낮은 발화 내용을 인식시키기 위해서는 DB 양을 대폭 줄이는 것이다.The speech recognition DB 315 is characterized in that it is a DB constructed by learning speech due to various utterances using a deep learning algorithm and reducing or expanding a recognition range according to the frequency of speech content. In other words, to recognize high-frequency speech content, the amount of DB is relatively increased, and in order to recognize low-frequency speech content, the amount of DB is significantly reduced.

완성도 높은 음성인식률을 얻기 위해서는 음성인식 DB(315)의 양이 많을수록 유리하나, 시간의 지연 및 과부하의 문제점이 있기 마련인데, 상기 방법에 따르면 DB 전체량을 줄임으로써 저용량의 DB를 구축하는 것이 가능하다.In order to obtain a high-quality voice recognition rate, the larger the amount of voice recognition DB 315 is, the more advantageous there is, but there are problems of delay and overload. Do.

또한, 번역 DB(325)에 대해서도, 상기 방법과 마찬가지로, 딥 러닝의 알고리즘을 이용하여 다양한 번역 예를 학습시키고, 번역 예의 빈도수에 따라 구어체 표현을 확대하고, 문어체 표현을 축소시켜 DB를 구축할 수 있다.Also, for the translation DB 325, similarly to the above method, various translation examples are learned using deep learning algorithms, colloquial expressions are expanded according to the frequency of translation examples, and written language expressions are reduced to build a DB. have.

따라서 본 발명에 따른 음성인식 모듈(310)과 번역 모듈(320)은, 빈도수를 고려하지 않고 구축된 DB 대비, 저용량의 음성인식 DB(315) 또는 번역 DB(325)를 이용할 수 있다.Accordingly, the speech recognition module 310 and the translation module 320 according to the present invention can use a low-capacity voice recognition DB 315 or a translation DB 325 compared to a DB constructed without considering the frequency.

네트워크(400)는 유선 및 무선 네트워크, 예를 들어 LAN(local area network), WAN(wide area network), 인터넷(internet), 인트라넷(intranet) 및 엑스트라넷(extranet), 그리고 모바일 네트워크, 예를 들어 셀룰러, 3G, LTE, WiFi 네트워크, 애드혹 네트워크 및 이들의 조합을 비롯한 임의의 적절한 통신 네트워크 일 수 있다.The network 400 is a wired and wireless network, for example, a local area network (LAN), a wide area network (WAN), an Internet, an intranet and an extranet, and a mobile network, for example. It may be any suitable communication network including cellular, 3G, LTE, WiFi networks, ad hoc networks, and combinations thereof.

네트워크(400)는 허브, 브리지, 라우터, 스위치 및 게이트웨이와 같은 네트워크 요소들의 연결을 포함할 수 있다. 네트워크(400)는 인터넷과 같은 공용 네트워크 및 안전한 기업 사설 네트워크와 같은 사설 네트워크를 비롯한 하나 이상의 연결된 네트워크들, 예컨대 다중 네트워크 환경을 포함할 수 있다. 네트워크(400)에의 액세스는 하나 이상의 유선 또는 무선 액세스 네트워크들을 통해 제공될 수 있다.Network 400 may include a connection of network elements such as hubs, bridges, routers, switches and gateways. Network 400 may include one or more connected networks, such as a multi-network environment, including a public network such as the Internet and a private network such as a secure corporate private network. Access to network 400 may be provided through one or more wired or wireless access networks.

도 3은 본 발명의 일 실시 예에 따른 스마트 의류의 블록도이다.3 is a block diagram of smart clothing according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 일 실시 예에 따른 스마트 의류(100)는 입력부(110), 출력부(120), PCB(130), 거리센서(140), 및 충전부(150)를 포함한다.Referring to FIG. 3, a smart clothing 100 according to an embodiment of the present invention includes an input unit 110, an output unit 120, a PCB 130, a distance sensor 140, and a charging unit 150. .

입력부(110)는 대화자 사이의 대화 내용 및 연설자와 청중들 사이에서 음성을 입력 받는 기능과 사용자로부터 문자 및 숫자를 입력 받는 기능을 한다. 입력부(110)는 마이크로폰(111) 및 키보드(112)를 포함한다.The input unit 110 performs a function of receiving a conversation content between a speaker and a voice between a speaker and an audience, and a function of receiving letters and numbers from a user. The input unit 110 includes a microphone 111 and a keyboard 112.

출력부(120)는 번역된 내용을 텍스트로 출력하고 통역된 내용을 음성으로 출력하는 역할을 한다. 출력부(120)는 디스플레이 디바이스(121) 및 스피커(122)를 포함한다.The output unit 120 serves to output the translated content as text and output the interpreted content as voice. The output unit 120 includes a display device 121 and a speaker 122.

입력부(110) 및 출력부(120)는 스마트 의류(100)의 외부에 위치하는 것을 특징으로 한다.The input unit 110 and the output unit 120 are located outside the smart clothing 100.

도 4는 본 발명의 일 실시 예에 따른 스마트 의류의 예시도이다.4 is an exemplary diagram of smart clothing according to an embodiment of the present invention.

도 4를 참조하면, 소리를 입력 또는 출력하는 마이크로폰(111) 및 스피커(122)는 스마트 의류의 가슴 부위에 위치할 수 있다. 다만, 각 도면에 표시된 각 위치는 여러 실시 예시 중의 하나에 불과하므로 이에 한정되는 것은 아니다.Referring to FIG. 4, a microphone 111 and a speaker 122 for inputting or outputting sound may be located on a chest of smart clothing. However, since each position shown in each drawing is only one of several exemplary embodiments, it is not limited thereto.

키보드(112)는 사용자가 사용하는 손의 위치를 고려해서, 예를 들어 오른손 잡이의 경우 왼팔에 위치하는 것이 바람직하다. 키보드는 터치 센서를 이용하여 터치 방식으로 구현될 수 있다.The keyboard 112 is preferably positioned on the left arm in the case of a right-handed person taking into account the position of the user's hand. The keyboard may be implemented in a touch method using a touch sensor.

디스플레이 디바이스(121)는 사용자 외의 타인이 볼 수 있도록, 스마트 의류의 가슴 및 배 부분의 정면과 등 부분의 후면 중에서 적어도 하나에 위치한다. 또한, 사용자가 볼 수 있도록, 디스플레이 디바이스(121)는 스마트 의류의 팔 부분에 위치한다.The display device 121 is positioned on at least one of the front of the chest and the abdomen and the rear of the back of the smart clothing so that others other than the user can see it. In addition, the display device 121 is positioned on the arm of the smart clothing so that the user can see it.

PCB(130)는 기계적 구성요소로서 그 내부에 각종 전자 부품 등을 포함할 수 있다. PCB(130)는 플랙서블 플랫 케이블(flexible flat cable, FFC)을 이용하여 필름 형태로 구현될 수 있다. PCB(130)는 특별한 구애를 받지 않고 스마트 의류(100)에서 위치할 수 있으며, 특히 어때 또는 팔 부분에 위치할 수 있다.The PCB 130 is a mechanical component and may include various electronic components and the like therein. The PCB 130 may be implemented in the form of a film using a flexible flat cable (FFC). The PCB 130 may be located on the smart clothing 100 without any particular restrictions, and may be located on the arm or in particular.

거리센서(140)는 적외선 등을 통해 대화자 또는 청중들과의 거리를 측정하는 역할을 한다. 거리센서(140)는 그 특성상 스마트 의류(100)의 정면에 위치하는 것이 바람직하다.The distance sensor 140 serves to measure the distance to the talker or the audience through infrared light or the like. The distance sensor 140 is preferably located in front of the smart clothing 100 due to its characteristics.

충전부(150)는 스마트 의류(100)의 전원부를 이용하여 사용자 단말(200)을 유선 또는 무선으로 충전하는 역할을 한다. 사용자 단말(200)은 USB 단자를 통해 유선으로 충전되거나, 스마트 의류(100)의 포켓에 담긴 상태에서 무선전력을 송신하는 송신기로부터 전력을 전달받아 무선 충전될 수도 있다.The charging unit 150 serves to charge the user terminal 200 by wire or wirelessly using the power supply of the smart clothing 100. The user terminal 200 may be charged by wire through a USB terminal, or may be charged wirelessly by receiving power from a transmitter that transmits wireless power while contained in a pocket of the smart clothing 100.

그리고 기계적 구성에 해당하는 PCB(130)는 제어부(1310), 전원부(132), 통번역부(1300), 오디오 처리부(134) 및 노이즈 처리부(135)를 포함한다.And the PCB 130 corresponding to the mechanical configuration includes a control unit 1310, a power supply unit 132, an interpretation and translation unit 1300, an audio processing unit 134 and a noise processing unit 135.

제어부(131)는 스마트 의류(100)의 각 구성요소를 제어하는 역할을 한다. 예를 들어 거리센서(140)의 센싱 정보를 이용하여 출력부(120)를 통해 출력되는 음성의 음량 및 텍스트의 크기를 제어한다.The control unit 131 serves to control each component of the smart clothing 100. For example, the volume of the voice output through the output unit 120 and the size of the text are controlled using the sensing information of the distance sensor 140.

전원부(132)는 입력부(110) 및 출력부(120)와 그 밖에 스마트 의류(100)의 입력, 출력 및 통신 기능에 필요한 전력을 공급하는 역할을 한다. 전원부(132)의 구성요소로서, 리툼이온 등의 2차 전지, 충전과 방전을 위한 회로 및 보호회로를 포함한다.The power supply unit 132 serves to supply power required for input, output, and communication functions of the input unit 110 and the output unit 120 and other smart clothing 100. As constituent elements of the power supply unit 132, a secondary battery such as Litum ion, a circuit for charging and discharging, and a protection circuit are included.

통번역부(133)는 입력된 음성 신호를 녹음하고, 녹음된 음성 데이터를 이용하여 음성인식, 번역, 통역 및 음성합성 기능을 수행한다. 통번역부(133)에 의한 통번역에 관한 전체 기능은 통번역 서버(300)에 의해 수행되는 기능과 유사하다.The interpretation and translation unit 133 records the input voice signal and performs voice recognition, translation, interpretation, and speech synthesis functions using the recorded voice data. The entire function of interpretation and translation by the interpretation and translation unit 133 is similar to the function performed by the interpretation and translation server 300.

오디오 처리부(134)는 통번역부(133)가 음성 신호를 이용하여 번역 및 통역을 수행함에 있어서 음성 신호를 전처리하는 역할을 한다. 스마트 의류(100) 에 마이크로폰(111)은 다수 개 포함될 수 있다. 마이크로폰(111)의 위치에 따라 입력되는 음성은 음량 차이를 보인다. 예를 들어, 사용자 정면에 있는 대화자와 대화를 하는 경우에, 배면보다는 정면에 위치한 마이크로폰(111)을 통해 더 큰 음성이 입력된다. 그리고 음성에 포함된 노이즈가 있는 경우에, 노이즈의 크기는 스마트 의류의 정면 또는 배면에 상관없이 일정한 크기로 입력될 수 있다. 이 경우, 오디오 처리부(134)는 크기에 변화를 보이는 입력을 음성신호로 인식하고 이를 필터링할 수 있다.The audio processing unit 134 serves to pre-process the audio signal when the interpreter and translation unit 133 performs translation and interpretation using the audio signal. A plurality of microphones 111 may be included in the smart clothing 100. Voice input according to the location of the microphone 111 shows a difference in volume. For example, when having a conversation with a talker in front of the user, a louder voice is input through the microphone 111 located in front of the user rather than in the rear. In addition, when there is noise included in the voice, the size of the noise may be input at a constant size regardless of the front or back of the smart clothing. In this case, the audio processing unit 134 may recognize an input having a change in size as a voice signal and filter it.

노이즈 처리부(135)는 음성신호에 포함된 노이즈 파형을 분석하고, 분석된 파형을 상쇄하는 간섭파를 발생시키는 역할을 한다. 오디오 처리부(134)에 의해 분리된 음성신호에는 노이즈 신호가 포함되어 있을 수 있다. 이 경우 노이즈의 파형을 분석한 후, 분석된 파형에 반대되는 간섭파를 발생시키면 노이즈가 능동적으로 제거될 수 있다. 이 경우에도 노이즈 처리부(135)는 노이즈 신호와 간섭파의 합성파를 마이크로폰(111)을 통해 피드백 받은 후에, 남아 잇는 노이즈를 포함하는 합성파가 다시 상쇄되도록 간섭파를 보정하여 노이즈 처리를 할 수 있다.The noise processing unit 135 analyzes a noise waveform included in the voice signal and generates an interference wave that cancels the analyzed waveform. The audio signal separated by the audio processing unit 134 may contain a noise signal. In this case, after analyzing the waveform of the noise, if an interference wave opposite to the analyzed waveform is generated, the noise can be actively removed. Even in this case, the noise processing unit 135 may perform noise processing by correcting the interference wave so that the synthesized wave including the remaining noise is canceled again after receiving the feedback of the synthesized wave of the noise signal and the interference wave through the microphone 111. have.

통신부(136)는 사용자 단말(200) 및 통번역 서버(300)와 통신을 수행한다. 통신부(136)는, 네트워크(400)의 각종 통신망에 대응하는 통신 모듈, 예를 들어 블루투스 모듈, WiFi 모듈, 이더넷, USB 모듈, 셀룰러 무선통신 모듈을 포함한다. 특히 유선 통신으로서 USB 모듈, 근거리 통신으로서 블루투스 모듈, 지그비 모듈, NFC 모듈 및 WiFi 모듈이 통신부(136)에 포함될 수 있다.The communication unit 136 communicates with the user terminal 200 and the interpretation and translation server 300. The communication unit 136 includes communication modules corresponding to various communication networks of the network 400, for example, a Bluetooth module, a WiFi module, an Ethernet, a USB module, and a cellular wireless communication module. In particular, a USB module as wired communication, a Bluetooth module, a Zigbee module, an NFC module, and a WiFi module as short-range communication may be included in the communication unit 136.

그 밖에 PCB(130)는 음성인식 DB, 번역 DB 및 음성 등을 저장하는 저장 장치를 포함한다. 저장 장치에는 사용자 단말(200)과 통번역 서버(300)를 연동시키기 위한 클라이언트 프로그램 등이 저장된다. 여기서 저장 장치는 휘발성의 RAM 및 비휘발성의 ROM, 플래시 메모리를 포함하고, 그 기능에 따라 각종 디지털 파일을 저장한다. 특히 저장 장치는 TTS 엔진을 저장함으로써, 통번역 서버(300)의 음성합성 기능이 스마트 의류(100)에서 직접 수행되도록 할 수 있다.In addition, the PCB 130 includes a voice recognition DB, a translation DB, and a storage device that stores voice. A client program for linking the user terminal 200 and the interpretation and translation server 300 is stored in the storage device. Here, the storage device includes a volatile RAM, a nonvolatile ROM, and a flash memory, and stores various digital files according to their functions. In particular, by storing the TTS engine in the storage device, the speech synthesis function of the interpretation and translation server 300 may be performed directly in the smart clothing 100.

발화자 사이에서 어느 한 측의 발화가 끝나고 상대방의 발화가 있다는 보장은 없다. 따라서 동시에 발생할 수 있는 발화에 있어서, 동시에 입력되는 이종의 음성을 구별할 필요가 있다. 이를 해결하기 위해, 제어부(131)는, 발화자 음성의 사운드 스펙트럼을 이용하여 음색의 특징을 결정하고, 결정된 음색의 특징을 이용하여 동시 발화된 이종 언어의 음성을 필터를 이용하여 필터링한다. 이에 따라 이종 언어 음성의 발화자가 구별되고, 필터링에 의해 이종 언어의 음성이 서로 분리 될 수 있다.There is no guarantee that one side's utterance will end between the talkers and the other's utterance will be. Therefore, it is necessary to distinguish between different types of voices input at the same time in speech that can occur simultaneously. To solve this problem, the control unit 131 determines a characteristic of a tone using the sound spectrum of the speaker's voice, and filters the voice of a heterogeneous language simultaneously uttered using the determined feature of the tone using a filter. Accordingly, speakers of heterogeneous language voices are distinguished, and voices of heterogeneous languages may be separated from each other by filtering.

더 나아가, 제어부(131)는, 동시 발화된 이종 언어의 음성에 대해, 샘플 음성의 번역 결과에 따른 점수(scoring)를 이용하여 이종 언어들이 어느 나라의 언어에 해당하는지 구별하는 것을 특징으로 한다.Further, the control unit 131 is characterized in that, with respect to the voices of the heterogeneous languages spoken at the same time, by using a scoring according to the translation result of the sample voices, it is characterized in that the different languages correspond to the languages of which countries.

구체적으로 영어 및 국어의 음성이 혼재되어 입력되는 경우에, 영어 발화자의 음색과 국어 발화자의 음색의 특징에 따른 필터링된 음성 신호에 대해 하나의 음성 신호에 대해 영어 및 국어로, 다른 하나의 음성 신호에 대해 국어 및 영어로 번역을 시도하여 이를 점수로 환산하여 가장 높은 점수를 획득한 번역을 채택함으로써 해당 언어가 어느 나라 언어인지를 결정한다.Specifically, when the voices of English and Korean are mixed and input, one voice signal for the filtered voice signal according to the tone of the English speaker and the voice of the Korean speaker, and the other voice signal It attempts to translate into Korean and English, converts it into a score, and selects the translation with the highest score to determine which language is the corresponding language.

본 발명의 일 실시 예에 따라, 클라이언트 프로그램을 이용하여 스마트 의류(100) 또는 사용자 단말(200)이 담당하게 될 언어 설정이 자동으로 수행될 수 있다. 즉, 제어부(131)는 사용자 단말(200)의 설정 언어를 참조하여 출발어를 한국어를 자동 설정할 수 있다.According to an embodiment of the present invention, language setting that the smart clothing 100 or the user terminal 200 will be in charge of may be automatically performed using a client program. That is, the control unit 131 may automatically set Korean as a starting language by referring to the language set in the user terminal 200.

추가적으로 이종 언어로 설정된 타 사용자 단말(200)이 일정 거리 내에 있는 경우, 타 사용자 단말(200)의 언어 설정을 참조하여 출발어와 목적어가 자동 설정될 수 있다.Additionally, when the other user terminal 200 set as a different language is within a certain distance, a starting word and an object word may be automatically set by referring to the language setting of the other user terminal 200.

또한, 제어부(131)는 통변역부(133) 통해 발화되는 음성의 샘플을 이용하여 번역의 완성도에 대한 점수를 매겨서 가장 높은 점수를 받은 언어를 도착어로 자동 설정한다. 예를 들어 상대방의 발화 음성이 국어 외의 외국어에 해당하는 경우에, 발화 음성을 영어, 일어, 및 중국어 등으로 번역한 후에, 번역된 결과에 점수를 매겨서 가장 높은 점수를 받은 언어를 사용자 단말(200)의 출발어인 한국어에 대한 도착어로 설정하게 된다.In addition, the control unit 131 scores the completeness of the translation using samples of speech spoken through the interpretation and interpretation unit 133, and automatically sets the language with the highest score as the destination language. For example, if the spoken voice of the other party corresponds to a foreign language other than Korean, the user terminal 200 translates the spoken voice into English, Japanese, Chinese, etc., and then scores the translated result and receives the highest score. ) Is set as the destination language for Korean, which is the starting language.

이하 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 방법(S100)에 대해 설명하기로 한다.Hereinafter, an interpretation and translation method (S100) using smart clothing according to an embodiment of the present invention will be described.

도 5는 본 발명의 일 실시 예에 따른 스마트 의류를 이용하는 통번역 방법의 예시도이다.5 is an exemplary diagram of an interpretation and translation method using smart clothing according to an embodiment of the present invention.

도 5를 참조하면, 본 발명의 일 실시 에에 따른 스마트 의류를 이용하는 통번역 방법(S100)은 S110 내지 S140 단계를 포함한다.5, the interpretation and translation method (S100) using smart clothing according to an embodiment of the present invention includes steps S110 to S140.

먼저, 발화되는 음성을 스마트 의류(100)의 입력부(110)를 통해 수집하여 음성신호로 변환한다. 입력된 음성은 입력부(110)의 마이크로폰(111)을 통해 음성신호로 변환된다.First, the spoken voice is collected through the input unit 110 of the smart clothing 100 and converted into a voice signal. The input voice is converted into a voice signal through the microphone 111 of the input unit 110.

다음으로 오디오 처리부(134) 및 노이즈 처리부(135) 중에서 적어도 하나를 이용하여 음성신호를 전처리한다(S120).Next, the audio signal is preprocessed using at least one of the audio processing unit 134 and the noise processing unit 135 (S120).

음성신호를 전처리하는 단계는 다음의 단계 중에서 적어도 하나의 단계를 포함한다.The step of pre-processing the voice signal includes at least one of the following steps.

마이크로폰의 위치에 따른 음성의 음량 차이를 이용해서 음성신호를 인식하고, 이를 필터링한다(S121).The voice signal is recognized by using the difference in the volume of the voice according to the location of the microphone and filtered (S121).

음성신호에 포함된 노이즈 신호의 파형을 분석하고, 분석된 파형을 상쇄하는 간섭파를 발생시키고, 노이즈 신호와 상기 간섭파의 합성파를 피드백 받아 상기 합성파가 상쇄되도록 상기 간섭파를 보정한다(S122).Analyzes the waveform of the noise signal included in the speech signal, generates an interference wave that cancels the analyzed waveform, and corrects the interference wave so that the synthesized wave is canceled by receiving feedback from the noise signal and the synthesized wave of the interference wave S122).

도 6은 본 발명의 일 실시 예에 따른 음성신호 전처리 단계의 예시도이다.6 is an exemplary diagram of a preprocessing step of a voice signal according to an embodiment of the present invention.

도 6을 참조하면, 마이크로폰(111)으로 입력된 노이즈가 포함된 음성은 마이크로폰(111)을 거치면서 음성신호 및 노이즈 신호로 변환된다. 오디오 처리부(134)는 스마트 의류(100)에 포함된 서로 다른 위치에 있는 다수의 마이크로폰의 출력 비교를 통해, 예를 정면의 마이크로폰의 출력과 배면의 마이크로폰의 출력을 비교하여 노이즈 외의 음성신호를 인식하고 이를 필터링한다.Referring to FIG. 6, a voice including noise input through the microphone 111 is converted into a voice signal and a noise signal while passing through the microphone 111. The audio processing unit 134 recognizes voice signals other than noise by comparing the output of a plurality of microphones at different locations included in the smart clothing 100, for example, by comparing the output of the front microphone with the output of the rear microphone. And filter it.

필터링된 음성신호에는 여전히 노이즈가 포함되어 있다. 노이즈 처리부(135)는 음성신호에 포함된 노이즈 신호를 1차 상쇄 간섭시킬 수 있는 1차 간섭파를 생성한다. 그리고 출력되는 음성신호에서 여전히 포함된 노이즈를 분리하고 분리된 노이즈에 대한 2차 상쇄 간섭을 위한 2차 간섭파를 1차 간섭파에 추가시킴으로써 1차 간섭파를 보정한다.The filtered voice signal still contains noise. The noise processing unit 135 generates a first-order interference wave capable of first-order destructive interference with a noise signal included in the voice signal. In addition, the primary interference wave is corrected by separating the noise still contained in the output voice signal and adding a secondary interference wave for secondary destructive interference to the separated noise to the primary interference wave.

다음으로 음성신호를 이용하여 음성 인식 과정을 거쳐 번역 및 통역을 수행한다(S130).Next, translation and interpretation are performed through a voice recognition process using the voice signal (S130).

여기서, 번역 및 통역은 스마트 의류(100) 단독으로 또는 유선 또는 무선으로 통신하는 사용자 단말(100) 및 통번역 서버(300)를 통해 수행될 수 있다.Here, the translation and interpretation may be performed by the smart clothing 100 alone or through the user terminal 100 and the interpretation and translation server 300 communicating by wire or wirelessly.

다음으로 번역 및 통역 결과를 스마트 의류를 이용하여 텍스트 및 합성된 음성 중에서 적어도 하나의 형태로 출력한다(S140). 여기서, 스마트 의류(100)는 거리센서를 센싱 정보를 이용하여 출력부(120)를 통해 출력되는 음성의 음량 및 텍스트의 크기를 제어할 수 있다.Next, the translation and interpretation results are output in at least one form of text and synthesized voice using smart clothing (S140). Here, the smart clothing 100 may control the volume of the voice output through the output unit 120 and the size of the text using the sensing information of the distance sensor.

상기 도면을 통해 설명된 일 실시 예에 따른 스마트 의류를 이용하는 통번역 방법(S100)은, 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행이 가능한 명령어 셋을 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다.The interpretation and translation method (S100) using smart clothing according to an embodiment described through the above drawings may be implemented in the form of a recording medium including an instruction set executable by a computer such as a program module executed by a computer. I can. Computer-readable media can be any available media that can be accessed by a computer, and includes both volatile and nonvolatile media, removable and non-removable media. Further, the computer-readable medium may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transmission mechanism, and includes any information delivery media.

이와 같이 본 발명의 일 실시 예에 따르면, 스마트 의류를 이용하여 오디오 및 비디오를 통해 통번역 결과를 출력할 수 있다.As described above, according to an embodiment of the present invention, an interpretation and translation result may be output through audio and video using smart clothing.

또한, 스마트 의류를 이용하여 대화자 사이 또는 화자와 청중 사이의 거리에 따라 출력되는 음성 및 텍스트의 크기가 자동 조절되도록 통번역 결과를 출력할 수 있다.In addition, it is possible to output the result of interpretation and translation so that the size of the voice and text output is automatically adjusted according to the distance between the talker or between the speaker and the audience using smart clothing.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present invention is for illustrative purposes only, and those of ordinary skill in the art to which the present invention pertains will be able to understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative and non-limiting in all respects. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.

100: 스마트 의류 110: 입력부
111: 마이크로폰 112: 키보드
120: 출력부 121: 디스플레이 디바이스
122: 스피커 130: PCB
131: 제어부 132: 전원부
133: 통번역부 134: 오디오 처리부
135: 노이즈 처리부 140: 거리센서
150: 충전부 200: 사용자 단말
300: 통역 서버 400: 네트워크
500: 컴퓨팅 장치100: smart clothing 110: input unit
111: microphone 112: keyboard
120: output unit 121: display device
122: speaker 130: PCB
131: control unit 132: power supply unit
133: interpretation and translation unit 134: audio processing unit
135: noise processing unit 140: distance sensor
150: charging unit 200: user terminal
300: interpretation server 400: network
500: computing device

Claims

In the interpretation and translation system using smart clothing that performs translation and interpretation for voice spoken through connection and communication between smart clothing and a user terminal and an interpretation and translation server,
The smart clothing,
An input unit for collecting speech for conversion from spoken speech to speech signal;
An output unit for outputting the translated content of the voice content recognized using the voice signal in at least one of text and synthesized voice;
A distance sensor for recognizing a distance from a user to a talker or an audience;
A control unit controlling a volume of a voice and a volume of a text output through the output unit by using sensing information of the distance sensor;
A power supply unit supplying power to the input unit and the output unit; And
A charging unit for charging the user terminal using the power unit; Including,
The interpretation and translation server,
A voice recognition module and a voice recognition DB for recognizing voices collected from the input unit;
Includes; a translation module and a translation DB for analyzing the speech of the talker uttered in the starting language recognized by the speech recognition module and translating the target language,
The speech recognition DB is a DB constructed by learning speech due to a plurality of speech examples using a deep learning algorithm and reducing or expanding the recognition range according to the frequency of speech content,
The translation DB is a DB constructed by learning a plurality of translation examples using a deep learning algorithm, expanding the colloquial expression according to the frequency of the translation examples, and reducing the written expression.
Interpretation and translation system using smart clothing.

delete

The method according to claim 1,
The smart clothing,
It includes a plurality of microphones of the input unit over the front and rear,
An interpretation and translation system using smart clothing, further comprising an audio processing unit for recognizing and filtering a voice signal by using a difference in volume of a voice according to a location of a microphone.

The method according to claim 1,
The smart clothing,
Noise that analyzes the waveform of the noise signal included in the voice signal, generates an interference wave that cancels the analyzed waveform, and corrects the interference wave so that the synthesized wave is canceled by receiving feedback from the noise signal and the synthesized wave of the interference wave Interpretation and translation system using smart clothing, characterized in that it further comprises a processing unit.

delete

The method according to claim 1,
The output unit includes a display device,
The display device,
It is located on at least one of the front of the chest and abdomen and the back of the back of the smart clothing so that others other than the user can see it,
An interpretation and translation system using smart clothing, characterized in that it is located on the arm of the smart clothing for the user to see.

delete