KR20200049404A

KR20200049404A - System and Method for Providing Simultaneous Interpretation Service for Disabled Person

Info

Publication number: KR20200049404A
Application number: KR1020180132634A
Authority: KR
Inventors: 강병진
Original assignee: 강병진
Priority date: 2018-10-31
Filing date: 2018-10-31
Publication date: 2020-05-08
Also published as: KR102299571B1

Abstract

The present invention provides a simultaneous interpretation service providing system for a disabled person which outputs a speaker′s content to be transferred in a voice, a sign language image, and text to provide the disabled person with the speaker′s content to be transferred and provides the text at the same time to prevent a misunderstanding caused by an error in the sign language image, and a method thereof. To achieve this, according to the present invention, the simultaneous interpretation service providing system for a disabled person comprises: an interpreter terminal and a listener terminal operated by operation of an interpreter and a listener, respectively, and executing an application installed between a simple manner of selecting a simultaneous interpretation service displayed on a display window (screen) to be connected to a simultaneous interpretation server; and the simultaneous interpretation server collecting and converting the speaker′s voice or sign language image for simultaneous interpretation into digital data, transmitting the digital data to the interpreter terminal, collecting an interpretation voice or sign language image from the interpreter terminal, converting the interpretation voice or sign language image into second text data thereof, and transmitting a simultaneous interpretation voice, a simultaneous interpretation sign language image, and the second text data to the listener terminal.

Description

System and Method for Providing Simultaneous Interpretation Service for Disabled Persons

본 발명은 장애인을 위한 동시통역 서비스 제공 시스템 및 그 방법에 관한 것으로, 특히 적어도 하나의 장애인 단말기 및 적어도 하나의 통역사 단말기를 연계시켜 긴급 상황에 통역이 필요한 경우 동시 통역 서비스를 제공하는 장애인을 위한 동시통역 서비스 제공 시스템 및 그 방법에 관한 것이다.The present invention relates to a system and method for providing simultaneous interpretation services for persons with disabilities, and in particular, for persons with disabilities who provide simultaneous interpretation services when an emergency situation requires interpretation by linking at least one disabled terminal and at least one interpreter terminal. The system and method for providing an interpreter service.

근래에 국제 교류가 활발히 이루어지면서 다른 언어권의 외국인과 통화 또는 대화할 경우가 증가되고 있으며, 이에 따라 외국인과의 원활한 의사 소통을 위한 통역 수단이 요구되고 있다.Recently, as international exchanges have been actively conducted, the number of cases of talking or speaking with foreigners in other languages has increased, and accordingly, an interpreter is required for smooth communication with foreigners.

외국인과의 의사소통을 위한 통역 수단으로, 서로 다른 언어를 사용하는 원격지의 외국인과도 의사 소통을 수행할 수 있는 양방향 통역 수단으로서, 국내 공개특허 제2002-54192호(발명의 명칭: 외국인을 위한 전화안내 자동 통역 시스템 및 방법)에는 외국인 사용자가 자국어로 질의하면 질의 내용을 자동 통역하여 내국인 안내원에게 전달하고, 내국인 안내원이 질의에 대해 자국어로 응답하면 응답 내용을 자동 통역하여 외국인 사용자에게 전달하는 전화안내 자동 통역 시스템이 개시되어 있다.As an interpreter for communication with foreigners, as a two-way interpreter that can communicate with foreigners in remote locations using different languages, Korean Patent Publication No. 2002-54192 (invention name: for foreigners In the automatic telephone interpretation system and method), when a foreign user queries in their native language, the contents of the query are automatically interpreted and delivered to the local guide, and when the local guide answers the query in the native language, the response is automatically interpreted and delivered to the foreign user. Disclosed is an automatic telephone interpretation system.

그러나, 청각 장애인의 경우 동시통역을 진행하는 통역사의 통역음성을 들을 수 없어 이러한 통역 수단을 이용하는데 어려움이 있다.However, in the case of the hearing impaired, it is difficult to use such an interpreter because the interpreter's voice is not heard by an interpreter who conducts simultaneous interpretation.

특히, 현재 장애인의 통역 수단으로 제공되고 있는 시스템으로는 자막 제공 시스템이 적용되고 있다. 자막 제공시스템은 텔레비전이나 대규모 회의석상 또는 미국 등의 선진국에서 장애인에게 제공되는 자막 통화(Telecommunications Relay Service; TRS) 등에 사용되고 있다.In particular, as a system currently provided as a means of interpretation for the disabled, a subtitle providing system is applied. The subtitle providing system is used in a television, a large conference seat, or a Telecommunications Relay Service (TRS) provided to people with disabilities in developed countries such as the United States.

하지만 텔레비전이나 대규모 회의석상에서 제공되는 자막 시스템은 장애인이 상대방과의 일상적인 대화를 위해 사용되는 시스템이 아니며, 자막 통화 역시 먼 거리에 있는 상대방과 전화 통화를 하기 위한 보조 수단으로 사용되고 있으므로 장애인이 일상적인 생활에서 상대방과 대화하기 위하여 수화나 필기가 가장 많이 사용되고 있다. However, the subtitle system provided on a television or a large conference room is not a system used by people with disabilities to communicate with their counterparts, and subtitle calls are also used as an auxiliary means to make phone calls with distant parties. Sign language or handwriting is most often used to communicate with the other person in an everyday life.

그러나 수화에 대한 지식이 없는 상대방이 알아들을 수 없기 때문에 일반적인 의사소통의 수단으로 이용하기 어려우며, 필기는 상대방도 동일하게 필기를 사용하여 의사소통을 해야 하는 불편함뿐만 아니라 장애인이 상대방과 더불어 살아가는 환경으로 나아가는 것에 대하여 장애물이 되고 있는 실정이다.However, it is difficult to use it as a means of general communication because the other person without knowledge of sign language cannot understand, and writing is not only a discomfort for the other person to use the same handwriting to communicate, but also an environment where the disabled live with the other person. The situation is becoming an obstacle to moving forward.

등록특허공보 제10-1454745호 (등록일자 2014.10.20.)Registered Patent Publication No. 10-1454745 (Registration Date 2014.10.20.) 등록특허공보 제10-0911717호 (등록일자 2009.08.04.)Registered Patent Publication No. 10-0911717 (Registration date 2009.08.04.)

따라서 본 발명은 상기와 같은 문제점을 해결하기 위해 안출한 것으로서, 적어도 하나의 장애인 단말기 및 적어도 하나의 통역사 단말기를 연계시켜 긴급 상황에 통역이 필요한 경우 동시 통역 서비스를 제공하는 장애인을 위한 동시통역 서비스 제공 시스템 및 그 방법을 제공하는데 그 목적이 있다.Accordingly, the present invention was devised to solve the above problems, and provides simultaneous interpretation services for the disabled who provide simultaneous interpretation services when an emergency situation requires interpretation by linking at least one disabled terminal and at least one interpreter terminal. The purpose is to provide a system and method.

또한, 본 발명은 화자(장애인)의 전달하고자 내용을 음성, 수화영상 및 텍스트로도 볼 수 있도록 함으로써 또 다른 장애인에게 화자의 전달내용을 제공할 수 있을 뿐만 아니라, 텍스트를 함께 제공함에 따라 수화영상의 오류로 인한 오해를 방지하는 동시통역 서비스 제공 시스템 및 그 방법을 제공하는데 그 목적이 있다.In addition, according to the present invention, the contents to be delivered by the speaker (disabled person) can be viewed through voice, sign language, and text, so that the contents of the speaker can be provided to another person with a disability as well as by providing text. The purpose of the present invention is to provide a system and method for providing a simultaneous interpretation service to prevent misunderstandings due to errors in the system.

또한, 본 발명은 통역사가 소유하고 있는 스마트폰을 이용하여 어플리케이션 설치를 통해 화자의 음성을 청취하여 동시통역을 제공할 수 있도록 하여 통역사의 시간적, 공간적 제약을 제거할 수 있는 동시통역 서비스 제공 시스템 및 그 방법을 제공하는데 그 목적이 있다. In addition, the present invention provides a simultaneous interpretation service providing system that can remove the temporal and spatial constraints of the interpreter by enabling the simultaneous interpretation by listening to the speaker's voice through application installation using a smartphone owned by the interpreter and The purpose is to provide the method.

본 발명의 다른 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.Other objects of the present invention are not limited to those mentioned above, and other objects not mentioned will be clearly understood by those skilled in the art from the following description.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 장애인을 위한 동시통역 서비스 제공 시스템의 특징은 통역사 및 청취자의 조작에 의해 구동되고, 표시창(화면)에 표시되는 동시통역 애플리케이션을 선택하는 간단한 방법으로 설치된 애플리케이션을 실행하여 동시통역 서버에 접속되는 통역사 단말기 및 청취자 단말기와, 동시통역을 위한 화자 음성 또는 화자 수화영상을 수집하여 디지털 데이터로 변환하고, 상기 통역사 단말기로 전달하며, 상기 통역사 단말기로부터 통역된 음성 또는 통역 수화영상을 수집하고 통역음성 또는 통역 수화영상에 따른 제 2 텍스트 데이터로 변환하여, 청취자 단말기로 상기 통역음성, 통역 수화영상 및 제 2 텍스트 데이터를 전달하는 동시통역 서버를 포함한다.A feature of the system for providing a simultaneous interpretation service for a disabled person according to the present invention for achieving the above object is installed by a simple method of selecting a simultaneous interpretation application displayed on a display window (screen) driven by an interpreter and a listener. An interpreter terminal and a listener terminal connected to a simultaneous interpretation server by executing an application, and a speaker voice or speaker sign language image for simultaneous interpretation is collected and converted into digital data, transmitted to the interpreter terminal, and the interpreted voice from the interpreter terminal Or a simultaneous interpretation server that collects an interpreted sign language image and converts it into second text data according to the interpreted voice or interpreted sign language image, and delivers the interpreted voice, interpreted sign language image, and second text data to a listener terminal.

바람직하게 상기 동시통역 서버는 수집된 화자 음성 또는 화자 수화영상을 화자 음성 또는 화자 수화영상에 따른 제 1 텍스트 데이터로 변환하여, 상기 청취자 단말기로 상기 화자 음성, 화자 수화영상 및 제 1 텍스트 데이터를 전달하는 것을 특징으로 한다.Preferably, the simultaneous interpretation server converts the collected speaker voice or speaker sign language image into first text data according to the speaker voice or speaker sign language image, and delivers the speaker voice, speaker sign language image, and first text data to the listener terminal. It is characterized by.

바람직하게 상기 동시통역 서버는 화자 음성을 수집하는 화자 음성 수집부와, 상기 화자 음성 수집부에서 수집된 화자 음성을 디지털 데이터화하고 인코딩하여 압축하는 오디오 처리부와, 상기 오디오 처리부에서 압축된 화자 음성을 통역사 단말기 및 청취자 단말기 중 적어도 하나로 전송하고, 상기 통역사 단말기에서 전달되는 통역음성을 청취자 단말기로 전송하며, 영상 수집부에서 수집된 화자 수화영상 또는 통역 수화영상을 청취자 단말기로 전송하는 전송부와, 상기 통역사 단말기로부터 전송된 화자 음성에 대응되는 통역음성을 수집하는 통역음성 수집부와, 상기 수집된 통역음성을 통역음성에 따른 제 2 텍스트 데이터로 변환하고, 상기 화자 음성 수집부에서 수집된 화자 음성을 화자 음성에 따른 제 1 텍스트 데이터로 변환하며, 영상 수집부에서 수집된 통역 수화영상 또는 화자 수화영상을 수화영상에 따른 문장을 이용하여 제 2 텍스트 데이터로 변환하는 텍스트 변환부와, 화자 수화영상 및 상기 통역사 단말기로부터 전송되는 통역 수화영상을 수집하는 영상 수집부와, 상기 영상 수집부에서 수집된 수화영상을 변형되는 수화자(화자 또는 통역사)의 손의 움직임과 비수지 신호(nonmanual signals, 얼굴표정과 몸짓) 별로 시간단위 영상으로 분할하는 영상 분할부와, 상기 영상 분할부에서 분할된 영상을 시간 순으로 나열하고, 손의 움직임과 비수지 신호에 따라 저장부에 미리 저장하고 있는 문자와 비교를 통해 매칭되는 모음 및 자음, 숫자, 특수문자를 포함하는 문자를 검출하고, 상기 검출된 문자를 시간 순으로 결합하여 문장을 생성하는 문자 생성부를 포함한다.Preferably, the simultaneous interpretation server interprets the speaker voice collecting unit for collecting the speaker voice, the audio processor for digitally encoding and encoding the speaker voice collected by the speaker voice collecting unit, and the speaker voice compressed by the audio processor A transmission unit transmitting at least one of a terminal and a listener terminal, transmitting an interpreter voice transmitted from the interpreter terminal to a listener terminal, and transmitting a speaker sign language image or an interpreter sign language image collected by the video collection unit to a listener terminal, and the interpreter An interpreter voice collecting unit that collects an interpreter voice corresponding to the speaker voice transmitted from the terminal, and converts the collected interpreter voice into second text data according to the interpreter voice, and converts the speaker voice collected by the speaker voice collector Converts to the first text data according to the voice and collects images A text conversion unit for converting the interpreted sign language image or the speaker sign language image to second text data by using a sentence according to the sign language image, and a video collection unit for collecting the speaker sign language image and the interpreted sign language image transmitted from the interpreter terminal And, an image segmentation unit for dividing the sign language image collected by the image collection unit into time-based images for each hand movement and non-manual signals (face expressions and gestures) of the deformed speaker (speaker or interpreter), Characters including vowels and consonants, numbers, and special characters that are matched by arranging the images divided by the image division unit in chronological order and comparing them with characters previously stored in the storage unit according to hand movements and non-resin signals. And a character generation unit that generates a sentence by combining the detected characters in chronological order.

바람직하게 상기 전송부는 상기 텍스트 변환부를 통해 상기 화자 음성, 통역음성, 화자 수화영상 및 통역 수화영상 중 적어도 하나가 텍스트로 변환된 텍스트 파일을 청취자 단말기로 전송하는 것을 특징으로 한다.Preferably, the transmitting unit is characterized in that at least one of the speaker voice, the interpreter voice, the speaker sign language image and the interpreter sign language image is transmitted to the listener terminal through the text conversion unit.

바람직하게 상기 통역음성 수집부는 전송된 화자 음성 또는 화자 수화영상에 포함된 제 1 고유코드 별로 수집된 통역음성 또는 화자 수화영상을 선별하고, 상기 통역사 단말기가 3개 이상인 경우, 상기 제 1 고유코드와 함께 통역사 단말기에 포함된 제 2 고유코드 별로 수집된 통역음성 또는 통역 수화영상을 선별하는 것을 특징으로 한다.Preferably, the translator voice collection unit selects the translator voice or speaker sign language image collected for each first unique code included in the transmitted speaker voice or speaker sign language image, and when there are three or more interpreter terminals, the first unique code and It is characterized by selecting an interpreter voice or interpreter sign language image collected for each second unique code included in the interpreter terminal.

바람직하게 상기 동시통역 서버는 상기 통역음성 수집부에서 수집된 적어도 3개 이상의 통역음성 또는 통역 수화영상의 매칭도를 서로 비교하는 매칭 처리부와, 상기 매칭 처리부에서 매칭도가 가장 높은 하나의 통역음성 또는 통역 수화영상을 선별하는 통역음성 선별부를 더 포함하는 것을 특징으로 한다.Preferably, the simultaneous interpretation server includes a matching processing unit that compares the matching degree of at least three or more interpretation voices or interpretation sign language images collected by the interpretation and voice collection unit, and one interpretation voice having the highest matching degree in the matching processing unit, or It is characterized in that it further comprises an interpreter speech selector to select the interpreter sign language image.

바람직하게 상기 매칭 처리부는 통역음성 또는 통화 수화영상을 음절별로 서로 비교하여 서로 다른 음절이 발생되는 횟수를 가지고 매칭도를 비교하는 것을 특징으로 한다.Preferably, the matching processing unit is characterized by comparing the interpretation degree with the number of times that different syllables are generated by comparing the interpreted voice or call sign image with each syllable.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 장애인을 위한 동시통역 서비스 제공 방법의 특징은 (A) 화자 음성 수집부를 통해 다국 언어로 수집되는 화자 음성 또는 화자 수화영상을 수집하는 단계와, (B) 오디오/비디오 처리부를 통해 수집된 화자 음성 또는 화자 수화영상을 디지털 데이터화하고 인코딩하여 압축하여, 전송부를 통해 압축된 화자 음성 또는 화자 수화영상을 통역사 단말기 및 청취자 단말기 중 적어도 하나로 전송하는 단계와, (C) 상기 수집된 화자 음성 또는 화자 수화영상을 텍스트 변환부를 통해 화자 음성 또는 화자 수화영상에 따른 제 1 텍스트 데이터로 변환하여, 상기 청취자 단말기로 상기 화자 음성 또는 화자 수화영상과 함께 상기 제 1 텍스트 데이터를 전달하는 단계와, (D) 통역음성 수집부를 통해 상기 통역사 단말기로부터 전송되는 통역음성 또는 통역 수화영상을 수집하는 단계와, (E) 전송부를 통해 상기 수집된 통역음성 또는 통역 수화영상을 청취자 단말기로 전송하는 단계와, (F) 상기 수집된 통역음성 또는 통역 수화영상을 텍스트 변환부를 통해 통역음성 또는 통역 수화영상에 따른 제 2 텍스트 데이터로 변환하여, 청취자 단말기로 상기 통역음성 또는 통역 수화영상과 함께 상기 제 2 텍스트 데이터를 전달하는 단계와, (G) 전송부를 통해 상기 수집된 통역음성 및 통역음성에 따른 제 2 텍스트 데이터 또는 통역 수화영상 및 통역 수화영상에 따른 제 2 텍스트 데이터를 청취자 단말기로 전달하는 단계를 포함한다.Features of a method for providing a simultaneous interpretation service for a disabled person according to the present invention for achieving the above object are (A) collecting speaker voices or speaker sign language images collected in a multilingual language through a speaker voice collection unit, and (B ) Digitally encoding, encoding and compressing the speaker voice or speaker sign language image collected through the audio / video processing unit, and transmitting the compressed speaker voice or speaker sign language image to at least one of the interpreter terminal and the listener terminal through the transmission unit, ( C) converting the collected speaker voice or speaker sign language image into first text data according to the speaker voice or speaker sign language image through a text conversion unit, and the first text data together with the speaker voice or speaker sign language image to the listener terminal (D) to the interpreter terminal through the interpreter voice collection unit Collecting an interpreted audio or interpreted sign language image transmitted from the (E) transmitting the collected interpreted audio or interpreted sign language image to a listener terminal through a transmission unit, and (F) collecting the interpreted audio or interpreted sign language Converting the image into second text data according to an interpreted voice or interpreted sign language image through a text conversion unit, and transmitting the second text data together with the interpreted voice or interpreted sign language image to a listener terminal; and (G) a transmission unit. And transmitting the second text data according to the interpreted voice and the interpreted voice or the second text data according to the interpreted sign language image and the interpreted sign language image to a listener terminal.

바람직하게 상기 (D) 단계는 상기 수집되는 통역음성 또는 통역 수화영상을 상기 통역사 단말기로 전송된 화자 음성 또는 화자 수화영상에 포함된 제 1 고유코드 별로 수집된 통역음성 또는 통역 수화영상으로 선별하고, 상기 통역사 단말기가 3개 이상인 경우, 상기 제 1 고유코드와 함께 상기 통역사 단말기에 포함된 제 2 고유코드 별로 수집된 통역음성 또는 통역 수화영상을 선별하는 것을 특징으로 한다.Preferably, in step (D), the interpreted voice or interpreted sign language image is selected as the interpreted voice or interpreted sign language image collected for each first unique code included in the speaker voice or the speaker sign language image transmitted to the interpreter terminal, When there are three or more interpreter terminals, it is characterized in that the interpreter voice or interpreter sign language image collected for each second unique code included in the interpreter terminal together with the first unique code is selected.

바람직하게 상기 (E) 단계는 상기 통역음성 또는 상기 통역 수화영상이 전송되는 통역사 단말기가 3개 이상인 경우, 매칭 처리부를 통해 수집된 적어도 3개 이상의 통역음성 또는 통역 수화영상의 매칭도를 서로 비교하는 단계와, 통역음성 선별부를 통해 상기 매칭도가 가장 높은 하나의 통역음성 또는 통역 수화영상을 선별하여, 전송부를 통해 선별된 통역음성 또는 통역 수화영상을 청취자 단말기로 전송하는 단계를 더 포함하는 것을 특징으로 한다.Preferably, in step (E), when there are three or more interpreter terminals to which the interpreter voice or the interpreter sign language image is transmitted, the matching degree of at least three interpreter voice or interpreter sign language images collected through a matching processor is compared with each other. And a step of selecting one interpretation voice or interpretation sign language image having the highest matching degree through the interpretation voice selection unit and transmitting the selected interpretation voice or interpretation sign language image to the listener terminal through the transmission unit. Is done.

바람직하게 상기 매칭도는 통역음성 또는 통역 수화영상을 음절별로 서로 비교하여 서로 다른 음절이 발생되는 횟수를 가지고 매칭도를 비교하는 것을 특징으로 한다.Preferably, the matching degree is characterized by comparing the matching degree with the number of times that different syllables are generated by comparing the interpreted voice or the interpreted sign language image for each syllable.

바람직하게 상기 제 1 텍스트 데이터 및 상기 제 2 텍스트 데이터의 변환은 영상 분할부를 통해 상기 수집된 수화영상을 변형되는 수화자(화자 또는 통역사)의 손의 움직임과 비수지 신호(nonmanual signals, 얼굴표정과 몸짓)별로 시간단위 영상으로 분할하는 단계와, 문장 생성부를 통해 상기 분할된 영상을 시간 순으로 나열하고, 손의 움직임과 비수지 신호에 따라 저장부에 미리 저장하고 있는 문자와 비교를 통해 매칭되는 모음 및 자음, 숫자, 특수문자 등을 포함하는 문자를 검출하고, 상기 검출된 문자를 시간 순으로 결합하여 문장을 생성하는 단계를 포함하는 것을 특징으로 한다.Preferably, the conversion of the first text data and the second text data includes hand movement and non-manual signals (face expression) of the hand of a speaker (speaker or interpreter) that transforms the collected sign language image through an image segmentation unit. And gestures) for each time-based image, and the segmented images are arranged in chronological order through the sentence generation unit, and matched through comparison with characters previously stored in the storage unit according to hand movements and non-resin signals. And detecting characters including vowels and consonants, numbers, special characters, and the like, and combining the detected characters in chronological order to generate a sentence.

이상에서 설명한 바와 같은 본 발명에 따른 장애인을 위한 동시통역 서비스 제공 시스템 및 그 방법은 다음과 같은 효과가 있다.The system and method for providing a simultaneous interpretation service for the disabled according to the present invention as described above has the following effects.

첫째, 기존의 동시통역 환경에서 지급되었던 동시통역 수신기 대신 자신이 소유하고 있는 스마트폰을 이용하여 어플리케이션 설치를 통해 동시통역음성을 청취할 수 있고, 청취 언어 선택이 자유로운 효과가 있다.First, instead of the simultaneous interpretation receiver that was provided in the existing simultaneous interpretation environment, you can listen to the simultaneous interpretation voice through application installation using your own smartphone, and you can freely select the listening language.

둘째, 장애인 청취자는 화자의 전달하고자 하는 내용을 음성, 수화영상 및 텍스트로 볼 수 있도록 함으로써 화자의 전달 내용을 용이하게 확인할 수 있도록 하며, 텍스트를 함께 제공함으로써 수화영상의 오류로 인한 오해를 방지할 수 있는 효과가 있다.Second, listeners with disabilities can easily see what the speaker wants to convey by using voice, sign language, and text, so that they can easily check the content of the speaker, and provide text together to prevent misunderstandings due to errors in sign language images. It has the effect.

셋째, 통역사가 소유하고 있는 스마트폰을 이용하여 어플리케이션 설치를 통해 화자의 음성을 청취하여 동시통역을 제공할 수 있도록 하여 통역사의 시간적, 공간적 제약이 없으며, 이에 따라, 장애인의 수화를 제공할 수 있는 통역사를 손쉽게 구할 수 있는 효과가 있다. Third, there is no temporal or spatial limitation of the interpreter by providing the simultaneous interpretation by listening to the speaker's voice through the application installation using the smartphone owned by the interpreter, so that there is no sign language for the disabled. It has the effect of easily obtaining an interpreter.

넷째, 기존의 고가의 동시통역 운영 시스템 대신 저가의 중/소규모의 시스템 및 방법을 제공함으로써, 동시통역 시스템 운영비의 절감 효과를 가져올 수 있다.Fourth, by providing a low-cost medium / small-scale system and method, instead of the existing high-cost simultaneous interpretation and operation system, it is possible to reduce the cost of operating the simultaneous interpretation system.

도 1 은 본 발명의 실시예에 따른 장애인을 위한 동시통역 서비스 제공 시스템의 구성을 나타낸 블록도
도 2 는 도 1에서 동시통역 서버의 구성을 상세히 나타낸 제 1 실시예
도 3 은 도 1에서 동시통역 서버의 구성을 상세히 나타낸 제 2 실시예
도 4 는 본 발명의 실시예에 따른 장애인을 위한 동시통역 서비스 제공 방법을 설명하기 위한 흐름도1 is a block diagram showing the configuration of a system for providing simultaneous interpretation service for a disabled person according to an embodiment of the present invention
Figure 2 is a first embodiment showing the configuration of the simultaneous interpretation server in Figure 1 in detail
Figure 3 is a second embodiment showing the configuration of the simultaneous interpretation server in Figure 1
4 is a flow chart for explaining a method for providing simultaneous interpretation service for a disabled person according to an embodiment of the present invention

본 발명의 다른 목적, 특성 및 이점들은 첨부한 도면을 참조한 실시예들의 상세한 설명을 통해 명백해질 것이다.Other objects, features and advantages of the present invention will become apparent through the detailed description of the embodiments with reference to the accompanying drawings.

본 발명에 따른 장애인을 위한 위치정보 제공 시스템 및 방법의 바람직한 실시예에 대하여 첨부한 도면을 참조하여 설명하면 다음과 같다. 그러나 본 발명은 이하에서 개시되는 실시예에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예는 본 발명의 개시가 완전하도록 하며 통상의 지식을 가진자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형예들이 있을 수 있음을 이해하여야 한다.A preferred embodiment of a system and method for providing location information for a disabled person according to the present invention will be described with reference to the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the present embodiments allow the disclosure of the present invention to be complete and the scope of the invention to those skilled in the art. It is provided to inform you. Therefore, the configuration shown in the embodiments and drawings described in this specification is only one of the most preferred embodiments of the present invention and does not represent all of the technical spirit of the present invention, and various equivalents that can replace them at the time of this application It should be understood that there may be water and variations.

도 1 은 본 발명의 실시예에 따른 장애인을 위한 동시통역 서비스 제공 시스템의 구성을 나타낸 블록도이다.1 is a block diagram showing the configuration of a system for providing simultaneous interpretation service for a disabled person according to an embodiment of the present invention.

도1에서 도시하고 있는 것과 같이, 본 발명의 장애인을 위한 동시통역 서비스 제공 시스템은 통역사 단말기(100), 동시통역 서버(200) 및 청취자 단말기(300)를 포함한다.As shown in Figure 1, the system for providing simultaneous interpretation services for the disabled of the present invention includes an interpreter terminal 100, a simultaneous interpretation server 200, and a listener terminal 300.

상기 통역사 단말기(100) 및 상기 청취자 단말기(300)는 통신망을 통해 동시통역 서버(200)로부터 동시통역 정보 및 동시통역 서비스를 제공받기 위한 동시통역 애플리케이션이 설치된다. 이때 동시통역 애플리케이션은 동시통역 서버(200)에서 제공되거나, 또는 기타 스마트폰용 애플리케이션 마켓, 예를 들면 애플 앱스토어, 구글 안드로이드 마켓 등에서 제공되어 설치될 수도 있다.The interpreter terminal 100 and the listener terminal 300 are installed with a simultaneous interpretation application for receiving simultaneous interpretation information and simultaneous interpretation service from the simultaneous interpretation server 200 through a communication network. At this time, the simultaneous interpretation application may be provided by the simultaneous interpretation server 200, or may be provided and installed in another smartphone application market, for example, the Apple App Store or Google Android Market.

그리고 상기 통역사 단말기(100) 및 상기 청취자 단말기(300)는 통역사 및 청취자의 조작에 의해 구동되고, 표시창(화면)에 표시되는 동시통역 애플리케이션을 선택(터치식 또는 버튼식)하는 간단한 방법으로 설치된 애플리케이션을 실행하여 동시통역 서버(200)에 접속하게 된다.And the interpreter terminal 100 and the listener terminal 300 is driven by the interpretation of the interpreter and listener, an application installed in a simple way to select (touch or button) the simultaneous interpretation application displayed on the display window (screen) To access the simultaneous interpretation server 200.

또한 상기 통역사 단말기(100) 및 상기 청취자 단말기(300)는 기본적으로 음성 인식 및 전송이 가능한 스마트폰, 태블릿 PC, 휴대폰, PDA(personal digital assistant), 및 기타 모바일 컴퓨팅 장치일 수 있으나, 이에 제한되지 않는다. 또한, 상기 통역사 단말기(100) 및 상기 청취자 단말기(300)는 음성 인식 및 전송 기능, 통신 기능 및 데이터 프로세싱 기능을 구비한 웨어러블 디바이스일 수 있다.Also, the interpreter terminal 100 and the listener terminal 300 may be basically a smart phone, a tablet PC, a mobile phone, a personal digital assistant (PDA), and other mobile computing devices capable of voice recognition and transmission, but are not limited thereto. Does not. Further, the interpreter terminal 100 and the listener terminal 300 may be wearable devices having a voice recognition and transmission function, a communication function, and a data processing function.

이때, 화자 및 청취자 중 적어도 하나가 장애인일 수 있으며, 청취자가 화자에게 통역에 대하여 답변을 할 때는 화자와 청취자는 서로 바뀌게 될 것이다. 이에 따라 도면에서는 청취자 단말기(300) 만을 기재하였으며, 화자 단말기는 별도로 기재하지 않았음에 주의하여야 한다. At this time, at least one of the speaker and the listener may be disabled, and when the listener answers the interpreter to the speaker, the speaker and the listener will be changed. Accordingly, it should be noted that only the listener terminal 300 is described in the drawing, and the speaker terminal is not separately described.

실시예로서, 화자가 청각 장애인인 경우, 화자는 수화로 제공할 수 있다. 이 경우 통역사는 수화영상을 제공받아 전달 내용(수화)을 통역하여 동시통역 서버(200)를 통해 청취자 단말기(300)로 전달한다. 이에 따라, 청취자는 청취자 단말기(300)로 수화 내용에 해당되는 텍스트 및 통역음성을 제공받을 수 있다. 이때, 동시통역 서버(200)는 화자 수화영상에 따른 텍스트를 청취자 단말기(300)로 제공할 수 있다. As an embodiment, if the speaker is deaf, the speaker may be provided in sign language. In this case, the interpreter receives the sign language video, interprets the content (sign language), and delivers it to the listener terminal 300 through the simultaneous interpretation server 200. Accordingly, the listener may be provided with text and an interpreted voice corresponding to sign language content to the listener terminal 300. At this time, the simultaneous interpretation server 200 may provide text according to the speaker sign language video to the listener terminal 300.

또 다른 예로서, 청취자가 청각 장애인인 경우, 통역사는 화자 음성을 제공받아 통역하여 수화 영상으로 동시통역 서버(200)를 통해 청취자 단말기(300)로 전달한다. 이에 따라, 청취자는 다국 언어로 진행되는 화자 음성을 통역 수화영상을 제공받을 수 있다. 이때, 동시통역 서버(200)는 통역 수화영상에 따른 텍스트를 청취자 단말기(300)로 제공할 수 있다. As another example, when the listener is a hearing impaired person, the interpreter receives the speaker's voice and interprets it to deliver the sign language video to the listener terminal 300 through the simultaneous interpretation server 200. Accordingly, the listener may be provided with an interpreted sign language video of a speaker's voice in a multilingual language. At this time, the simultaneous interpretation server 200 may provide text according to the interpreted sign language image to the listener terminal 300.

이를 위해, 상기 동시통역 서버(200)는 동시통역을 위한 화자 음성 또는 수화영상을 수집하여 디지털 데이터로 변환하고, 통역사 단말기(100)로 전달하며, 통역사 단말기(100)로부터 통역된 음성 또는 수화영상을 수집하고 통역음성 또는 수화영상에 따른 제 2 텍스트 데이터로 변환하여, 청취자 단말기(300)로 상기 통역음성, 수화영상 및 제 2 텍스트 데이터를 전달한다. 이에 따라 청취자는 청취자 단말기(300)로 전달되는 통역음성, 수화영상 및 제 2 텍스트 데이터 중 적어도 하나를 선택하여 확인하게 된다.To this end, the simultaneous interpretation server 200 collects speaker voice or sign language images for simultaneous interpretation, converts them into digital data, delivers them to the interpreter terminal 100, and interprets the voice or sign language images interpreted from the interpreter terminal 100. Is collected and converted into second text data according to an interpreted voice or sign language image, and transmits the interpreted voice, sign language image and second text data to a listener terminal 300. Accordingly, the listener selects and confirms at least one of an interpreted voice, a sign language image, and second text data delivered to the listener terminal 300.

이때, 동시통역 서버(200)는 수집된 화자 음성 또는 수화영상을 제 1 텍스트 데이터로 변환하여, 청취자 단말기(300)로 상기 제 1 텍스트 데이터를 전달할 수 있다. At this time, the simultaneous interpretation server 200 may convert the collected speaker voice or sign language image into first text data and transmit the first text data to the listener terminal 300.

또한, 동시통역 서버(200)는 통역사 단말기(100)로부터 수집되는 통역음성이 3개 이상인 경우, 다수의 통역사 단말기(100)에서 수집된 통역음성을 서로 비교하고, 통역음성의 매칭도가 가장 높은 통역음성을 선별한다. 그리고 동시통역 서버(200)는 선별된 통역음성을 제 2 텍스트 데이터로 변환하여, 청취자 단말기(300)로 상기 통역음성 및 제 2 텍스트 데이터를 전달할 수 있다. In addition, the simultaneous interpretation server 200 compares the interpreted voices collected from the multiple interpreter terminals 100 with each other when the number of interpreted voices collected from the interpreter terminal 100 is three or more, and has the highest matched interpretation voice. Select an interpreter voice. In addition, the simultaneous interpretation server 200 may convert the selected interpretation voice into second text data, and transmit the interpreted voice and second text data to the listener terminal 300.

한편, 동시통역 서버(200)는 적어도 하나 이상의 통역사 단말기(100) 및 적어도 하나의 청취자 단말기(300)를 연계시켜 동시통역 서비스를 제공할 수 있다. 구체적으로, 동시통역 서버(200)는 동시통역을 위한 클라우드 인스턴스(cloud instance)를 적어도 하나 이상의 통역사 단말기(100) 및 적어도 하나의 청취자 단말기(300)로 제공할 수 있는 바, 통역사 단말기(100) 및 청취자 단말기(300) 간의 동시통역이 가능하도록 할 수 있다. Meanwhile, the simultaneous interpretation server 200 may provide a simultaneous interpretation service by linking the at least one interpreter terminal 100 and the at least one listener terminal 300. Specifically, the simultaneous interpretation server 200 may provide a cloud instance for simultaneous interpretation to at least one interpreter terminal 100 and at least one listener terminal 300, and the interpreter terminal 100 And it is possible to enable simultaneous interpretation between the listener terminal 300.

다시 말해, 통역사 및 청취자는 클라우드 인스턴스로 연계된 통역사 단말기(100) 및 청취자 단말기(300)를 통해 동시통역 서비스를 제공 받을 수 있다. 따라서 화자의 음성이 클라우드 인스턴스를 통해 적어도 하나 이상의 통역사 단말기(100)로 전송될 수 있고, 이어서 통역사 단말기(100)에서 통역된 음성 언어가 클라우드 인스턴스를 통해 적어도 하나의 청취자 단말기(300)로 전송될 수 있다.In other words, the interpreter and listener may be provided with a simultaneous interpretation service through the interpreter terminal 100 and the listener terminal 300 linked to the cloud instance. Therefore, the voice of the speaker may be transmitted to at least one interpreter terminal 100 through the cloud instance, and then the voice language interpreted by the interpreter terminal 100 may be transmitted to the at least one listener terminal 300 through the cloud instance. Can be.

이처럼, 본 발명은 통역자 및 청취자의 조작에 의해 구동되고, 표시창(화면)에 표시되는 동시통역 어플리케이션을 선택하는 간단한 방법으로 설치된 어플리케이션을 실행하여 동시통역 서버(200)에 접속된다.As described above, the present invention is driven by the operation of the interpreter and listener, and is connected to the simultaneous interpretation server 200 by executing the installed application in a simple way to select the simultaneous interpretation application displayed on the display window (screen).

제 1 실시예Example 1

도 2 는 도 1에서 동시통역 서버의 구성을 상세히 나타낸 제 1 실시예 이다.FIG. 2 is a first embodiment showing the configuration of the simultaneous interpretation server in FIG. 1.

도 2에서 도시하고 있는 것과 같이, 동시통역 서버(200)는 화자 음성 수집부(201), 오디오/비디오 처리부(202), 전송부(203), 통역음성 수집부(204), 매칭 처리부(205), 통역음성 선별부(206) 및 텍스트 변환부(207)를 포함한다.2, the simultaneous interpretation server 200 includes a speaker voice collection unit 201, an audio / video processing unit 202, a transmission unit 203, an interpretation voice collection unit 204, and a matching processing unit 205. ), An interpreting and voice selecting unit 206 and a text converting unit 207.

상기 화자 음성 수집부(201)는 다국 언어로 진행되는 화자 음성을 수집한다. The speaker voice collection unit 201 collects speaker voices conducted in multiple languages.

상기 오디오/비디오 처리부(202)는 화자 음성 수집부(201)에서 수집된 화자 음성을 디지털 데이터화하고 인코딩하여 압축한다. 압축은 MP3 등의 다양한 압축방식으로 인코딩될 수 있다.The audio / video processing unit 202 digitalizes, encodes, and compresses the speaker speech collected by the speaker speech collection unit 201. Compression can be encoded by various compression methods such as MP3.

상기 전송부(203)는 오디오/비디오 처리부(202)에서 압축된 화자 음성을 통역사 단말기(100)로 전송한다. 또한 전송부(203)는 통역사 단말기(100)에서 전달되는 통역음성을 청취자 단말기(100)로 전송한다. The transmission unit 203 transmits the speaker voice compressed by the audio / video processing unit 202 to the interpreter terminal 100. In addition, the transmitting unit 203 transmits the interpreted voice transmitted from the interpreter terminal 100 to the listener terminal 100.

한편, 전송부(203)는 텍스트 변환부(207)를 통해 상기 화자 음성 및 통역음성 중 적어도 하나가 텍스트로 변환된 텍스트 파일을 청취자 단말기(100)로 전송할 수 있다.Meanwhile, the transmission unit 203 may transmit a text file in which at least one of the speaker voice and the interpretation voice is converted to text through the text conversion unit 207 to the listener terminal 100.

상기 통역음성 수집부(204)는 통역사 단말기(100)로부터 전송된 화자 음성에 대응되는 통역음성을 수집한다. 이때, 통역음성 수집부(204)는 전송된 화자 음성에 포함된 제 1 고유코드 별로 수집된 통역음성을 선별할 수 있다. 또한 통역음성 수집부(204)는 통역사 단말기(100)가 3개 이상인 경우, 상기 제 1 고유코드와 함께 통역사 단말기(100)에 포함된 제 2 고유코드 별로 수집된 통역음성을 선별할 수 있다. The interpreter voice collection unit 204 collects interpreter voices corresponding to the speaker voices transmitted from the interpreter terminal 100. At this time, the interpreter voice collection unit 204 may select the interpreter voice collected for each first unique code included in the transmitted speaker voice. In addition, when the interpreter voice collection unit 204 has three or more interpreter terminals 100, the interpreter voice collected for each second unique code included in the interpreter terminal 100 together with the first unique code may be selected.

상기 매칭 처리부(205)는 통역음성 수집부(204)에서 수집된 적어도 3개 이상의 통역음성의 매칭도를 서로 비교한다. 즉, 통역음성을 음절별로 서로 비교하여 서로 다른 음절이 발생되는 횟수를 가지고 매칭도를 비교한다. 이때, 기준이 되는 음절은 미리 설정되어 있지 않으며, 통역음성을 비교하는 과정에서 서로 공통으로 사용되는 음절이 기준 음절로 이용될 수 있다. The matching processing unit 205 compares the matching degree of at least three interpreted voices collected by the interpreted voice collection unit 204 with each other. That is, the interpreted voices are compared for each syllable, and the matching degree is compared with the number of times that different syllables are generated. At this time, the reference syllable is not set in advance, and in the process of comparing interpreted voices, syllables commonly used with each other may be used as the reference syllable.

상기 통역음성 선별부(206)는 매칭 처리부(205)에서 매칭도가 가장 높은 하나의 통역음성을 선별한다.The interpreter voice selector 206 selects one interpreter voice having the highest matching degree from the matching processor 205.

상기 텍스트 변환부(207)는 통역음성 선별부(206)에서 선별된 통역음성을 통역음성에 따른 제 2 텍스트 데이터로 변환한다. 이때, 하나의 통역사 단말기(100)로부터 통역음성이 전송되는 경우에는 상기 통역음성 선별부(206)의 구성없이 하나의 통역사 단말기(100)로부터 전송되는 통역음성을 제 2 텍스트 데이터로 변환하게 된다. 또한, 텍스트 변환부(207)는 화자 음성 수집부(201)에서 수집된 화자 음성을 화자 음성에 따른 제 1 텍스트 데이터로 변환한다.The text converting unit 207 converts the interpreted speech selected by the interpreted speech selector 206 into second text data according to the interpreted speech. At this time, when an interpreter voice is transmitted from one interpreter terminal 100, the interpreter voice transmitted from one interpreter terminal 100 is converted into second text data without configuring the interpreter voice selector 206. In addition, the text conversion unit 207 converts the speaker speech collected by the speaker speech collection unit 201 into first text data according to the speaker speech.

제 2 실시예Example 2

도 3 은 도 1에서 동시통역 서버의 구성을 상세히 나타낸 제 2 실시예 이다.FIG. 3 is a second embodiment showing the configuration of the simultaneous interpretation server in FIG. 1.

도 3에서 도시하고 있는 것과 같이, 동시통역 서버(200)는 화자 음성 수집부(201), 오디오/비디오 처리부(202), 전송부(203), 통역음성 수집부(204), 매칭 처리부(205), 통역음성 선별부(206), 텍스트 변환부(207), 영상 수집부(208), 영상 분할부(209), 문장 생성부(210), 저장부(211)를 포함한다.3, the simultaneous interpretation server 200 includes a speaker voice collection unit 201, an audio / video processing unit 202, a transmission unit 203, an interpretation voice collection unit 204, and a matching processing unit 205. ), An interpreting and voice selection unit 206, a text conversion unit 207, an image collection unit 208, an image division unit 209, a sentence generation unit 210, and a storage unit 211.

상기 오디오/비디오 처리부(202)는 화자 음성 수집부(201)에서 수집된 화자 음성 및 영상 수집부(208)에서 수집된 수화영상을 디지털 데이터화하고 인코딩하여 압축한다. 압축은 MP3 등의 다양한 압축방식으로 인코딩될 수 있다.The audio / video processing unit 202 digitally encodes, encodes, and compresses the sign language images collected by the speaker voice and image collection unit 208 collected by the speaker voice collection unit 201. Compression can be encoded by various compression methods such as MP3.

상기 전송부(203)는 오디오/비디오 처리부(202)에서 압축된 화자 음성 또는 화자 수화영상을 통역사 단말기(100)로 전송한다. 또한 전송부(203)는 통역사 단말기(100)에서 전달되는 통역음성 또는 화자 수화영상을 청취자 단말기(100)로 전송한다. 이때, 화자 수화영상은 화자가 수행하는 수화를 촬영한 영상을 말한다.The transmitting unit 203 transmits the speaker's voice or the speaker's sign language compressed by the audio / video processing unit 202 to the interpreter terminal 100. In addition, the transmitting unit 203 transmits an interpreter voice or a speaker sign language image transmitted from the interpreter terminal 100 to the listener terminal 100. At this time, the speaker sign language image refers to an image of the sign language performed by the speaker.

한편, 전송부(203)는 텍스트 변환부(207)를 통해 상기 화자 음성 및 통역음성 중 적어도 하나가 텍스트로 변환된 텍스트 파일을 청취자 단말기(100)로 전송할 수 있다. 또한, 전송부(203)는 영상 수집부(208)에서 수집된 화자 수화영상 또는 통역 수화영상을 청취자 단말기(100)로 전송할 수 있다.Meanwhile, the transmission unit 203 may transmit a text file in which at least one of the speaker voice and the interpretation voice is converted to text through the text conversion unit 207 to the listener terminal 100. Also, the transmitting unit 203 may transmit the speaker sign language image or the interpreted sign language image collected by the image collection unit 208 to the listener terminal 100.

상기 텍스트 변환부(207)는 통역음성 선별부(206)에서 선별된 통역음성을 통역음성에 따른 제 2 텍스트 데이터로 변환한다. 이때, 하나의 통역사 단말기(100)로부터 통역음성이 전송되는 경우에는 상기 통역음성 선별부(206)의 구성없이 하나의 통역사 단말기(100)로부터 전송되는 통역음성을 제 2 텍스트 데이터로 변환하게 된다. 또한, 텍스트 변환부(207)는 영상 수집부(208)에서 수집된 통역 수화영상을 수화영상에 따른 문장을 이용하여 제 2 텍스트 데이터로 변환한다.The text converting unit 207 converts the interpreted speech selected by the interpreted speech selector 206 into second text data according to the interpreted speech. At this time, when an interpreter voice is transmitted from one interpreter terminal 100, the interpreter voice transmitted from one interpreter terminal 100 is converted into second text data without configuring the interpreter voice selector 206. In addition, the text conversion unit 207 converts the interpreted sign language image collected by the image collection unit 208 into second text data by using a sentence according to the sign language image.

한편, 텍스트 변환부(207)는 화자 음성 수집부(201)에서 수집된 화자 음성을 화자 음성에 따른 제 1 텍스트 데이터로 변환한다. 또한, 텍스트 변환부(207)는 영상 수집부(208)에서 수집된 화자 수화영상을 수화영상에 따른 제 1 텍스트 데이터로 변환한다.Meanwhile, the text conversion unit 207 converts the speaker speech collected by the speaker speech collection unit 201 into first text data according to the speaker speech. In addition, the text conversion unit 207 converts the speaker sign language image collected by the image collection unit 208 into first text data according to the sign language image.

상기 영상 수집부(208)는 화자 수화영상을 수집한다. 또한 영상 수집부(208)는 통역사 단말기(100)로부터 전송된 통역 수화영상을 수집한다. 이때, 화자 수화영상은 화자가 수행하는 수화를 촬영한 영상을 말하며, 통역 수화영상은 통역사가 수행하는 수화를 촬영한 영상을 말한다.The image collection unit 208 collects a speaker sign language image. Also, the image collection unit 208 collects an interpreted sign language image transmitted from the interpreter terminal 100. In this case, the speaker sign language image refers to an image of sign language performed by a speaker, and the interpreter sign language image refers to an image of sign language performed by an interpreter.

상기 영상 분할부(209)는 영상 수집부(208)에서 수집된 수화영상을 변형되는 수화자(화자 또는 통역사)의 손의 움직임과 비수지 신호(nonmanual signals, 얼굴표정과 몸짓)별로 시간단위 영상으로 분할한다. The image segmentation unit 209 is a time unit image for each hand movement and non-manual signals (face expressions and gestures) of the speaker (speaker or interpreter) that transforms the sign language image collected by the image collection unit 208. Divide by.

상기 문장 생성부(210)는 영상 분할부(209)에서 분할된 영상을 시간 순으로 나열하고, 손의 움직임과 비수지 신호에 따라 저장부(211)에 미리 저장하고 있는 문자와 비교를 통해 매칭되는 모음 및 자음, 숫자, 특수문자 등을 포함하는 문자를 검출한다. 그리고 언어 생성부(210)는 검출된 문자를 시간 순으로 결합하여 문장을 생성한다. The sentence generation unit 210 lists the images divided by the image division unit 209 in chronological order, and matches them through comparison with characters previously stored in the storage unit 211 according to hand movements and non-resin signals. Detects characters including vowels and consonants, numbers, and special characters. In addition, the language generation unit 210 generates sentences by combining the detected characters in chronological order.

이와 같이 구성된 본 발명에 따른 동시통역 서비스 제공 시스템의 동작을 첨부한 도면을 참조하여 상세히 설명하면 다음과 같다. 도 1 내지 도 3과 동일한 참조부호는 동일한 기능을 수행하는 동일한 부재를 지칭한다. The operation of the system for providing simultaneous interpretation service according to the present invention configured as described above will be described in detail with reference to the accompanying drawings. The same reference numerals as in FIGS. 1 to 3 refer to the same members that perform the same functions.

도 4 는 본 발명의 실시예에 따른 동시통역 서비스 제공 방법을 설명하기 위한 흐름도이다.4 is a flowchart for explaining a method for providing simultaneous interpretation service according to an embodiment of the present invention.

도 4를 참조하여 설명하면, 먼저 화자 음성 수집부(201)를 통해 다국 언어로 진행되는 화자 음성 또는 화자 수화영상을 수집한다(S10). Referring to FIG. 4, first, a speaker voice or a speaker sign language image collected in a multilingual language is collected through the speaker voice collection unit 201 (S10).

그리고 오디오 처리부(202)를 통해 수집된 화자 음성 또는 화자 수화영상을 디지털 데이터화하고 인코딩하여 압축하여, 전송부(203)를 통해 압축된 화자 음성 또는 화자 수화영상을 통역사 단말기(100) 및 청취자 단말기(300)로 전송한다(S20). 이때, 압축은 MP3 등의 다양한 압축방식으로 인코딩될 수 있다. 또한, 상기 통역사 단말기(100)는 적어도 하나 이상일 수 있다.Then, the speaker voice or speaker sign language image collected through the audio processing unit 202 is digitally encoded, encoded, and compressed, and the speaker voice or speaker sign language image compressed through the transmission unit 203 is interpreted by the interpreter terminal 100 and the listener terminal ( 300) is transmitted (S20). At this time, the compression may be encoded by various compression methods such as MP3. In addition, the interpreter terminal 100 may be at least one.

한편, 수집된 화자 음성 또는 화자 수화영상은 텍스트 변환부(207)를 통해 화자 음성에 따른 제 1 텍스트 데이터로 변환하여, 청취자 단말기(300)로 상기 화자 음성과 함께 제 1 텍스트 데이터를 전달할 수 있다(S30). 이때, 하나의 실시예로서 통역사 단말기(100)로 제 1 텍스트 데이터를 전달할 수도 있다. Meanwhile, the collected speaker voice or speaker sign language image may be converted into first text data according to the speaker voice through the text conversion unit 207, and the first text data may be transmitted to the listener terminal 300 together with the speaker voice. (S30). In this case, as one embodiment, the first text data may be transmitted to the interpreter terminal 100.

통역사는 자신의 통역사 단말기(100)로 화자 음성 또는 화자 수화영상이 입력되면, 입력되는 화자 음성을 듣거나 또는 화자 수화영상을 보고 통역한 통역음성 또는 통역 수화영상을 동시통역 서버(200)로 전송한다. 이때, 통역사는 통역을 위한 음성언어를 선택하고, 선택된 음성언어의 통역음성 또는 통역 수화영상을 동시통역 서버(200)로 전송한다. 이때, 동시통역 서버(200)는 텍스트 변환부(207)에서 변환된 화자 음성 또는 화자 수화영상에 따른 제 1 텍스트 데이터를 통역사 단말기(100)로 화자 음성 또는 화자 수화영상과 함께 전송함으로써, 통역사가 화자 음성 또는 화자 수화영상만을 듣거나 보고 통역함으로써 발생되는 오류를 제 1 텍스트 데이터를 함께 확인할 수 있도록 하여 보다 정확한 통역이 가능하도록 할 수 있다. When the speaker voice or speaker sign language video is input to the interpreter terminal 100 of the interpreter, the interpreter transmits the interpreter voice or interpreter sign language video interpreted by listening to the speaker's voice or viewing the speaker sign language video to the simultaneous interpretation server 200 do. At this time, the interpreter selects a voice language for interpretation, and transmits an interpreted voice or an interpreted sign language image of the selected voice language to the simultaneous interpretation server 200. At this time, the simultaneous interpretation server 200 transmits the first text data according to the speaker voice or the speaker sign language image converted by the text conversion unit 207 to the interpreter terminal 100 together with the speaker voice or the speaker sign language image, so that the interpreter It is possible to make the interpretation more accurate by allowing the first text data to check the error caused by interpreting or viewing only the speaker's voice or the speaker's sign language.

그리고 동시통역 서버(200)는 통역음성 수집부(204)를 통해 통역사 단말기(100)로부터 전송되는 통역음성 또는 통역 수화영상을 수집한다(S40). 이때, 수집되는 통역음성 또는 통역 수화영상은 통역사 단말기(100)로 전송된 화자 음성 또는 통역 수화영상에 포함된 제 1 고유코드 별로 수집된 통역음성 또는 통역 수화영상을 선별할 수 있다. 또한 통역음성 수집부(204)는 통역사 단말기(100)가 3개 이상인 경우, 상기 제 1 고유코드와 함께 통역사 단말기(100)에 포함된 제 2 고유코드 별로 수집된 통역음성을 선별한다. Then, the simultaneous interpretation server 200 collects an interpreted voice or an interpreted sign language image transmitted from the interpreter terminal 100 through the interpreted voice collection unit 204 (S40). In this case, the interpreted voice or interpreted sign language image may be selected for each interpreted voice or interpreted sign language image collected for each first unique code included in the speaker voice or interpreter sign language image transmitted to the interpreter terminal 100. In addition, when the interpreter voice collection unit 204 has three or more interpreter terminals 100, the interpreter voices collected for each second unique code included in the interpreter terminal 100 together with the first unique code are selected.

이어서, 전송부(203)를 통해 수집된 통역음성 또는 통역 수화영상을 청취자 단말기(300)로 전송한다(S80). 이때, 청취자 단말기(300)로 전송되는 통역음성은 오디오 처리부(202)를 통해 디지털 데이터화하고 인코딩하여 압축될 수 있다. 이때, 압축은 MP3 등의 다양한 압축방식으로 인코딩될 수 있다. Subsequently, the interpreted voice or the interpreted sign language image collected through the transmitting unit 203 is transmitted to the listener terminal 300 (S80). At this time, the interpreted voice transmitted to the listener terminal 300 may be compressed by digitalizing and encoding it through the audio processing unit 202. At this time, the compression may be encoded by various compression methods such as MP3.

한편, 수집된 통역음성 또는 통역 수화영상은 텍스트 변환부(207)를 통해 통역음성 또는 통역 수화영상에 따른 제 2 텍스트 데이터로 변환하여, 청취자 단말기(300)로 상기 통역음성 또는 통역 수화영상과 함께 제 2 텍스트 데이터를 함께 전달할 수 있다. On the other hand, the collected interpreted voice or interpreted sign language image is converted into second text data according to the interpreted voice or interpreted sign language image through the text conversion unit 207, and the interpreter voice or interpreted sign language image is transmitted to the listener terminal 300. The second text data can be transmitted together.

이때, 통역음성이 전송되는 통역사 단말기(100)가 3개 이상인 경우는(S50) 매칭 처리부(205)를 통해 수집된 적어도 3개 이상의 통역음성의 매칭도를 서로 비교한다(S60). 즉, 통역음성을 음절별로 서로 비교하여 서로 다른 음절이 발생되는 횟수를 가지고 매칭도를 비교한다. 이때, 기준이 되는 음절은 미리 설정되어 있지 않으며, 통역음성을 비교하는 과정에서 서로 공통으로 사용되는 음절이 기준 음절로 이용될 수 있다. At this time, when there are three or more interpreter terminals 100 to which the interpreter voice is transmitted (S50), the matching degree of at least three interpreter voices collected through the matching processor 205 is compared with each other (S60). That is, the interpreted voices are compared for each syllable, and the matching degree is compared with the number of times that different syllables are generated. At this time, the reference syllable is not set in advance, and in the process of comparing interpreted voices, syllables commonly used with each other may be used as the reference syllable.

그리고 통역음성 선별부(206)를 통해 상기 매칭도가 가장 높은 하나의 통역음성을 선별하여(S70), 전송부(203)를 통해 선별된 통역음성 또는 상기 통역 수화영상을 청취자 단말기(300)로 전송한다(S80).Then, the interpreter voice selector 206 selects one interpreter voice having the highest matching (S70), and transmits the interpreter voice or the interpreter sign language image selected through the transmitter 203 to the listener terminal 300. Transmit (S80).

한편, 선별된(수집된) 화자 음성 또는 통역 수화영상은 텍스트 변환부(207)를 통해 통역음성 또는 통역 수화영상에 따른 제 2 텍스트 데이터로 변환하여, 청취자 단말기(300)로 상기 통역음성 또는 통역 수화영상과 함께 제 2 텍스트 데이터를 전달할 수 있다(S90).Meanwhile, the selected (collected) speaker voice or interpreter sign language image is converted into second text data according to the interpreter voice or interpreter sign language image through the text conversion unit 207, and the interpreter voice or interpreter is interpreted by the listener terminal 300. The second text data may be transmitted together with the sign language image (S90).

이때, 통역 수화영상은 영상 분할부(209)를 통해 수집된 수화영상을 변형되는 수화자(화자 또는 통역사)의 손의 움직임과 비수지 신호(nonmanual signals, 얼굴표정과 몸짓)별로 시간단위 영상으로 분할한다. At this time, the interpreted sign language image is a time-based image for each of the hand movements and non-manual signals (face expressions and gestures) of the hand of the speaker (speaker or interpreter) to transform the sign language image collected through the image division unit 209. Divide.

그리고 문장 생성부(210)를 통해 분할된 영상을 시간 순으로 나열하고, 손의 움직임과 비수지 신호에 따라 저장부(211)에 미리 저장하고 있는 문자와 비교를 통해 매칭되는 모음 및 자음, 숫자, 특수문자 등을 포함하는 문자를 검출한다. 그리고 검출된 문자를 시간 순으로 결합하여 문장을 생성하여 제 2 텍스트 데이터로 변환한다. Then, the images divided through the sentence generation unit 210 are arranged in chronological order, and vowels, consonants, and numbers that are matched through comparison with characters previously stored in the storage unit 211 according to hand movements and non-resin signals. , Characters including special characters are detected. Then, the detected characters are combined in chronological order to generate sentences and converted into second text data.

그러면, 청취자는 자신의 청취자 단말기(300)를 통해 통시통역 서버(200)에서 제공되는 통역음성을 청취하거나, 또는 통역 수화영상을 보게 된다. 이때, 청취자 단말기(300)는 청취자의 조작에 의해 통역음성 또는 통역 수화영상과 함께, 선택적으로 화자 음성, 화자 음성에 따른 제 2 텍스트 데이터, 화자 수화영상, 화자 수화영상에 따른 제 2 텍스트 데이터, 통역음성에 따른 제 1 텍스트 데이터, 통역 수화영상에 따른 제 1 텍스트 데이터 중 적어도 하나를 동시에 확인할 수 있다.Then, the listener listens to the interpreted voice provided by the interpreter interpreter server 200 through his listener terminal 300, or views the interpreter sign language image. At this time, the listener terminal 300 is a second voice data according to the voice of the speaker, the speaker's voice, the second text data according to the speaker's voice, the speaker's sign language, and the interpreter's voice or the interpreter's sign language by the listener's manipulation. At least one of the first text data according to the interpretation voice and the first text data according to the interpreted sign language image may be simultaneously confirmed.

따라서 청취자는 화자의 전달하고자 하는 내용을 음성, 수화영상 및 텍스트로 모두 청취할 수 있도록 함으로써 화자가 강조하는 포인트를 캐치하기 용이하도록 하고, 통역사의 오역으로 인한 오해를 방지할 수 있다.Therefore, by allowing the listener to listen to the speaker's content to be transmitted through voice, sign language, and text, it is easy to catch the point emphasized by the speaker, and it is possible to prevent misinterpretation due to the misinterpretation of the interpreter.

상기에서 설명한 본 발명의 기술적 사상은 바람직한 실시예에서 구체적으로 기술되었으나, 상기한 실시예는 그 설명을 위한 것이며 그 제한을 위한 것이 아님을 주의하여야 한다. 또한, 본 발명의 기술적 분야의 통상의 지식을 가진자라면 본 발명의 기술적 사상의 범위 내에서 다양한 실시예가 가능함을 이해할 수 있을 것이다. 따라서 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다. Although the technical spirit of the present invention described above has been specifically described in a preferred embodiment, it should be noted that the above-described embodiment is for the purpose of description and not for limitation. In addition, those skilled in the art of the present invention will understand that various embodiments are possible within the scope of the technical spirit of the present invention. Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

100: 통역사 단말기 200: 동시통역 서버
201: 화자 음성 수집부 202: 오디오/비디오 처리부
203: 전송부 204: 통역음성 수집부
205: 매칭 처리부 206: 통역음성 선별부
207: 텍스트 변환부 208: 영상 수집부
209: 영상 분할부 210: 언어 생성부
211: 저장부 300: 청취자 단말기 100: interpreter terminal 200: simultaneous interpretation server
201: speaker voice collection unit 202: audio / video processing unit
203: Transmission unit 204: Interpretation and speech collection unit
205: Matching processing unit 206: Interpretation and voice selection unit
207: text conversion unit 208: image collection unit
209: image division unit 210: language generation unit
211: storage unit 300: listener terminal

Claims

An interpreter terminal and a listener terminal driven by an interpreter and a listener and running a simultaneous interpreter application displayed on a display window (screen) to access a simultaneous interpreter server;
Collects speaker voice or speaker sign language images for simultaneous interpretation, converts them into digital data, delivers them to the interpreter terminal, collects interpreted voice or interpreter sign language images from the interpreter terminal, and interprets voice or interpret sign language images according to the second A system for providing simultaneous interpretation services for persons with disabilities, including a simultaneous interpretation server that converts text data into a listener terminal and delivers the interpreted voice, interpreted sign language images, and second text data.

According to claim 1,
The simultaneous interpretation server converts the collected speaker voice or speaker sign language image into first text data according to the speaker voice or speaker sign language image, and transmits the speaker voice, speaker sign language image, and first text data to the listener terminal. A system for providing simultaneous interpretation services for disabled people.

According to claim 1,
The simultaneous interpretation server
A speaker voice collection unit for collecting a speaker voice,
An audio processing unit that digitalizes, encodes, and compresses the speaker voices collected by the speaker voice collection unit,
The audio processing unit transmits the compressed speaker voice to at least one of an interpreter terminal and a listener terminal, transmits an interpreted voice transmitted from the interpreter terminal to a listener terminal, and listens to a speaker sign language image or interpreter sign language image collected by the video collection unit A transmission unit to transmit to the terminal,
An interpreter voice collection unit for collecting an interpreter voice corresponding to the speaker voice transmitted from the interpreter terminal;
Translating the collected interpretation voice into second text data according to the interpretation voice, converting the speaker voice collected by the speaker voice collection unit into first text data according to the speaker voice, and interpreting sign language image collected by the video collection unit Or a text conversion unit for converting the speaker sign language image into second text data by using a sentence according to the sign language image,
An image collection unit for collecting a speaker sign language image and an interpreted sign language image transmitted from the interpreter terminal;
An image segmentation unit that divides the sign language image collected by the image collection unit into time-based images for each hand movement and non-manual signals (face expressions and gestures) of the deformed speaker (speaker or interpreter);
Characters including vowels and consonants, numbers, and special characters that are matched by arranging the images divided by the image division unit in chronological order and comparing them with characters previously stored in the storage unit according to hand movements and non-resin signals. A system for providing simultaneous interpretation services for persons with disabilities, including a character generator that detects and generates sentences by combining the detected characters in chronological order.

The method of claim 3.
The transmitting unit transmits a text file in which at least one of the speaker voice, the interpreter voice, the speaker sign language image, and the interpreter sign language image is converted to text through the text conversion unit to a listener terminal. .

The method of claim 3,
The interpreter voice collection unit selects the interpreter voice or speaker sign language image collected for each first unique code included in the transmitted speaker voice or speaker sign language image,
When the number of interpreter terminals is three or more, a system for providing simultaneous interpretation service for persons with disabilities, characterized in that the interpreter voice or interpreter sign language video collected for each second unique code included in the interpreter terminal together with the first unique code is selected.

The method of claim 3,
The simultaneous interpretation server
Matching processing unit for comparing the matching degree of at least three or more interpreted speech or interpreted sign language images collected from the interpreted speech collection unit,
A system for providing simultaneous interpretation services for persons with disabilities, further comprising an interpreter voice selector for selecting one interpreter voice or interpreter sign language image having the highest matching degree in the matching processor.

The method of claim 6,
The matching processing unit compares the interpretation voice or call sign language with each syllable, and compares the matching degree with the number of times that different syllables are generated.

(A) collecting a speaker's voice or a speaker's sign language video in a multilingual language through a speaker's voice collection unit,
(B) digitally encoding and compressing the speaker voice or speaker sign language image collected through the audio / video processing unit, and transmitting the compressed speaker voice or speaker sign language image to at least one of an interpreter terminal and a listener terminal through the transmission unit; ,
(C) converting the collected speaker voice or speaker sign language image into first text data according to a speaker voice or speaker sign language image through a text conversion unit, and the first text together with the speaker voice or speaker sign language image to the listener terminal Passing data,
(D) collecting an interpreted voice or an interpreted sign language image transmitted from the interpreter terminal through an interpreted voice collecting unit,
(E) transmitting the collected translator voice or interpreter sign language image to a listener terminal through a transmission unit,
(F) converting the collected interpreted voice or interpreted sign language image into second text data according to the interpreted voice or interpreted sign language image through a text conversion unit, and the second text data together with the interpreted voice or interpreted sign language image to a listener terminal The step of passing,
(G) Simultaneous interpretation service for persons with disabilities, including passing the collected second and second text data according to the interpreted voice and the interpreted voice to the listener terminal through the translator How to provide.