KR20210029636A

KR20210029636A - Real-time interpretation service system that hybridizes translation through artificial intelligence and interpretation by interpreter

Info

Publication number: KR20210029636A
Application number: KR1020190131881A
Authority: KR
Inventors: 백민호
Original assignee: 백민호
Priority date: 2019-09-06
Filing date: 2019-10-23
Publication date: 2021-03-16
Also published as: WO2021080074A1

Abstract

The present invention relates to a hybrid real-time interpretation service system for hybridizing translation through artificial intelligence and translation by an interpreter expert, and more particularly, to a hybrid real-time interpretation service system for hybridizing translation through artificial intelligence and interpretation by an interpreter expert, in which one or both of real-time translation through artificial intelligence and real-time translation by an interpreter expert are selected to provide an interpretation service suitable for a conversational environment and provide meeting minute information in a text format, so that a separate conversation recording is unnecessary. To this end, the present invention includes: a plurality of user terminals for requesting translation in a user′s requested language and performing voice or video calls with one or more other users as conversation partners; an APP server that is connected to enable data communication with terminals of the users and relay the terminals so as to enable the voice or video calls, receives voice or image information of the user terminals to generate text information through voice recognition, uses artificial intelligence in a preset language to generate translated text information, and transmits the translated text information to the user terminals upon request; and at least one expert terminal that receives, upon request of the user terminal, the voice information or video information, which is included in the voice or video call between users by the APP server, of the other user′s terminal of the requested user terminal from the APP server to provide the information to the interpreter expert, and provides translated audio information or translated image information to the user terminal in the set language in the requested user terminal.

Description

Real-time interpretation service system that hybridizes translation through artificial intelligence and interpretation by interpreter}

본 발명은 인공지능을 통한 번역과 통역 전문가의 번역이 하이브리드된 실시간 통역 서비스시스템에 관한 것으로서 더욱 상세하게는 인공지능을 통한 실시간 번역과 통역 전문가의 실시간 번역을 택일 또는 모두 선택할 수 있어 대화 환경에 맞는 통역 서비스를 제공받을 수 있으며 텍스트 형태의 회의록정보를 제공받을 수 있어 별도의 대화 기록을 위한 작업이 필요없는 인공지능을 통한 번역과 통역 전문가의 통역이 하이브리드된 실시간 통역 서비스시스템에 관한 것이다. The present invention relates to a real-time interpretation service system in which translation through artificial intelligence and translation by an interpreter expert are hybrid. More specifically, real-time translation through artificial intelligence and real-time translation by an interpreter can be selected or both can be selected to suit the conversation environment. It relates to a real-time interpretation service system in which translation through artificial intelligence and interpretation of an interpreter expert are hybridized through artificial intelligence that does not require an operation for recording a separate conversation because it can receive an interpretation service and can receive information about meeting minutes in text format.

일반적으로, 외국어를 잘 못하는 사람이 외국 여행이나 외국인과의 대화 및 통화 시에 의사 소통이 잘되지 않아 불편함과 어려움을 겪는 경우가 많다. 그렇지만 외국어를 잘 아는 다른 사람의 도움 없이는 스스로 사전 또는 회화책자 등을 찾아가며 외국인과 의사 소통을 해야하나 이는 실시간 대응이 어렵고 제대로 의사 소통을 하기가 어려웠다.In general, people who do not speak foreign languages often experience inconvenience and difficulties because they cannot communicate well when traveling abroad or talking or talking with foreigners. However, without the help of other people who know the foreign language well, they have to go to dictionaries or conversation books on their own to communicate with foreigners, but this made real-time response difficult and it was difficult to communicate properly.

최근 들어 스마트폰 기술의 발달로 자동 번역기가 개발되어, 언어 체계가 비슷한 한국어와 일본어인 경우 그 신뢰성이 높아 실용화되어 있으며, 영어인 경우 한국어와 언어 체계가 달라 일본어의 경우보다는 번역의 신뢰성이 떨어지나 단문이나 간단한 복문 등은 실용적으로 이용할 수 있는 정도의 번역기 및 그 번역 프로그램들이 널리 알려져 있다.Recently, with the development of smart phone technology, automatic translators have been developed, and Korean and Japanese languages with similar language systems have high reliability and are practically used. For English, because the language system is different from Korean, the reliability of translation is lower than that of Japanese. Translators and their translation programs are widely known to the extent that they can be practically used for or simple complex text.

한편, 사람의 음성을 기계가 인식하기 위한 음성인식에 대해서 많은 연구 노력을 기울이고 있는 분야이며, 샘플링 기술의 발달과 신경회로망을 이용한 자기 학습기능 등의 발달로 음성의 자동 인식율이 높아지고 일부 분야에서 실용화되고 있다.On the other hand, it is a field where a lot of research efforts are devoted to speech recognition for machine recognition of human speech, and the automatic recognition rate of speech is increased due to the development of sampling technology and self-learning function using neural networks, and it is practically used in some fields Has become.

또한, 문자를 음성으로 출력하는 기술은 각 문자의 발음 조합이나 기타 단어의 발음 및 문장의 발음 등을 데이터베이스(DB)화하여 이를 음성으로 출력하는 음성 출력장치도 이미 널리 이용되고 있다.In addition, as a technology for outputting a character as a voice, a voice output device for converting a database (DB) of pronunciation combinations of each character, pronunciation of other words, and pronunciation of sentences and outputting it as a voice is already widely used.

또한, 현대 사람들이 항상 소지하고 다니는 스마트폰(Smart Phone)을 이용하여 사용자들에게 다른 나라 언어를 자국어로 변환하여 제공하는 통역 서비스를 제공하고 있다. In addition, an interpreter service is provided to users by converting other languages into their own languages using a smart phone that modern people always carry.

그러나, 기존에 다양한 형태로 제공되고 있는 번역 또는 통역 시스템은 단순한 기계번역을 통해 제공됨에 따라 통역 품질의 신뢰성 및 정확도가 감소되고, 다자간 대화 진행 시에 원활하게 번역 또는 통역 수행이 어려우며 단말기와 서버가 직접 연결되는 온라인 환경에서만 이용이 가능하여 환경 조건에 따른 서비스 사용의 제약이 발생되고, 대화 기록이 저장되지 않아 기록을 위해서는 별도의 기록 과정을 수행하여야 하는 불편함이 발생된다. However, since the existing translation or interpretation system provided in various forms is provided through simple machine translation, the reliability and accuracy of interpretation quality decreases, and it is difficult to perform smooth translation or interpretation during multi-party conversations, and the terminal and server Since it can be used only in an online environment that is directly connected, there is a restriction on the use of the service according to the environmental conditions, and the conversation record is not stored, causing inconvenience in that a separate recording process must be performed for recording.

한국등록특허 제10-1753649호Korean Patent Registration No. 10-1753649

본 발명은 상기의 문제점을 해결하기 위해 안출된 것으로서 인공지능(AI) 번역부를 통한 번역과 통역 전문가의 전문 번역이 사용자의 요청에 따라 택일 또는 전부 제공됨에 따라 대화 상황 및 요구하는 통역 품질에 따라 사용자가 용이하게 선택할 수 있으며 통역 품질의 신뢰도가 향상되는 인공지능을 통한 번역과 통역 전문가의 통역이 하이브리드된 실시간 통역 서비스시스템을 제공함에 그 목적이 있다. The present invention was devised to solve the above problems, and according to the conversation situation and the requested interpretation quality, as the translation through the artificial intelligence (AI) translation unit and the professional translation of the interpreter expert are provided alternatively or in full at the request of the user. Its purpose is to provide a real-time interpretation service system in which translation through artificial intelligence can be easily selected and the reliability of interpretation quality is improved, and interpretation by an interpreter expert is hybridized.

본 발명은 상기의 목적을 달성하기 위해 아래와 같은 특징을 갖는다.The present invention has the following features to achieve the above object.

본 발명은 사용자의 요청 언어로 번역을 요청하며, 대화 상대방인 하나 이상의 다른 사용자와 음성 또는 영상 통화를 수행하는 복수의 사용자단말기와; 상기 복수의 사용자단말기들과 데이터통신가능하도록 연결되며 이들 간을 중계하여 음성 또는 영상 통화가 이루어지도록 하며 해당 사용자단말기들의 음성 또는 영상정보를 수신하여 음성인식을 통한 텍스트정보를 생성하고, 해당 텍스트정보를 기설정된 언어로 인공지능을 이용하여 번역텍스트정보를 생성하여 요청에 따라 복수의 사용자단말기로 전송하는 APP 서버; 및 상기 사용자단말기의 요청에 따라 상기 APP 서버에 의해 사용자들 간의 음성 또는 영상 통화에 포함되어 요청한 사용자단말기의 상대방 사용자단말기 음성정보 또는 영상정보를 APP 서버로부터 전달받아 통역 전문가에게 제공하며, 요청한 사용자단말기의 설정언어로 해당 사용자단말기에게 번역음성정보 또는 번역영상정보를 제공하는 적어도 하나 이상의 전문가단말기;를 포함한다. The present invention includes a plurality of user terminals that request translation in a user's requested language and perform an audio or video call with one or more other users who are conversation partners; It is connected to enable data communication with the plurality of user terminals and relays them to make an audio or video call, receives voice or video information of the user terminals to generate text information through voice recognition, and the corresponding text information An APP server that generates translated text information in a preset language using artificial intelligence and transmits it to a plurality of user terminals according to a request; And receiving voice information or video information of the other user terminal of the requested user terminal by the APP server in an audio or video call between users at the request of the user terminal and providing it to an interpreter expert, and the requested user terminal And at least one expert terminal that provides translated audio information or translated image information to a corresponding user terminal in a set language of.

여기서 사용자 또는 통역 전문가의 음성정보를 입력받아 상기 사용자단말기 또는 전문가단말기로 전송하는 음성입력수단 및 상기 사용자단말기로부터 음성정보 또는 번역음성정보를 전달받아 통역 전문가 또는 사용자에게 제공하는 음성출력수단이 포함된다. Here, a voice input means for receiving voice information of a user or an interpreter and transmitting it to the user terminal or an expert terminal, and a voice output means for receiving voice information or translated voice information from the user terminal and providing it to the interpreter expert or user. .

아울러 상기 사용자단말기는 상기 APP 서버와 데이터통신 가능하도록 하는 통신모듈부와, 사용자의 음성정보를 전달받는 음성수신부와, 번역음성정보를 제공하는 음성송신부와, 사용자의 번역요청정보, 번역언어 설정정보, 통역 전문가 선택정보, 대화 주제 선택정보 또는 인공지능 번역요청정보를 포함하는 사용자 입력정보를 수신하는 인터페이스부 및 상기 인터페이스부로부터 사용자 입력정보에 따라 상기 음성수신부로부터 수신되는 음성정보 및 사용자 입력정보를 상기 APP 서버로 전송하여 선택한 통역 전문가의 전문가단말기가 사용자들간 음성 또는 영상통화에 포함되도록 하며, 요청에 따라 인공지능을 이용한 번역텍스트정보의 전송요청을 수행하며, APP 서버로부터 수신되는 전문가단말기의 번역음성정보 또는 인공지능을 이용한 번역텍스트정보를 음성송신부 또는 디스플레이수단을 통해 사용자에게 제공하는 사용자단말기제어부를 포함한다. In addition, the user terminal includes a communication module unit that enables data communication with the APP server, a voice receiver that receives the user's voice information, a voice transmission unit that provides translated voice information, the user's translation request information, and translation language setting information. , Interpreter selection information, conversation topic selection information, or an interface unit for receiving user input information including artificial intelligence translation request information, and voice information and user input information received from the voice receiving unit according to user input information from the interface unit. Transmits to the APP server so that the expert terminal of the selected interpreter expert is included in the voice or video call between users, performs a request for transmission of translated text information using artificial intelligence according to the request, and translates the expert terminal received from the APP server. And a user terminal control unit that provides voice information or translated text information using artificial intelligence to a user through a voice transmission unit or display means.

또한 상기 APP 서버는 상기 복수개의 사용자단말기 또는 상기 적어도 하나 이상의 전문가단말기와 데이터통신 가능하도록 하는 통신부와, 상기 통신부를 통해 사용자단말기의 번역요청정보가 수신되는 경우 대화 주제 도메인 리스트정보를 제공하여 선택된 도메인 리스트정보에 따른 통역 전문가 리스트정보를 해당 사용자단말기로 제공한 다음 제공된 통역 전문가 리스트정보에서 사용자자 선택한 통역 전문가 선택정보에 따라 해당 통역 전문가의 전문가단말기로 번역수행가능 요청정보를 전송하여 사용자단말기와 전문가단말기 간 번역 수행 연결을 관리하는 매칭관리부와, 상기 매칭관리부의 매칭이 수행되면 상기 번역요청정보를 전송한 사용자단말기의 상대방 사용자단말기로부터 제공받은 번역을 수행할 음성정보 또는 영상정보를 상기 전문가단말기로 전송하고, 전문가단말기로부터 번역이 수행된 음성정보 또는 영상정보를 전달받아 이를 번역요청정보를 전송한 사용자단말기로 실시간 전송하는 일련의 과정을 반복하는 실시간번역중계부 및 상기 사용자단말기들과 전문가단말기 간의 음성정보 또는 영상정보를 음성인식을 통한 텍스트정보로 변환하는 STT변환부과, 상기 사용자단말기의 인공지능을 이용한 번역텍스트정보의 전송요청이 있는 경우 상기 STT변환부로부터 대화 상대방 사용자단말기의 텍스트정보를 전달받아 인공지능을 통해 해당 텍스트정보를 기설정된 언어로 번역이 이루어지는 번역텍스트정보를 생성하여 전송요청을 한 사용자단말기로 전송하는 인공지능(AI) 번역부를 포함한다. In addition, the APP server is a communication unit that enables data communication with the plurality of user terminals or the at least one expert terminal, and when the translation request information of the user terminal is received through the communication unit, the domain selected by providing conversation subject domain list information After providing the list of interpreter experts according to the list information to the user terminal, the user terminal and the expert by transmitting the request information for possible translation to the expert terminal of the corresponding interpreter according to the information selected by the user from the provided interpreter expert list information. A matching management unit that manages the connection between the terminals to perform translation, and when the matching of the matching management unit is performed, the voice information or image information to be translated provided from the other user terminal of the user terminal that has transmitted the translation request information is transferred to the expert terminal. A real-time translation relay unit that repeats a series of processes of transmitting and receiving the translated audio or video information from the expert terminal and transmitting it in real time to the user terminal that has transmitted the translation request information, and between the user terminals and the expert terminal. An STT conversion unit that converts voice information or video information into text information through voice recognition, and when there is a request for transmission of translated text information using the artificial intelligence of the user terminal, the STT conversion unit transmits the text information of the user terminal of the conversation partner. It includes an artificial intelligence (AI) translation unit that generates translation text information in which the text information is translated into a preset language through artificial intelligence and transmits the transmission request to a user terminal.

아울러 상기 전문가단말기는 상기 APP 서버와 데이터통신 가능하도록 하는 통신모듈부와, 상기 음성입력수단으로부터 통역 전문가의 번역음성정보를 전달받는 전문가음성수신부와, 상기 음성출력수단으로 사용자의 음성정보를 통역 전문가에게 제공하는 전문가음성송신부와, 상기 APP 서버로부터 번역수행가능 요청정보가 수신되는 경우 통역 전문가에게 제공하고 통역 전문가의 가능여부에 대한 선택정보를 입력받는 전문가인터페이스부 및 상기 APP 서버로부터 사용자의 음성정보 또는 영상정보를 통역 전문가에게 전문가음성송신부 또는 디스플레이수단을 통해 제공하며, 전문가음성수신부를 통해 수신되는 통역 전문가의 번역음성정보를 상기 APP 서버로 전송하는 전문가단말기제어부를 포함한다. In addition, the expert terminal includes a communication module unit that enables data communication with the APP server, an expert voice receiving unit that receives the translated voice information of the interpreter expert from the voice input unit, and the user’s voice information through the voice output unit. An expert voice transmission unit provided to the user, an expert interface unit provided to an interpreting expert when request information for possible translation is received from the APP server, and inputting selection information on whether or not the interpreting expert is possible, and user's voice information from the APP server Or an expert terminal control unit that provides image information to an interpreter through an expert voice transmission unit or a display means, and transmits the translated audio information of the interpreter expert received through the expert voice reception unit to the APP server.

또한 상기 APP 서버는 상기 STT변환부로부터 상기 복수의 사용자단말기 및 전문가단말기의 음성정보 및 번역음성정보의 텍스트정보를 전달받아 각 사용자단말기 및 전문가단말기의 식별정보에 따라 시간순으로 리스트화하는 회의록생성부가 포함된다. In addition, the APP server receives the text information of the voice information and translated voice information of the plurality of user terminals and expert terminals from the STT converter and lists them in chronological order according to the identification information of each user terminal and expert terminal. Included.

아울러 상기 APP 서버의 STT변환부 및 인공지능 번역부로부터 사용자단말기들 및 전문가단말기의 텍스트정보 및 번역텍스트정보를 전달받아 저장하되, 대화 주제 도메인정보별로 저장되는 번역데이터베이스가 포함된다. In addition, text information and translation text information of user terminals and expert terminals are received and stored from the STT conversion unit and the artificial intelligence translation unit of the APP server, and a translation database stored for each conversation subject domain information is included.

또한 상기 사용자단말기로부터 통역 전문가 또는 인공지능을 이용한 번역요청이 있는 경우 설정에 따라 과금정보를 산출하여 결제가 이루어지는 결제서버가 포함된다. In addition, when there is a translation request using an interpreter expert or artificial intelligence from the user terminal, a payment server in which payment is made by calculating billing information according to a setting is included.

아울러 상기 복수의 사용자단말기 중 어느 하나의 사용자단말기가 메인스피커단말기로 선택되는 경우 나머지 사용자단말기들은 리스너단말기로 특정되어 각 리스너단말기의 음성정보 또는 영상정보의 번역을 수행하는 전문가단말기가 해당 리스너단말기의 음성정보 또는 영상정보를 메인스피커단말기의 설정언어로 번역을 수행한다. In addition, when any one of the plurality of user terminals is selected as the main speaker terminal, the remaining user terminals are specified as listener terminals, and an expert terminal performing the translation of audio or video information of each listener terminal The audio or video information is translated into the language set in the main speaker terminal.

본 발명에 따르면 인공지능(AI) 번역부를 통한 번역과 통역 전문가의 전문 번역이 사용자의 요청에 따라 택일 또는 모두 제공됨에 따라 대화 상황 및 요구하는 통역 품질에 따라 사용자가 용이하게 선택할 수 있으며 통역 품질의 신뢰도가 향상될 수 있는 효과가 있다. According to the present invention, as the translation through the artificial intelligence (AI) translation unit and the professional translation of the interpreter expert are provided alternatively or both according to the user's request, the user can easily select according to the conversation situation and the requested interpretation quality. There is an effect that the reliability can be improved.

아울러 인공지능 번역부가 해당 인공지능 번역 결과와 해당 번역 결과에 대응되는 통역 전문가의 번역 결과를 비교하여 인공지능 학습이 수행됨에 따라 인공지능 번역 품질의 향상을 유도할 수 있다. In addition, as the artificial intelligence translation unit compares the artificial intelligence translation result with the translation result of an interpreter corresponding to the translation result, the artificial intelligence translation quality can be improved as artificial intelligence learning is performed.

또한 APP 서버의 회의록생성부를 통해 별도로 사용자가 대화 내용을 기록할 필요없이 회의록정보를 제공받을 수 있는 효과가 있다. In addition, through the meeting minutes generation unit of the APP server, there is an effect that the user can receive the meeting minutes information without having to separately record the contents of the conversation.

아울러 다자간 대화 시 또는 온라인 환경이 아닌 로컬 오프라인 환경에서도 모바일라우터를 이용하여 통역 서비스의 제공이 가능한 효과가 있다. In addition, it is possible to provide an interpretation service using a mobile router in a multi-party conversation or in a local offline environment rather than an online environment.

도 1은 본 발명의 일실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다.
도 2는 본 발명의 일실시예에 따른 실시간 통역 서비스시스템의 내부 구성을 나타내는 블럭도이다.
도 3은 본 발명의 일실시예에 따른 실시간 통역 서비스시스템의 동작과정을 개략적으로 나타내는 사용상태도이다.
도 4 및 도 5는 본 발명의 일실시예에 따른 사용자단말기가 APP 서버에 접속시 제공받는 인터페이스화면을 나타내는 도면이다.
도 6은 본 발명의 다른 실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다.
도 7은 본 발명의 또 다른 실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다.
도 8은 본 발명의 변형가능한 실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다.
도 9는 본 발명의 변형가능한 실시예에 따른 컨퍼런스 모드 형태의 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다.
도 10은 도 9에 따른 컨퍼런스 모드가 표시된 사용자단말기의 인터페이스화면을 나타내는 도면이다.
도 11은 도 9에 따른 컨퍼런스 모드시 각 사용자단말기별 대화 화면을 나타내는 도면이다. 1 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to an embodiment of the present invention.
2 is a block diagram showing the internal configuration of a real-time interpretation service system according to an embodiment of the present invention.
3 is a use state diagram schematically showing the operation of the real-time interpretation service system according to an embodiment of the present invention.
4 and 5 are diagrams illustrating an interface screen provided when a user terminal accesses an APP server according to an embodiment of the present invention.
6 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to another embodiment of the present invention.
7 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to another embodiment of the present invention.
8 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to a deformable embodiment of the present invention.
9 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system in the form of a conference mode according to a deformable embodiment of the present invention.
10 is a diagram illustrating an interface screen of a user terminal in which the conference mode according to FIG. 9 is displayed.
11 is a diagram illustrating a chat screen for each user terminal in the conference mode according to FIG. 9.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 설명하기 위하여 이하에서는 본 발명의 바람직한 실시예를 예시하고 이를 참조하여 살펴본다.In order to explain the present invention and the operational advantages of the present invention and the object achieved by the implementation of the present invention, the following describes a preferred embodiment of the present invention and looks at with reference thereto.

먼저, 본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로서, 본 발명을 한정하려는 의도가 아니며, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 또한 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.First, terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention, and expressions in the singular may include a plurality of expressions unless clearly differently in context. In addition, in the present application, terms such as "comprise" or "have" are intended to designate the existence of features, numbers, steps, actions, components, parts, or a combination thereof described in the specification, but one or more other It should be understood that the presence or addition of features, numbers, steps, actions, components, parts, or combinations thereof, does not preclude the possibility of preliminary exclusion.

본 발명을 설명함에 있어서, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.In describing the present invention, when it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present invention, a detailed description thereof will be omitted.

도 1은 본 발명의 일실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다. 1 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to an embodiment of the present invention.

도면을 참조하면 본 발명의 일실시예에 따른 실시간 통역 서비스시스템(1000)은 사용자의 요청 언어로 번역을 요청하며, 대화 상대방인 하나 이상의 다른 사용자와 음성 또는 영상 통화를 수행하는 복수의 사용자단말기(100)와, 상기 복수의 사용자단말기(100)들과 데이터통신가능하도록 연결되며 이들 간을 중계하여 음성 또는 영상 통화가 이루어지도록 하며 해당 사용자단말기(100)들의 음성 또는 영상정보를 수신하여 음성인식을 통한 텍스트정보를 생성하고, 해당 텍스트정보를 기설정된 언어로 인공지능을 이용하여 번역텍스트정보를 생성하여 요청에 따라 복수의 사용자단말기(100)로 전송하는 APP 서버(300) 및 상기 사용자단말기(100)의 요청에 따라 상기 APP 서버(300)에 의해 사용자들 간의 음성 또는 영상 통화에 포함되어 요청한 사용자단말기(100)의 상대방 사용자단말기(100) 음성정보 또는 영상정보를 APP 서버(300)로부터 전달받아 통역 전문가에게 제공하며, 요청한 사용자단말기(100)의 설정언어로 해당 사용자단말기(100)에게 번역음성정보 또는 번역영상정보를 제공하는 적어도 하나 이상의 전문가단말기(200)로 구성된다. Referring to the drawings, a real-time interpretation service system 1000 according to an embodiment of the present invention requests translation in a user's requested language, and a plurality of user terminals ( 100), and the plurality of user terminals 100 and connected to enable data communication, relaying therebetween to make an audio or video call, and receiving audio or video information of the corresponding user terminals 100 to perform voice recognition. The APP server 300 and the user terminal 100 for generating text information through and transmitting the text information to a plurality of user terminals 100 according to a request by generating translated text information using artificial intelligence in a preset language. ) At the request of the APP server 300 to receive the voice information or video information of the other user terminal 100 of the user terminal 100 requested by the APP server 300 in an audio or video call between users from the APP server 300 It is provided to an interpreter expert, and consists of at least one expert terminal 200 that provides translation voice information or translated image information to the corresponding user terminal 100 in a language set for the requested user terminal 100.

여기서 상기 사용자단말기(100)는 이종 언어를 가지는 2인 이상의 다자간 대화자들이 각자 소지하는 단말기로서 인공지능 번역 또는 통역 전문가를 통한 번역이 요구되는 경우 본 발명에 따른 실시간 통역 서비스시스템(1000)을 통해 통역 서비스를 제공받게 된다. Here, the user terminal 100 is a terminal owned by two or more multi-party chatters having different languages, and interpreting through the real-time interpretation service system 1000 according to the present invention when an artificial intelligence translation or translation by an interpreter is required. Service will be provided.

이와 같은 상기 사용자단말기(100)는 스마트폰을 포함하는 휴대용 단말기외에 데스크탑, 노트북 또는 테블릿 등 후술할 APP 서버(300)로 접속하여 데이터통신가능한 단말기라면 모두 적용될 수 있다. The user terminal 100 may be applied to any terminal capable of data communication by connecting to an APP server 300 to be described later, such as a desktop, a notebook, or a tablet, in addition to a portable terminal including a smart phone.

아울러 상기 APP 서버(300)는 이종 언어로 대화를 수행하는 다자간 대화에서 번역요청정보가 수신되는 경우 이들을 중계하여 음성 또는 영상통화를 수행하도록 하며 사용자단말기(100)의 요청에 따라 각 사용자단말기(100)로부터 수신되는 음성정보 또는 영상정보를 음성인식을 통해 STT 형태의 텍스트정보로 변환하고 해당 변환된 텍스트정보를 인공지능을 이용하여 번역을 수행하며 생성된 번역텍스트정보를 각 사용자단말기(100)로 전송할 수 있다. In addition, the APP server 300 relays the translation request information when the translation request information is received in the multilateral conversation in which the conversation is performed in different languages to perform an audio or video call, and according to the request of the user terminal 100, each user terminal 100 ) To convert the voice information or video information received from) into text information in STT form through voice recognition, and perform the translation using artificial intelligence, and transfer the generated translated text information to each user terminal (100). Can be transmitted.

또한 다자간 대화 당사자 중 어느 하나 또는 모든 당사자가 통역 전문가의 번역을 요청하는 경우 상기 APP 서버(300)는 통역 전문가가 소지하는 전문가단말기(200)로 번역수행가능 요청정보를 전송하여 가능여부를 확인한 후 가능한 경우 다자간 음성 또는 영상통화에 해당 전문가단말기(200)를 포함시킨다. In addition, when any one or all of the parties to the multilateral conversation requests a translation from an interpreter, the APP server 300 transmits the request information for possible translation to the expert terminal 200 possessed by the interpreter and checks whether it is possible. If possible, the expert terminal 200 is included in the multi-party voice or video call.

이때 상기 APP 서버(300)는 특정 사용자단말기(100)로부터 통역 전문가의 번역요청정보가 수신되는 경우 사용자의 언어정보, 대화 주제 도메인정보를 해당 사용자단말기(100)로 제공하여 해당 사용자의 입력 정보에 따라 해당 언어, 해당 도메인에 기 저장된 통역 전문가 리스트정보를 추출하고 이를 사용자단말기(100)로 제공하여 선택정보를 요청하도록 한다. At this time, when the translation request information of the interpreter expert is received from the specific user terminal 100, the APP server 300 provides the user's language information and the conversation subject domain information to the corresponding user terminal 100 to provide the user's input information. Accordingly, the interpreter expert list information previously stored in the corresponding language and the corresponding domain is extracted and provided to the user terminal 100 to request selection information.

이러한 통역 전문가 리스트정보에는 해당 통역 전문가의 개인 정보 외에 통역 품질에 대한 기존 사용자의 리뷰 정보, 비용 정보 등이 제공될 수 있다. In addition to personal information of the corresponding interpreter expert, the interpreter expert list information may provide review information of an existing user on the quality of interpretation, cost information, and the like.

이에 따라 다자간 대화 당사자 중 일부 또는 전부가 번역요청정보를 전송하고, 일련의 사용자 입력정보를 입력하여 통역 전문가의 매칭 및 음성 또는 영상통화 연결이 이루어지면 각 사용자단말기(100)가 선택한 전문가단말기(200)는 APP 서버(300)로부터 해당 사용자단말기(100)의 대화 상대방 사용자단말기(100)의 음성정보 또는 영상정보를 전달받게 되고, 이를 번역한 번역음성정보 또는 번역영상정보를 APP 서버(300)를 통해 해당 사용자단말기(100)로 전달하게 된다. Accordingly, when some or all of the multilateral conversation parties transmit the translation request information and input a series of user input information to match the interpreter expert and connect an audio or video call, each user terminal 100 selects the expert terminal 200. ) Receives the voice information or video information of the user terminal 100 of the corresponding user terminal 100 from the APP server 300, and transmits the translated audio information or translated image information to the APP server 300. Through this, it is transmitted to the corresponding user terminal 100.

물론 대화 당사자가 2명이고, 이들이 1인의 통역 전문가를 선택하는 경우 해당 통역 전문가의 전문가단말기(200)는 계속하여 2인의 사용자단말기(100)로부터 음성정보 또는 영상정보를 전달받아 이를 번역한 번역음성정보 및 번역영상정보를 제공하게 된다. Of course, if there are two parties to the conversation, and they select one interpreter, the expert terminal 200 of the corresponding interpreter continues to receive audio or video information from the two user terminals 100 and translates it. Information and translated video information will be provided.

아울러 APP 서버(300)에는 STT변환부(340)가 포함되어 각 사용자단말기(100)의 음성정보 또는 영상정보에서 음성인식을 통한 텍스트정보와 통역 전문가의 음성정보 또는 영상정보에서 음성인식을 통한 번역텍스트정보를 각각 생성하고, 이를 사용자단말기(100)로 전송할 수 있다. In addition, the APP server 300 includes an STT converter 340 to translate text information through voice recognition from voice information or video information of each user terminal 100 and voice information or video information of an interpreter through voice recognition. Each text information may be generated and transmitted to the user terminal 100.

이에 따라 사용자단말기(100)의 사용자는 번역음성정보 또는 번역영상정보를 제공받아 청취함과 동시에 사용자단말기(100)의 디스플레이수단으로 텍스트정보를 함께 제공받음으로써 보다 대화 내용에 집중할 수 있게 된다. Accordingly, the user of the user terminal 100 receives and listens to the translated voice information or the translated image information, and at the same time receives the text information through the display means of the user terminal 100 so that the user can concentrate on the conversation content more.

아울러 전술한 바와 같이 APP 서버(300)의 인공지능 번역을 요청하게 되면 해당 인공지능 번역에 따른 번역텍스트정보를 함께 제공받아 통역 전문가의 번역텍스트정보와 비교할 수 있게 된다. In addition, as described above, when an artificial intelligence translation is requested by the APP server 300, translation text information according to the artificial intelligence translation is also provided and compared with the translated text information of an interpreter.

이와 같은 본 발명에 따른 하이브리드 실시간 통역 서비스시스템의 경우 통역 전문가를 통한 번역 서비스와 인공지능을 통한 번역 서비스가 택일 또는 동시에 제공가능함에 따라 대화 현장 또는 시간 등의 다양한 환경 변화에 대화의 중단 없이 대화가 가능하도록 제공할 수 있고, 대화 주제의 전문성 등을 고려하여 인공지능을 통한 번역만으로도 어느 정도의 양질의 번역 서비스가 가능한 경우 사용자로 하여금 비용 부담을 감소시켜 줄 수 있는 특징을 가진다. In the case of the hybrid real-time interpretation service system according to the present invention, a translation service through an interpreter expert and a translation service through artificial intelligence can be provided alternatively or at the same time. It can be provided so as to be possible, and in consideration of the expertise of a conversation topic, if a certain level of high-quality translation service is possible with only translation through artificial intelligence, it has a characteristic that can reduce the cost burden to the user.

예를 들면 통역 전문가의 통화 연결이 네트워크의 상황상 오프라인이 된 경우 인공지능을 통한 번역 서비스로 대체하여 대화를 계속 이어갈 수 있으며, 통역 전문가의 번역 서비스와 인공지능을 통한 번역 서비스를 동시에 제공받으면서 양자를 비교하였을 때 큰 품질차가 없는 것으로 사용자가 판단하는 경우 도중에 통역 전문가를 통한 번역 서비스를 중단하고 인공지능을 통한 번역 서비스로만 선택 제공받을 수 있는 것이다. For example, if an interpreter's call connection is offline due to the network situation, the conversation can be continued by replacing it with a translation service through artificial intelligence, and both are provided with a translation service by an interpreter and a translation service through artificial intelligence at the same time. Compared to the comparison, if the user determines that there is no significant difference in quality, the translation service through an interpreter can be stopped halfway and only the translation service through artificial intelligence can be selectively provided.

도 2는 본 발명의 일실시예에 따른 실시간 통역 서비스시스템의 내부 구성을 나타내는 블럭도이다. 2 is a block diagram showing the internal configuration of a real-time interpretation service system according to an embodiment of the present invention.

도면을 참조하면 본 발명의 일실시예에 따른 실시간 통역 서비스시스템(1000)의 내부 구성을 살펴보면 우선 상기 사용자단말기(100)의 경우 상기 APP 서버(300)와 데이터통신 가능하도록 하는 통신모듈부(110)와, 사용자의 음성정보를 전달받는 음성수신부(120)와, 번역음성정보를 제공하는 음성송신부(130)와, 사용자의 번역요청정보, 번역언어 설정정보, 통역 전문가 선택정보, 대화 주제 선택정보 또는 인공지능 번역요청정보를 포함하는 사용자 입력정보를 수신하는 인터페이스부(140) 및 상기 인터페이스부(140)로부터 사용자 입력정보에 따라 상기 음성수신부(120)로부터 수신되는 음성정보 및 사용자 입력정보를 상기 APP 서버(300)로 전송하여 선택한 통역 전문가의 전문가단말기(200)가 사용자들간 음성 또는 영상통화에 포함되도록 하며, 요청에 따라 인공지능을 이용한 번역텍스트정보의 전송요청을 수행하며, APP 서버(300)로부터 수신되는 전문가단말기(200)의 번역음성정보 또는 인공지능을 이용한 번역텍스트정보를 음성송신부(130) 또는 디스플레이수단을 통해 사용자에게 제공하는 사용자단말기제어부(150)로 구성된다. Referring to the drawings, looking at the internal configuration of the real-time interpretation service system 1000 according to an embodiment of the present invention, first, in the case of the user terminal 100, a communication module unit 110 that enables data communication with the APP server 300. ), the voice receiving unit 120 receiving the user's voice information, and the voice transmitting unit 130 providing translated voice information, the user's translation request information, translation language setting information, interpretation expert selection information, conversation topic selection information Or the interface unit 140 for receiving user input information including artificial intelligence translation request information, and the voice information and user input information received from the voice receiving unit 120 according to user input information from the interface unit 140. The expert terminal 200 of the interpreter expert selected by sending it to the APP server 300 is included in the voice or video call between users, and performs a request for transmission of translated text information using artificial intelligence according to the request, and the APP server 300 ). It consists of a user terminal control unit 150 that provides the user with translation voice information of the expert terminal 200 or translation text information using artificial intelligence received from the voice transmission unit 130 or a display means.

여기서 상기 사용자의 음성정보를 음성수신부(120)가 전달받을 수 있도록 별도의 음성입력수단(400)이 구비될 수 있으며, 필요에 따라 사용자단말기(100) 내에 일반적으로 구비되는 음성입력수단인 마이크 등을 통해 음성수신부(120)가 사용자의 음성정보를 전달받을 수 있다. Here, a separate voice input means 400 may be provided so that the voice receiver 120 can receive the user's voice information, and if necessary, a microphone, which is a voice input means generally provided in the user terminal 100, etc. Through the voice receiver 120 may receive the user's voice information.

아울러 상기 음성송신부(130)에 의해 번역음성정보를 사용자에게 제공하기 위해 별도의 음성출력수단(410)이 구비될 수 있으며, 필요에 따라 사용자단말기(100) 내에 일반적으로 구비되는 음성출력수단인 스피커 등을 통해 음성송신부(130)의 번역음성정보를 사용자가 제공받을 수 있다. In addition, a separate voice output means 410 may be provided in order to provide the translated voice information to the user by the voice transmission unit 130, and if necessary, a speaker, which is a voice output means generally provided in the user terminal 100 The user may be provided with the translated voice information of the voice transmission unit 130 through or the like.

또한 상기 사용자단말기제어부(150)는 상기 인터페이스부(140)를 통해 입력되는 사용자의 입력정보 즉, 예를 들면 사용자의 번역요청정보, 번역언어 설정정보, 통역 전문가 선택정보, 대화 주제 선택정보 또는 인공지능 번역요청정보 등을 입력받는 경우 이를 APP 서버(300)로 전송하고, APP 서버(300)로부터 제공되는 각종 요청정보에 대한 제공정보를 사용자단말기(100)의 디스플레이수단을 통해 제공하게 된다. In addition, the user terminal control unit 150 includes user input information input through the interface unit 140, for example, user's translation request information, translation language setting information, interpretation expert selection information, conversation topic selection information, or artificial When intelligent translation request information is received, it is transmitted to the APP server 300, and information provided for various request information provided from the APP server 300 is provided through a display means of the user terminal 100.

한편 실시간 통역 서비스시스템(1000)의 APP 서버(300)는 상기 복수개의 사용자단말기(100) 또는 상기 적어도 하나 이상의 전문가단말기(200)와 데이터통신 가능하도록 하는 통신부(310)와, 상기 통신부(310)를 통해 사용자단말기(100)의 번역요청정보가 수신되는 경우 대화 주제 도메인 리스트정보를 제공하여 선택된 도메인 리스트정보에 따른 통역 전문가 리스트정보를 해당 사용자단말기(100)로 제공한 다음 제공된 통역 전문가 리스트정보에서 사용자자 선택한 통역 전문가 선택정보에 따라 해당 통역 전문가의 전문가단말기(200)로 번역수행가능 요청정보를 전송하여 사용자단말기(100)와 전문가단말기(200) 간 번역 수행 연결을 관리하는 매칭관리부(320)와, 상기 매칭관리부(320)의 매칭이 수행되면 상기 번역요청정보를 전송한 사용자단말기(100)의 상대방 사용자단말기(100)로부터 제공받은 번역을 수행할 음성정보 또는 영상정보를 상기 전문가단말기(200)로 전송하고, 전문가단말기(200)로부터 번역이 수행된 음성정보 또는 영상정보를 전달받아 이를 번역요청정보를 전송한 사용자단말기(100)로 실시간 전송하는 일련의 과정을 반복하는 실시간번역중계부(330) 및 상기 사용자단말기(100)들과 전문가단말기(200) 간의 음성정보 또는 영상정보를 음성인식을 통한 텍스트정보로 변환하는 STT변환부(340)과, 상기 사용자단말기(100)의 인공지능을 이용한 번역텍스트정보의 전송요청이 있는 경우 상기 STT변환부(340)로부터 대화 상대방 사용자단말기(100)의 텍스트정보를 전달받아 인공지능을 통해 해당 텍스트정보를 기설정된 언어로 번역이 이루어지는 번역텍스트정보를 생성하여 전송요청을 한 사용자단말기(100)로 전송하는 인공지능(AI) 번역부(350)와 상기 APP 서버(300)는 상기 STT변환부(340)로부터 상기 복수의 사용자단말기(100) 및 전문가단말기(200)의 음성정보 및 번역음성정보의 텍스트정보를 전달받아 각 사용자단말기(100) 및 전문가단말기(200)의 식별정보에 따라 시간순으로 리스트화하는 회의록생성부(360)로 구성된다. Meanwhile, the APP server 300 of the real-time interpretation service system 1000 includes a communication unit 310 for enabling data communication with the plurality of user terminals 100 or the at least one expert terminal 200, and the communication unit 310 When the translation request information of the user terminal 100 is received through the conversation subject domain list information, the list of interpreter experts according to the selected domain list information is provided to the corresponding user terminal 100, and then from the provided interpreter expert list information. Matching management unit 320 that manages the connection between the user terminal 100 and the expert terminal 200 by transmitting the translation request information to the expert terminal 200 of the corresponding interpreter expert according to the selected information selected by the user. And, when the matching of the matching management unit 320 is performed, the expert terminal 200 transmits voice information or image information to perform translation provided from the counterpart user terminal 100 of the user terminal 100 that has transmitted the translation request information. ), and repeats a series of processes of receiving the voice information or video information translated from the expert terminal 200 and transmitting it to the user terminal 100 that transmitted the translation request information in real time ( 330) and the STT conversion unit 340 for converting voice information or image information between the user terminals 100 and the expert terminal 200 into text information through voice recognition, and the artificial intelligence of the user terminal 100 When there is a request for transmission of the translated text information used, the text information of the user terminal 100 of the conversation counterpart is received from the STT conversion unit 340, and the text information is translated into a preset language through artificial intelligence. The artificial intelligence (AI) translation unit 350 and the APP server 300 that generate and transmit the transmission request to the user terminal 100 are provided with the plurality of user terminals 100 and experts from the STT conversion unit 340. By receiving the text information of the voice information and the translated voice information of the terminal 200, each user terminal 100 and the expert terminal 200 ) Is composed of a minutes generating unit 360 listing in chronological order according to the identification information.

여기서 상기 STT변환부(340)와 인공지능 번역부(350)는 후술할 도 7에 도시된 바와 같이 별도의 인공지능 기반 STT 서버(700)로 구비되어 APP 서버(300)가 매칭관리부(320) 및 실시간번역중계부(330)를 통해 각 사용자단말기(100) 및 전문가단말기(200)들 간의 음성 또는 영상통화를 중계함과 동시에 인공지능 기반 STT 서버(700)로 음성 또는 영상정보를 전송하여 음성인식을 통한 STT 형태의 텍스트정보를 생성하고, 요청시 인공지능을 통한 번역텍스트정보를 생성하여 이를 사용자단말기(100)로 제공할 수 있다. Here, the STT conversion unit 340 and the artificial intelligence translation unit 350 are provided as a separate artificial intelligence-based STT server 700 as shown in FIG. 7 to be described later, so that the APP server 300 is a matching management unit 320. And by relaying the voice or video call between each user terminal 100 and the expert terminal 200 through the real-time translation relay unit 330, and at the same time transmitting the audio or video information to the artificial intelligence-based STT server 700 STT-type text information may be generated through recognition, and translated text information may be generated through artificial intelligence upon request and provided to the user terminal 100.

물론 이와 같이 인공지능 기반 STT 서버(700)로 분리 구성되는 경우 해당 서버 내에 회의록생성부(360)가 포함되어 회의록 리스트정보를 생성함은 물론이며, 후술할 번역데이터베이스(500) 또한 상기 인공지능 기반 STT 서버(700)와 연동되어 이로부터 해당 음성 또는 영상정보 및 텍스트정보, 번역텍스트정보 등을 전달받아 이를 저장한다. Of course, if the artificial intelligence-based STT server 700 is configured separately, the meeting minutes generation unit 360 is included in the server to generate the minutes list information, and the translation database 500 to be described later is also based on the artificial intelligence. It is interlocked with the STT server 700 to receive the corresponding audio or video information, text information, and translation text information from thereto and store it.

한편 상기 전문가단말기(200)는 상기 APP 서버(300)와 데이터통신 가능하도록 하는 통신모듈부(210)와, 상기 음성입력수단(400)으로부터 통역 전문가의 번역음성정보를 전달받는 전문가음성수신부(220)와, 상기 음성출력수단(410)으로 사용자의 음성정보를 통역 전문가에게 제공하는 전문가음성송신부(230)와, 상기 APP 서버(300)로부터 번역수행가능 요청정보가 수신되는 경우 통역 전문가에게 제공하고 통역 전문가의 가능여부에 대한 선택정보를 입력받는 전문가인터페이스부(240) 및 상기 APP 서버(300)로부터 사용자의 음성정보 또는 영상정보를 통역 전문가에게 전문가음성송신부(230) 또는 디스플레이수단(260)을 통해 제공하며, 전문가음성수신부(220)를 통해 수신되는 통역 전문가의 번역음성정보를 상기 APP 서버(300)로 전송하는 전문가단말기제어부(250)로 이루어진다. Meanwhile, the expert terminal 200 includes a communication module unit 210 that enables data communication with the APP server 300, and an expert voice receiving unit 220 that receives the translated voice information of an interpreter expert from the voice input unit 400. ), and an expert voice transmission unit 230 that provides the user's voice information to an interpreter expert through the voice output means 410, and when the request information for possible translation is received from the APP server 300, it is provided to an interpreter expert. The expert interface unit 240 for receiving selection information on whether or not the interpreter is available, and the voice or video information of the user from the APP server 300 are sent to the expert voice transmission unit 230 or the display means 260 to the interpreter expert. It is provided through, and consists of an expert terminal controller 250 that transmits the translated voice information of the interpreter expert received through the expert voice receiving unit 220 to the APP server 300.

도 3은 본 발명의 일실시예에 따른 실시간 통역 서비스시스템의 동작과정을 개략적으로 나타내는 사용상태도이다. 3 is a use state diagram schematically showing the operation of the real-time interpretation service system according to an embodiment of the present invention.

도면을 참조하여 본 발명에 따른 실시간 통역 서비스시스템의 동작과정을 설명하면, 우선 도 3에서는 설정언어가 한국어인 대화 당사자와, 설정언어가 영어인 대화 당사자가 서로 음성 통화를 수행하는 예이다. The operation process of the real-time interpretation service system according to the present invention will be described with reference to the drawings. FIG. 3 is an example in which a conversation party whose setting language is Korean and a conversation party whose setting language is English perform a voice call with each other.

이에 따라 설정언어가 한국어인 대화 당사자는 영어를 한국어로 번역하는 통역 전문가를 선택하였으며, 설정언어가 영어인 대화 당사자는 한국어를 영어로 번역하는 통역 전문가를 선택하였다. Accordingly, the participants of the conversation with the set language of Korean selected an interpreter who translates English into Korean, and those with the set language of English selected an interpreter who translates Korean into English.

물론 하나의 통역 전문가가 지정되어 두 명의 대화 당사자에게 모두 번역 서비스를 제공하도록 구성될 수 있음은 물론이다. Of course, one interpreter can be designated and configured to provide translation services to both parties to the conversation.

이에 따라 설정언어가 한국어인 대화 당사자가 "안녕! 다시 만나서 좋습니다." 라고 말을 한 경우 해당 음성 정보는 설정언어가 영어인 대화 당사자가 선택한 통역 전문가의 전문가단말기(200)로 전송되고 해당 통역 전문가의 번역 음성정보인 "Hello, we bumped up here!" 음성이 APP 서버(300)를 통해 설정언어가 영어인 대화 당사자로 전달된다. As a result, the conversation participant whose setting language is Korean will say, "Hello! It's good to see you again." In the case of saying “Hello, we bumped up here!”, the voice information is transmitted to the expert terminal 200 of the interpreter expert selected by the person speaking with the set language of English, and the translated voice information of the interpreter expert is “Hello, we bumped up here!” The voice is transmitted to the conversation party whose set language is English through the APP server 300.

물론 인공지능을 통한 번역을 선택하였으므로 텍스트정보로 인공지능을 통한 번역텍스트정보가 "Hi, we catch up!"이라고 제공된다. Of course, since translation through artificial intelligence was selected, translated text information through artificial intelligence is provided as "Hi, we catch up!" as text information.

도 4 및 도 5는 본 발명의 일실시예에 따른 사용자단말기가 APP 서버에 접속시 제공받는 인터페이스화면을 나타내는 도면이다. 4 and 5 are diagrams illustrating an interface screen provided when a user terminal accesses an APP server according to an embodiment of the present invention.

도면을 참조하면 사용자단말기(100)가 APP 서버(300)에 접속하게 되면, 우선 해당 사용자의 설정언어와 대화 주제 도메인을 선택하기 위한 메뉴 화면이 제공된다.(도 4의 (a)참조) Referring to the drawing, when the user terminal 100 accesses the APP server 300, a menu screen is first provided for selecting the user's setting language and the conversation subject domain (see Fig. 4(a)).

이에 따라 설정언어를 선택하고 도메인을 선택하면(도 4의 (b), (c)참조), 해당 설정언어 및 도메인에 따른 통역 전문가 리스트정보가 APP 서버(300)로부터 사용자단말기(100)로 제공되고,(도 5의 (a)참조) 사용자가 이들 중 선택하면 해당 통역 전문가가 대화에 포함된다.(도 5의 (b)참조)Accordingly, when a set language is selected and a domain is selected (see Fig. 4(b) and (c)), interpreter expert list information according to the set language and domain is provided from the APP server 300 to the user terminal 100 If the user selects one of these, the corresponding interpreter is included in the conversation (see (b) of FIG. 5).

대화창을 살펴보면(도 5의 (b)참조) 우선 ①은 대화 전체 메뉴이고, ②는 대화 종료를 선택하기 위한 메뉴이다. 물론 대화 종료 메뉴를 선택하게 되면 회의록 리스트정보를 제공받을지 전체 음성 또는 영상통화의 녹음파일을 제공받을지 선택 메뉴가 제공될 수 있다. Looking at the chat window (refer to (b) of FIG. 5), first, ① is a menu for the entire conversation, and ② is a menu for selecting the end of the conversation. Of course, when the conversation end menu is selected, a selection menu may be provided whether to receive the meeting minutes list information or a recording file of the entire audio or video call.

한편 ③은 사용자단말기(100)의 해당 사용자의 음성정보가 텍스트정보로 표시되는 것이며, ④는 대화 상대방의 음성정보가 텍스트정보로 표시되는 것이다. On the other hand, ③ indicates that the voice information of the user of the user terminal 100 is displayed as text information, and ④ indicates that the voice information of the conversation partner is displayed as text information.

⑤는 통역 전문가의 음성정보가 텍스트정보로 표시되는 것이며, ⑥ 및 ⑦ 통역 전문가에게 요청사항을 전달하거나 통역 전문가만이 들을 수 있는 귓속말 기능 메뉴이다. ⑤ is an interpreter's voice information displayed as text information, and is a whisper function menu that can only be heard by an interpreter expert or communicate requests to ⑥ and ⑦ interpreter experts.

⑧은 음성 통화시 자신의 음성정보를 전달하기 위해 버튼을 누른 상태로 말하기 위한 메뉴이며, ⑨는 대화 시간, ⑩은 영상 통화와 음성 통화를 선택하기 위한 메뉴이다. ⑧ is a menu for speaking while pressing a button to deliver your own voice information during a voice call, ⑨ is a menu for conversation time, and ⑩ is a menu for selecting a video call and an audio call.

한편 도 5의 (c)에 도시된 바와 같이 인공지능을 통한 번역서비스의 제공여부를 선택할 수 있는 메뉴가 제공된다. Meanwhile, as shown in (c) of FIG. 5, a menu for selecting whether to provide a translation service through artificial intelligence is provided.

도 6은 본 발명의 다른 실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다. 6 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to another embodiment of the present invention.

도면을 참조하면 본 실시예에 따른 실시간 통역 서비스시스템(1000)은 상기 APP 서버(300)의 STT변환부(340) 및 인공지능 번역부(350)로부터 사용자단말기(100)들 및 전문가단말기(200)의 텍스트정보 및 번역텍스트정보를 전달받아 저장하되, 대화 주제 도메인정보별로 저장되는 번역데이터베이스(500)가 포함된다. Referring to the drawings, the real-time interpretation service system 1000 according to the present embodiment includes user terminals 100 and expert terminals 200 from the STT conversion unit 340 and the artificial intelligence translation unit 350 of the APP server 300. ) Text information and translation text information are received and stored, and a translation database 500 stored for each conversation subject domain information is included.

이러한 번역데이터베이스(500)는 인공지능 번역부(350)의 요청에 따라 인공지능 학습시 저장된 데이터를 제공하여 학습이 수행되도록 할 수 있다. The translation database 500 may provide data stored during artificial intelligence learning at the request of the artificial intelligence translation unit 350 so that the learning is performed.

아울러 본 실시예에서는 상기 사용자단말기(100)로부터 통역 전문가 또는 인공지능을 이용한 번역요청이 있는 경우 설정에 따라 과금정보를 산출하여 결제가 이루어지는 결제서버(600)가 포함된다. In addition, in the present embodiment, when there is a request for translation using an interpreter expert or artificial intelligence from the user terminal 100, a payment server 600 that calculates billing information according to a setting and makes payment is included.

도 7은 본 발명의 또 다른 실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다. 7 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to another embodiment of the present invention.

도면을 참조하면 본 실시예에 따른 실시간 통역 서비스시스템(100)은 전술한 실시예에서 인공지능 기반 STT 서버(700)가 별도로 분리되어 구성되고, 인공지능 기반 STT 서버(700)로부터 기존 다자간 대화가 수행된 해당 음성 또는 영상정보 및 텍스트정보, 번역텍스트정보 등을 전달받아 이를 저장하는 데이터베이스(500)가 구비된다. Referring to the drawings, in the real-time interpretation service system 100 according to the present embodiment, the artificial intelligence-based STT server 700 is separately configured in the above-described embodiment, and the existing multi-party conversation is possible from the artificial intelligence-based STT server 700. A database 500 is provided to receive and store the performed audio or video information, text information, and translated text information.

여기서 상기 인공지능 기반 STT 서버(700)는 APP 서버(300)가 매칭관리부(320) 및 실시간번역중계부(330)를 통해 각 사용자단말기(100) 및 전문가단말기(200)들 간의 음성 또는 영상통화를 중계시에 해당 음성 또는 영상정보를 실시간으로 전송받아 이를 음성인식으로 STT 형태의 텍스트정보를 생성한다. Here, the artificial intelligence-based STT server 700 is a voice or video call between each user terminal 100 and the expert terminal 200 through the APP server 300 matching management unit 320 and real-time translation relay unit 330 When relaying, the corresponding voice or video information is transmitted in real time and text information in the form of STT is generated by voice recognition.

아울러 사용자단말기(100)의 요청이 있는 경우 인공지능을 통한 번역텍스트정보를 생성하여 이를 사용자단말기(100)로 제공할 수 있다. In addition, when there is a request from the user terminal 100, translation text information may be generated through artificial intelligence and provided to the user terminal 100.

물론 이와 같이 인공지능 기반 STT 서버(700)로 분리 구성되는 경우 전술한 실시예에서의 APP 서버(300) 내 회의록생성부(360)은 포함되지 않고, 인공지능 기반 STT 서버(700) 내에 포함되어 회의록 리스트정보를 생성하며, APP 서버(300)의 요청에 따라 회의록 리스트정보를 전송하게 된다. Of course, in the case of separate configuration of the artificial intelligence-based STT server 700 as described above, the meeting minutes generation unit 360 in the APP server 300 in the above-described embodiment is not included, and is included in the artificial intelligence-based STT server 700. The minutes list information is generated, and the minutes list information is transmitted according to the request of the APP server 300.

도 8은 본 발명의 변형가능한 실시예에 따른 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이다. 8 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system according to a deformable embodiment of the present invention.

도면을 참조하면 본 실시예에 따른 실시간 통역 서비스시스템(1000)은 복수의 사용자별 음성정보를 입력받아 이를 텍스트정보로 변환하고, 외부로 음성정보 또는 텍스트정보를 전송하는 모바일라우터(800)와, 외부로부터 전송받은 텍스트정보 또는 음성정보를 전달받아 이를 통역 전문가에게 제공하며, 통역 전문가로부터 해당 텍스트정보 또는 음성정보의 언어에서 기설정된 언어로 번역을 수행한 번역텍스트정보 또는 번역음성정보를 입력받되 번역음성정보가 입력된 경우 이를 번역텍스트정보로 변환하는 적어도 하나 이상의 전문가단말기(200)와, 상기 모바일라우터(800) 및 상기 전문가단말기(200)와 데이터 통신가능하도록 연결되어 상기 모바일라우터(800)로부터 사용자별 음성정보 또는 텍스트정보를 전달받아 이를 상기 전문가단말기(200)로 전송하여 사용자별 음성정보의 언어를 기설정된 다른 언어로 번역을 수행한 번역음성정보 전송을 요청하고, 사용자별 음성정보 또는 텍스트정보의 언어를 기설정된 언어로 인공지능(AI) 번역부(350)를 통해 번역을 수행하여 AI번역음성정보 또는 AI번역텍스트정보를 생성하며, 상기 전문가단말기(200)로부터 전달받은 번역음성정보 또는 번역텍스트정보를 번역을 요청한 상기 모바일라우터(800)로 전송하는 APP 서버(300)로 구성된다. Referring to the drawings, the real-time interpretation service system 1000 according to the present embodiment receives a plurality of user-specific voice information, converts it into text information, and transmits voice information or text information to the outside; Receive text information or voice information transmitted from outside and provide it to an interpreter expert, and receive translation text information or translated voice information that has been translated from the language of the text or voice information into a preset language from the interpreter. When voice information is input, it is connected to at least one expert terminal 200 that converts it into translated text information, and data communication with the mobile router 800 and the expert terminal 200, and is connected to the mobile router 800. Voice information or text information for each user is received and transmitted to the expert terminal 200 to request the transmission of the translated voice information in which the language of the voice information for each user is translated into another preset language, and the voice information or text for each user The language of the information is translated into a preset language through the artificial intelligence (AI) translation unit 350 to generate AI translated voice information or AI translated text information, and the translated voice information received from the expert terminal 200 or It is composed of an APP server 300 that transmits the translation text information to the mobile router 800 requesting the translation.

여기서 사용자단말기(100)와 APP 서버(300)가 데이터통신이 안되는 오프라인 상황에도 상기 모바일라우터(800)는 APP 서버(300)와 데이터통신 가능하도록 연결되는데, 이를 위해 상기 모바일라우터(800)는 APP 서버(300)와 일반 이더넷 통신, Wi-fi 통신은 물론 위성통신이 가능하도록 하여 일반 사용자단말기(100)가 오프라인인 경우에도 실시간 통역 서비스가 제공가능하도록 한다. Here, even in an offline situation in which data communication between the user terminal 100 and the APP server 300 is not possible, the mobile router 800 is connected to enable data communication with the APP server 300. To this end, the mobile router 800 is an APP The server 300 enables general Ethernet communication, Wi-fi communication, as well as satellite communication, so that a real-time interpretation service can be provided even when the general user terminal 100 is offline.

아울러 본 실시예에서는 사용자단말기(100) 외에 별도의 식별번호를 가지며 음성입력수단(400) 및 음성출력수단(410)을 가지는 장치, 즉 예를 들면 이어셋, 헤드셋과 같은 장치로도 연결이 가능하여 다자간 통화는 물론 다자간 회의에도 적용가능하다. In addition, in this embodiment, a device having a separate identification number in addition to the user terminal 100 and having a voice input means 400 and a voice output means 410, that is, a device such as an ear set or a headset, can also be connected. It can be applied to conferences as well as conference calls.

도 9는 본 발명의 변형가능한 실시예에 따른 컨퍼런스 모드 형태의 실시간 통역 서비스시스템의 개략적인 구성을 나타내는 개념도이며, 도 10은 도 9에 따른 컨퍼런스 모드가 표시된 사용자단말기의 인터페이스화면을 나타내는 도면이고, 도 11은 도 9에 따른 컨퍼런스 모드시 각 사용자단말기별 대화 화면을 나타내는 도면이다. 9 is a conceptual diagram showing a schematic configuration of a real-time interpretation service system in the form of a conference mode according to a deformable embodiment of the present invention, and FIG. 10 is a view showing an interface screen of a user terminal displaying the conference mode according to FIG. 9; 11 is a diagram illustrating a conversation screen for each user terminal in the conference mode according to FIG. 9.

도면을 참조하면 본 실시예에 따른 실시간 통역 서비스시스템(1000)은 컨퍼런스 모드 형태에 적용되는 것으로 복수의 사용자단말기(100) 중 어느 하나의 사용자단말기(100)가 메인스피커단말기(100a)로 선택되는 경우 나머지 사용자단말기(100)들은 리스너단말기(100b)로 특정된다. Referring to the drawings, the real-time interpretation service system 1000 according to the present embodiment is applied to the conference mode type, and any one user terminal 100 among a plurality of user terminals 100 is selected as the main speaker terminal 100a. In this case, the remaining user terminals 100 are specified as listener terminals 100b.

이에 따라 메인스피커단말기(100a)는 원격 세미나 또는 강연이나 회의 주재자 등이 지정될 수 있는데, 대화량이 가장 많은 사용자단말기가 지정됨이 바람직하다. Accordingly, the main speaker terminal 100a may be designated as a remote seminar or lecture or conference chair, and it is preferable that the user terminal with the largest amount of conversation is designated.

아울러 리스너단말기(100b)는 메인스피커단말기(100a)를 제외한 나머지 사용자단말기(100)로서 리스너단말기(100b)를 담당하는 전문가단말기(200)의 경우 메인스피커단말기(100a)로부터 음성정보 또는 영상정보를 전달받아 이를 번역한 번역음성정보 또는 번역영상정보를 APP 서버(300)로 전송하여 해당 리스너단말기(100b)로 전달되도록 하고, 리스너단말기(100b)로부터 전달받은 리스너단말기(100b)의 사용자 음성정보 또는 영상정보를 다시 메인스피커단말기(100a)의 설정언어로 번역하여 이를 APP 서버(300)를 통해 메인스피커단말기(100a)로 전달되도록 한다. In addition, the listener terminal 100b is a user terminal 100 other than the main speaker terminal 100a. Transmits the received and translated translation voice information or translated video information to the APP server 300 so that it is transmitted to the corresponding listener terminal (100b), and the user voice information of the listener terminal (100b) received from the listener terminal (100b) or The image information is translated back into the language set of the main speaker terminal 100a, so that it is transmitted to the main speaker terminal 100a through the APP server 300.

이에 따라 도 11에서와 같이 한국어를 설정언어로 사용하는 메인스피커단말기(100a)가 특정되고, 중국어를 설정언어로 하는 리스너단말기(100b)와, 영어를 설정언어로 하는 리스너단말기(100b)가 각각 특정되는 경우 메인스피커단말기(100a)는 각 리스너단말기(100b)의 음성정보 또는 영상정보에 대한 번역음성정보 또는 번역영상정보를 각각 한국어로 전달받을 수 있게 된다. Accordingly, as shown in FIG. 11, the main speaker terminal 100a using Korean as the setting language is specified, the listener terminal 100b using Chinese as the setting language, and the listener terminal 100b using English as the setting language, respectively. When specified, the main speaker terminal 100a may receive translation audio information or translated image information for audio information or image information of each listener terminal 100b in Korean, respectively.

즉, 리스너단말기(100b)들 간에는 별도의 전문가단말기(200)를 통한 번역이 수행되지 않아 각 리스너단말기(100b)들의 음성정보 또는 영상정보는 서로 번역 서비스가 지원되지 않게 된다. That is, the translation service between the listener terminals 100b is not performed through a separate expert terminal 200, so that the audio information or the video information of each of the listener terminals 100b is not supported by a translation service.

이는 컨퍼런스 모드의 형태가 주로 메인스피커의 역할을 담당하는 사용자에게 집중됨에 따라 리스너(청중)들 간에 대화는 번역이 필요하지 않기 때문이다. This is because the form of the conference mode is mainly focused on the user who plays the role of the main speaker, so the conversation between listeners (audience) does not require translation.

이와 같이 본 발명은 도면에 도시된 일실시예를 참고로 설명되었으나, 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. As described above, the present invention has been described with reference to an embodiment shown in the drawings, but this is only exemplary, and those of ordinary skill in the art can recognize that various modifications and other equivalent embodiments are possible therefrom. I will understand.

따라서, 본 발명의 진정한 기술적 보호범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

100 : 사용자단말기 200 : 전문가단말기
300 : APP 서버 400 : 음성입력수단
410 : 음성출력수단 500 : 데이터베이스
600 : 결제서버 700 : 인공지능 기반 STT 서버
800 : 모바일 라우터 1000 : 통역 서비스시스템100: user terminal 200: expert terminal
300: APP server 400: voice input means
410: voice output means 500: database
600: payment server 700: artificial intelligence-based STT server
800: mobile router 1000: interpretation service system

Claims

A plurality of user terminals 100 requesting translation in a user's requested language and performing an audio or video call with one or more other users who are conversation partners;
It is connected to enable data communication with the plurality of user terminals 100 and relays them to make an audio or video call, and receives audio or video information of the corresponding user terminals 100 to receive text information through voice recognition. An APP server 300 that generates and transmits the corresponding text information to a plurality of user terminals 100 according to a request by generating translated text information using artificial intelligence in a preset language; And
In response to the request of the user terminal 100, the APP server 300 transmits the voice or video information of the other user terminal 100 of the user terminal 100 requested by the APP server 300 in an audio or video call between users. 300), provided to an interpreter expert, and at least one expert terminal 200 for providing translation voice information or translated image information to the corresponding user terminal 100 in the set language of the requested user terminal 100; A real-time interpretation service system in which translation through artificial intelligence and interpretation of an interpreter expert are hybridized.

The method of claim 1,
A voice input means 400 for receiving voice information of a user or an interpreter and transmitting it to the user terminal 100 or the expert terminal 200, and
The translation through artificial intelligence and the interpretation of the interpreter expert are hybridized, characterized in that it further comprises a voice output means 410 for receiving voice information or translated voice information from the user terminal 100 and providing it to an interpreter or a user. Real-time interpretation service system.

The method of claim 1,
The user terminal 100 is
A communication module unit 110 that enables data communication with the APP server 300,
A voice receiving unit 120 receiving the user's voice information,
A voice transmission unit 130 that provides translated voice information,
An interface unit 140 for receiving user input information including user's translation request information, translation language setting information, interpretation expert selection information, conversation topic selection information, or artificial intelligence translation request information, and
According to the user input information from the interface unit 140, the voice information received from the voice receiving unit 120 and the user input information are transmitted to the APP server 300, and the expert terminal 200 of the selected interpreter Alternatively, it is included in a video call, and performs a request for transmission of translated text information using artificial intelligence according to the request, and translated voice information of the expert terminal 200 received from the APP server 300 or translated text information using artificial intelligence. A real-time interpretation service system in which translation through artificial intelligence and interpretation of an interpreter expert are hybridized, comprising: a voice transmission unit 130 or a user terminal control unit 150 that provides a user through a display means.

The method of claim 3,
The APP server 300
A communication unit 310 that enables data communication with the plurality of user terminals 100 or the at least one expert terminal 200,
When the translation request information of the user terminal 100 is received through the communication unit 310, the subject domain list information is provided, and interpreter expert list information according to the selected domain list information is provided to the user terminal 100 and then provided. Transmitting request information for possible translation to the expert terminal 200 of the corresponding interpreter according to the information selected by the user from the list of interpreter experts, and managing the connection between the user terminal 100 and the expert terminal 200 A matching management unit 320,
When the matching of the matching management unit 320 is performed, the voice information or image information to be translated provided from the counterpart user terminal 100 of the user terminal 100 that has transmitted the translation request information is transferred to the expert terminal 200. A real-time translation relay unit 330 that repeats a series of processes of transmitting and receiving the translated audio or video information from the expert terminal 200 and transmitting it in real time to the user terminal 100 that has transmitted the translation request information. And
An STT conversion unit 340 for converting voice information or image information between the user terminals 100 and the expert terminals 200 into text information through voice recognition, and
When there is a request for transmission of the translated text information using the artificial intelligence of the user terminal 100, the text information of the conversation partner user terminal 100 is received from the STT conversion unit 340 and the corresponding text information is recorded through artificial intelligence. Translation and interpretation through artificial intelligence, characterized in that it comprises an artificial intelligence (AI) translation unit 350 that generates translation text information that is translated into a set language and transmits the transmission request to the user terminal 100. Real-time interpretation service system with hybrid interpretation.

The method of claim 4,
The expert terminal 200 is
A communication module unit 210 that enables data communication with the APP server 300,
An expert voice receiving unit 220 receiving the translated voice information of an interpreter expert from the voice input means 400,
An expert voice transmission unit 230 for providing the user's voice information to an interpreter expert through the voice output means 410;
When the request information for possible translation is received from the APP server 300, the expert interface unit 240 is provided to an interpreter and receives selection information on whether or not the interpreter is available, and
The voice information or video information of the user from the APP server 300 is provided to an interpreting expert through the expert voice transmitting unit 230 or the display means 260, and the translated voice of the interpreting expert received through the expert voice receiving unit 220 A real-time interpretation service system in which translation through artificial intelligence and interpretation of an interpreter expert are hybridized, comprising: an expert terminal controller 250 that transmits information to the APP server (300).

The method of claim 4,
The APP server 300
Identification information of each user terminal 100 and expert terminal 200 by receiving text information of the voice information and translated voice information of the plurality of user terminals 100 and expert terminals 200 from the STT converter 340 A real-time interpretation service system in which translation through artificial intelligence and interpretation of an interpreter expert are hybridized, characterized in that it further includes a minutes generating unit 360 listing in chronological order according to the following.

The method of claim 4,
Text information and translation text information of the user terminals 100 and expert terminals 200 are received and stored from the STT conversion unit 340 and the artificial intelligence translation unit 350 of the APP server 300,
A real-time interpretation service system in which translation through artificial intelligence and interpretation of an interpreter expert are hybridized, characterized in that it further includes a translation database 500 stored for each conversation topic domain information.

The method of claim 4,
Translation and interpretation through artificial intelligence, characterized in that when there is a translation request from the user terminal 100 using an interpreter expert or artificial intelligence, a payment server 600 that calculates billing information according to the setting and makes payment is further included. A real-time interpretation service system in which expert interpretation is hybridized.

The method of claim 2,
When any one user terminal 100 of the plurality of user terminals 100 is selected as the main speaker terminal 100a, the remaining user terminals 100 are specified as the listener terminal 100b. Artificial intelligence, characterized in that the expert terminal 200 performing translation of audio or video information translates the audio or video information of the corresponding listener terminal 100b into a set language of the main speaker terminal 100a. A real-time interpretation service system that combines translation through translation and interpretation of experts.