KR20210057435A

KR20210057435A - Electronic Device and the Method for Mechanical Translation based on Nerve Network

Info

Publication number: KR20210057435A
Application number: KR1020190144186A
Authority: KR
Inventors: 이기영
Original assignee: 한국전자통신연구원
Priority date: 2019-11-12
Filing date: 2019-11-12
Publication date: 2021-05-21

Abstract

The present invention relates to an electronic device and a method for mechanical translation based on a neural network. According to an embodiment of the present invention, the electronic device comprises: a communication circuit which is able to communicate with an external electronic device; a memory which stores one or more instructions, original words dictionary, target words dictionary, and extension dictionary for the machine translation based on the neural network (the original words dictionary and the target words dictionary are respectively related to original words and substitute words); and a processor. The processor is able to execute one or more instructions, acquire an original sentence from the external electronic device through the communication circuit, determine a substitute word for one or more words among an unregistered word from the original sentence or a word related to a user dictionary based on a concentration probability distribution corresponding to the one or more words and a probability distribution of the target words dictionary, generate a translated sentence corresponding to the original sentence to include the determined substitute word, and provide the generated translated sentence to the external electronic device through the communication circuit. The present invention aims to provide an electronic device and a method for mechanical translation based on a neural network, which are able to perform mechanical translation based on the neural network.

Description

Electronic Device and the Method for Mechanical Translation based on Nerve Network}

본 문서에서 개시되는 다양한 실시 예들은, 신경망 기반 기계 번역 기술과 관련된다.Various embodiments disclosed in this document relate to a neural network-based machine translation technology.

인간의 말을 자동으로 번역하려는 시도는 컴퓨터의 등장과 함께 시작되었다. 기계 번역 기법은 규칙 기반 방식에서 시작하여 패턴 기반 방식 및 통계 기반 방식에 이르기까지 지속적으로 발전되어 왔다.Attempts to automatically translate human speech began with the advent of computers. Machine translation techniques have been continuously developed from rule-based methods to pattern-based methods and statistics-based methods.

최근에는 기계 번역에 신경망 기반의 기계 학습이 도입되어, 기계 번역의 획기적인 성능 개선을 이루었다. 이러한 신경망 기반의 기계 번역 시스템은 대량의 학습 코퍼스(말뭉치)만 준비되면, 손쉽게 기계 번역 모델을 생성할 수 있다. 또한, 신경망 기반의 기계 번역 시스템은 텍스트 기반의 문서 번역뿐만 아니라, 음성 기반의 자동 통역 서비스를 제공할 수 있다. In recent years, machine learning based on neural networks has been introduced in machine translation, resulting in a breakthrough performance improvement in machine translation. This neural network-based machine translation system can easily generate a machine translation model if only a large amount of learning corpuses are prepared. In addition, the neural network-based machine translation system can provide not only text-based document translation but also voice-based automatic interpretation service.

신경망 기반의 기계 번역 시스템은 입력된 원시 문장을 인코딩하여 추상화된 컨텍스트 벡터(context vector)로 변환하고, 디코딩 과정에서 컨텍스트 벡터에 기반하여 번역 문장(대역 어휘의 조합)을 생성할 수 있다. 이 같이, 신경망 기반의 기계 번역 시스템은 원시 문장 전체의 의미가 축약된 은닉 상태(hidden state)(또는, 컨텍스트 벡터)로부터 대역 어휘를 생성하므로, 원시 어휘와 대역 어휘(이하, “대역 어휘”로 언급될 수 있음) 간의 매핑 관계를 파악하기 어렵고, 사용자 사전을 적용하기 어려울 수 있다. A machine translation system based on a neural network may encode an input source sentence and convert it into an abstracted context vector, and generate a translation sentence (a combination of band vocabulary) based on the context vector in a decoding process. In this way, the neural network-based machine translation system generates a band vocabulary from a hidden state (or context vector) in which the meaning of the entire original sentence is abbreviated. May be mentioned), and it may be difficult to grasp the mapping relationship between them, and to apply a user dictionary.

사용자 사전은 다른 분야와 달리 번역되어야 하는 원시 어휘에 대응하는 대역 어휘를 정의하는데, 신경망 기반의 기계 번역 시스템은 상기 특징으로 인하여 어휘 대 어휘 변환을 제공하기 어려울 수 있다.Unlike other fields, the user dictionary defines a band vocabulary corresponding to the original vocabulary to be translated, and the neural network-based machine translation system may be difficult to provide vocabulary-to-vocabulary conversion due to the above characteristics.

본 문서에 개시되는 다양한 실시 예들은 사용자 사전을 적용할 수 있는 전자 장치 및 신경망 기반 기계 번역 방법을 제공할 수 있다.Various embodiments disclosed in this document may provide an electronic device and a neural network-based machine translation method to which a user dictionary can be applied.

본 문서에 개시되는 일 실시 예에 따른 전자 장치는, 외부 전자 장치와 통신할 수 있는 통신 회로; 신경망 기반의 기계 번역을 위한 적어도 하나의 인스트럭션, 원시 어휘 사전, 목표 어휘 사전 및 확장 사전이 저장된 메모리; 상기 원시 어휘 사전 및 상기 목표 어휘 사전은, 각기 원시 어휘와 대역 어휘와 관련되고, 및 프로세서를 포함하고, 상기 프로세서는, 상기 적어도 하나의 인스트럭션을 실행함으로써, 상기 통신 회로를 통해 상기 외부 전자 장치로부터 원시 문장을 획득하고, 상기 원시 문장으로부터 미등록 어휘 또는 사용자 사전에 관련된 어휘 중 적어도 하나의 어휘에 대해서, 상기 적어도 하나의 어휘에 대응하는 주의집중 확률 분포 및 상기 목표 어휘 사전의 확률 분포에 기반하여 대역 어휘를 결정하고, 상기 결정된 대역 어휘를 포함하도록 상기 원시 문장에 대응하는 번역 문장을 생성하고, 상기 생성된 번역 문장을 상기 통신 회로를 통해 상기 외부 전자 장치에 제공할 수 있다.An electronic device according to an embodiment disclosed in the present document includes: a communication circuit capable of communicating with an external electronic device; A memory in which at least one instruction, a source vocabulary dictionary, a target vocabulary dictionary, and an extension dictionary for neural network-based machine translation are stored; The source vocabulary dictionary and the target vocabulary dictionary are each associated with a source vocabulary and a band vocabulary, and include a processor, wherein the processor executes the at least one instruction from the external electronic device through the communication circuit. A band based on a probability distribution of attention concentration corresponding to the at least one vocabulary and a probability distribution of the target vocabulary dictionary with respect to at least one vocabulary of an unregistered vocabulary or a vocabulary related to a user dictionary from the original sentence A vocabulary may be determined, a translated sentence corresponding to the original sentence may be generated to include the determined band vocabulary, and the generated translated sentence may be provided to the external electronic device through the communication circuit.

본 문서에 개시되는 다양한 실시 예들에 따르면, 신경망 기반의 기계 번역을 수행할 수 있다. 이 외에, 본 문서를 통해 직접적 또는 간접적으로 파악되는 다양한 효과들이 제공될 수 있다.According to various embodiments disclosed in this document, machine translation based on a neural network may be performed. In addition to this, various effects that are directly or indirectly identified through this document can be provided.

도 1은 일 실시예에 따른 신경망 기반의 기계 번역 기법을 나타낸다.
도 2는 일 실시예에 따른 전자 장치의 구성도를 나타낸다.
도 3은 일 실시예에 따른 전자 장치에 의한 미등록 어휘 학습 예시도를 나타낸다.
도 4는 일 실시예에 따른 전자 장치에 의한 미등록 어휘에 관련된 확장 사전 이용 예시도를 나타낸다.
도 5는 일 실시예에 따른 전자 장치에 의한 번역 과정 예시도를 나타낸다.
도 6은 일 실시예에 따른 신경망 기반 기계 번역 방법의 흐름도를 나타낸다.
도면의 설명과 관련하여, 동일 또는 유사한 구성요소에 대해서는 동일 또는 유사한 참조 부호가 사용될 수 있다.1 shows a neural network-based machine translation technique according to an embodiment.
2 is a block diagram of an electronic device according to an exemplary embodiment.
3 is a diagram illustrating an example of learning unregistered vocabulary by an electronic device according to an exemplary embodiment.
4 is a diagram illustrating an example of using an extended dictionary related to an unregistered vocabulary by an electronic device according to an embodiment.
5 is a diagram illustrating an example of a translation process by an electronic device according to an embodiment.
6 is a flowchart of a neural network-based machine translation method according to an embodiment.
In connection with the description of the drawings, the same or similar reference numerals may be used for the same or similar components.

도 1은 일 실시예에 따른 신경망 기반의 기계 번역 기법을 나타낸다.1 shows a neural network-based machine translation technique according to an embodiment.

도 1을 참조하면, 신경망 기반의 기계 번역 장치(100)의 인코더(110)는 입력 문장 “I like you”을 인코딩함에 따라 입력 문장을 추상화한 컨텍스트 벡터(context vector)로 변환할 수 있다. 신경망 기반의 기계 번역 장치(100)의 디코더(120)는 디코딩 과정에서 컨텍스트 벡터에 기반하여 번역 문장 “나는 당신을 좋아합니다”을 생성할 수 있다. 신경망 기반의 기계 번역 장치(100)는 원시 문장 전체의 의미가 축약된 은닉 상태(hidden state)(또는, 컨텍스트 벡터)로부터 번역 문장을 구성하는 대역 어휘를 생성한다. 이에, 신경만 기반의 기계 번역 장치(100)는 원시 어휘와 대역 어휘 간의 매핑 관계를 파악하기 어렵고, 사용자 사전을 적용하기 어려울 수 있다. 하지만, 본 문서에 개시된 일 실시예에 따른 신경망 기반 기계 번역 장치(100)는 확장 사전에 기반하여 사용자 사전을 적용할 수 있다.Referring to FIG. 1, the encoder 110 of the neural network-based machine translation apparatus 100 may convert an input sentence into an abstracted context vector by encoding an input sentence “I like you”. The decoder 120 of the neural network-based machine translation apparatus 100 may generate a translated sentence "I like you" based on a context vector during a decoding process. The neural network-based machine translation apparatus 100 generates a band vocabulary constituting a translated sentence from a hidden state (or context vector) in which the meaning of the entire original sentence is abbreviated. Accordingly, the neural-only-based machine translation apparatus 100 may have difficulty in grasping the mapping relationship between the original vocabulary and the band vocabulary, and it may be difficult to apply a user dictionary. However, the neural network-based machine translation apparatus 100 according to an embodiment disclosed in this document may apply a user dictionary based on an extended dictionary.

도 2는 일 실시예에 따른 전자 장치(예: 도 1의 신경망 기반의 기계번역 장치(100))의 구성도를 나타낸다.FIG. 2 is a block diagram of an electronic device (eg, the neural network-based machine translation apparatus 100 of FIG. 1) according to an embodiment.

도 2를 참조하면, 일 실시예에 따른 전자 장치(201)는 통신 회로(210), 메모리(240) 및 프로세서(250)를 포함할 수 있다. 일 실시 예에서, 전자 장치(201)는 일부 구성요소가 생략되거나, 추가적인 구성요소를 더 포함할 수 있다. 예를 들어, 전자 장치(201)는 입력 장치(220) 및 출력 장치(230)를 더 포함할 수 있다. 또한, 전자 장치(201)의 구성요소들 중 일부가 결합되어 하나의 개체로 구성되되, 결합 이전의 해당 구성요소들의 기능을 동일하게 수행할 수 있다. 일 실시예에 따른 전자 장치(201)는 신경망 기반 기계 번역 기능을 제공하는 웹 서버일 수 있다.Referring to FIG. 2, an electronic device 201 according to an embodiment may include a communication circuit 210, a memory 240, and a processor 250. In an embodiment, the electronic device 201 may omit some components or further include additional components. For example, the electronic device 201 may further include an input device 220 and an output device 230. In addition, some of the components of the electronic device 201 are combined to form a single entity, and functions of the corresponding components prior to the combination may be performed in the same manner. The electronic device 201 according to an embodiment may be a web server that provides a neural network-based machine translation function.

통신 회로(210)는 전자 장치(201)와 다른 장치(예: 외부 전자 장치(202)) 간의 통신 채널 또는 무선 통신 채널의 수립, 및 수립된 통신 채널을 통한 통신 수행을 지원할 수 있다. 상기 통신 채널은 예를 들어, LAN(local area network), FTTH(Fiber　to　the　home), xDSL(x-Digital　Subscriber　Line), WiFi, Wibro, 3G 또는 4G과 같은 통신 방식의 통신 채널일 수 있다. 외부 전자 장치(202)는 PC, 노트북, 스마트폰(smart phone), 태블릿(tablet), 웨어러블 컴퓨터(wearable computer)와 같이 전자 장치(201)와 관련된 웹/모바일 사이트의 접속 또는 서비스(기계 번역 서비스) 전용 어플리케이션의 설치 및 실행이 가능한 모든 단말 장치를 의미할 수 있다.The communication circuit 210 may support establishment of a communication channel or a wireless communication channel between the electronic device 201 and another device (eg, the external electronic device 202 ), and communication through the established communication channel. The communication channel may be, for example, a communication channel of a communication method such as a local area network (LAN), FTTH (Fiber　to　the　home), xDSL (x-Digital　Subscriber　Line), WiFi, Wibro, 3G or 4G. The external electronic device 202 is a PC, notebook, smartphone (smart phone), a tablet (tablet), such as a wearable computer (wearable computer), such as access to a web/mobile site related to the electronic device 201 or service (machine translation service). ) It may mean all terminal devices capable of installing and executing a dedicated application.

입력 장치(220)는 사용자 입력을 감지 또는 수신할 수 있다. 예를 들어, 입력 장치(220)는 마우스, 키보드 또는 터치 패드 중 적어도 하나를 포함할 수 있다.The input device 220 may detect or receive a user input. For example, the input device 220 may include at least one of a mouse, a keyboard, and a touch pad.

출력 장치(230)는 프로세서(250)의 제어에 따라 번역 문장(또는, 출력 문장)을 출력하는 스피커 또는 디스플레이 중 적어도 하나를 포함할 수 있다. 디스플레이는, 예를 들면, 액정 디스플레이(LCD), 발광 다이오드(LED) 디스플레이, 유기 발광 다이오드(OLED) 디스플레이, 또는 전자종이(electronic paper) 디스플레이를 포함할 수 있다.The output device 230 may include at least one of a speaker or a display that outputs a translated sentence (or an output sentence) under the control of the processor 250. The display may include, for example, a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, or an electronic paper display.

메모리(240)는 전자 장치(201)의 적어도 하나의 구성요소(예: 프로세서(250))에 의해 사용되는 다양한 데이터를 저장할 수 있다. 데이터는 예를 들어, 소프트웨어 및 이와 관련된 명령에 대한 입력 데이터 또는 출력 데이터를 포함할 수 있다. 예를 들어, 메모리(240)는 신경망 기반의 기계 번역을 위한 적어도 하나의 인스트럭션을 저장할 수 있다. 메모리(240)는 원시 어휘 사전, 목표 어휘 사전 및 확장 사전을 포함할 수 있다. 상기 원시 어휘 사전은 원시 어휘와 관련되고, 상기 목표 어휘 사전은 지정된 개수의 목표 어휘와 관련된 사전(또는, 목표 어휘의 확률 분포를 저장하거나, 목표 어휘를 저장하는 사전)일 수 있다. 상기 확장 사전은 미등록 어휘(또는, 사용자 사전에 관련된 어휘)와 관련된 사전(또는, 미등록 어휘의 확률 분포를 저장하거나, 미등록 어휘를 저장하는 사전)일 수 있다. 상기 원시 어휘 사전 및 목표 어휘 사전의 크기(예: 사전에 저장 가능한 어휘 개수)는 고정될 수 있다. 상기 확장 사전의 크기는 프로세서(250)의 제어에 따라 가변 될 수 있다. 메모리(240)는 휘발성 메모리 또는 비휘발성 메모리를 포함할 수 있다.The memory 240 may store various types of data used by at least one component of the electronic device 201 (for example, the processor 250 ). The data may include, for example, input data or output data for software and instructions related thereto. For example, the memory 240 may store at least one instruction for machine translation based on a neural network. The memory 240 may include a source vocabulary dictionary, a target vocabulary dictionary, and an extended dictionary. The source vocabulary dictionary is related to a source vocabulary, and the target vocabulary dictionary may be a dictionary (or a dictionary storing a probability distribution of target vocabulary or a target vocabulary) related to a specified number of target vocabularies. The extended dictionary may be a dictionary (or a dictionary storing a probability distribution of an unregistered vocabulary or a dictionary storing an unregistered vocabulary) related to an unregistered vocabulary (or a vocabulary related to a user dictionary). The sizes of the source vocabulary dictionary and the target vocabulary dictionary (eg, the number of vocabularies that can be stored in the dictionary) may be fixed. The size of the expansion dictionary may be changed according to the control of the processor 250. The memory 240 may include a volatile memory or a nonvolatile memory.

프로세서(250)는 전자 장치(201)의 적어도 하나의 다른 구성요소(예: 하드웨어 또는 소프트웨어 구성요소)를 제어할 수 있고, 다양한 데이터 처리 또는 연산을 수행할 수 있다. 프로세서(250)는 적어도 하나의 인스트럭션을 실행하여 신경망 기반의 기계 번역을 수행할 수 있다. 프로세서(250)는 예를 들어, 중앙처리장치(CPU), 그래픽처리장치(GPU), 마이크로프로세서, 애플리케이션 프로세서(application processor), 주문형 반도체(ASIC(application specific integrated circuit), FPGA(field programmable gate arrays)) 중 적어도 하나를 포함할 수 있으며, 복수의 코어를 가질 수 있다. The processor 250 may control at least one other component (eg, hardware or software component) of the electronic device 201 and perform various data processing or operations. The processor 250 may perform machine translation based on a neural network by executing at least one instruction. The processor 250 is, for example, a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, an application processor, an application specific integrated circuit (ASIC), field programmable gate arrays (FPGA). )), and may have a plurality of cores.

일 실시예에 따르면, 프로세서(250)는 인코더(251), 디코더(253), 주의집중 모듈(255) 및 소프트맥스 레이어(softmax layer)(257)를 포함할 수 있다. 인코더(251)는 입력 문장(또는, 원시 문장)을 구성하는 어휘들(원시 어휘)을 하나씩 읽어 들이면서 임베딩(예: word embedding) 과정을 거쳐 각 어휘들에 관련된 은닉 상태(hidden state)를 산출할 수 있다. 인코더(251)는 원시 문장에 포함된 모든 어휘 정보를 압축해서 하나의 컨벡스트 벡터(인코더(251)의 마지막 은닉 상태)를 산출할 수 있다. 상기 컨벡스트 벡터는 예를 들어, 원시 어휘 사전과 동일한 크기의 벡터일 수 있다. 디코더(253)는 컨벡스트 벡터를 입력 받고, 입력된 컨벡스트 벡터에 대응하는 출력 벡터(디코더(253)의 은닉 상태)를 산출할 수 있다. 상기 출력 벡터는 예를 들어, 목표 어휘 사전의 크기와 동일한 크기를 갖는 벡터일 수 있다. 예를 들어, 디코더(253)는 t-1 시점의 출력 단어와 t 시점의 주의집중 값에 의하여 상기 출력 벡터를 산출할 수 있다. 상기 t 시점의 주의집중 값은 예를 들면, t 시점에 예측해야 하는 어휘와 관련되어 있는 원시 어휘를 나타내는 값일 수 있다. 상기 주의집중 값은 다른 예를 들면, 인코더(251)의 모든 은닉 상태(또는, 임베딩된 전체 원시 어휘)와 주의집중 가중치의 가중합일 수 있다. 소프트맥스 레이어(257)는 디코더(253)의 은닉 상태(출력 벡터)를 정규화하여 디코더(253)의 은닉 상태에 대응하는 제1 사전(예: 목표 어휘 사전) 확률 분포를 산출할 수 있다. 디코더(253)는 제1 사전 확률 분포에 기반하여 가장 높은 확률 값을 갖는 대역 어휘를 출력 단어(번역 문장을 구성하는 대역 어휘)로 결정하여 하나씩 순차적으로 출력할 수 있다. 대체적으로, 디코더(253)는 원시 문장에 포함된 미등록 어휘에 대해서는 미등록 어휘에 대응하는 제1 사전 확률 분포 및 미등록 어휘에 대응하는 제2 사전 확률 분포에 기반하여 가장 높은 확률 값(또는, 주의집중 가중치)을 갖는 대역 어휘를 출력 단어로 결정할 수 있다. 이러한 방식으로, 디코더(253)는 순차적으로 출력된 출력 단어를 포함하는 번역 문장을 출력할 수 있다. 주의집중 모듈(255)은 주의집중 메커니즘에 따라 t-1 시점의 디코더(253)의 은닉 상태와 t 시점의 인코더(251)의 모든 은닉 상태들에 대응하는 t 시점의 어텐션 스코어의 모음값을 산출하고, 상기 어텐션 스코어의 모음값을 소프트맥스 레이어(257)에 의해 정규화하여 t 시점의 주의집중 확률 분포(이하, "제2 사전 확률 분포"로 언급될 수 있음)를 산출할 수 있다. 상기 어텐션 스코어의 모음값은 예를 들면, t-1 시점의 디코더(253)의 은닉 상태를 전치한 후, t 시점의 인코더(251)의 모든 은닉 상태들과 곱하여 산출될 수 있다. 상기 주의집중 확률 분포의 각 값은 주의집중 가중치일 수 있다. 이하의 문서에서는 설명의 편의성을 위해서 인코더(251), 디코더(253), 주의집중 모듈(255) 및 소프트맥스 레이어(257)의 구성 또는 기능을 프로세서(250)를 주체로 하여 설명한다.According to an embodiment, the processor 250 may include an encoder 251, a decoder 253, an attention module 255, and a softmax layer 257. The encoder 251 reads the vocabulary (raw vocabulary) constituting the input sentence (or the original sentence) one by one, and calculates a hidden state related to each vocabulary through an embedding (eg, word embedding) process. can do. The encoder 251 may compress all vocabulary information included in the original sentence to calculate one convex vector (the last hidden state of the encoder 251). The convex vector may be, for example, a vector having the same size as the original vocabulary dictionary. The decoder 253 may receive a convex vector and calculate an output vector (a hidden state of the decoder 253) corresponding to the input convex vector. The output vector may be, for example, a vector having the same size as the size of the target vocabulary dictionary. For example, the decoder 253 may calculate the output vector based on the output word at the time t-1 and the attention value at the time t. The attention value at time t may be, for example, a value representing a primitive vocabulary related to a vocabulary to be predicted at time t. The attention value may be, for example, a weighted sum of all hidden states (or all embedded primitive vocabulary) of the encoder 251 and the attention weight. The softmax layer 257 may normalize the hidden state (output vector) of the decoder 253 to calculate a first dictionary (eg, target vocabulary dictionary) probability distribution corresponding to the hidden state of the decoder 253. The decoder 253 may determine a band vocabulary having the highest probability value as an output word (a band vocabulary constituting a translated sentence) based on the first prior probability distribution, and sequentially output one by one. In general, for unregistered vocabulary included in the original sentence, the decoder 253 has the highest probability value (or attention concentration) based on the first prior probability distribution corresponding to the unregistered vocabulary and the second prior probability distribution corresponding to the unregistered vocabulary. A band vocabulary having a weight) can be determined as an output word. In this way, the decoder 253 may output translated sentences including output words that are sequentially output. The attention module 255 calculates a collection value of the attention score at the time t corresponding to the hidden state of the decoder 253 at the time t-1 and all the hidden states of the encoder 251 at the time t according to the attention mechanism. In addition, the vowel values of the attention score may be normalized by the softmax layer 257 to calculate a probability distribution of attention at time t (hereinafter, referred to as a “second prior probability distribution”). The vowel value of the attention score may be calculated by transposing the hidden state of the decoder 253 at the time t-1, and then multiplying it with all the hidden states of the encoder 251 at the time t-1. Each value of the attention probability distribution may be an attention weight. In the following document, for convenience of description, the configuration or function of the encoder 251, the decoder 253, the attention module 255, and the softmax layer 257 will be described with the processor 250 as the main body.

일 실시예에 따르면, 프로세서(250)는 각기 등록어(등록된 원시 어휘) 및 상기 등록어의 대역 어휘를 포함하는 원시 문장 및 번역 문장의 쌍에 대한 지도 학습을 거쳐, 등록된 원시 어휘 및 대역 어휘의 쌍과 관련된 목표 어휘 사전의 확률 분포(이하, “제1 사전 확률 분포”라 함)를 결정할 수 있다. According to an embodiment, the processor 250 undergoes supervised learning on a pair of a source sentence and a translated sentence including a registered word (registered source vocabulary) and a band vocabulary of the registered word, respectively, and the registered source vocabulary and the band A probability distribution of a target vocabulary dictionary related to a pair of vocabulary words (hereinafter, referred to as “first prior probability distribution”) may be determined.

일 실시예에 따르면, 프로세서(250)는 각기 미등록 어휘(목표 어휘 사전에 등록되지 않은 원시 어휘)를 포함하는 원시 문장 및 번역 문장의 쌍에 대한 지도 학습을 거쳐 미등록 어휘에 관련된 확장 사전의 확률 분포(이하, “제2 사전 확률 분포”라 함)를 결정할 수 있다. 예를 들어, 프로세서(250)는 외부 전자 장치(202)로부터 미등록 어휘를 포함하는 원시 문장을 획득하면, 인코딩 및 디코딩을 거쳐 미등록 어휘에 대응하는 주의집중 확률 분포에서 가장 높은 확률 값을 미등록 어휘의 생성 확률 값으로 결정할 수 있다. 이 경우, 미등록 어휘의 생성 확률 값은 번역 문장에서 원시 언어의 형태로 등장하는 어휘의 생성 확률 값일 수 있다. 프로세서(250)는 번역 문장에서 미등록 어휘를 확인하면, 확장 사전의 엔트리 수를 증가시키고, 추가된 엔트리에 결정된 미등록 어휘의 생성 확률 값을 저장할 수 있다. 이와 관련하여, 프로세서(250)는 확장 사전에 관련된 어휘(예: 미등록 어휘)의 개수에 따라 확장 사전의 크기를 증가시킬 수 있다. 예를 들어, 프로세서(250)는 확장 사전에 관련된 어휘의 개수가 증가할수록 확장 사전의 크기(또는, 저장 공간)을 증가시킬 수 있다.According to an embodiment, the processor 250 performs supervised learning on pairs of original sentences and translated sentences each including unregistered vocabulary (raw vocabulary not registered in the target vocabulary dictionary), and the probability distribution of the extended dictionary related to the unregistered vocabulary (Hereinafter referred to as “second prior probability distribution”) can be determined. For example, if the processor 250 obtains an original sentence including an unregistered vocabulary from the external electronic device 202, the highest probability value in the attention probability distribution corresponding to the unregistered vocabulary is determined by encoding and decoding. It can be determined by the generation probability value. In this case, the generation probability value of an unregistered vocabulary may be a generation probability value of a vocabulary appearing in the form of a source language in a translated sentence. When the unregistered vocabulary is identified in the translated sentence, the processor 250 may increase the number of entries in the extended dictionary and store a generation probability value of the determined unregistered vocabulary in the added entry. In this regard, the processor 250 may increase the size of the extended dictionary according to the number of words (eg, unregistered words) related to the extended dictionary. For example, the processor 250 may increase the size (or storage space) of the extended dictionary as the number of words related to the extended dictionary increases.

일 실시예에 따르면, 프로세서(250)는 번역 과정에서 통신 회로(210)를 통해 외부 전자 장치(202)로부터 번역되어야 할 원시 문장을 획득할 수 있다. 예를 들어, 프로세서(250)는 지정된 웹 사이트를 통해 외부 전자 장치(202)로부터 번역 요청된 원시 문장을 수신할 수 있다. According to an embodiment, the processor 250 may obtain an original sentence to be translated from the external electronic device 202 through the communication circuit 210 during the translation process. For example, the processor 250 may receive the original sentence requested for translation from the external electronic device 202 through a designated web site.

일 실시예에 따르면, 프로세서(250)는 원시 문장을 획득하면, 원시 문장에 포함되어 있는 각 원시 어휘에 대한 인코더(251)의 은닉 상태를 산출할 수 있다. 프로세서(250)는 이전 시점의 디코더(253)의 출력 어휘(대역 어휘) 및 주의집중 값에 기반하여 현 시점 디코더(253)의 은닉 상태(출력 벡터)를 산출할 수 있다. 프로세서(250)는 제1 사전 확률 분포 및 제2 사전 확률 분포 중 적어도 하나의 확률 분포에 기반하여 출력 벡터에 대응하는 대역 어휘를 결정하고, 결정된 대역 어휘를 순차적으로 포함하는 번역 문장을 생성할 수 있다. 프로세서(250)는 생성된 번역 문장을 통신 회로(210)를 통해 외부 전자 장치(202)에 제공할 수 있다.According to an embodiment, upon obtaining the original sentence, the processor 250 may calculate a hidden state of the encoder 251 for each original vocabulary included in the original sentence. The processor 250 may calculate the hidden state (output vector) of the current time decoder 253 based on the output vocabulary (band vocabulary) and the attention value of the decoder 253 at the previous time. The processor 250 may determine a band vocabulary corresponding to the output vector based on at least one of the first prior probability distribution and the second prior probability distribution, and generate a translation sentence sequentially including the determined band vocabulary. have. The processor 250 may provide the generated translated sentence to the external electronic device 202 through the communication circuit 210.

한 실시예에 따르면, 프로세서(250)는 획득된 원시 문장에 미등록 어휘가 포함되는지를 확인하고, 원시 문장에 미등록 어휘가 포함되지 않은 경우에는 제1 사전 확률 분포에 기반하여 기계 번역을 수행할 수 있다. 예를 들어, 프로세서(250)는 획득된 원시 문장에 미등록 어휘가 포함되지 않으면, 현 시점의 디코더(253)의 은닉 상태에 대응하는 제1 사전 확률 분포에 기반하여 seq-to-seq 기계 번역 방식에 따라 가장 높은 확률 값을 대역 어휘를 하나씩 결정함에 따라 번역 문장을 생성할 수 있다. 상기 미등록 어휘는 예를 들어, 사용 빈도가 낮은 고유 명사를 포함할 수 있다.According to an embodiment, the processor 250 checks whether an unregistered vocabulary is included in the acquired original sentence, and when the unregistered vocabulary is not included in the original sentence, the processor 250 may perform machine translation based on the first prior probability distribution. have. For example, if the acquired original sentence does not contain an unregistered vocabulary, the processor 250 may perform a seq-to-seq machine translation method based on a first prior probability distribution corresponding to the hidden state of the decoder 253 at the current time. Depending on the highest probability value, a translation sentence may be generated by determining the band vocabulary one by one. The unregistered vocabulary may include, for example, a proper noun with a low frequency of use.

한 실시예에 따르면, 프로세서(250)는 원시 문장에 미등록 어휘가 포함된 경우에는 제1 사전 확률 분포 및 제2 사전 확률 분포에 기반하여 기계 번역을 수행할 수 있다. 예를 들어, 프로세서(250)는 획득된 원시 문장에 미등록 어휘가 포함되면, 원시 문장에 대한 인코딩 및 디코딩 과정을 거쳐서 디코더(253)의 은닉 상태에 대응하는 제1 사전 확률 분포 및 제2 사전 확률 분포에 기반하여 가장 높은 생성 확률 값을 갖는 대역 어휘를 결정하고, 결정된 대역 어휘를 순차적으로 포함하는 번역 문장을 생성할 수 있다.According to an embodiment, when an unregistered vocabulary is included in the original sentence, the processor 250 may perform machine translation based on the first prior probability distribution and the second prior probability distribution. For example, if an unregistered vocabulary is included in the acquired original sentence, the processor 250 performs an encoding and decoding process for the original sentence to provide a first prior probability distribution and a second prior probability corresponding to the hidden state of the decoder 253. Based on the distribution, a band vocabulary having the highest generation probability value may be determined, and a translated sentence including the determined band vocabulary may be sequentially generated.

일 실시예에 따르면, 프로세서(250)는 원시 문장으로부터 사용자 사전과 관련된 어휘를 확인하면, 사용자 사전과 관련된 어휘를 포함하도록 번역 문장을 생성할 수 있다. 상기 사용자 사전과 관련된 어휘는 예를 들면, 지정된 기호들로 구분되는 원시 어휘와 대역 어휘의 쌍을 포함할 수 있다. 예를 들어, 프로세서(250)는 제1 사전 확률 분포 및 제2 사전 확률 분포에 기반하여 디코더(253)의 출력 벡터에 대응하는 대역 어휘가 사용자 사전과 관련된 어휘인 것으로 확인할 수 있다. 이 경우, 프로세서(250)는 번역 문장에 포함될 대역 어휘로서 사용자 사전과 관련된 어휘에 포함된 대역 어휘를 출력할 수 있다.According to an embodiment, when the processor 250 checks the vocabulary related to the user dictionary from the original sentence, the processor 250 may generate a translated sentence to include the vocabulary related to the user dictionary. The vocabulary related to the user dictionary may include, for example, a pair of a source vocabulary and a band vocabulary divided by designated symbols. For example, the processor 250 may determine that the band vocabulary corresponding to the output vector of the decoder 253 is a vocabulary related to the user dictionary based on the first prior probability distribution and the second prior probability distribution. In this case, the processor 250 may output a band vocabulary included in a vocabulary related to a user dictionary as a band vocabulary to be included in the translated sentence.

다양한 실시예에 따르면, 프로세서(250)는 입력 장치(220)를 통해 입력된 원시 문장을 확인하고, 원시 문장에 대응하는 번역 문장을 생성하고, 생성된 번역 문장을 출력 장치(230)를 통해 출력할 수 있다.According to various embodiments, the processor 250 checks the original sentence input through the input device 220, generates a translated sentence corresponding to the original sentence, and outputs the generated translated sentence through the output device 230 can do.

상술한 실시예에 따르면, 전자 장치(201)는 신경망 기반의 기계 번역을 수행할 때에 원시 문장에 포함된 원시 어휘 또는 사용자 사전에 대응하는 대역 어휘를 포함하도록 번역 문장을 생성 및 출력할 수 있다.According to the above-described embodiment, when performing machine translation based on a neural network, the electronic device 201 may generate and output a translated sentence to include a source vocabulary included in an original sentence or a band vocabulary corresponding to a user dictionary.

또한, 상술한 실시예에 따르면, 전자 장치(201)는 목표 어휘 사전과 별개로 확장 사전을 구성하고, 주의집중 메커니즘에 기반하여 각 시점에 예측되어야 할 대역 어휘와 대응 관계에 있는 원시 어휘의 위치를 확인하고 확인된 원시 어휘가 사용자 사전에 관련된 어휘이거나 미등록 어휘인 경우, 확장 사전으로부터 대역 어휘를 가져와서 대역 어휘를 생성할 수 있다.In addition, according to the above-described embodiment, the electronic device 201 configures an extended dictionary separately from the target vocabulary dictionary, and the position of the original vocabulary corresponding to the band vocabulary to be predicted at each point in time based on the attention mechanism. When the source vocabulary is checked and the identified original vocabulary is a vocabulary related to a user dictionary or an unregistered vocabulary, a band vocabulary can be generated by obtaining a band vocabulary from the extended dictionary.

도 3은 일 실시예에 따른 전자 장치에 의한 미등록 어휘 학습을 설명하기 위한 도면을 나타낸다.3 is a diagram for describing learning of unregistered vocabulary by an electronic device according to an exemplary embodiment.

도 3을 참조하면, 전자 장치(201)는 미등록 어휘를 포함하는 원시 문장 및 미등록 어휘를 포함하는 번역 문장의 쌍에 대한 지도 학습을 거쳐 미등록 어휘에 관련된 제2 사전(확장 사전) 확률 분포를 결정할 수 있다. 메모리(240)에 저장된 원시 어휘 사전 및 목표 어휘 사전(310)은 각기 10000개의 어휘를 저장 가능한 사전일 수 있다. 예를 들어, 원시 어휘 사전 및 목표 어휘 사전(310)은 0 내지 9999까지의 인덱스(ID)로 구분되는 원시 어휘 및 목표 어휘를 저장할 수 있다.Referring to FIG. 3, the electronic device 201 determines a probability distribution of a second dictionary (extended dictionary) related to the unregistered vocabulary through supervised learning on a pair of an original sentence including an unregistered vocabulary and a translated sentence including the unregistered vocabulary. I can. The source vocabulary dictionary and the target vocabulary dictionary 310 stored in the memory 240 may be dictionaries capable of storing 10000 vocabularies, respectively. For example, the source vocabulary dictionary and the target vocabulary dictionary 310 may store a source vocabulary and a target vocabulary divided by an index (ID) ranging from 0 to 9999.

전자 장치(201)에 의해 학습될 원시 문장(학습코퍼스 원문)과 번역 문장(학습코퍼스 번역문)의 쌍(320)은 각기 “I love Susan and Sandra”와 “나는 Susan과 Sandra를 사랑한다.” 일 수 있다. 본 문서에서는 “Susan”과 “Sandra”가 고유 명사로서 사용 빈도가 낮은 희소 어휘이므로, 전자 장치(201)에 의해 미등록 어휘로 확인된다고 가정하여 설명한다. 이에, “Susan”과 “Sandra”는 원시 어휘 사전에 포함되지 않고, 목표 어휘 사전에서 “Susan”과 “Sandra”에 대응하는 대역 어휘가 포함되지 않을 수 있다.The pair 320 of the original sentence (the original text of the learning corpus) and the translated sentence (the original text of the learning corpus) to be learned by the electronic device 201 are "I love Susan and Sandra" and "I love Susan and Sandra," respectively. Can be In this document, since “Susan” and “Sandra” are rare vocabularies with low frequency of use as proper nouns, description will be made on the assumption that they are identified as unregistered vocabulary by the electronic device 201. Accordingly, “Susan” and “Sandra” may not be included in the original vocabulary dictionary, and band vocabulary corresponding to “Susan” and “Sandra” may not be included in the target vocabulary dictionary.

전자 장치(201)는 “Susan”과 “Sandra”를 미등록 어휘로 확인하면, “Susan” 및 “Sandra”에 10000부터 시작하는 인덱스(ID)를 할당하고, 할당된 인덱스(ID)와 관련하여 “Susan” 및 “Sandra”를 등록된 미등록어 리스트(330)에 저장할 수 있다. 예를 들어, 전자 장치(201)는 목표 어휘 사전의 인덱스(ID)와 구별되도록, “Susan” 및 “Sandra”에 각기 10000 및 10001의 인덱스(ID)를 할당할 수 있다. 전자 장치(201)는 미등록 어휘 “Susan” 및 “Sandra”를 포함하는 번역 문장을 확인하면, 확장 사전(340)에 “Susan” 및 “Sandra”의 생성 확률 값들을 각기 저장할 수 있는 공간을 생성할 수 있다. 예를 들어, 전자 장치(201)는 “Susan” 및 “Sandra”의 생성 확률 값들을 저장하기 위하여 확장 사전(340)의 엔트리를 추가할 수 있다. When “Susan” and “Sandra” are identified as unregistered vocabulary, the electronic device 201 allocates indexes (IDs) starting from 10000 to “Susan” and “Sandra”, and “ Susan" and "Sandra" may be stored in the registered unregistered word list 330. For example, the electronic device 201 may allocate an index ID of 10000 and 10001 to “Susan” and “Sandra”, respectively, so as to be distinguished from the index ID of the target vocabulary dictionary. When the electronic device 201 checks the translated sentence including the unregistered vocabulary “Susan” and “Sandra”, the electronic device 201 creates a space in the extended dictionary 340 to store the generation probability values of “Susan” and “Sandra” respectively. I can. For example, the electronic device 201 may add an entry of the extension dictionary 340 to store generation probability values of “Susan” and “Sandra”.

이후, 전자 장치(201)는 미등록 어휘의 주의집중 확률 분포에 기반하여 미등록 어휘의 생성 확률 값들을 결정하고, 결정된 생성 확률 값을 확장 사전(340)에 저장할 수 있다. 예를 들어, 전자 장치(201)는 미등록 어휘에 대응하는 주의집중 확률 분포에서 가장 높은 확률 값을 미등록 어휘의 생성 확률 값으로 결정할 수 있다. 이 경우, 미등록 어휘의 생성 확률 값은 번역 문장에서 원시 언어의 형태로 등장하는 어휘의 생성 확률 값일 수 있다. Thereafter, the electronic device 201 may determine generation probability values of the unregistered vocabulary based on the attention probability distribution of the unregistered vocabulary and store the determined generation probability value in the expansion dictionary 340. For example, the electronic device 201 may determine the highest probability value from the attention probability distribution corresponding to the unregistered vocabulary as the generation probability value of the unregistered vocabulary. In this case, the generation probability value of an unregistered vocabulary may be a generation probability value of a vocabulary appearing in the form of a source language in a translated sentence.

도 4는 일 실시예에 따른 전자 장치에 의한 미등록 어휘에 관련된 확장 사전 이용 예를 설명하기 위한 도면을 나타낸다. 본 문서에서는 전자 장치(201)가 RNN 기반 종단 간(sequence-to-sequence) 신경망 기반의 기계 번역을 수행하는 경우를 가정하여 설명한다.4 is a diagram illustrating an example of using an extended dictionary related to an unregistered vocabulary by an electronic device according to an exemplary embodiment. In this document, it is assumed that the electronic device 201 performs machine translation based on an RNN-based sequence-to-sequence neural network.

도 4를 참조하면, 동작 410에서, 전자 장치(201)는 디코더(253)를 통해 목표 어휘 사전과 동일한 크기를 갖는 출력 벡터를 산출할 수 있다. 동작 420에서, 전자 장치(201)는 출력 벡터를 소프트맥스 레이어(257)에 입력하고, 소프트맥스 레이어(257)를 통해 출력 벡터를 정규화함에 따라 목표 어휘 사전을 구성하는 각 어휘가 생성될 확률 분포(425)를 산출할 수 있다. Referring to FIG. 4, in operation 410, the electronic device 201 may calculate an output vector having the same size as the target vocabulary dictionary through the decoder 253. In operation 420, the electronic device 201 inputs the output vector to the softmax layer 257, and normalizes the output vector through the softmax layer 257, thereby generating a probability distribution for each vocabulary constituting the target vocabulary dictionary. (425) can be calculated.

동작 430에서, 전자 장치(201)는 주의집중 모듈(255)을 통해 t 시점에 예측될 어휘에 관련된 주의집중 확률 분포(435)를 생성할 수 있다. 전자 장치(201)는 미등록 어휘를 확인하면, 주의집중 확률 분포(435)에 기반하여 미등록 어휘의 생성 확률 값을 결정하고, 결정된 미등록 어휘의 생성 확률 값을 확장 사전에 저장할 수 있다. 예를 들어, 전자 장치(201)는 주의집중 확률 분포(435)에서 가장 높은 확률 값을 미등록 어휘의 생성 확률 값으로 결정할 수 있다. 이후, 전자 장치(201)는 제1 사전 확률 분포 및 제2 사전 확률 분포에 기반하여 원시 문장에 대응하는 번역 문장을 생성할 수 있다.In operation 430, the electronic device 201 may generate an attention probability distribution 435 related to a vocabulary to be predicted at time t through the attention module 255. When the electronic device 201 checks the unregistered vocabulary, the electronic device 201 may determine a generation probability value of the unregistered vocabulary based on the attention probability distribution 435 and store the determined generation probability value of the unregistered vocabulary in the extended dictionary. For example, the electronic device 201 may determine the highest probability value from the attention probability distribution 435 as the generation probability value of an unregistered vocabulary. Thereafter, the electronic device 201 may generate a translated sentence corresponding to the original sentence based on the first prior probability distribution and the second prior probability distribution.

도 5는 일 실시예에 따른 전자 장치에 의한 번역 과정 예시도이다.5 is an exemplary diagram illustrating a translation process by an electronic device according to an exemplary embodiment.

도 5를 참조하면, 전자 장치(201)는 외부 전자 장치(202) 또는 입력 장치(220)로부터 원시 문장(또는, 입력 문장) “I like <tr>Susan|수쟨<tr>(510)을 확인할 수 있다. 전자 장치(201)는 상기 원시 문장(510)을 확인하면, 제1 지정된 기호 <tr>로 구분되는 “Susan|수쟨<tr>”을 사용자 사전에 관련된 어휘로 확인할 수 있다. 또한, 전자 장치(201)는 제2 지정된 기호인 “|” 이전의 어휘 “Susan “ 원시 어휘로 확인하고, “|” 이후의 “수쟨”을 대역 어휘로 확인할 수 있다. 전자 장치(201)는 대역 어휘 “수쟨”을 미등록어 리스트(530)에 인덱스(ID)와 관련하여 저장하고, 확장 사전()에 미등록어의 대역 어휘 “수쟨”의 생성 확률 값을 저장하기 위한 엔트리를 추가할 수 있다.Referring to FIG. 5, the electronic device 201 checks an original sentence (or input sentence) “I like <tr>Susan|Suzane<tr> 510 from the external electronic device 202 or the input device 220. I can. When the electronic device 201 checks the original sentence 510, the electronic device 201 may check “Susan|Suzy<tr>” identified by the first designated symbol <tr> as a vocabulary related to the user dictionary. In addition, the electronic device 201 is a second designated symbol “|” The previous vocabulary “Susan” confirms with the original vocabulary, and “|” You can confirm the subsequent “Suzam” as a band vocabulary. The electronic device 201 stores the band vocabulary "Suzya" in the unregistered word list 530 in relation to the index (ID), and stores the generation probability value of the band vocabulary "Suzya" of the unregistered words in the extended dictionary. You can add an entry.

전자 장치(201)는 “나는”까지의 번역 문장(520)을 생성하고, 다음 대역 어휘 “수쟨”을 생성하는 시점에 디코더(253)와 소프트맥스 레이어(257)를 거쳐서 현재 시점에 출력해야 할 “수쟨”에 관련된 목표 어휘 확률 분포(560)를 결정할 수 있다. 또한, 전자 장치(201)는 주의집중 모듈(255)을 통해 현재 생성할 대역 어휘 “수쟨”과 대응 관계에 있는 원시 어휘를 추정할 수 있다. 이에, 전자 장치(201)는 현재 생성해야 할 대역 어휘와 대응 관계에 있는 원시 어휘가 “Susan|수쟨<tr>”인 것을 확인할 수 있다. 전자 장치(201)는 “Susan|수쟨<tr>”을 포지셔닝하는 위치의 주의집중 확률 분포(540)에 따른 값(어텐션 가중치)이 0.7로서 최고 값을 가짐을 확인할 수 있다. 전자 장치(201)는 확장 사전(570)의 수쟨의 인덱스(ID)에 대응하는 위치에 수쟨의 생성 확률 값 0.7을 저장할 수 있다. 이후, 전자 장치(201)는 “나는 수쟨”까지의 번역 문장(550)을 출력할 수 있다.The electronic device 201 generates the translated sentence 520 up to “I” and passes through the decoder 253 and the softmax layer 257 at the time of generating the next band vocabulary “Suzang”. It is possible to determine a target vocabulary probability distribution 560 related to "Suzy." In addition, the electronic device 201 may estimate a source vocabulary corresponding to the currently generated band vocabulary “Suzam” through the attention module 255. Accordingly, the electronic device 201 may confirm that the original vocabulary corresponding to the band vocabulary to be generated currently is "Susan|Suzy<tr>". The electronic device 201 may confirm that a value (attention weight) according to the attention probability distribution 540 of the position positioning “Susan|Suzy<tr>” has the highest value as 0.7. The electronic device 201 may store a generation probability value of 0.7 in the extension dictionary 570 in a location corresponding to the index ID of the number. Thereafter, the electronic device 201 may output the translated sentences 550 up to “I am Suzy”.

도 6은 일 실시예에 따른 신경망 기반 기계 번역 방법의 흐름도를 나타낸다.6 is a flowchart of a neural network-based machine translation method according to an embodiment.

도 6을 참조하면, 동작 610에서, 전자 장치(201)는 외부 전자 장치(202)로부터 미등록 어휘 또는 사용자 사전에 관련된 어휘 중 적어도 하나의 어휘를 포함하는 원시 문장을 획득할 수 있다.Referring to FIG. 6, in operation 610, the electronic device 201 may obtain an original sentence including at least one of an unregistered vocabulary or a vocabulary related to a user dictionary from the external electronic device 202.

동작 620에서, 전자 장치(201)는 원시 문장으로부터 상기 적어도 하나의 어휘에 대해서, 상기 적어도 하나의 어휘에 대응하는 주의집중 확률 분포 및 상기 목표 어휘 사전의 확률 분포에 기반하여 대역 어휘를 결정할 수 있다.In operation 620, for the at least one vocabulary from the original sentence, the electronic device 201 may determine a band vocabulary based on a probability distribution of attention concentration corresponding to the at least one vocabulary and a probability distribution of the target vocabulary dictionary. .

동작 630에서, 전자 장치(201)는 결정된 대역 어휘를 포함하도록 상기 원시 문장에 대응하는 번역 문장을 생성할 수 있다.In operation 630, the electronic device 201 may generate a translated sentence corresponding to the original sentence to include the determined band vocabulary.

동작 640에서, 전자 장치(201)는 생성된 번역 문장을 상기 외부 전자 장치(202)에 제공할 수 있다.In operation 640, the electronic device 201 may provide the generated translated sentence to the external electronic device 202.

일 실시예에 따르면, 전자 장치는, 외부 전자 장치와 통신할 수 있는 통신 회로; 신경망 기반의 기계 번역을 위한 적어도 하나의 인스트럭션, 원시 어휘 사전, 목표 어휘 사전 및 확장 사전이 저장된 메모리; 상기 원시 어휘 사전 및 목표 어휘 사전은, 각기 원시 어휘와 대역 어휘와 관련되고, 및 프로세서를 포함하고, 상기 프로세서는, 상기 적어도 하나의 인스트럭션을 실행함으로써, 상기 통신 회로를 통해 상기 외부 전자 장치로부터 원시 문장을 획득하고, 상기 원시 문장으로부터 미등록 어휘 또는 사용자 사전에 관련된 어휘 중 적어도 하나의 어휘에 대해서, 상기 적어도 하나의 어휘에 대응하는 주의집중 확률 분포 및 상기 목표 어휘 사전의 확률 분포에 기반하여 대역 어휘를 결정하고, 상기 결정된 대역 어휘를 포함하도록 상기 원시 문장에 대응하는 번역 문장을 생성하고, 상기 생성된 번역 문장을 상기 통신 회로를 통해 상기 외부 전자 장치에 제공할 수 있다.According to an embodiment, an electronic device includes: a communication circuit capable of communicating with an external electronic device; A memory in which at least one instruction, a source vocabulary dictionary, a target vocabulary dictionary, and an extension dictionary for neural network-based machine translation are stored; The source vocabulary dictionary and the target vocabulary dictionary are each associated with a source vocabulary and a band vocabulary, and include a processor, and the processor executes the at least one instruction, thereby A band vocabulary based on an attention probability distribution corresponding to the at least one vocabulary and a probability distribution of the target vocabulary dictionary for at least one of an unregistered vocabulary or a vocabulary related to a user dictionary from the original sentence May be determined, a translated sentence corresponding to the original sentence may be generated to include the determined band vocabulary, and the generated translated sentence may be provided to the external electronic device through the communication circuit.

상기 프로세서는, 상기 목표 어휘 사전에 관련된 어휘들의 확률 분포 및 상기 확장 사전에 관련된 어휘들의 확률 분포에 기반하여 가장 높은 생성 확률 값을 갖는 대역 어휘에 기반하여 상기 원시 문장에 대응하는 번역 문장을 생성할 수 있다.The processor may generate a translation sentence corresponding to the original sentence based on a band vocabulary having the highest generation probability value based on a probability distribution of vocabularies related to the target vocabulary dictionary and a probability distribution of vocabularies related to the extended dictionary. I can.

상기 미등록 어휘에 대응하는 대역 어휘는, 상기 미등록 어휘일 수 있다.The band vocabulary corresponding to the unregistered vocabulary may be the unregistered vocabulary.

상기 프로세서는, 각기 하나 이상의 미등록 어휘들을 포함하는 원시 문장들 및 상기 원시 문장들에 대응하는 번역 문장들의 쌍을 학습하여 상기 원시 문장들에 포함된 미등록 어휘들에 대응하는 주의집중 확률 분포를 결정하고, 상기 주의집중 확률 분포에 기반하여 상기 하나 이상의 미등록 어휘들의 생성 확률 값들을 결정하고, 상기 결정된 생성 확률 값들을 상기 하나 이상의 미등록 어휘들과 관련하여 상기 확장 사전에 저장할 수 있다.The processor determines an attention probability distribution corresponding to unregistered vocabularies included in the original sentences by learning original sentences each including one or more unregistered vocabulary and a pair of translated sentences corresponding to the original sentences, and , On the basis of the attention probability distribution, generation probability values of the one or more unregistered words may be determined, and the determined generation probability values may be stored in the extended dictionary in relation to the one or more unregistered words.

상기 프로세서는, 상기 확장 사전에 관련된 어휘의 개수에 따라 상기 확장 사전의 크기를 가변 시킬 수 있다.The processor may change the size of the extension dictionary according to the number of words related to the extension dictionary.

상기 프로세서는, 상기 원시 문장으로부터 지정된 기호들로 구분된 어휘들을 확인하면, 상기 구분된 어휘를 상기 사용자 사전에 관련된 어휘로 결정할 수 있다.The processor may determine the classified vocabulary as a vocabulary related to the user dictionary when vocabulary is identified from the original sentence by designated symbols.

상기 사용자 사전에 관련된 어휘는, 상기 지정된 기호들로 구분된 원시 어휘와 대역 어휘의 쌍을 포함할 수 있다.The vocabulary related to the user dictionary may include a pair of a source vocabulary and a band vocabulary divided by the designated symbols.

상기 프로세서는, 상기 번역 문장을 생성하는 도중에, 상기 사용자 사전에 관련된 어휘에 대응하는 위치 별 주의집중 확률 분포를 확인하고, 상기 주의집중 확률 분포에서 가장 높은 확률 값을 갖는 위치를 상기 사용자 사전에 관련된 어휘의 위치로 결정하고, 상기 결정된 위치에 상기 사용자 사전에 관련된 어휘를 포함하는 상기 번역 문장을 생성할 수 있다.The processor, while generating the translated sentence, checks a distribution of attention probability for each location corresponding to a vocabulary related to the user dictionary, and associates the location with the highest probability value in the attention probability distribution to the user dictionary. It is possible to determine the position of the vocabulary, and generate the translated sentence including the vocabulary related to the user dictionary at the determined position.

일 실시예에 따르면, 전자 장치는, 입력 장치; 출력 장치; 목표 어휘 사전 및 확장 사전이 저장된 메모리; 및 프로세서를 포함하고, 상기 프로세서는, 상기 입력 장치를 통해 상기 외부 전자 장치로부터 원시 문장을 획득하고, 상기 원시 문장으로부터 상기 목표 어휘 사전에 포함되지 않은 미등록 어휘를 확인하면, 상기 미등록 어휘에 대응하는 주의집중 확률 분포에 기반하여 상기 미등록 어휘의 생성 확률 값으로 결정하고, 상기 미등록 어휘를 상기 결정된 확률 값과 관련하여 상기 확장 사전에 저장하고, 상기 미등록 어휘를 포함하도록 상기 원시 문장에 대응하는 번역 문장을 생성하고, 상기 생성된 번역 문장을 상기 출력 장치를 통해 출력할 수 있다.According to an embodiment, an electronic device includes: an input device; Output device; Memory in which the target vocabulary dictionary and the extended dictionary are stored; And a processor, wherein the processor obtains an original sentence from the external electronic device through the input device, and when confirming an unregistered vocabulary not included in the target vocabulary dictionary from the original sentence, the processor corresponds to the unregistered vocabulary. A translation sentence corresponding to the original sentence to include the unregistered vocabulary to determine the probability value of generation of the unregistered vocabulary based on the attention probability distribution, store the unregistered vocabulary in the expansion dictionary in relation to the determined probability value, and include the unregistered vocabulary May be generated, and the generated translated sentence may be output through the output device.

일 실시예에 따르면, 신경망 기반 기계 번역 방법은, 외부 전자 장치로부터 미등록 어휘 또는 사용자 사전에 관련된 어휘 중 적어도 하나의 어휘를 포함하는 원시 문장을 획득하는 동작; 원시 문장으로부터 상기 적어도 하나의 어휘에 대해서, 상기 적어도 하나의 어휘에 대응하는 주의집중 확률 분포 및 상기 목표 어휘 사전의 확률 분포에 기반하여 대역 어휘를 결정하는 동작; 상기 결정된 대역 어휘를 포함하도록 상기 원시 문장에 대응하는 번역 문장을 생성하는 동작; 및 상기 생성된 번역 문장을 상기 외부 전자 장치에 제공하는 동작을 포함할 수 있다.According to an embodiment, a method of machine translation based on a neural network includes: obtaining an original sentence including at least one of an unregistered vocabulary or a vocabulary related to a user dictionary from an external electronic device; Determining a band vocabulary for the at least one vocabulary from an original sentence based on a probability distribution of attention and a probability distribution of the target vocabulary dictionary corresponding to the at least one vocabulary; Generating a translated sentence corresponding to the original sentence to include the determined band vocabulary; And providing the generated translated sentence to the external electronic device.

본 문서의 다양한 실시예들 및 이에 사용된 용어들은 본 문서에 기재된 기술적 특징들을 특정한 실시예들로 한정하려는 것이 아니며, 해당 실시예의 다양한 변경, 균등물, 또는 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 또는 관련된 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다. 아이템에 대응하는 명사의 단수 형은 관련된 문맥상 명백하게 다르게 지시하지 않는 한, 상기 아이템 한 개 또는 복수 개를 포함할 수 있다. 본 문서에서, "A 또는 B", "A 및 B 중 적어도 하나",“A 또는 B 중 적어도 하나”, "A, B 또는 C", "A, B 및 C 중 적어도 하나” 및 “A, B, 또는 C 중 적어도 하나"와 같은 문구들 각각은 그 문구들 중 해당하는 문구에 함께 나열된 항목들 중 어느 하나, 또는 그들의 모든 가능한 조합을 포함할 수 있다. "제 1", "제 2", 또는 "첫째" 또는 "둘째"와 같은 용어들은 단순히 해당 구성요소를 다른 해당 구성요소와 구분하기 위해 사용될 수 있으며, 해당 구성요소들을 다른 측면(예: 중요성 또는 순서)에서 한정하지 않는다. 어떤(예: 제 1) 구성요소가 다른(예: 제 2) 구성요소에, “기능적으로” 또는 “통신적으로”라는 용어와 함께 또는 이런 용어 없이, “커플드” 또는 “커넥티드”라고 언급된 경우, 그것은 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로(예: 유선으로), 무선으로, 또는 제 3 구성요소를 통하여 연결될 수 있다는 것을 의미한다.Various embodiments of the present document and terms used therein are not intended to limit the technical features described in this document to specific embodiments, and should be understood to include various modifications, equivalents, or substitutes of the corresponding embodiments. In connection with the description of the drawings, similar reference numerals may be used for similar or related components. The singular form of a noun corresponding to an item may include one or a plurality of the items unless clearly indicated otherwise in a related context. In this document, “A or B”, “at least one of A and B”, “at least one of A or B”, “A, B or C”, “at least one of A, B and C” and “A, Each of phrases such as "at least one of B or C" may include any one of the items listed together in the corresponding one of the phrases, or all possible combinations thereof. Terms such as "first", "second", or "first" or "second" may be used simply to distinguish the component from other Order) is not limited. Some (eg, first) component is referred to as “coupled” or “connected” to another (eg, second) component, with or without the terms “functionally” or “communicatively”. When mentioned, it means that any of the above components may be connected to the other components directly (eg by wire), wirelessly, or via a third component.

본 문서에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구현된 유닛을 포함할 수 있으며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로와 같은 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는, 상기 부품의 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 일실시예에 따르면, 모듈은 ASIC(application-specific integrated circuit)의 형태로 구현될 수 있다.The term "module" used in this document may include a unit implemented in hardware, software, or firmware, and may be used interchangeably with terms such as logic, logic blocks, parts, or circuits. The module may be an integrally configured component or a minimum unit of the component or a part thereof that performs one or more functions. For example, according to an embodiment, the module may be implemented in the form of an application-specific integrated circuit (ASIC).

본 문서의 다양한 실시예들은 기기(machine)(예: 전자 장치(201)) 의해 읽을 수 있는 저장 매체(storage medium)(예: 내장 메모리 또는 외장 메모리)(메모리(240))에 저장된 하나 이상의 명령어들을 포함하는 소프트웨어(예: 프로그램)로서 구현될 수 있다. 예를 들면, 기기(예: 전자 장치(201))의 프로세서(예: 프로세서(250)는, 저장 매체로부터 저장된 하나 이상의 명령어들 중 적어도 하나의 명령을 호출하고, 그것을 실행할 수 있다. 이것은 기기가 상기 호출된 적어도 하나의 명령어에 따라 적어도 하나의 기능을 수행하도록 운영되는 것을 가능하게 한다. 상기 하나 이상의 명령어들은 컴파일러에 의해 생성된 코드 또는 인터프리터에 의해 실행될 수 있는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장매체 는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, ‘비일시적’은 저장매체가 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다.Various embodiments of the present document include one or more instructions stored in a storage medium (eg, internal memory or external memory) (memory 240) that can be read by a machine (eg, electronic device 201). It may be implemented as software (eg, a program) including them. For example, a processor (eg, processor 250) of a device (eg, electronic device 201) may call and execute at least one command of one or more commands stored from a storage medium. This allows the device to execute it. It is possible to operate to perform at least one function according to the at least one command called, and the at least one command may include a code generated by a compiler or a code that can be executed by an interpreter. A storage medium that can be read may be provided in the form of a non-transitory storage medium, where'non-transitory' refers to a device in which the storage medium is tangible and a signal (eg, electromagnetic wave). ), and this term does not distinguish between a case where data is stored semi-permanently in a storage medium and a case that is temporarily stored.

일실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어^TM)를 통해 또는 두 개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to an embodiment, a method according to various embodiments disclosed in the present document may be provided by being included in a computer program product. Computer program products can be traded between sellers and buyers as commodities. Computer program products are distributed in the form of a device-readable storage medium (e.g., compact disc read only memory (CD-ROM)), or through an application store (e.g., Play Store ^TM ) or two user devices (e.g., compact disc read only memory (CD-ROM)). It can be distributed (e.g., downloaded or uploaded) directly between, e.g. smartphones). In the case of online distribution, at least a part of the computer program product may be temporarily stored or temporarily generated in a storage medium that can be read by a device such as a server of a manufacturer, a server of an application store, or a memory of a relay server.

다양한 실시예들에 따르면, 상기 기술한 구성요소들의 각각의 구성요소(예: 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있다. 다양한 실시예들에 따르면, 전술한 해당 구성요소들 중 하나 이상의 구성요소들 또는 동작들이 생략되거나, 또는 하나 이상의 다른 구성요소들 또는 동작들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성요소들(예: 모듈 또는 프로그램)은 하나의 구성요소로 통합될 수 있다. 이런 경우, 통합된 구성요소는 상기 복수의 구성요소들 각각의 구성요소의 하나 이상의 기능들을 상기 통합 이전에 상기 복수의 구성요소들 중 해당 구성요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따르면, 모듈, 프로그램 또는 다른 구성요소에 의해 수행되는 동작들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 상기 동작들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 동작들이 추가될 수 있다.According to various embodiments, each component (eg, module or program) of the above-described components may include a singular number or a plurality of entities. According to various embodiments, one or more components or operations among the above-described corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, a plurality of components (eg, a module or program) may be integrated into one component. In this case, the integrated component may perform one or more functions of each component of the plurality of components in the same or similar to that performed by the corresponding component among the plurality of components prior to the integration. . According to various embodiments, operations performed by a module, program, or other component may be sequentially, parallel, repeatedly, or heuristically executed, or one or more of the operations may be executed in a different order or omitted. Or one or more other actions may be added.

Claims

In the electronic device,
A communication circuit capable of communicating with an external electronic device;
A memory in which at least one instruction, a source vocabulary dictionary, a target vocabulary dictionary, and an extension dictionary for neural network-based machine translation are stored; The source vocabulary dictionary and the target vocabulary dictionary are each associated with a source vocabulary and a band vocabulary, and
Including a processor, the processor,
By executing the at least one instruction,
Obtaining an original sentence from the external electronic device through the communication circuit,
For at least one of an unregistered vocabulary or a vocabulary related to a user dictionary from the original sentence, a band vocabulary is determined based on an attention probability distribution corresponding to the at least one vocabulary and a probability distribution of the target vocabulary dictionary,
Generating a translation sentence corresponding to the original sentence to include the determined band vocabulary,
An electronic device that provides the generated translated sentence to the external electronic device through the communication circuit.