KR102065083B1

KR102065083B1 - An English sentence translation system with machine learning-based direct-reading symbols

Info

Publication number: KR102065083B1
Application number: KR1020190052135A
Authority: KR
Inventors: 손민석
Original assignee: 손민석
Priority date: 2019-05-03
Filing date: 2019-05-03
Publication date: 2020-01-10

Abstract

The present invention relates to an English sentence translation providing system with direct-reading symbols based on machine learning. The English sentence translation providing system with direct-reading symbols based on machine learning according to the present invention comprises: a content input part receiving English content from a user terminal; a direct-reading symbol generation part adding a direct-reading symbol to the English content received in the content input part; a translation part receiving an English sentence including the direct-reading symbol generated in the direct-reading symbol generation part and performing direct-reading translation to generate a direct-reading translation; and a content transmission part providing the English content with the direct-reading symbol generated from the direct-reading symbol generation part and the direct-reading translation to the user terminal.

Description

An English sentence translation system with machine learning-based direct-reading symbols}

본 발명은 머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템에 관한 것이다.The present invention relates to a system for providing English sentence translation based on direct learning symbols based on machine learning.

지금까지의 영한기계번역 시스템들은 대개 영어 해석, 변환, 그리고 한국어 생성의 3단계로 이루어져있다. 특히 기계번역 시스템의 성능을 좌우하는 영어 해석 단계는 영어 형태소 분석, 구문 분석, 의미 생성의 세 가지 과정을 거치고 있다.So far, English-Korean machine translation systems usually consist of three stages: English interpretation, translation, and Korean generation. In particular, the English interpretation stage, which determines the performance of the machine translation system, goes through three processes: English morphological analysis, syntax analysis, and meaning generation.

일반적으로 형태소 분석과정에서 각 단어의 품사를 지정하고 구문 분석과정에서 품사열을 입력받아 문장의 구조를 분석하는 데, 단문에 비해 중문이나 복문의 성공률이 낮고, 혼합문과 같이 문장이 복잡하고 길어질수록 실패할 확률이 높다.In general, the parts of speech are designated in the morphological analysis process and the parts of speech are analyzed in the syntax analysis process. The success rate of the sentence in Chinese or compound sentence is lower than that of the short sentence, and the more complicated and longer the sentence is, like the mixed sentence. There is a high probability of failure.

문장 전체를 한꺼번에 분석하려고 할 경우 규칙이 부족하거나 구문에 다양하게 존재하는 애매성 처리에 종종 오류가 생기기 때문이다.If you try to analyze the whole sentence at once, there is often an error in the lack of rules or the ambiguity in the syntax.

이러한 문제의 해결책으로, 장문을 절 단위의 단문으로 분할하여 영어 문장의 구조를 분석하기도 한다. 절 단위로 그 역할을 분명히 할 수 있고, 단어 단위에서는 애매성이 보다 낮으므로 좀 더 정확한 번역을 할 수 있는 것이다. As a solution to this problem, the long sentence is divided into short paragraphs to analyze the structure of the English sentence. Its role can be clarified in terms of clauses, and its lower ambiguity in word units allows for a more accurate translation.

이와 같은 노력으로 현존의 영한 번역기들이 오류도 줄고 번역시간도 단축되어 많이 개선되었다고는 하지만 여전히 미흡한 면이 있다. 그 시스템들이 대부분 자체의 구조적 한계를 벗어나지 못하고 있는 것이다. 즉, 긴 문장인 경우, 구조 분석의 실패를 줄이기 위하여 절로 구분하는 과정을 포함시켜야 하고, 절 단위 내에서 하나의 문장 성분을 결정하기 위해서는 단어에서 구, 절의 수준에 이르기까지 순차적으로 모든 가능한 형태의 후보를 검증해야 한다. 문장성분마다 동일한 과정을 반복한 다음 문형 규칙을 적용하여 구문을 분석해야 한다.Although such efforts have improved the existing English-language translators with fewer errors and shorter translation times, there are still some disadvantages. Most of these systems do not escape their structural limitations. In other words, in case of long sentences, it is necessary to include a sectioning process to reduce the failure of structural analysis, and in order to determine one sentence component within a section unit, all possible forms are sequentially performed from word to phrase and clause level. Candidates should be verified. You must repeat the same process for each sentence component and then analyze the syntax by applying sentence rules.

이것은 현존 번역기의 중추인 구문 분석이 기존의 영문법 체계에 근거하고 있기 때문에 불가피한 현상이다. 문법 요소로서 문장성분을 주어·동사·목적어·보어·수식어(장소, 시간, 이유, 방법) 등으로 나누고, 영문장에 6하 원칙을 가미(who, what, where, when, why, how)하며, 그것을 기준으로 문형을 분류하는 것이다.This is an unavoidable phenomenon because the parsing, the backbone of the existing translator, is based on the existing English grammar system. Sentence elements are given as grammar elements, and they are divided into verbs, verbs, objects, bores, and formulas (place, time, reason, method), and the sixth principle is added to the English text (who, what, where, when, why, how). , It is to classify sentence patterns based on it.

이 문법에 의존하여 영어 문장을 해석하려면 우선 절을 구분해야 하고, 그 다음에 절단위의 문형을 분류해서 그 문장성분에 따라 의미를 부여하는 과정을 거쳐야 한다. 기본문형의 수만큼 다수의 문법규칙이 개입되고 그 적용이 절마다 반복된다. 여기에 변형된 규칙까지 더해지면, 그 해석 과정은 길어지고 복잡해지기 마련이다.In order to interpret English sentences based on this grammar, first, the clauses must be separated, and then the sentence patterns on the truncation must be classified and given meaning according to the sentence components. As many grammar rules as there are base sentences are involved, and their application is repeated from clause to clause. In addition to the modified rules, the interpretation process becomes long and complicated.

또한, 문장성분의 지정과 함께 구문 분석이 완료된 후에도, 의미부여 단계에서 그 세부 구조를 전체적으로 다시 검토해야 한다. 그러지 않으면 의미 처리가 불가능하거나 불완전한 경우가 잦은 것도 중대한 결함이다. 이것은 번역상의 오류를 일으키는 주된 원인이 되기도 한다. 특히, 영어에 발달된 동사 어구의 복합적인 구조를 세부적으로 분석하지 못하기 때문에, 의미를 부여할 때 원문을 재검하여 옮기지 않으면 정확한 번역을 할 수가 없다.In addition, even after the parsing is completed with the designation of sentence components, the detailed structure must be reviewed as a whole in the semantic step. Otherwise, it is also a serious flaw that semantic processing is often impossible or incomplete. This is also a major cause of translation errors. In particular, since the complex structure of verb phrases developed in English cannot be analyzed in detail, accurate meaning cannot be obtained unless the original text is reviewed and moved when assigning meaning.

등록특허 10-1302875호Patent 10-1302875

본 발명은 상기의 문제점을 해결하기 위해 안출된 것으로서 영어문장을 문장성분 또는 품사별로 분석하여 각각 문장성분 또는 품사별로 특정된 기호로 문장 데이터를 표시함으로써 문장 구조를 신속하고 용이하게 파악할 수 있으며 파악된 문장 구조를 이용하여 번역을 진행함으로 신속하고 정확한 번역이 가능하도록 하는 머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템을 제공함에 그 목적이 있다.The present invention has been made to solve the above problems and by analyzing the English sentence by sentence component or part of speech and sentence data by the symbol specific to each sentence component or part of speech, it is possible to quickly and easily grasp the sentence structure and identified It is an object of the present invention to provide a system for providing an English sentence translation system having a direct reading symbol based on machine learning, which enables rapid and accurate translation by proceeding translation using a sentence structure.

본 발명은 사용자 단말로부터 영문 컨텐츠를 입력받는 컨텐츠 입력부; 상기 컨텐츠 입력부에서 입력받은 영문 컨텐츠에 직독직해 기호를 추가하는 직독직해 기호 생성부; 상기 직독직해 기호 생성부에서 생성된 직독직해 기호가 포함된 영어 문장을 입력받아 직독직해 번역을 수행하여 직독직해 번역문을 생성하는 번역부; 및 상기 직독직해 기호 생성부로부터 생성된 직독직해 기호가 부가된 영문 컨텐츠와 직독직해 번역문을 사용자 단말로 제공하는 컨텐츠 전송부를 포함한다.The present invention provides a content input unit for receiving English content from a user terminal; A direct reading symbol generator for adding a direct reading symbol to the English content inputted from the content input unit; A translation unit configured to receive an English sentence including a direct reading symbol generated by the direct reading symbol generator and perform a direct reading to generate a direct reading translation; And a content transmission unit providing the user terminal with a direct reading translation and the English content added with the direct reading symbol generated from the direct reading symbol generator.

본 발명에 따르면 영어문장을 문장성분 또는 품사별로 분석하여 각각 문장성분 또는 품사별로 특정된 기호로 문장 데이터를 표시함으로써 문장 구조를 신속하고 용이하게 파악할 수 있으며 파악된 문장 구조를 이용하여 번역을 진행함으로 신속하고 정확한 번역이 가능하도록 한다.According to the present invention, by analyzing the English sentence by sentence component or part-of-speech and displaying sentence data with symbols specific for each sentence component or part-of-speech, it is possible to quickly and easily grasp the sentence structure and to proceed with the translation using the identified sentence structure. Ensure fast and accurate translations.

도 1은 본 발명이 제안하는 머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템의 블록구성도를 도시한 것이다.
도 2는 본 발명에 따른 기호표시가 수행된 문장데이터의 예시도이다.
도 3은 직독직해 번역문이 제공되는 예시도이다.
도 4는 한국어 어순 번역문이 제공되는 예시도이다.
도 5는 개념 영상 생성 과정을 보여주는 도면이다.
도 6은 비교 영상 생성 과정을 보여주는 도면이다.
도 7은 도 1의 번역부의 상세 구성도이다.1 is a block diagram of an English sentence translation providing system having a direct reading based on machine learning proposed by the present invention.
2 is an exemplary view of sentence data on which symbol display is performed according to the present invention.
3 is an exemplary diagram in which a translation is directly read.
4 is an exemplary view in which a Korean word order translation is provided.
5 is a diagram illustrating a conceptual image generation process.
6 is a view illustrating a comparison image generation process.
7 is a detailed block diagram of the translation unit of FIG. 1.

본 발명의 추가의 특징 및 효과는 이하 설명하는 발명을 실시하기 위한 최선의 형태로부터 명확하게 이해될 것이다.Further features and effects of the present invention will be clearly understood from the best mode for carrying out the invention described below.

실시예Example

이하, 첨부 도면을 참조하면서, 바람직한 실시예에 따라서 본 발명을 상세하게 설명한다.EMBODIMENT OF THE INVENTION Hereinafter, this invention is demonstrated in detail according to a preferable embodiment, referring an accompanying drawing.

도 1은 본 발명이 제안하는 머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템의 블록구성도를 도시한 것이다.1 is a block diagram of an English sentence translation providing system having a direct reading based on machine learning proposed by the present invention.

도 1을 참조하면, 본 발명이 제안하는 머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템(1)은 컨텐츠 입력부(2), 직독직해 기호 생성부(3), 번역부(3-1), 개념 영상 생성부(3-2), 비교 영상 생성부(3-3) 및 컨텐츠 전송부(4)로 나뉠 수 있다. Referring to FIG. 1, the English sentence translation providing system 1 having a machine learning-based direct reading symbol proposed by the present invention includes a content input unit 2, a direct reading symbol generating unit 3, and a translation unit 3-. 1) it may be divided into a concept image generator 3-2, a comparison image generator 3-3, and a content transmitter 4;

컨텐츠 입력부(2)는 사용자 단말로부터 텍스트, 이미지, 동영상 등의 영문 컨텐츠를 입력받는 기능을 제공한다. 여기에서 영문 컨텐츠란 영어 문장이 포함된 컨텐츠를 의미하며, 컨텐츠 입력부(2)는 텍스트를 입력받는 경우에는 영어 문장을 추출하여 직독직해 기호 생성부(3)로 제공하며, 이미지나 동영상의 경우에 이미지나 동영상에 포함된 영어 문장을 추출하여 추출된 영어 문장을 직독직해 기호 생성부(3)로 제공한다.The content input unit 2 provides a function of receiving English content such as text, an image, and a video from a user terminal. Herein, the English content means content including an English sentence, and the content input unit 2 extracts an English sentence and provides it directly to the symbol generator 3 when the text is input, and in the case of an image or a video. The English sentence included in the image or video is extracted and directly provided to the symbol generator 3 to extract the English sentence.

이러한 컨텐츠 입력부(2)는 핸드폰 카메라를 포함하며, 핸드폰 카메라를 통하여 촬영된 영어 문장을 입력받을 수 있으며, 웹으로부터 영어 문장을 입력받을 수도 있다.The content input unit 2 includes a mobile phone camera, and may receive an English sentence photographed through the mobile phone camera, or may receive an English sentence from the web.

다음으로, 직독직해 기호 생성부(3)는 컨텐츠 입력부(2)를 통해 입력받은 영문 컨텐츠를 기초로 맥락을 파악하여 가장 적절한 직독직해 기호를 추가한다.Next, the direct reading symbol generator 3 detects the context based on the English content received through the content input unit 2 and adds the most suitable direct reading symbol.

또한, 번역부(3-1)는 직독직해 기호 생성부(3)에서 생성된 직독직해 기호가 포함된 영어 문장을 입력받아 직독직해 번역과 한국어 어순 번역을 수행하여 직독직해 번역문과 한국어 어순 번역문을 생성한다.In addition, the translation unit 3-1 receives an English sentence containing a direct reading symbol generated by the direct reading symbol generating unit 3 and performs a direct reading and a Korean word order translation to perform a direct reading translation and a Korean word order translation. Create

여기에서, 번역부(3-1)는 직독직해 기호가 표함된 영어 문장에서 직독직해 기호의 의미 단위 연산을 수행하고, 이를 기초로 하여 의미단위 직독직해를 수행한다.Here, the translation unit 3-1 performs a semantic unit operation of the direct reading symbol in the English sentence with the direct reading symbol, and performs the direct reading of the semantic unit based on the reading.

다음으로, 개념 영상 생성부(3-2)는 직독직해 번역에 따른 개념 영상을 생성한다.Next, the concept image generator 3-2 directly generates a concept image according to translation.

여기에서, 개념 영상이란 직독직해 번역문에 따라 순차적으로 개념이 연상되도록 제작된 영상을 의미한다.Here, the concept image refers to an image produced so that the concept is sequentially reminded according to the translation.

이에 더해서 개념 영상 생성부(3-2)는 직독직해 기호 설명 영상과, 주장(부정) 의미 단위 표시 영상, 연결사 표시 영상, 부정어구 표시 영상 등을 추가적으로 생성하여 제공할 수 있다.In addition, the concept image generation unit 3-2 may additionally generate and provide a direct explanation symbol description image, a claim (negative) meaning unit display image, a conjunction verb display image, a negative phrase display image, and the like.

비교 영상 생성부(3-3)는 한국어 어순 번역문에 따라 순차적으로 개념이 연상되도록 제작된 비교 영상을 생성한다. The comparison image generating unit 3-3 generates a comparison image, which is made so that concepts are sequentially associated with the Korean word order translation.

또한, 컨텐츠 전송부(4)는 직독직해 기호 생성부(3)로부터 직독직해 기호가 부가된 컨텐츠, 직독직해 번역문, 한국어 어순 번역문, 개념 영상 및 비교 영상을 사용자 단말로 제공한다.In addition, the content transmitter 4 provides the user terminal with the content, the direct translation, the Korean word order translation, the concept image, and the comparison image to which the direct reading symbol is added from the direct reading symbol generating unit 3.

머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템(1)과 사용자 단말은 근거리 통신 또는 원거리 통신을 이용하여 서로 간의 데이터를 교환할 수 있다. The English sentence translation providing system 1 having a machine learning-based direct reading symbol and the user terminal may exchange data with each other using near field communication or long distance communication.

여기서 적용되는 근거리 통신은 블루투스(Bluetooth), RFID(Radio Frequency Identification), 적외선 통신(IrDA, infrared Data Association), UWB(Ultra Wideband), ZigBee, Wi-Fi (Wireless Fidelity) 기술을 포함할 수 있다.The short-range communication applied here may include Bluetooth, Radio Frequency Identification (RFID), Infrared Data Association (IrDA), Ultra Wideband (UWB), ZigBee, and Wireless Fidelity (Wi-Fi) technologies.

또한, 적용되는 원거리 통신은 CDMA(code division multiple access), FDMA(frequency division multiple access), TDMA(time division multiple access), OFDMA(orthogonal frequency division multiple access),SC-FDMA(single carrier frequency division multiple access) 기술을 포함할 수 있다.In addition, applicable telecommunications include code division multiple access (CDMA), frequency division multiple access (FDMA), time division multiple access (TDMA), orthogonal frequency division multiple access (OFDMA), and single carrier frequency division multiple access (SC-FDMA). Technology).

전술한 직독직해 기호 생성부(3), 번역부(3-1), 개념 영상 생성부(3-2), 비교 영상 생성부(3-3)는 머신 러닝을 기반으로 구현될 수 있으며, 전술한 머신 러닝 기반의 직독직해 기호를 구비한 영어 문장 번역 제공 시스템(1)을 구성하는 컨텐츠 입력부(2), 직독직해 기호 생성부(3), 번역부(3-1), 개념 영상 생성부(3-2), 비교 영상 생성부(3-3) 및 컨텐츠 전송부(4)의 요소를 기초로 본 발명의 구체적인 기능을 설명한다.The direct reading symbol generator 3, the translator 3-1, the concept image generator 3-2, and the comparison image generator 3-3 may be implemented based on machine learning. Content input unit (2), direct reading symbol generator (3), translator (3-1), conceptual image generator (1) constituting an English sentence translation system having a direct reading symbol based on a machine learning ( 3-2), specific functions of the present invention will be described based on the elements of the comparison image generator 3-3 and the content transmitter 4.

본 발명에 따른 컨텐츠 입력부(2)는 사용자 단말로부터 텍스트, 보이스, 이미지, 동영상 등의 영문 컨텐츠를 입력받는다.The content input unit 2 according to the present invention receives English content such as text, voice, image, and video from a user terminal.

이후, 직독직해 기호 생성부(3)는 입력된 영문 컨텐츠를 기초로 형태소를 분석하게 된다.Subsequently, the direct reading symbol generator 3 analyzes the morpheme based on the input English content.

이를 좀더 상세히 살펴보면, 직독직해 기호 생성부(3)는 주어, 서술어, 목적어, 보어 및 기타수식어 중 두 개 이상의 조합으로 이루어지는 문장성분을 분석하여 제 1형식 내지 제 5형식 중 어느 하나의 문장형식을 결정하고 이에 따른 문장성분 및 문장형식의 제 1결과데이터를 생성한다.In more detail, the direct reading symbol generator 3 analyzes a sentence component consisting of two or more combinations of a predicate, a target word, a bore word, and other formula words, and analyzes the sentence form of any one of the first to fifth forms. And the first result data of the sentence component and sentence form is generated accordingly.

그리고, 직독직해 기호 생성부(3)는 문장데이터의 문장성분 및 문장형식이 분석된 제 1결과데이터를 전달받아 명사, 동사, 수식어, 부정사, 동명사, 분사, 비교급, 등위접속사, 접속사, 관계대명사, 관계부사를 포함하는 다수의 품사군 중 어느 하나 또는 이들의 조합으로 문장성분별 분석을 수행하여 문장데이터를 품사별로 분류하는 제 2결과데이터를 생성한다.In addition, the direct reading symbol generator 3 receives the first result data in which the sentence component and sentence form of the sentence data are analyzed, and the noun, the verb, the modifier, the infinitive, the same noun, the participle, the comparative class, the conjunctional conjunction, the conjunction and the relative pronoun. The second result data classifying the sentence data by parts of speech is performed by analyzing the sentence components using any one or a combination of parts of speech parts including a relative adverb.

아울러 상기 직독직해 기호 생성부(3)는 제 1결과데이터 및 제 2결과데이터 를 전달받아 해당 품사별로 특정기호를 문장데이터에 표시하여 기호처리 문장데이터를 생성하는데, 이러한 특정 기호는 해당 단어 또는 구절에 밑줄을 그어 표시하거나, 원문자 또는 세모, 화살표 등으로 표시하는데, 본 발명의 일실시예에 따른 기호는 하나의 예시에 불과하며, 이러한 기호표시는 다양하게 채택되어 표시될 수 있다.In addition, the direct reading symbol generator 3 receives the first result data and the second result data to generate a symbol processing sentence data by displaying a specific symbol for each part-of-speech in sentence data, and the specific symbol is a corresponding word or phrase. Underlined or displayed in the original letters, triangles, arrows, etc., the symbol according to an embodiment of the present invention is only one example, such symbol display may be variously adopted and displayed.

또한 상기 컨텐츠 전송부(4)는 상기 직독직해 기호 생성부(3)로부터 기호처리 문장데이터를 전달받아 이를 사용자 단말에게 제공한다.In addition, the content transmission unit 4 receives the symbol processing sentence data from the direct reading symbol generation unit 3 and provides it to the user terminal.

도 2는 본 발명에 따른 기호표시가 수행된 문장데이터의 예시도이다.2 is an exemplary view of sentence data on which symbol display is performed according to the present invention.

도면을 참조하면, 본 발명의 일실시예에 따라 직독직해 기호 생성부(3)는 문장성분 또는 품사별로 분류하여 특정 기호처리를 수행하는데, 이를 상세하게 설명하면 우선 분류된 문장성분 또는 품사가 동사일 경우 해당 단어 또는 구절의 하부측에 밑줄을 그어 표시한다.Referring to the drawings, according to an embodiment of the present invention, the direct symbol generator 3 performs a specific symbol processing by classifying by sentence component or part of speech, and in detail, the classified sentence component or part of speech is a verb. If it is, underline the bottom of the word or phrase.

아울러 수식어구(전치사+명사)의 경우 해당 단어 또는 구절의 전방측에 '/' 기호를 표시하며, 부정사의 경우 'to'부분에는 이를 수용하는 원을 그리고 'to' 뒷부분에는 하부측에 밑줄을 그어 표시한다.In the case of modifiers (prepositions + nouns), the symbol '/' is displayed on the front side of the word or phrase.In the case of infinitive, the 'to' part accepts the circle and the 'to' part is underlined on the lower part. Draw it.

이때 분사구문(V·ing 명사(S), pp 명사)일 경우에는 전술한 바와 같이 밑줄과 원의 기호를 처리하지 않고 해당 단어 또는 구절 상부에 '

' 기호를 표시한다.In the case of the injection phrase (V · ing noun (S), pp noun), as mentioned above, the word or phrase above the word or phrase is not processed without the underscore and the symbol of the circle.

'Sign.

또한 분류된 문장성분 또는 품사가 분사(V·ing 명사, pp)일 경우 해당 단어V·ing, pp) 또는 구절에서 해당 단어 또는 구절이 수식하는 전방측 또는 후방측의 명사방향으로 상부측에 화살표를 그어 표시한다.In addition, if the classified sentence component or part of speech is a parting (V · ing noun, pp), the arrow on the upper side in the noun direction of the front side or the rear side that the word or phrase modifies in the word V · ing, pp) or phrase. Draw to display.

아울러 분류된 문장성분 또는 품사가 비교급일 경우 해당 단어 또는 구절의 'as' 또는 'than' 부분에 이를 수용하는 '△'기호를 표시하며, 분류된 문장성분 또는 품사가 접속사일 경우 해당 단어 또는 구절에 이를 수용하는 '△' 또는 '○'기호를 표시하는데, 접속사가 등위접속사인 경우 '△' 기호를 표시하고 종속접속사인 경우 '○' 기호를 표시한다.In addition, if the classified sentence component or part of speech is comparative, the '△' symbol is displayed in the 'as' or 'than' part of the word or phrase.If the classified sentence component or part of speech is an conjunction, the word or phrase '△' or '○' symbol is accepted in order to display this. If the conjunction is an equivalent conjunction, the symbol '△' is displayed.

또한 분류된 문장성분 또는 품사가 관계대명사 또는 관계부사일 경우 해당 단어 또는 구절에 이를 수용하는 '○'기호를 표시하고 관계대명사 또는 관계부사 전방방향으로 '

'기호를 함께 표시한다.In addition, if a classified sentence component or part of speech is a relative pronoun or a relative adverb, the symbol '○' is accepted in the word or phrase, and the forward pronoun or relative adverb

'Together with the symbol.

또한 제 1결과데이터 및 제 2결과데이터를 통해 접속사, 관계대명사 또는 관계부사가 생략된 것으로 판단되는 경우 해당 문장데이터의 생략된 위치 부분에 접속사의 경우 '∨'기호를 표시하고 관계대명사 또는 관계부사일 경우 '

'기호를 표시한다.In addition, if it is determined that the conjunction, relative pronoun, or relative adverb is omitted through the first result data and the second result data, the symbol '∨' is indicated in the omitted position of the sentence data and the relative pronoun or relation adverb is displayed. If

'Mark the symbol.

한편, 번역부(3-1)는 입력 문장을 처음부터 끝까지 읽으면서 가능한 분할점 후보들을 모두 추출하여 문장을 분할한다. 여기서, 사용된 분할점 후보들은 문장 부호, 수식어구, 수식어절이다.Meanwhile, the translator 3-1 divides the sentence by extracting all possible split point candidates while reading the input sentence from the beginning to the end. Here, the split point candidates used are punctuation marks, modifier phrases, and modifier clauses.

여기에서, 문장 부호는 마침표(.), 물음표(?), 느낌표(!), 쉼표(,), 가운뎃 점(·), 쌍점( : ), 쌍반점(;), 빗금(/), 큰따옴표(“ ”), 작은따옴표(‘ ’), 소괄호(( )), 중괄호,({ }), 대괄호([ ]), 붙임표(-), 줄표(--), 물결표(~), 줄임표(......) 등이 있다. Here, punctuation marks include periods (.), Question marks (?), Exclamation points (!), Commas (,), suffixes (·), dashes (:), semicolons (;), dashes (/), and double quotation marks ( “”), Single quotes (''), parentheses (()), braces, ({}), brackets ([]), ellipses (-), ellipses (-), tildes (~), ellipses (..) ....).

이때, 수식어구나 수식어절은 '/' 기호, '△' 기호 또는 '○'기호로 직독직해 기호가 표기되어 있다.At this time, the modifier or modifier clause is directly read by the symbol '/', '△' or '○'.

다만, 번역부(3-1)는 쉼표인 경우에 명사나 명사절이 이어지면, 분할점 후보에서 제외한다. 이때, 번역부(3-1)는 쉼표 이후에 이어지는 단어가 전치사 역할을 하는 including인 경우에는 분할점 후보에서 제외하지 않는다.However, if the noun or noun clause is continued in the case of a comma, the translation unit 3-1 excludes the split point candidate. At this time, the translation unit 3-1 does not exclude the split point candidate when the word following the comma is a preposition.

또한, 번역부(3-1)는 수식어구의 경우에 앞에 단어가 분사구문인 경우에 분할점 후보에서 제외한다.In the case of the modifier phrase, the translation unit 3-1 excludes the split point candidate when the preceding word is the injection phrase.

즉, 번역부(3-1)는 '/' 기호의 경우에 앞에 단어가 분사구문으로 '

' 기호가 표시되어 있는 경우에 분할점 후보에서 제외한다. 다만 번역부(3-1)는 '/' 기호의 뒤에 전치사를 앞에 둔 동명사가 위치하는 경우(/전치사+동명사)에는 분할점 후보에서 제외하지 않는다. That is, in the case of the '/' symbol, the translator 3-1 has a word in front of the spray phrase '

If the symbol 'is displayed, it is excluded from the split point candidate. However, the translation unit 3-1 does not exclude the split point candidate when the same name preceded by the preposition is placed after the / symbol.

또한, 번역부(3-1)는 동사의 경우에 앞뒤에 가상의 분할점을 추가하여 분할되도록 한다.In addition, the translation unit 3-1 adds a virtual split point before and after the verb to split the verb.

즉, 번역부(3-1)는 동사의 경우에 밑줄이 위치하고 있는바, 밑줄이 있는 경우에 가상의 분할점을 앞과 뒤에 추가한다.That is, the translation part 3-1 adds a virtual split point before and after the underline in the case of the verb.

그리고, 번역부(3-1)는 등위접속사인 경우에 다음에 동사가 아닌 경우에는, 즉 명사등인 경우에는 분할점 후보에서 제외한다.The translator 3-1 excludes the split point candidate when the verb is a verb, that is, a noun.

다만, 번역부(3-1)는 등위접속사에서 but은 뒤에 동사가 있는 경우에 가상의 분할점을 앞과 뒤에 추가한다.However, the translation unit 3-1 adds a virtual split point before and after the but before the verb in the conjunctive conjunction.

즉, 번역부(3-1)는 등위 접속사인 경우에 '△' 기호로 표기되며, '△' 기호이후에 동사를 나타내는 밑줄 기호가 없는 경우에 분할점 후보에서 제외하며, 등위접속사에서 but인 경우에 밑줄 기호가 있는 경우에는 가상의 분할점을 앞과 뒤에 추가한다.That is, the translation unit 3-1 is denoted by a '△' symbol in the case of an equivalence conjunction, and is excluded from the candidate for split point when there is no underscore symbol representing a verb after the '△' symbol. In the case of underscores, virtual split points are added before and after.

상기 번역부(3-1)는 수식어절이 관계 대명사나 관계 부사인 경우에 관계 대명사나 관계 부사가 한정하는 한정 부분을 수식어절의 동사를 대체하여 번역한다.The translator 3-1 translates a limited part defined by a relative pronoun or a relation adverb by replacing a verb of the modification clause when the expression clause is a relative pronoun or a relation adverb.

즉, 상기 번역부(3-1)는 수식어절이 관게 대명사나 관계 부사인 경우에 '○'기호를 표시하고 관계대명사 또는 관계부사 전방방향으로 '

'기호를 함께 표시되어 있는바, '○'기호에 '

'기호가 있는 경우에 한정 부분으로 수식어절의 동사를 대체하여 번역한다. 도 3은 이와 같이 직독직해 번역문이 제공되는 예시도이다.That is, the translator 3-1 displays a symbol "○" when a modifier clause is a related pronoun or a relative adverb and moves forward in the direction of a relative pronoun or a relative adverb.

'The symbol is marked with' ○ '

'If there is a symbol, translate it by replacing the verb of the qualifier clause with the limited part. 3 is an exemplary view in which a direct translation is provided in this way.

한편, 상기 번역부(3-1)는 직독직해된 번역문을 제공할 뿐만 아니라 한국어 어순에 적합한 번역문을 제공한다.On the other hand, the translation unit (3-1) not only provides a direct reading translation but also provides a translation suitable for the Korean word order.

이때, 상기 번역부(3-1)는 주어와 관련된 수식어의 경우에 순서가 반대로 될 수 있다.At this time, the translation unit 3-1 may be reversed in the case of the modifier associated with the subject.

일예로, 영어 문장이 "On a clear day, a crewmember on a merchant ship sailing across the caribbean Sea peers out at the horizon through his telescope"인 경우에 "맑은 날에는 승무원이 카리브 해를 가로 지르는 상선에서 그의 망원경을 통해 수평선을 응시한다"로 번역하는 경우가 많이 있다.For example, if the English sentence is "On a clear day, a crewmember on a merchant ship sailing across the caribbean Sea peers out at the horizon through his telescope," on a clear day, a crew member flies his telescope on a merchant ship across the Caribbean. Stare at the horizon through the words.

이러한 잘못된 번역을 방지하기 위하여, 번역부(3-1)는 동사 앞에 있는 명사들을 추출한다.To prevent this mistranslation, the translator (3-1) extracts nouns before the verb.

이때, 번역부(3-1)는 의미 요소가 강한 명사들을 추출하며, 번역부(3-1)는 일예로 승무원, 상선을 선택한다. 이처럼 번역부(3-1)는 동사 앞에 명사가 적어도 2개이상인 경우에 아래 동작을 진행한다.At this time, the translation unit 3-1 extracts nouns with strong semantic elements, and the translation unit 3-1 selects a crew member and a merchant ship as an example. As such, the translation unit 3-1 performs the following operation when there are at least two nouns before the verb.

그리고, 번역부(3-1)는 빅데이터에서 승무원과 상선을 입력하여 빅데이터에서 해당 단어를 포함한 문장들을 추출하며, 추출된 문장들에서 승무원에 이어 상선이 표현된 문장 비율을 산출하고, 이와 반대로 상선에 이어 승무원이 표현된 문장 비율을 산출하여 산출된 문장 비율이 큰 순서에 따라 배열된 한국어 어순 번역문을 작성하여 컨텐츠 전송부(4)를 통하여 사용자에게 제공한다.The translator 3-1 inputs a crew member and a merchant ship from the big data, extracts sentences including the corresponding word from the big data, calculates a sentence ratio in which the merchant ship is represented after the crew member in the extracted sentences, and On the contrary, after the merchant ship, the crew calculates the sentence ratio expressed and prepares the Korean word order translations arranged in the order of the largest sentence ratio, and provides the user with the content transmission unit 4.

상기 번역부(3-1)는 일예로 추출된 문장들에서 승무원에 이어 상선이 표현된 문장 비율이 30%이고, 이와 반대로 상선에 이어 승무원이 표현된 문장 비율을 산출하여 산출된 문장 비율이 70%이면 번역부(3-1)는 "맑은 날에는 카리브 해를 가로 지르는 상선에서 승무원이 수평선을 그의 망원경을 통해 응시한다"로 번역된 한국어 어순 번역문을 생성한다.The translator 3-1 has a 30% sentence ratio in which the merchant ship is expressed after the crew in the sentences extracted as an example, and the sentence ratio calculated by calculating the sentence ratio in which the crew member is represented after the merchant ship is 70%. In percent, the translation unit (3-1) produces a Korean word order translation, translated, "On a clear day, a crew stares at the horizon through his telescope on a merchant ship across the Caribbean."

즉, 번역부(3-1)는 이와 같은 과정을 통하여 주어의 의미 단위 독해 이후에, 수식어의 의미 단위를 독해하고, 동사의 의미 단위가 독해되도록 하여 자연스러운 문장이 되도록 한다. That is, the translation unit 3-1 reads the semantic unit of the modifier after reading the semantic unit of the subject through the above-described process, and makes the semantic unit of the verb read so that the sentence becomes a natural sentence.

한편, 번역부(3-1)는 동사 관련하여 번역상의 오류를 수정하기 위한 동사 후처리 과정을 수행한다.Meanwhile, the translation unit 3-1 performs a post-processing of verbs to correct translational errors with respect to verbs.

이를 위하여 번역부(3-1)는 동사의 앞과 뒤의 검색 명사들을 추출하여 빅데이터에서 관련 검색 명사들을 포함한 문장들을 검색하여 추출한다.To this end, the translation unit 3-1 extracts search nouns before and after the verb and searches for and extracts sentences including related search nouns from the big data.

그리고, 상기 번역부(3-1)는 해당 동사의 의미와 유사한 의미의 동사 표현을 동사 대체 후보로 추출하며, 추출된 동사 대체 후보들에 대하여 문장 비율을 산출한 후에 문장 비율이 가장 큰 가장 큰 동사 대체 후보로 해당 동사 표현을 대체한다.The translator 3-1 extracts a verb expression having a meaning similar to that of the corresponding verb as a verb replacement candidate, calculates a sentence ratio with respect to the extracted verb replacement candidates, and then has the largest verb with the largest sentence ratio. Replace the verb expression with the candidate for substitution.

일예로, 번역부(3-1)는 "A black flag is flying high on its mast."에 대하여 "검은 깃발이 그 돛대에서 높이 달려 있다."로 1차 번역하고, "깃발", "돛대"를 1차 번역문의 복수의 대표 명사로 추출한 후에, 이를 빅데이터에서 검색하여 해당 복수의 대표 명사을 포함한 문장들을 수집한다.For example, the translator 3-1 primarily translates "A black flag is flying high on its mast." To "A black flag is high on its mast." After extracting the as a plurality of representative nouns of the first translation, search for this in the big data to collect sentences including the plurality of representative nouns.

이때, 번역부(3-1)에 의해 수집될 수 있는 문장들은 일예로 "돛대에는 바람이 잘게 찢어놓은 깃발들 찢어진 깃발들이 슬픈 춤을 춘다", " 높게 솟은 돛대 사이로 깃발이 펄럭입니다", "부러진 돛대 끝엔 처참하게 찢긴 깃발이 늘어져 있었다"등이며, 번역부(3-1)는 해당 동사의 의미와 유사한 의미의 동사 표현을 동사 대체 후보로 추출하여 문장 비율을 산출한다.At this time, the sentences that can be collected by the translation unit 3-1 are, for example, "flags of torn flags in the masts dance sadly", "flags among the towering masts", " At the end of the broken mast there was a terribly torn flag hanging. ”The translator (3-1) extracts a verb expression with a similar meaning to the verb's meaning as a verb replacement candidate and calculates a sentence ratio.

상기 번역부(3-1)는 문장 비율을 산출한 결과, 일예로 "돛대에는 바람이 잘게 찢어놓은 깃발들 찢어진 깃발들이 슬픈 춤을 춘다"에서 추출된 대체 동사 후보인 "슬픈 춤을 춘다"를 포함한 문장비율이 2%, " 높게 솟은 돛대 사이로 깃발이 펄럭입니다"라는 문장에서 추출된 대체 동사 후보인 "펄럭이다"를 포함함 문장 비율이 25%, "돛대 끝엔 처참하게 찢긴 깃발이 늘어져 있었다"라는 표현에 "늘어져 있었다"를 포함한 표현이 8%인 경우에, 문장 비율이 가장 큰 "펄럭이다"로 "달려 있다"를 대체한다.As a result of calculating the sentence ratio, the translator 3-1, for example, substitutes "dancing sad dance", which is an alternative verb candidate extracted from "flags dancing in the mast, torn apart flags." Includes 2% sentence rate, including the alternative verb candidate "Fluggda" extracted from the sentence "Flag is fluttering among the towering masts". 25% sentence rate: "The flag was ragged at the end of the mast." In the case where the expression "stretched" includes 8%, the sentence proportion replaces "running" with the largest "floating".

이와 달리, 번역부(3-1)는 "A black flag is flying high on its mast."에 대하여 "검은 깃발이 그 돛대에서 높이 달려 있다."로 1차 번역하고, "깃발", "돛대"를 1차 번역문의 복수의 검색 명사로 추출한 후에, 이를 빅데이터에서 검색하여 해당 복수의 검색 명사를 포함한 문장들을 수집하여 각각의 문장들을 제1 내지 n의 대체 후보 번역문으로 분류한다.In contrast, the translator 3-1 first translates "A black flag is flying high on its mast." To "A black flag is high on its mast." After extracting as a plurality of search nouns of the primary translation, search for it in the big data to collect sentences including the plurality of search nouns and classify each sentence as the first to n alternative candidate translations.

이후에, 번역부(3-1)는 1차 번역문에서 복수의 기준 대표 단어를 추출하되, 복수의 기준 대표 단어와 복수의 제1 내지 n의 대체 대표 단어 사이의 유사도를 측정하고, 결과를 바탕으로 해당 동사와 제1 내지 n의 동사 대체 후보 사이의 유사도를 추론하여 유사도가 가장 큰 동사 대체 후보를 해당 동사로 대체한다.Subsequently, the translation unit 3-1 extracts a plurality of reference representative words from the primary translation, and measures the similarity between the plurality of reference representative words and the plurality of first to n alternative representative words and based on the result. By inferring the similarity between the verb and the verb replacement candidates of 1 to n, the verb substitution candidate having the largest similarity is replaced with the verb.

해당 동사와 복수의 제1 내지 n의 동사 대체 후보 사이의 유사도의 측정은, 복수의 기준 대표 단어와 복수의 제1 내지 n의 대체 대표 단어 사이에 공통된 단어의 존재 유무를 검사함으로써 이루어질 수 있으며, 아래의 수식(1)에 따라 연산될 수 있다.The measurement of the similarity between the verb and the plurality of first to n substitute candidates may be performed by checking whether a common word exists between the plurality of reference representative words and the plurality of first to n alternative representative words. It may be calculated according to Equation (1) below.

(수학식 1)(Equation 1)

여기서, n은 어느 하나의 대체 후보 번역문으로부터 추출된 복수의 대체 대표 단어의 개수를 나타낸다. 이때, 대체 대표 단어 중 i번째 단어가 기준 대표 단어 및 대체 대표 단어에 공통적으로 존재하는 때 Si 값은 1의 값을 갖는다.Here, n represents the number of substitute representative words extracted from one substitute candidate translation. At this time, the Si value has a value of 1 when the i-th word of the substitute representative word is commonly present in the reference representative word and the replacement representative word.

기준 대표 단어 중 i번째 단어가 대체 대표 단어에는 존재하지 않는 때 Si 값은 0의 값을 갖는다. The Si value has a value of zero when the i th word of the reference representative word is not present in the replacement representative word.

일예로, 도 4를 보면, 직독직해 번역문이"깃발이/펄럭인다/ 돛대에서/"인 경우에 먼저 깃발 영상을 보여주고, 다음에, 깃발이 펄럭이는 것을 보여주며, 그 다음으로 돛대를 추가하여 보여줌으로 영어 문장이 어떤 순서에 따라 구성되는지를 이와 같은 개념 영상으로 보여주어 영어 문장의 구성을 개념적으로 이해할 수 있도록 한다.As an example, in Figure 4, if the translation is "flag / flap / on mast /", the flag image is first shown, then the flag is fluttered, and then the mast is In addition, it shows how the English sentences are organized in a conceptual image like this, so that the composition of the English sentences can be conceptually understood.

이에 더해서, 개념 영상 생성부(3-2)는 직독직해 기호 설명 영상을 제공하고, 주장(부정) 의미 단위 표시 영상, 연결사 표시 영상, 부정어구 표시 영상 등을 추가적으로 제공할 수 있다.In addition, the concept image generation unit 3-2 may directly provide a symbol description image and additionally provide a claim (negative) meaning unit display image, a conjunction verb display image, a negative phrase display image, and the like.

그리고, 비교 영상 생성부(3-3)는 한국어 어순 번역문에 따라 순차적으로 개념이 연상되도록 제작된 비교 영상을 생성한다. In addition, the comparison image generator 3-3 generates a comparison image, which is designed to sequentially associate concepts according to the Korean word order translation.

일예로, 도 6을 보면, 한국어 어순 번역문이"깃발이 돛대에서 펄럭이다"인 경우에 먼저 깃발 영상을 보여주고, 다음에, 깃발이 돛대에 있는 것을 보여주며, 그 다음으로 펄럭이는 동작을 추가하여 보여줌으로 한국어 문장이 어떤 순서에 따라 구성되는지를 이와 같은 비교 영상으로 보여주어 영어 문장의 구성과 한국어 문장의 구성의 차이를 이해할 수 있도록 한다.As an example, referring to FIG. 6, the Korean word order translation shows a flag image first when the flag is fluttering on the mast, and then shows that the flag is on the mast, and then flutters. In addition, it shows the order in which the Korean sentences are constructed in a comparative video such that the difference between the structure of the English sentence and the structure of the Korean sentence can be understood.

도 7은 본 발명에 따른 번역부의 구성을 보여주는 도면이다.7 is a view showing the configuration of a translation unit according to the present invention.

도 7을 참조하면, 본 발명에 따른 번역부는 문장 분할기(10), 직독직해 번역기(12), 한국어 어순 번역기(14), 어순 정렬기(16) 및 후처리 수행기(18)를 포함한다.Referring to FIG. 7, a translation unit according to the present invention includes a sentence divider 10, a direct reading translator 12, a Korean word order translator 14, an word order sorter 16, and a post-processing performer 18.

먼저, 상기 문장 분할기(10)는 입력 문장을 처음부터 끝까지 읽으면서 가능한 분할점 후보들을 모두 추출한다. 여기서, 사용된 분할점 후보들은 문장 부호, 수식어구, 수식어절이다.First, the sentence divider 10 extracts all possible split point candidates by reading an input sentence from beginning to end. Here, the split point candidates used are punctuation marks, modifier phrases, and modifier clauses.

다만, 문장 분할기(10)는 쉼표인 경우에 명사나 명사절이 이어지면, 분할점 후보에서 제외한다. 이때, 문장 분할기(10)는 쉼표 이후에 이어지는 단어가 전치사 역할을 하는 including인 경우에는 분할점 후보에서 제외하지 않는다.However, if the noun or noun clause is continued in the case of a comma, the sentence divider 10 excludes the split point candidate. In this case, the sentence divider 10 does not exclude a split point candidate when the word following the comma is a preposition.

또한, 문장 분할기(10)는 수식어구의 경우에 앞에 단어가 분사구문인 경우에 분할점 후보에서 제외한다.In addition, the sentence divider 10 excludes the split point candidate when the word is the injection phrase in the case of the modifier phrase.

즉, 문장 분할기(10)는 '/' 기호의 경우에 앞에 단어가 분사구문으로 '

' 기호가 표시되어 있는 경우에 분할점 후보에서 제외한다.That is, in the case of the '/' symbol, the sentence divider 10 has a front word as a spray phrase '

If the symbol 'is displayed, it is excluded from the split point candidate.

또한, 문장 분할기(10)는 동사의 경우에 앞뒤에 가상의 분할점을 추가하여 구별되도록 한다.In addition, the sentence divider 10 adds virtual split points before and after the verb to distinguish them.

즉, 문장 분할기(10)는 동사의 경우에 밑줄이 위치하고 있는바, 밑줄이 있는 경우에 가상의 분할점을 앞과 뒤에 추가한다.That is, the sentence divider 10 adds an imaginary split point before and after the underline in the case of the verb.

그리고, 문장 분할기(10)는 등위접속사인 경우에 다음에 동사가 아닌 경우에는, 즉 명사등인 경우에는 분할점 후보에서 제외한다.The sentence divider 10 is excluded from the split point candidate when the verb is a next verb, that is, a noun.

다만, 문장 분할기(10)는 등위접속사에서 but은 뒤에 동사가 있는 경우에 가상의 분할점을 앞과 뒤에 추가한다.However, the sentence divider 10 adds a virtual split point before and after the but in the iso conjunction.

즉, 문장 분할기(10)는 등위 접속사인 경우에 '△' 기호로 표기되며, '△' 기호이후에 동사를 나타내는 밑줄 기호가 없는 경우에 분할점 후보에서 제외하며, 등위접속사에서 but인 경우에 밑줄 기호가 있는 경우에는 가상의 분할점을 앞과 뒤에 추가한다.That is, the sentence divider 10 is denoted by a '△' symbol when it is an equal conjunction, and is excluded from a candidate for split point when there is no underscore symbol representing a verb after the '△' symbol, but when it is but in an equivalent conjunction. If there are underscores, add virtual split points before and after.

다음으로, 직독직해 번역기(12)는 문장 분할된 영어 문장을 입력받아 문장 분할된 순서에 따라 번역하다.Next, the direct translation translator 12 receives the sentence-splitting English sentence and translates the sentence in the order of sentence division.

여기에서, 직독직해 번역기(12)는 직독직해 기호가 표함된 영어 문장에서 직독직해 기호의 의미 단위 연산을 수행하고, 이를 기초로 하여 의미단위 직독직해를 수행한다.Here, the direct reading translator 12 performs a semantic unit operation of the direct reading symbol in the English sentence with the direct reading symbol, and performs the direct reading on the semantic unit based on the reading.

한국어 어순 번역기(14)는 직독직해 번역기(12)에서 번역된 직독직해 번역문을 입력받아 한국어 어순에 따른 순서로 조정하여 한국어 어순 번역문을 생성한다.The Korean word order translator 14 receives the direct reading word translated by the direct reading translator 12 and adjusts the order according to the Korean word order to generate the Korean word order translation.

이때, 직독직해 번역기(12)는 수식어절이 관계 대명사나 관계 부사인 경우에 관계 대명사나 관계 부사가 한정하는 한정 부분을 수식어절의 동사로 대체하여 번역한다.At this time, the direct reading translator 12 translates a limited part defined by a relative pronoun or a relative adverb by a verb of the modified clause when the clause is a relative pronoun or a relative adverb.

즉, 상기 문장 분할기(10)는 수식어절이 관게 대명사나 관계 부사인 경우에 '○'기호를 표시하고 관계대명사 또는 관계부사 전방방향으로 '

'기호를 함께 표시되어 있는바, 직독직해 번역기(12)는 '○'기호에 '

'기호가 있는 경우에 한정 부분으로 수식어절의 동사를 대체하여 번역한다. That is, the sentence divider 10 displays the symbol '○' when the modifier clause is a related pronoun or a relative adverb and moves forward in the direction of a relative pronoun or a relative adverb.

'The symbol is displayed together, direct reading translator (12) is a' ○ 'symbol'

'If there is a symbol, translate it by replacing the verb of the qualifier clause with the limited part.

한편, 상기 한국어 어순 번역기(14)는 주어와 관련된 수식어의 경우에 순서가 반대로 될 수 있다.On the other hand, the Korean word order translator 14 may be reversed in the case of the modifier associated with the subject.

이러한 잘못된 번역을 방지하기 위하여, 어순 정렬기(16)는 동사 앞에 있는 명사들을 추출한다.To prevent this mistranslation, word order sorter 16 extracts nouns before the verb.

이때, 어순 정렬기(16)는 의미 요소가 강한 명사들을 추출하며, 어순 정렬기(16)는 일예로 승무원, 상선을 선택한다. 이처럼 어순 정렬기(16)는 동사 앞에 명사가 적어도 2개이상인 경우에 아래 동작을 진행한다.At this time, the word order sorter 16 extracts nouns with strong semantic elements, and the word order sorter 16 selects a crew member and a merchant ship as an example. As such, the word order sorter 16 performs the following operation when there are at least two nouns before the verb.

그리고, 어순 정렬기(16)는 빅데이터에서 승무원과 상선을 입력하여 빅데이터에서 해당 단어를 포함한 문장들을 추출하며, 추출된 문장들에서 승무원에 이어 상선이 표현된 문장 비율을 산출하고, 이와 반대로 상선에 이어 승무원이 표현된 문장 비율을 산출하여 산출된 문장 비율이 큰 순서에 따라 배열된 한국어 어순 번역문을 작성하여 컨텐츠 전송부(4)를 통하여 사용자에게 제공한다.In addition, the word order sorter 16 inputs a crew member and a merchant ship from the big data, extracts sentences including the corresponding word from the big data, calculates a sentence ratio in which the merchant ship is represented after the crew member in the extracted sentences, and vice versa. Following the merchant ship, the crew calculates the sentence proportions expressed and prepares the Korean word order translations arranged in the order of the larger sentence proportions, and provides them to the user through the content transmission unit 4.

상기 어순 정렬기(16)는 일예로 추출된 문장들에서 승무원에 이어 상선이 표현된 문장 비율이 30%이고, 이와 반대로 상선에 이어 승무원이 표현된 문장 비율을 산출하여 산출된 문장 비율이 70%이면 어순 정렬기(16)는 "맑은 날에는 카리브 해를 가로 지르는 상선에서 승무원이 수평선을 그의 망원경을 통해 응시한다"로 번역된 한국어 어순 번역문을 생성한다.The word order sorter 16 has a 30% sentence rate in which the merchant ship is expressed after the crew in the sentences extracted as an example, and the sentence rate calculated by calculating the sentence rate in which the crew is expressed after the merchant ship is 70%. The word order aligner 16 produces a Korean word order translation translated, "On a clear day, a crew stares at the horizon through his telescope on a merchant ship crossing the Caribbean Sea."

한편, 후처리 수행기(18)는 동사 관련하여 번역상의 오류를 수정하기 위한 동사 후처리 과정을 수행한다.Meanwhile, the post-processing executor 18 performs a verb post-processing process to correct translational errors with respect to the verb.

이를 위하여 후처리 수행기(18)는 동사의 앞과 뒤의 검색 명사들을 추출하여 빅데이터에서 관련 검색 명사들을 포함한 문장들을 검색하여 추출한다.To this end, the post-processing performer 18 extracts the search nouns before and after the verb and searches for and extracts sentences including related search nouns from the big data.

그리고, 상기 후처리 수행기(18)는 해당 동사의 의미와 유사한 의미의 동사 표현을 동사 대체 후보로 추출하며, 추출된 동사 대체 후보들에 대하여 문장 비율을 산출한 후에 문장 비율이 가장 큰 가장 큰 동사 대체 후보로 해당 동사 표현을 대체한다.The post-processing processor 18 extracts a verb expression having a meaning similar to that of the verb as a verb replacement candidate, calculates a sentence ratio with respect to the extracted verb replacement candidates, and then substitutes the largest verb having the largest sentence ratio. Substitute the verb expression as a candidate.

일예로, 후처리 수행기(18)는 "A black flag is flying high on its mast."에 대하여 "검은 깃발이 그 돛대에서 높이 달려있다."로 1차 번역하고, "깃발", "돛대"를 1차 번역문의 복수의 대표 명사로 추출한 후에, 이를 빅데이터에서 검색하여 해당 복수의 대표 명사을 포함한 문장들을 수집한다.In one example, post-processing implementer 18 primarily translates "A black flag is flying high on its mast." To "A black flag is high on its mast." After extracting a plurality of representative nouns in the primary translation, the search is performed on the big data to collect sentences including the plurality of representative nouns.

이때, 후처리 수행기(18)에 의해 수집될 수 있는 문장들은 일예로 "돛대에는 바람이 잘게 찢어놓은 깃발들 찢어진 깃발들이 슬픈 춤을 춘다", " 높게 솟은 돛대 사이로 깃발이 펄럭입니다", "부러진 돛대 끝엔 처참하게 찢긴 깃발이 늘어져 있었다"등이며, 후처리 수행기(18)는 해당 동사의 의미와 유사한 의미의 동사 표현을 동사 대체 후보로 추출하여 문장 비율을 산출한다.At this time, the sentences that can be collected by the post-processing performer 18 are, for example, "the masts are torn apart by the wind torn flags. The flag flutters among the towering masts." At the end of the mast there was a terribly torn flag hanging. ”The post-processing executor 18 extracts a verb expression having a meaning similar to that of the verb as a verb replacement candidate to calculate a sentence ratio.

상기 후처리 수행기(18)는 문장 비율을 산출한 결과, 일예로 "돛대에는 바람이 잘게 찢어놓은 깃발들 찢어진 깃발들이 슬픈 춤을 춘다"에서 추출된 대체 동사 후보인 "슬픈 춤을 춘다"를 포함한 문장비율이 2%, " 높게 솟은 돛대 사이로 깃발이 펄럭입니다"라는 문장에서 추출된 대체 동사 후보인 "펄럭이다"를 포함함 문장 비율이 25%, "돛대 끝엔 처참하게 찢긴 깃발이 늘어져 있었다"라는 표현에 "늘어져 있었다"를 포함한 표현이 8%인 경우에, 문장 비율이 가장 큰 "펄럭이다"로 "달려있다"를 대체한다.As a result of calculating the sentence ratio, the post-processing performer 18 includes, for example, "dancing sad dance", which is an alternative verb candidate extracted from "the masts torn apart in the wind. 2% sentence rate, including the alternative verb candidate "Fluda", extracted from the sentence "Flag is flung between the towering masts." If 8% of the expressions include "stretched" in the expression, the sentence proportion replaces "running" with the largest "floating".

이와 달리, 후처리 수행기(18)는 "A black flag is flying high on its mast."에 대하여 "검은 깃발이 그 돛대에서 높이 달려있다."로 1차 번역하고, "깃발", "돛대"를 1차 번역문의 복수의 검색 명사로 추출한 후에, 이를 빅데이터에서 검색하여 해당 복수의 검색 명사를 포함한 문장들을 수집하여 각각의 문장들을 제1 내지 n의 대체 후보 번역문으로 분류한다.In contrast, post-processing executor 18 primarily translates "A black flag is flying high on its mast." To "black flag is high on its mast." After extracting a plurality of search nouns of the primary translation, the search is performed on the big data, and the sentences including the plurality of search nouns are collected to classify the sentences as alternative candidate translations of the first to n.

이후에, 후처리 수행기(18)는 1차 번역문에서 복수의 기준 대표 단어를 추출하되, 복수의 기준 대표 단어와 복수의 제1 내지 n의 대체 대표 단어 사이의 유사도를 측정하고, 결과를 바탕으로 해당 동사와 제1 내지 n의 동사 대체 후보 사이의 유사도를 추론하여 유사도가 가장 큰 동사 대체 후보를 해당 동사로 대체한다.Subsequently, the post-processing performer 18 extracts a plurality of reference representative words from the primary translation, and measures the similarity between the plurality of reference representative words and the plurality of first to n alternative representative words, based on the result. By inferring the similarity between the verb and the verb replacement candidates of 1 to n, the verb substitution candidate having the largest similarity is replaced with the verb.

(수학식 1)(Equation 1)

여기서, n은 어느 하나의 대체 후보 번역문으로부터 추출된 복수의 대체 대표 단어의 개수를 나타낸다. 이때, 대체 대표 단어 중 i번째 단어가 기준 대표 단어 및 대체 대표 단어에 공통적으로 존재하는 때 Si 값은 1의 값을 갖는다.Here, n represents the number of substitute representative words extracted from one substitute candidate translation. At this time, the Si value has a value of 1 when the i-th word of the substitute representative word is commonly present in the reference representative word and the substitute representative word.

상술한 실시예에 설명된 특징, 구조, 효과 등은 본 발명의 적어도 하나의 실시예에 포함되며, 반드시 하나의 실시예에만 한정되는 것은 아니다 나아가, 각 실시예에서 예시된 특징, 구조, 효과 등은 실시예들이 속하는 분야의 통상의 지식을 가지는 자에 의하여 다른 실시예들에 대해서도 조합 또는 변형되어 실시 가능하다.Features, structures, effects, and the like described in the above-described embodiments are included in at least one embodiment of the present invention, and are not necessarily limited to only one embodiment. Furthermore, the features, structures, effects, and the like illustrated in each embodiment are described. The embodiments may be combined or modified with respect to other embodiments by those skilled in the art.

따라서 이러한 조합과 변형에 관계된 내용들은 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다. 또한, 이상에서 실시예들을 중심으로 설명하였으나 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이속하는 분야의 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 예를 들어, 실시예들에 구체적으로 나타난 각구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부한 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.Therefore, it should be interpreted that the contents related to such a combination and modification are included in the scope of the present invention. In addition, the above description has been made with reference to the embodiments, which are merely examples and are not intended to limit the present invention, and those skilled in the art to which the present invention pertains may be illustrated as above without departing from the essential characteristics of the present embodiment. It will be appreciated that various modifications and applications are possible. For example, each component shown in detail in the embodiments may be modified. And differences relating to such modifications and applications will have to be construed as being included in the scope of the invention defined in the appended claims.

1 : 번역문 제공 시스템
2: 컨텐츠 입력부
3: 직독직해 기호 생성부
3-1 : 번역부
3-2 : 개념 영상 생성부(직독직해 기호 설명 영상 생성부, 주장(부정) 의미 단위 표시 영상 생성부 포함)
3-3 : 비교 영상 생성부
4 : 컨텐츠 전송부 1: Translation System
2: content input unit
3: direct symbol generator
3-1: Translation Department
3-2: Concept image generation unit (including direct symbol description image generation unit, claim (negative) meaning unit display image generation unit)
3-3: Comparative Image Generator
4: content transmission unit

Claims

A content input unit for receiving English content from a user terminal;
A direct reading symbol generator for adding a direct reading symbol to the English content inputted from the content input unit;
A translation unit configured to receive an English sentence including a direct reading symbol generated by the direct reading symbol generator and perform a direct reading to generate a direct reading translation; And
It includes a content transmission unit for providing the English content and the direct-to-direct translation to the user terminal is added to the direct reading symbol generated from the direct reading symbol generation unit,
The translation unit receives an English sentence containing a direct reading symbol generated by the direct reading symbol generator, performs a Korean word order translation, and generates a Korean word order translation sentence.
The translation unit
A sentence divider for dividing a sentence by extracting all possible splitting point candidates while reading an input sentence from beginning to end;
A direct reading translator which receives the sentence-splitting English sentence in the sentence divider and translates it according to the order of sentence division to generate a direct reading translation;
A Korean word order translator which receives the direct translation word translated by the direct reading word translator and generates a Korean word order translation by adjusting the order according to Korean word order; And
And a post-processing performer performing a verb post-processing process for correcting a translation error in relation to a verb of a Korean word-order translation of the Korean word-order translator.

delete

The method according to claim 1,
Further comprising a concept image generation unit for generating a concept image produced to relate to the concept sequentially according to the direct translation generated by the translation unit,
The system of claim 1, wherein the content transmitter provides the concept image to a user terminal.

The method according to claim 3,
The apparatus may further include a comparison image generation unit configured to generate a comparison image, the concept image being sequentially associated with the Korean word order translation sentence generated by the translation unit.
The system of claim 1, wherein the content transmitter is provided with a machine learning-based direct reading symbol for providing the comparison image to a user terminal.

The method according to claim 1,
When the text input unit receives text, the English sentence is extracted and provided to the direct reading symbol generator. In the case of an image or a video, the content input unit extracts the English sentence included in the image or video. English sentence translation providing system with direct reading based on machine learning provided to the generation unit.

The method according to claim 1,
The direct reading symbol generating unit analyzes the morpheme based on the inputted English content, displays a specific symbol for each part of speech, and generates a sentence processing sentence data by generating a symbol processing sentence data. .

The method according to claim 6,
The direct reading symbol generating unit analyzes a sentence component consisting of two or more combinations of a subject, a descriptive word, a target word, a bore, and other formula words to determine a sentence form of any one of the first to fifth forms, and accordingly the sentence component and sentence Generates first result data in a form, and receives the first result data in which sentence components and sentence forms of sentence data are analyzed, such as nouns, verbs, modifiers, infinitives, nouns, participles, comparative grades, rank conjunctions, conjunctions, relative pronouns, Analyze sentence components using one or a combination of parts-of-speech groups including related adverbs to generate second result data that classifies sentence data by parts of speech, and deliver the first result data and the second result data. Obtain a symbol based on machine learning to generate symbol processing sentence data by displaying specific symbols for each part of speech. An English sentence translation system available.

The method of claim 7, wherein
And a specific symbol generated by the direct reading symbol generator by underlining the corresponding word or phrase, or by using a direct learning symbol based on machine learning to display the original letter, triangle, or arrow.

delete

The method according to claim 1,
And the split point candidates are punctuation marks, modifier phrases and modal clauses.

The method according to claim 1,
And a word order sorter for aligning a position of a modifier related to a subject in the Korean word order translation sentence.

A content input unit for receiving English content from a user terminal;
A direct reading symbol generator for adding a direct reading symbol to the English content inputted from the content input unit;
A translation unit configured to receive an English sentence including a direct reading symbol generated by the direct reading symbol generator and perform a direct reading to generate a direct reading translation; And
It includes a content transmission unit for providing the English content and the direct-to-direct translation to the user terminal is added to the direct reading symbol generated from the direct reading symbol generation unit,
The translation unit receives an English sentence containing a direct reading symbol generated by the direct reading symbol generator, performs a Korean word order translation, and generates a Korean word order translation sentence.
The translation unit
A sentence divider for dividing a sentence by extracting all possible splitting point candidates while reading an input sentence from beginning to end;
A direct reading translator which receives the sentence-splitting English sentence in the sentence divider and translates it according to the order of sentence division to generate a direct reading translation;
A Korean word order translator which receives the direct translation word translated by the direct reading word translator and generates a Korean word order translation by adjusting the order according to Korean word order; And
A word order sorter for sorting a position of a modifier related to a subject in the Korean word order translation,
The word order sorter extracts nouns in front of the verb, extracts sentences containing the word from the big data, calculates a sentence ratio for each noun order from the extracted sentences, and creates a Korean word order translation sentence arranged according to the calculated sentence ratio. English sentence translation providing system having a machine learning-based direct reading symbols provided to the user through the content transmission unit.

delete

The method according to claim 1,
The post-processing performer extracts a plurality of search nouns, retrieves and extracts sentences including related search nouns from the big data, extracts verb expressions having a meaning similar to that of the corresponding verbs as verb replacement candidates, and extracted verb replacement candidates. A system for providing English sentence translation based on a direct reading symbol that substitutes a verb expression after calculating a sentence ratio for the largest verb replacement candidate with the largest sentence ratio.

The method according to claim 1,
The post-processing performer extracts a plurality of search nouns of the first translation sentence, searches the big data, collects sentences including the plurality of search nouns, and classifies each sentence as a first to n alternative candidate translation sentence. Extract a plurality of reference representative words from the primary translation, and measure the similarity between the plurality of reference representative words and the plurality of first to n alternative representative words, and replace the verbs with the verbs based on the result. A system for providing English sentence translation with direct reading symbols that infers similarity between candidates and replaces the verb substitution candidate with the highest similarity with the corresponding verb.