KR102566899B1

KR102566899B1 - Electronic terminal apparatus that can perform individually customized automatic typo correction and operating method thereof

Info

Publication number: KR102566899B1
Application number: KR1020200022099A
Authority: KR
Inventors: 박현
Original assignee: 주식회사 한글과컴퓨터
Priority date: 2020-02-24
Filing date: 2020-02-24
Publication date: 2023-08-14
Also published as: KR20210107301A

Abstract

개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치 및 그 동작 방법이 개시된다. 본 발명에 따른 전자 단말 장치 및 그 동작 방법은 미리 등록된 사용자 지정 단어들을 저장해 둔 상태에서 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 특정 입력 단어가 입력된 후, 상기 사용자로부터 상기 특정 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 특정 입력 단어를 삭제하고, 상기 사용자 지정 단어들 중 상기 특정 입력 단어와의 유사도가 최대인 사용자 지정 단어를 선택한 후 상기 선택된 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 특정 입력 단어에 대한 정정 단어로 대체하여 표시함으로써, 개인별 맞춤형 자동 오타 정정이 가능하도록 지원할 수 있다.Disclosed are an electronic terminal device capable of performing automatic typo correction customized for each individual and an operating method thereof. An electronic terminal device and method of operating the same according to the present invention, after a specific input word consisting of at least two or more words is input by a user on an electronic document in a state in which pre-registered user-designated words are stored, the specific input word from the user When a deletion command for an input word is applied, the specific input word is deleted from the electronic document, a user-specified word having the maximum similarity with the specific input word is selected from among the user-specified words, and the selected user-specified word is used. By replacing the specific input word deleted on the electronic document with a corrected word and displaying it, customized automatic typo correction for each individual may be supported.

Description

Electronic terminal device capable of performing automatic typo correction tailored to each individual and its operation method

본 발명은 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치 및 그 동작 방법에 대한 것이다.The present invention relates to an electronic terminal device capable of performing automatic typo correction customized for each individual and an operating method thereof.

최근, 컴퓨터나 스마트폰 또는 태블릿 PC 등이 널리 보급됨에 따라, 이러한 전자 단말 장치를 이용하여 전자 문서를 열람, 작성, 편집할 수 있도록 하는 다양한 종류의 전자 문서 관련 프로그램들이 출시되고 있다.Recently, as computers, smart phones, tablet PCs, etc. are widely spread, various types of electronic document-related programs that enable reading, writing, and editing of electronic documents using such electronic terminal devices have been released.

전자 단말 장치에서 사용 가능한 전자 문서 관련 프로그램들로는 기본적인 문서의 작성, 편집 등을 지원하는 워드프로세서, 데이터의 입력, 산술연산, 데이터 관리를 보조하는 스프레드시트, 발표자의 발표를 보조하기 위한 프레젠테이션 프로그램들이 있다.Programs related to electronic documents that can be used in electronic terminal devices include word processors that support basic document preparation and editing, spreadsheets that assist data input, arithmetic operations, and data management, and presentation programs that assist presenters in their presentations. .

사용자는 이러한 전자 문서 관련 프로그램들을 사용할 때 전자 문서 상에서 오타를 내는 경우가 많다.Users often make typos on electronic documents when using these electronic document-related programs.

이와 관련해서, 전자 문서 프로그램들에서는 맞춤법/문법 검사기능을 제공함으로써, 사용자가 보다 쉽게 오타를 정정할 수 있도록 지원하고 있다.In this regard, electronic document programs support users to more easily correct typos by providing a spelling/grammar check function.

다만, 전자 문서 작성 시 사용자별로 오타가 자주 발생하는 단어가 서로 다를 수 있다는 점에서, 사용자별로 오타 발생이 많을 것으로 예상되는 소정의 사용자 지정 단어들을 미리 등록해 놓은 후 사용자가 전자 문서 상에서 오타를 낼 때 상기 사용자 지정 단어들을 기초로 자동으로 오타 정정을 수행할 수 있도록 하는 사용자 맞춤형 자동 오타 정정 기술의 도입도 고려될 수 있다.However, given that each user may have different words with frequent typos when writing an electronic document, it is possible to register predetermined user-specified words that are expected to have a lot of typos for each user in advance and then make a typo on the electronic document. In addition, introduction of a user-customized automatic typo correction technology that automatically corrects typos based on the user-specified words may be introduced.

따라서, 사용자별로 맞춤화된 자동 오타 정정을 수행할 수 있도록 하는 자동 오타 정정 기술에 대한 연구가 필요하다.Therefore, there is a need for research on an automatic error correction technology that can perform automatic error correction customized for each user.

본 발명에 따른 전자 단말 장치 및 그 동작 방법은 미리 등록된 사용자 지정 단어들을 저장해 둔 상태에서 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 특정 입력 단어가 입력된 후, 상기 사용자로부터 상기 특정 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 특정 입력 단어를 삭제하고, 상기 사용자 지정 단어들 중 상기 특정 입력 단어와의 유사도가 최대인 사용자 지정 단어를 선택한 후 상기 선택된 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 특정 입력 단어에 대한 정정 단어로 대체하여 표시함으로써, 개인별 맞춤형 자동 오타 정정이 가능하도록 지원하고자 한다.An electronic terminal device and method of operating the same according to the present invention, after a specific input word consisting of at least two or more words is input by a user on an electronic document in a state in which pre-registered user-designated words are stored, the specific input word from the user When a deletion command for an input word is applied, the specific input word is deleted from the electronic document, a user-specified word having the maximum similarity with the specific input word is selected from among the user-specified words, and the selected user-specified word is used. By substituting a corrected word for the specific input word deleted on the electronic document and displaying it, it is intended to support personalized automatic typo correction.

본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치는 복수의 낱글자들 - 상기 복수의 낱글자들은 복수의 자음들과 복수의 모음들로 이루어짐 - 각각에 대한 미리 정해진 서로 다른 원-핫(One-Hot) 벡터가 저장되어 있는 낱글자 벡터 저장부, 오타 정정을 위한 미리 지정된 서로 다른 복수의 사용자 지정 단어들과 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터 - 상기 단어 벡터는 각 단어를 구성하는 적어도 둘 이상의 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 생성된 벡터임 - 가 저장되어 있는 단어 데이터베이스, 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 제1 입력 단어가 입력된 후, 상기 사용자로부터 상기 제1 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 제1 입력 단어를 삭제하고, 상기 낱글자 벡터 저장부를 참조하여 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 오타 벡터를 생성하는 오타 벡터 생성부, 상기 단어 데이터베이스에 저장되어 있는 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도를 연산하는 벡터 유사도 연산부 및 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 제1 사용자 지정 단어를 선택한 후 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시하는 오타 정정 처리부를 포함한다.According to an embodiment of the present invention, an electronic terminal device capable of performing automatic typo correction tailored to each individual includes a plurality of letters - the plurality of letters are composed of a plurality of consonants and a plurality of vowels - each of which is predetermined for each other. A word vector storage unit in which different one-hot vectors are stored, a plurality of predefined different user-specified words for correcting typos and a word vector corresponding to each of the plurality of user-specified words - the word A vector is a vector generated by summing up all one-hot vectors for each of at least two or more words constituting each word - A word database in which is stored, a list consisting of at least two or more words on an electronic document by a user 1 After an input word is input, if a deletion command for the first input word is applied from the user, the first input word is deleted from the electronic document, and the first input word is retrieved by referring to the character vector storage unit. After checking the one-hot vector for each of the words constituting the word, a typo vector generator for generating a typo vector by summing up all the one-hot vectors for each of the words constituting the first input word, and storing it in the word database A vector similarity calculation unit for calculating a vector similarity between a word vector corresponding to each of the plurality of user-specified words and the typo vector, and a word vector corresponding to a word vector for which the vector similarity is maximized among the plurality of user-specified words and a typo correction processing unit that selects a first user-specified word, replaces the first user-specified word with a corrected word for the first input word deleted from the electronic document, and displays the corrected word.

또한, 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법은 복수의 낱글자들 - 상기 복수의 낱글자들은 복수의 자음들과 복수의 모음들로 이루어짐 - 각각에 대한 미리 정해진 서로 다른 원-핫 벡터가 저장되어 있는 낱글자 벡터 저장부를 유지하는 단계, 오타 정정을 위한 미리 지정된 서로 다른 복수의 사용자 지정 단어들과 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터 - 상기 단어 벡터는 각 단어를 구성하는 적어도 둘 이상의 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 생성된 벡터임 - 가 저장되어 있는 단어 데이터베이스를 유지하는 단계, 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 제1 입력 단어가 입력된 후, 상기 사용자로부터 상기 제1 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 제1 입력 단어를 삭제하고, 상기 낱글자 벡터 저장부를 참조하여 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 오타 벡터를 생성하는 단계, 상기 단어 데이터베이스에 저장되어 있는 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도를 연산하는 단계 및 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 제1 사용자 지정 단어를 선택한 후 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시하는 단계를 포함한다.In addition, according to an embodiment of the present invention, an operating method of an electronic terminal device capable of performing automatic typo correction customized for each individual includes a plurality of words - the plurality of words are composed of a plurality of consonants and a plurality of vowels - respectively Maintaining a word vector storage unit in which different predefined one-hot vectors for are stored, a plurality of predefined different user-specified words for correcting typos and word vectors corresponding to each of the plurality of user-specified words - The word vector is a vector generated by summing up all one-hot vectors for each of at least two or more letters constituting each word - Maintaining a word database in which is stored, at least on an electronic document by a user After a first input word consisting of two or more letters is input, if a deletion command for the first input word is applied from the user, the first input word is deleted from the electronic document, and the letter vector storage unit is referred to. After checking the one-hot vector for each of the letters constituting the first input word, generating a typo vector by adding all the one-hot vectors for each of the letters constituting the first input word; calculating a vector similarity between a word vector corresponding to each of the plurality of user-specified words stored in a word database and the typo vector; and selecting a corresponding first user-specified word, replacing the first user-specified word with a correction word for the first input word deleted on the electronic document, and displaying it.

본 발명에 따른 전자 단말 장치 및 그 동작 방법은 미리 등록된 사용자 지정 단어들을 저장해 둔 상태에서 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 특정 입력 단어가 입력된 후, 상기 사용자로부터 상기 특정 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 특정 입력 단어를 삭제하고, 상기 사용자 지정 단어들 중 상기 특정 입력 단어와의 유사도가 최대인 사용자 지정 단어를 선택한 후 상기 선택된 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 특정 입력 단어에 대한 정정 단어로 대체하여 표시함으로써, 개인별 맞춤형 자동 오타 정정이 가능하도록 지원할 수 있다.An electronic terminal device and method of operating the same according to the present invention, after a specific input word consisting of at least two or more words is input by a user on an electronic document in a state in which pre-registered user-designated words are stored, the specific input word from the user When a deletion command for an input word is applied, the specific input word is deleted from the electronic document, a user-specified word having the maximum similarity with the specific input word is selected from among the user-specified words, and the selected user-specified word is used. By replacing the specific input word deleted on the electronic document with a corrected word and displaying it, customized automatic typo correction for each individual may be supported.

도 1은 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 구조를 도시한 도면이다.
도 2는 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법을 도시한 순서도이다.1 is a diagram showing the structure of an electronic terminal device capable of performing automatic error correction tailored to each individual according to an embodiment of the present invention.
2 is a flowchart illustrating an operating method of an electronic terminal device capable of performing automatic error correction customized for each individual according to an embodiment of the present invention.

이하에서는 본 발명에 따른 실시예들을 첨부된 도면을 참조하여 상세하게 설명하기로 한다. 이러한 설명은 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였으며, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 본 명세서 상에서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 사람에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다.Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. This description is not intended to limit the present invention to specific embodiments, but should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. While describing each drawing, similar reference numerals have been used for similar components, and unless otherwise defined, all terms used in this specification, including technical or scientific terms, are common knowledge in the art to which the present invention belongs. has the same meaning as commonly understood by the person who has it.

본 문서에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있다는 것을 의미한다. 또한, 본 발명의 다양한 실시예들에 있어서, 각 구성요소들, 기능 블록들 또는 수단들은 하나 또는 그 이상의 하부 구성요소로 구성될 수 있고, 각 구성요소들이 수하는 전기, 전자, 기계적 기능들은 전자회로, 집적회로, ASIC(Application Specific Integrated Circuit) 등 공지된 다양한 소자들 또는 기계적 요소들로 구현될 수 있으며, 각각 별개로 구현되거나 2 이상이 하나로 통합되어 구현될 수도 있다.In this document, when a certain component is said to "include", it means that it may further include other components without excluding other components unless otherwise stated. In addition, in various embodiments of the present invention, each component, functional block, or means may be composed of one or more sub-components, and the electrical, electronic, and mechanical functions performed by each component may be electronic. It may be implemented with various known elements or mechanical elements such as circuits, integrated circuits, ASICs (Application Specific Integrated Circuits), and may be implemented separately or two or more may be integrated into one.

한편, 첨부된 블록도의 블록들이나 흐름도의 단계들은 범용 컴퓨터, 특수용 컴퓨터, 휴대용 노트북 컴퓨터, 네트워크 컴퓨터 등 데이터 프로세싱이 가능한 장비의 프로세서나 메모리에 탑재되어 지정된 기능들을 수행하는 컴퓨터 프로그램 명령들(instructions)을 의미하는 것으로 해석될 수 있다. 이들 컴퓨터 프로그램 명령들은 컴퓨터 장치에 구비된 메모리 또는 컴퓨터에서 판독 가능한 메모리에 저장될 수 있기 때문에, 블록도의 블록들 또는 흐름도의 단계들에서 설명된 기능들은 이를 수행하는 명령 수단을 내포하는 제조물로 생산될 수도 있다. 아울러, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 명령들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 가능한 실시예들에서는 블록들 또는 단계들에서 언급된 기능들이 정해진 순서와 달리 실행되는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 실질적으로 동시에 수행되거나, 역순으로 수행될 수 있으며, 경우에 따라 일부 블록들 또는 단계들이 생략된 채로 수행될 수도 있다.On the other hand, the blocks of the accompanying block diagram or the steps of the flowchart are computer program instructions that perform designated functions by being loaded into a processor or memory of a device capable of data processing, such as a general-purpose computer, a special purpose computer, a portable notebook computer, and a network computer. can be interpreted as meaning Since these computer program instructions may be stored in a memory included in a computer device or in a computer readable memory, the functions described in blocks of a block diagram or steps of a flowchart are produced as a product containing instruction means for performing them. It could be. Further, each block or each step may represent a module, segment or portion of code that includes one or more executable instructions for executing specified logical function(s). Also, it should be noted that in some alternative embodiments, functions mentioned in blocks or steps may be executed out of a predetermined order. For example, two blocks or steps shown in succession may be performed substantially simultaneously or in reverse order, and in some cases, some blocks or steps may be omitted.

도 1은 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 구조를 도시한 도면이다.1 is a diagram showing the structure of an electronic terminal device capable of performing automatic error correction tailored to each individual according to an embodiment of the present invention.

도 1을 참조하면, 본 발명에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치(110)는 낱글자 벡터 저장부(111), 단어 데이터베이스(112), 오타 벡터 생성부(113), 벡터 유사도 연산부(114) 및 오타 정정 처리부(115)를 포함한다.Referring to FIG. 1 , the electronic terminal device 110 capable of performing automatic typo correction customized for each individual according to the present invention includes a character vector storage unit 111, a word database 112, a typo vector generator 113, a vector It includes a similarity calculation unit 114 and a typo correction processing unit 115.

낱글자 벡터 저장부(111)에는 복수의 낱글자들(상기 복수의 낱글자들은 복수의 자음들과 복수의 모음들로 이루어짐)각각에 대한 미리 정해진 서로 다른 원-핫(One-Hot) 벡터가 저장되어 있다.The character vector storage unit 111 stores different predefined one-hot vectors for each of a plurality of characters (the plurality of characters are composed of a plurality of consonants and a plurality of vowels). .

예컨대, 낱글자 벡터 저장부(111)에는 하기의 표 1과 같이 정보가 저장되어 있을 수 있다.For example, information as shown in Table 1 below may be stored in the character vector storage unit 111 .

복수의 낱글자들 plural letters 원-핫 벡터one-hot vector ㄱgo [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅏall [0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㄴyou [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅑhey [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㄷdo [0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅓyes [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㄹL [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅕwoman [0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅁgrave [0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅗfuck you [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅂhundred [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0] ㅛblanket [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0] ㅅcow [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0] ㅜsob [0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0] ㅇblanket [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0] ㅠㅠ [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0] ㅈtrillion [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0] ㅡㅡ [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0] ㅊcongrat [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0] ㅣtooth [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0] ㅋlol [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0] ㅌframe [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0] ㅍblood [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0] ㅎhe [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1][0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]

단어 데이터베이스(112)에는 오타 정정을 위한 미리 지정된 서로 다른 복수의 사용자 지정 단어들과 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터(상기 단어 벡터는 각 단어를 구성하는 적어도 둘 이상의 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 생성된 벡터임)가 저장되어 있다.In the word database 112, a plurality of different user-specified words designated in advance for correction of typos and word vectors corresponding to each of the plurality of user-specified words (the word vector is each of at least two or more words constituting each word) It is a vector generated by summing all one-hot vectors for ) is stored.

여기서, 상기 복수의 사용자 지정 단어들은 사용자가 미리 설정해둔 단어들을 의미하는 것으로, 본 발명에 따른 전자 단말 장치(110)가 음악 업계에서 자주 사용된다고 하는 경우, 상기 복수의 사용자 지정 단어들은 음악 업계에서 자주 사용되는 단어들을 중심으로 사용자에 의해 미리 정해질 수 있다.Here, the plurality of user-designated words mean words previously set by the user, and when the electronic terminal device 110 according to the present invention is frequently used in the music industry, the plurality of user-designated words are used in the music industry. The frequently used words may be pre-determined by the user.

관련해서, 단어 데이터베이스(112)에는 하기의 표 2와 같이 정보가 저장되어 있을 수 있다.In relation to this, the word database 112 may store information as shown in Table 2 below.

복수의revenge
사용자 지정 단어들custom words 단어 벡터word vector 작곡Composition [3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0][3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0] 녹음record [1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0][1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0] 공연show [1 0 1 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0][1 0 1 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0] ...... ......

오타 벡터 생성부(113)는 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 제1 입력 단어가 입력된 후, 상기 사용자로부터 상기 제1 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 제1 입력 단어를 삭제하고, 낱글자 벡터 저장부(111)를 참조하여 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 오타 벡터를 생성한다.When a first input word consisting of at least two or more words is input by a user on an electronic document and a deletion command for the first input word is applied from the user, the typo vector generator 113 generates the electronic document After deleting the first input word from above and checking the one-hot vector for each of the letters constituting the first input word with reference to the letter vector storage unit 111, the letters constituting the first input word The typo vectors are generated by summing all the one-hot vectors for each.

예컨대, 제1 입력 단어를 '작걱'이라고 가정하자. 이때, 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들인 'ㅈㅏㄱㄱㅓㄱ'으로 구성된 제1 입력 단어인 '작걱'이 입력된 후, 상기 사용자로부터 전자 단말 장치(110)에 상기 제1 입력 단어인 '작걱'에 대한 삭제 명령이 인가되면, 오타 벡터 생성부(113)는 상기 전자 문서 상에서 상기 제1 입력 단어인 '작걱'을 삭제하고, 상기 표 1과 같은 낱글자 벡터 저장부(111)를 참조하여 상기 제1 입력 단어인 '작걱'을 구성하는 낱글자들인 'ㅈㅏㄱㄱㅓㄱ' 각각에 대한 원-핫 벡터를 확인할 수 있다.For example, let's assume that the first input word is 'jak'. At this time, after the user inputs the first input word 'jag' composed of at least two or more words 'jaggag' on the electronic document, the first input word from the user to the electronic terminal device 110 When a deletion command for 'jakteok' is applied, the typo vector generation unit 113 deletes the first input word 'jakteok' on the electronic document, and refers to the character vector storage unit 111 as shown in Table 1 Thus, it is possible to check the one-hot vector for each of the letters 'jaggag', which constitutes the first input word, 'jakgeop'.

그 이후, 오타 벡터 생성부(113)는 상기 제1 입력 단어인 '작걱'을 구성하는 낱글자들인 'ㅈㅏㄱㄱㅓㄱ' 각각에 대한 원-핫 벡터를 모두 합산하여 '[3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]'과 같은 오타 벡터를 생성할 수 있다.After that, the typo vector generation unit 113 sums up all the one-hot vectors for each of the letters 'jaggag', which is the word constituting the first input word, 'jakkuk', and obtains '[3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]'.

벡터 유사도 연산부(114)는 단어 데이터베이스(112)에 저장되어 있는 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도를 연산한다.The vector similarity calculation unit 114 calculates a vector similarity between a word vector corresponding to each of the plurality of user-specified words stored in the word database 112 and the typo vector.

여기서, 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도는 하기의 수학식 1에 따라 연산될 수 있다.Here, the vector similarity between the word vector corresponding to each of the plurality of user-specified words and the typo vector may be calculated according to Equation 1 below.

여기서, M은 두 벡터 사이의 벡터 유사도로, S는 두 벡터 사이의 코사인 유사도, D는 두 벡터 사이의 유클리드 거리(Euclidean Distance)를 의미하고, 상기 두 벡터 사이의 코사인 유사도 S와 상기 두 벡터 사이의 유클리드 거리 D는 하기의 수학식 2와 하기의 수학식 3에 따라 연산될 수 있다.Here, M is the vector similarity between two vectors, S is the cosine similarity between the two vectors, D is the Euclidean distance between the two vectors, and the cosine similarity between the two vectors S and the two vectors The Euclidean distance D of can be calculated according to Equation 2 below and Equation 3 below.

여기서, S는 벡터 A와 B 사이의 코사인 유사도로 -1에서 1사이의 값을 가지며, 그 값이 클수록 유사한 벡터임을 의미하고, A_i는 벡터 A의 i번째 성분, B_i는 벡터 B의 i번째 성분을 의미한다.Here, S is the cosine similarity between vectors A and B. It has a value between -1 and 1, and the larger the value, the more similar the vector, A _i is the i-th component of vector A, and B _i is the i of vector B. means the second component.

상기 수학식 3에서 D는 유클리드 거리, A_i와 B_i는 두 벡터에 포함되어 있는 i번째 성분들을 의미한다. 보통, 두 벡터 간의 유클리드 거리가 작을수록 두 벡터는 유사한 벡터라고 볼 수 있고, 두 벡터 간의 유클리드 거리가 클수록 두 벡터는 비유사한 벡터라고 볼 수 있다.In Equation 3 above, D is the Euclidean distance, and A _i and B _i denote i-th components included in the two vectors. In general, the smaller the Euclidean distance between two vectors, the more similar the two vectors are, and the larger the Euclidean distance between the two vectors, the more dissimilar vectors they are.

오타 정정 처리부(115)는 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 제1 사용자 지정 단어를 선택한 후 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시한다.The typo correction processing unit 115 selects a first user-specified word corresponding to a word vector for which the vector similarity is maximized from among the plurality of user-specified words, and then selects the first user-specified word as the deleted word from the electronic document. A correction word for the first input word is replaced and displayed.

예컨대, 앞서 설명한 예시와 같이, 상기 오타 벡터가 '[3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]'으로 생성되었다고 가정하는 경우, 벡터 유사도 연산부(114)는 상기 표 2와 같은 단어 데이터베이스(112)에 저장되어 있는 상기 복수의 사용자 지정 단어들인 '작곡, 녹음, 공연, ...' 각각에 대응되는 단어 벡터인 '[3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0], [1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0], [1 0 1 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0], ...'과 상기 오타 벡터인 '[3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]' 간의 벡터 유사도를 상기 수학식 1 내지 3에 따라 연산할 수 있다.For example, as in the example described above, if it is assumed that the typo vector is generated as '[3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]', the vector similarity calculation unit 114 is a word vector corresponding to each of the plurality of user-specified words 'composition, recording, performance, ...' stored in the word database 112 as shown in Table 2, '[3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0], [1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0], [1 0 1 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0], ...' and the typo vector '[3 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]' can be calculated according to Equations 1 to 3 above.

예컨대, 앞서 설명한 예시와 같이, 상기 제1 입력 단어를 '작걱'이라고 하고, 상기 연산된 벡터 유사도를 '1.33, 0.577, 0.516, ...'이라고 가정하는 경우, 오타 정정 처리부(115)는 상기 복수의 사용자 지정 단어들인 '작곡, 녹음, 공연, ...' 중 상기 벡터 유사도가 최대로 연산된 단어 벡터인 '[3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0]'에 대응하는 제1 사용자 지정 단어인 '작곡'을 선택한 후 상기 제1 사용자 지정 단어인 '작곡'을 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어인 '작걱'에 대한 정정 단어로 대체하여 표시할 수 있다.For example, as in the example described above, when it is assumed that the first input word is 'jak' and the calculated vector similarity is '1.33, 0.577, 0.516, ...', the typo correction processing unit 115 Among the plurality of user-specified words 'composition, recording, performance, ...', the word vector whose vector similarity is maximized is '[3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0] after selecting the first user-specified word 'composition' corresponding to 'composition' for the first input word 'jakgeop' deleted from the electronic document. It can be indicated by replacing it with a correction word.

즉, 전자 단말 장치(110)는 낱글자 벡터 저장부(111)에 복수의 낱글자들 각각에 대한 미리 정해진 서로 다른 원-핫 벡터를 저장해 두고, 단어 데이터베이스(112)에 오타 정정을 위한 미리 지정된 서로 다른 복수의 사용자 지정 단어들과 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터를 저장해 둔 상태에서, 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 제1 입력 단어가 입력된 후, 상기 사용자로부터 상기 제1 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 제1 입력 단어를 삭제하고, 낱글자 벡터 저장부(111)를 참조하여 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 오타 벡터를 생성하며, 단어 데이터베이스(112)에 저장되어 있는 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도를 연산하고, 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 제1 사용자 지정 단어를 선택한 후 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시함으로써, 개인별 맞춤형 자동 오타 정정을 수행할 수 있도록 지원할 수 있다.That is, the electronic terminal device 110 stores different predetermined one-hot vectors for each of a plurality of letters in the character vector storage unit 111, and stores different predetermined one-hot vectors for correcting typos in the word database 112. In a state in which a plurality of user-specified words and word vectors corresponding to each of the plurality of user-specified words are stored, after a first input word consisting of at least two or more words is input on an electronic document by a user, the When a deletion command for the first input word is applied from the user, the first input word is deleted from the electronic document, and the character vector storage unit 111 is referred to, and each of the letters constituting the first input word After checking the one-hot vector for each word, a typo vector is generated by summing all the one-hot vectors for each of the letters constituting the first input word, and the plurality of user-specified words stored in the word database 112 A vector similarity between a word vector corresponding to each of ? and the typo vector is calculated, a first user-specified word corresponding to a word vector for which the vector similarity is calculated to be maximized is selected from among the plurality of user-specified words, and then the first user-specified word is selected. By replacing the user-designated word with a corrected word for the first input word deleted from the electronic document and displaying it, it is possible to support personalized automatic typo correction.

본 발명의 일실시예에 따르면, 오타 정정 처리부(115)는 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 상기 제1 사용자 지정 단어를 선택한 후, 상기 최대로 연산된 벡터 유사도가 기설정된(predetermined) 임계치 이상인 것으로 판단되는 경우에만 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시하고, 상기 최대로 연산된 벡터 유사도가 상기 기설정된 임계치 미만인 것으로 판단되는 경우, 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시하지 않을 수 있다.According to one embodiment of the present invention, the typo correction processing unit 115 selects the first user-specified word corresponding to the word vector for which the vector similarity is calculated to the maximum among the plurality of user-specified words, Only when it is determined that the calculated vector similarity is equal to or greater than a predetermined threshold, the first user-specified word is replaced with a correction word for the first input word deleted from the electronic document and displayed, and the maximum calculated When it is determined that the vector similarity is less than the preset threshold, the first user-specified word may be replaced with a corrected word for the first input word deleted from the electronic document and not displayed.

예컨대, 상기 제1 입력 단어를 '작샤'라고 하고, 상기 제1 사용자 지정 단어를 '작곡'이라고 하며, 상기 최대로 연산된 벡터 유사도(상기 제1 입력 단어인 '작샤'와 상기 제1 사용자 지정 단어인 '작곡' 간의 벡터 유사도)를 '0.919'라고 하고, 상기 기설정된 임계치를 '1'이라고 하는 경우, 오타 정정 처리부(115)는 상기 최대로 연산된 벡터 유사도가 상기 기설정된 임계치 미만이기 때문에 상기 제1 사용자 지정 단어인 '작곡'을 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어인 '작샤'에 대한 정정 단어로 대체하여 표시하지 않을 수 있다.For example, the first input word is called 'jaxha', the first user-specified word is called 'composition', and the maximum calculated vector similarity (the first input word 'jaxa' and the first user-specified word) When the vector similarity between the word 'composition' is set to '0.919' and the preset threshold is set to '1', the typo correction processing unit 115 operates because the vector similarity calculated as maximum is less than the preset threshold. The first user-specified word 'composition' may be replaced with a corrected word for 'jaksha', the first input word deleted from the electronic document, and not displayed.

이때, 본 발명의 일실시예에 따르면, 오타 정정 처리부(115)는 단어 정보 갱신부(116)를 포함할 수 있다.At this time, according to one embodiment of the present invention, the typo correction processor 115 may include a word information update unit 116 .

단어 정보 갱신부(116)는 상기 최대로 연산된 벡터 유사도가 상기 기설정된 임계치 미만인 것으로 판단되는 경우, 상기 전자 문서 상에서 상기 제1 입력 단어가 삭제된 시점으로부터 기설정된 대기시간 이내에 상기 사용자에 의해 상기 전자 문서 상에 상기 제1 입력 단어를 대체하는 적어도 둘 이상의 낱글자들로 구성된 제2 입력 단어가 재입력되면, 낱글자 벡터 저장부(111)를 참조하여 상기 제2 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제2 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 상기 제2 입력 단어에 대응되는 단어 벡터를 생성한 후, 상기 제2 입력 단어를 사용자 지정 단어로 새롭게 지정하여 단어 데이터베이스(112) 상에 상기 제2 입력 단어와 상기 제2 입력 단어에 대응되는 단어 벡터를 추가로 저장한다.When the word information updater 116 determines that the maximum calculated vector similarity is less than the predetermined threshold, the word information update unit 116 determines that the first input word is deleted from the electronic document by the user within a predetermined waiting time. When a second input word composed of at least two or more letters replacing the first input word is re-entered on the electronic document, the letter vector storage unit 111 is referred to, and each of the letters constituting the second input word After checking the one-hot vector for the second input word, a word vector corresponding to the second input word is generated by summing all the one-hot vectors for each of the words constituting the second input word, and then the second input word The second input word and the word vector corresponding to the second input word are additionally stored in the word database 112 by newly designating it as a user-specified word.

예컨대, 앞서 설명한 예시와 같이, 상기 제1 입력 단어가 '작샤'임에 따라 상기 최대로 연산된 벡터 유사도가 상기 기설정된 임계치 미만인 것으로 판단된 경우, 상기 전자 문서 상에서 상기 제1 입력 단어인 '작샤'가 삭제된 시점으로부터 기설정된 대기시간 이내에 상기 사용자에 의해 상기 전자 문서 상에 상기 제1 입력 단어인 '작샤'를 대체하는 적어도 둘 이상의 낱글자들인 'ㅈㅏㄱㅅㅏ'로 구성된 제2 입력 단어인 '작사'가 전자 단말 장치(110)에 재입력되면, 단어 정보 갱신부(116)는 상기 표 1과 같은 낱글자 벡터 저장부(111)를 참조하여 상기 제2 입력 단어인 '작사'를 구성하는 낱글자들인 'ㅈㅏㄱㅅㅏ' 각각에 대한 원-핫 벡터를 확인할 수 있다.For example, as in the example described above, when it is determined that the maximum calculated vector similarity is less than the predetermined threshold value according to the first input word being 'Jaksha', the first input word 'Jaksha' on the electronic document A second input word consisting of at least two or more words 'Jaagss' replacing 'Jaksha', the first input word, on the electronic document by the user within a predetermined waiting time from the time ' is deleted. When 'lyrics' is re-entered into the electronic terminal device 110, the word information updating unit 116 refers to the letter vector storage unit 111 as shown in Table 1 above, and the letters constituting the second input word 'lyrics' You can check the one-hot vector for each of the 'jhaggs'.

그 이후, 단어 정보 갱신부(116)는 상기 제2 입력 단어인 '작사'를 구성하는 낱글자들인 'ㅈㅏㄱㅅㅏ' 각각에 대한 원-핫 벡터를 모두 합산하여 상기 제2 입력 단어인 '작사'에 대응되는 단어 벡터를 '[1 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0]'과 같이 생성할 수 있다.After that, the word information updating unit 116 sums up all the one-hot vectors for each of the words 'aaaa', which are the words constituting the second input word 'lyric', and obtains the second input word 'lyric'. A word vector corresponding to '[1 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0]' can be generated.

그러고 나서, 단어 정보 갱신부(116)는 상기 제2 입력 단어인 '작사'를 사용자 지정 단어로 새롭게 지정하여 상기 표 2와 같은 단어 데이터베이스(112) 상에 상기 제2 입력 단어인 '작사'와 상기 제2 입력 단어인 '작사'에 대응되는 단어 벡터인 '[1 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0]'을 추가로 저장할 수 있다.Then, the word information updating unit 116 newly designates the second input word 'lyric' as a user-specified word, and the second input word 'lyric' and the word database 112 as shown in Table 2 A word vector '[1 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0]' corresponding to the second input word 'lyric' may be additionally stored.

관련해서, 단어 데이터베이스(112)에는 하기의 표 3과 같이 새롭게 지정된 사용자 지정 단어에 대한 정보가 추가로 저장될 수 있다.In relation to this, the word database 112 may additionally store information on newly designated user-specified words as shown in Table 3 below.

복수의revenge
사용자 지정 단어들custom words 단어 벡터word vector 작곡Composition [3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0][3 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0] 녹음record [1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0][1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0] 공연show [1 0 1 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0][1 0 1 0 0 0 0 1 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0] 작사Lyricist [1 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0][1 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0] ...... ......

즉, 단어 정보 갱신부(116)는 단어 데이터베이스(112)에 저장되어 있지 않은 새로운 사용자 지정 단어가 사용자의 오타 정정에 의해 전자 문서 상에 입력되는 것으로 확인되는 경우, 새로운 사용자 지정 단어를 단어 데이터베이스(112)에 추가 저장함으로써, 추후 새로운 사용자 지정 단어와 유사한 오타가 전자 문서에 입력될 때, 해당 오타가 상기 새로운 사용자 지정 단어로 자동으로 정정될 수 있도록 지원할 수 있다.That is, when it is confirmed that a new user-specified word not stored in the word database 112 is input on the electronic document by correcting a user's typo, the word information update unit 116 converts the new user-specified word to the word database ( 112), when a typo similar to a new user-specified word is input to an electronic document in the future, it is possible to automatically correct the typo with the new user-defined word.

도 2는 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법을 도시한 순서도이다.2 is a flowchart illustrating an operating method of an electronic terminal device capable of performing automatic error correction customized for each individual according to an embodiment of the present invention.

단계(S210)에서는 복수의 낱글자들(상기 복수의 낱글자들은 복수의 자음들과 복수의 모음들로 이루어짐)각각에 대한 미리 정해진 서로 다른 원-핫 벡터가 저장되어 있는 낱글자 벡터 저장부를 유지한다.In step S210, a character vector storage unit is maintained in which predetermined different one-hot vectors for each of a plurality of letters (the plurality of letters are composed of a plurality of consonants and a plurality of vowels) are stored.

단계(S220)에서는 오타 정정을 위한 미리 지정된 서로 다른 복수의 사용자 지정 단어들과 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터(상기 단어 벡터는 각 단어를 구성하는 적어도 둘 이상의 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 생성된 벡터임)가 저장되어 있는 단어 데이터베이스를 유지한다.In step S220, a plurality of different user-defined words designated in advance for correction of typos and a word vector corresponding to each of the plurality of user-specified words (the word vector is each of at least two or more words constituting each word) It maintains a word database in which the vectors generated by summing all the one-hot vectors for each word) are stored.

단계(S230)에서는 사용자에 의해 전자 문서 상에 적어도 둘 이상의 낱글자들로 구성된 제1 입력 단어가 입력된 후, 상기 사용자로부터 상기 제1 입력 단어에 대한 삭제 명령이 인가되면, 상기 전자 문서 상에서 상기 제1 입력 단어를 삭제하고, 상기 낱글자 벡터 저장부를 참조하여 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제1 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 오타 벡터를 생성한다.In step S230, after a first input word consisting of at least two or more words is input on the electronic document by the user and a deletion command for the first input word is applied from the user, the first input word on the electronic document 1 After deleting the input word, checking the one-hot vector for each of the letters constituting the first input word with reference to the letter vector storage unit, one-hot vector for each of the letters constituting the first input word By summing all the vectors, the typo vector is created.

단계(S240)에서는 상기 단어 데이터베이스에 저장되어 있는 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도를 연산한다.In step S240, a vector similarity between a word vector corresponding to each of the plurality of user-specified words stored in the word database and the typo vector is calculated.

단계(S250)에서는 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 제1 사용자 지정 단어를 선택한 후 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시한다.In step S250, after selecting a first user-specified word corresponding to a word vector for which the vector similarity is maximized among the plurality of user-specified words, the first user-specified word is deleted from the electronic document. It is displayed by replacing the input word with the corrected word.

이때, 본 발명의 일실시예에 따르면, 상기 복수의 사용자 지정 단어들 각각에 대응되는 단어 벡터와 상기 오타 벡터 간의 벡터 유사도의 연산은 상기의 수학식 1에 따라 수행될 수 있다.In this case, according to an embodiment of the present invention, the calculation of the vector similarity between the word vector corresponding to each of the plurality of user-specified words and the typo vector may be performed according to Equation 1 above.

또한, 본 발명의 일실시예에 따르면, 단계(S250)에서는 상기 복수의 사용자 지정 단어들 중 상기 벡터 유사도가 최대로 연산된 단어 벡터에 대응하는 상기 제1 사용자 지정 단어를 선택한 후, 상기 최대로 연산된 벡터 유사도가 기설정된 임계치 이상인 것으로 판단되는 경우에만 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시하고, 상기 최대로 연산된 벡터 유사도가 상기 기설정된 임계치 미만인 것으로 판단되는 경우, 상기 제1 사용자 지정 단어를 상기 전자 문서 상에서 삭제된 상기 제1 입력 단어에 대한 정정 단어로 대체하여 표시하지 않을 수 있다.Further, according to an embodiment of the present invention, in step S250, after selecting the first user-specified word corresponding to the word vector for which the vector similarity is calculated to the maximum, among the plurality of user-specified words, Only when it is determined that the calculated vector similarity is equal to or greater than a predetermined threshold, the first user-specified word is replaced with a corrected word for the first input word deleted from the electronic document and displayed, and the vector similarity calculated as maximum is displayed. When it is determined that the value is less than the predetermined threshold value, the first user specified word may be replaced with a corrected word for the first input word deleted from the electronic document and not displayed.

이때, 본 발명의 일실시예에 따르면, 단계(S250)에서는 상기 최대로 연산된 벡터 유사도가 상기 기설정된 임계치 미만인 것으로 판단되는 경우, 상기 전자 문서 상에서 상기 제1 입력 단어가 삭제된 시점으로부터 기설정된 대기시간 이내에 상기 사용자에 의해 상기 전자 문서 상에 상기 제1 입력 단어를 대체하는 적어도 둘 이상의 낱글자들로 구성된 제2 입력 단어가 재입력되면, 상기 낱글자 벡터 저장부를 참조하여 상기 제2 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 확인한 후 상기 제2 입력 단어를 구성하는 낱글자들 각각에 대한 원-핫 벡터를 모두 합산하여 상기 제2 입력 단어에 대응되는 단어 벡터를 생성한 후, 상기 제2 입력 단어를 사용자 지정 단어로 새롭게 지정하여 상기 단어 데이터베이스 상에 상기 제2 입력 단어와 상기 제2 입력 단어에 대응되는 단어 벡터를 추가로 저장하는 단계를 포함할 수 있다.At this time, according to one embodiment of the present invention, in step S250, when it is determined that the maximum calculated vector similarity is less than the preset threshold, a preset If a second input word consisting of at least two or more letters replacing the first input word is re-entered by the user on the electronic document within the waiting time, the second input word is constructed with reference to the letter vector storage unit After confirming the one-hot vector for each of the words that make up the second input word, a word vector corresponding to the second input word is generated by summing all the one-hot vectors for each of the words constituting the second input word. The method may further include designating a second input word as a user-specified word and additionally storing the second input word and a word vector corresponding to the second input word in the word database.

이상, 도 2를 참조하여 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법에 대해 설명하였다. 여기서, 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법은 도 1을 이용하여 설명한 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치(110)의 동작에 대한 구성과 대응될 수 있으므로, 이에 대한 보다 상세한 설명은 생략하기로 한다.In the above, with reference to FIG. 2 , an operating method of an electronic terminal device capable of performing automatic error correction tailored to each individual according to an embodiment of the present invention has been described. Here, the operating method of the electronic terminal device capable of performing automatic typo correction customized for each person according to an embodiment of the present invention is the electronic terminal device 110 capable of performing personalized automatic error correction described with reference to FIG. Since it may correspond to a configuration for an operation, a detailed description thereof will be omitted.

본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법은 컴퓨터와의 결합을 통해 실행시키기 위한 저장매체에 저장된 컴퓨터 프로그램으로 구현될 수 있다.An operating method of an electronic terminal device capable of performing automatic typo correction customized for each individual according to an embodiment of the present invention may be implemented as a computer program stored in a storage medium for execution through combination with a computer.

또한, 본 발명의 일실시예에 따른 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치의 동작 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.In addition, the operating method of the electronic terminal device capable of performing automatic typo correction tailored to each individual according to an embodiment of the present invention can be implemented in the form of program commands that can be executed through various computer means and recorded on a computer readable medium. there is. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the medium may be those specially designed and configured for the present invention or those known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, the present invention has been described by specific details such as specific components and limited embodiments and drawings, but these are provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , Those skilled in the art in the field to which the present invention belongs can make various modifications and variations from these descriptions.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the described embodiments, and it will be said that not only the claims to be described later, but also all modifications equivalent or equivalent to these claims belong to the scope of the present invention. .

110: 개인별 맞춤형 자동 오타 정정을 수행할 수 있는 전자 단말 장치
111: 낱글자 벡터 저장부 112: 단어 데이터베이스
113: 오타 벡터 생성부 114: 벡터 유사도 연산부
115: 오타 정정 처리부 116: 단어 정보 갱신부110: Electronic terminal device capable of performing automatic typo correction tailored to each individual
111: single character vector storage unit 112: word database
113: typo vector generator 114: vector similarity calculation unit
115: typo correction processing unit 116: word information updating unit

Claims

a character vector storage unit storing different predefined one-hot vectors for each of the plurality of words - the plurality of words are composed of a plurality of consonants and a plurality of vowels;
A plurality of predefined different user-specified words for correcting typos and word vectors corresponding to each of the plurality of user-specified words - the word vector is a one-hot vector for each of at least two or more words constituting each word It is a vector generated by summing all of - A word database in which is stored;
After a first input word consisting of at least two or more words is input on the electronic document by the user and a deletion command for the first input word is applied from the user, the first input word is deleted from the electronic document and checking the one-hot vector for each of the letters constituting the first input word with reference to the letter vector storage unit, and then summing up all the one-hot vectors for each of the letters constituting the first input word an erroneous vector generator that generates an erroneous vector;
a vector similarity calculation unit that calculates a vector similarity between a word vector corresponding to each of the plurality of user-specified words stored in the word database and the typo vector according to Equation 1 below; and
After selecting a first user-specified word corresponding to a word vector for which the vector similarity is maximized among the plurality of user-specified words, it is determined that the maximum vector similarity is equal to or greater than a predetermined threshold, When the first user-specified word is replaced with a corrected word for the first input word deleted on the electronic document and displayed, and it is determined that the maximum calculated vector similarity is less than the preset threshold, the first user A typo correction processing unit that replaces a designated word with a corrected word for the first input word deleted from the electronic document and does not display the specified word.
including,
The typo correction processing unit
When it is determined that the maximum calculated vector similarity is less than the predetermined threshold, the first input on the electronic document by the user within a predetermined waiting time from the time the first input word is deleted on the electronic document When a second input word composed of at least two or more words replacing a word is re-entered, a one-hot vector for each of the letters constituting the second input word is checked with reference to the letter vector storage unit, and then the second input word A word vector corresponding to the second input word is created by summing up all one-hot vectors for each of the words constituting the input word, and then the second input word is newly designated as a user-specified word and stored on the word database. A word information updating unit for additionally storing the second input word and a word vector corresponding to the second input word in
An electronic terminal device capable of performing automatic typo correction tailored to each individual comprising a.
[Equation 1]

Here, M is the vector similarity between two vectors, S is the cosine similarity between the two vectors, and D is the Euclidean distance between the two vectors.

delete

maintaining a character vector storage unit in which predetermined different one-hot vectors for each of a plurality of letters, the plurality of letters consisting of a plurality of consonants and a plurality of vowels, are stored;
A plurality of predefined different user-specified words for correcting typos and word vectors corresponding to each of the plurality of user-specified words - the word vector is a one-hot vector for each of at least two or more words constituting each word Maintaining a word database in which − is a vector generated by summing all of − ;
After a first input word consisting of at least two or more words is input on the electronic document by the user and a deletion command for the first input word is applied from the user, the first input word is deleted from the electronic document and checking the one-hot vector for each of the letters constituting the first input word with reference to the letter vector storage unit, and then summing up all the one-hot vectors for each of the letters constituting the first input word generating typo vectors;
calculating a vector similarity between a word vector corresponding to each of the plurality of user-specified words stored in the word database and the typo vector according to Equation 1 below; and
After selecting a first user-specified word corresponding to a word vector for which the vector similarity is maximized among the plurality of user-specified words, it is determined that the maximum vector similarity is equal to or greater than a predetermined threshold, When the first user-specified word is replaced with a corrected word for the first input word deleted on the electronic document and displayed, and it is determined that the maximum calculated vector similarity is less than the preset threshold, the first user replacing a designated word with a corrected word for the first input word deleted from the electronic document and not displaying the word;
including,
Steps not indicated above
When it is determined that the maximum calculated vector similarity is less than the predetermined threshold, the first input on the electronic document by the user within a predetermined waiting time from the time the first input word is deleted on the electronic document When a second input word composed of at least two or more words replacing a word is re-entered, a one-hot vector for each of the letters constituting the second input word is checked with reference to the letter vector storage unit, and then the second input word A word vector corresponding to the second input word is created by summing up all one-hot vectors for each of the words constituting the input word, and then the second input word is newly designated as a user-specified word and stored on the word database. additionally storing the second input word and a word vector corresponding to the second input word in
A method of operating an electronic terminal device capable of performing automatic typo correction customized for each individual, including.
[Equation 1]

Here, M is the vector similarity between two vectors, S is the cosine similarity between the two vectors, and D is the Euclidean distance between the two vectors.

delete

A computer readable recording medium recording a computer program for executing the method of claim 5 through a combination with a computer.

A computer program stored in a storage medium for executing the method of claim 5 through a combination with a computer.