KR102508200B1

KR102508200B1 - Artificial intelligence learning data collection method for improving recognition rate using eraser function and candidate list

Info

Publication number: KR102508200B1
Application number: KR1020210036910A
Authority: KR
Inventors: 유왕상
Original assignee: 주식회사 아이텍솔루션
Priority date: 2021-03-22
Filing date: 2021-03-22
Publication date: 2023-03-09
Also published as: KR20220131774A

Abstract

본 명세서는 AI(Artificial Intelligence) 프로세서가 AI 모델을 학습시키기 위해, 사용자로부터 학습 데이터를 수집하는 방법에 있어서, 사용자 디바이스로부터, 상기 사용자가 입력한 수학식이 포함된 이미지를 수신하고, 상기 사용자가 입력한 수학식이 포함된 이미지에 근거하여 : 상기 AI 모델을 이용하여, 1) 상기 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 예측하고, 상기 사용자 디바이스로 1) 상기 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 전송하는 단계; 상기 사용자 디바이스로부터, 상기 수학식의 문자열 및 상기 후보자 리스트에 근거하여, 생성된 학습 데이터를 수신하며, 상기 학습 데이터를 이용하여, 상기 AI 모델을 학습시킬 수 있다.In the present specification, an AI (Artificial Intelligence) processor receives an image containing a mathematical expression input by the user from a user device in a method of collecting learning data from a user in order to learn an AI model, and the user inputs Based on an image including an equation: using the AI model, 1) a character string of the equation and 2) a candidate list related to the character string are predicted, and 1) the character string of the equation and 2 are predicted by the user device. ) transmitting a candidate list related to the character string; From the user device, learning data generated based on the character string of the equation and the candidate list may be received, and the AI model may be trained using the learning data.

Description

Artificial intelligence learning data collection method for improving recognition rate using eraser function and candidate list}

본 명세서는 인공지능 모델에서 사용자의 활자 또는 스트로크 방식으로 입력된 문자를 판독할 때, 발생하는 오류 데이터를 효과적으로 수집하여 보다 높은 인식율의 학습 모델을 구축하기 위한 방법 및 이를 위한 장치에 관한 것이다.The present specification relates to a method and an apparatus for constructing a learning model with a higher recognition rate by effectively collecting error data that occurs when reading characters input by a user's typeface or stroke method in an artificial intelligence model.

인공지능 학습 데이터란, 머신러닝, 딥러닝 등 AI 모델 학습을 위해 사용되는 데이터를 총칭한다. 사용자의 활자 또는 스트로크 방식으로 입력된 문자를 판독하기 위한 인공지능 모델에 있어서, 기존의 인공지능 학습 데이터 수집은 개발자의 인위적인 행위를 통해 수집된다. AI learning data is a generic term for data used for learning AI models such as machine learning and deep learning. In an artificial intelligence model for reading characters input by a user's typeface or stroke method, existing artificial intelligence learning data collection is collected through artificial actions of developers.

또한, 인공지능 학습 데이터 세트는 초기에는 인위적인 방식으로 수집되고, 상용 서비스가 진행될 때는 서비스를 이용하는 사용자의 데이터가 무작위로 수집되어, 기능 개선에 꼭 필요한 데이터와 그렇지 않은 데이터의 구분 없이 추가 학습에 사용되는 문제가 있었다.In addition, artificial intelligence learning data sets are initially collected in an artificial way, and when commercial services are in progress, data of users using services are randomly collected and used for additional learning without distinguishing between data that is essential for function improvement and data that is not. there was a problem with

공개특허공보 제10-2016-0101683호(20160825)Publication No. 10-2016-0101683 (20160825) 공개특허공보 제10-2016-0018495호(20160217)Publication No. 10-2016-0018495 (20160217) 공개특허공보 제10-2018-0060971호(20180607)Publication No. 10-2018-0060971 (20180607)

본 명세서의 목적은, 문자를 입력하는 사용자가 원하는 인식 결과에 도달하기 위한 과정에서 일어나는 일련의 행위, 즉, 잘못 인식된 문자를 수정하는 과정을 학습 데이터 수집 단계로 자연스럽게 연결시킴으로서, 인식율 향상에 효과적인 순도 높은 데이터(예를 들어, 학습용 원시 데이터와 정답 데이터)를 확보하기 위한 방법을 제안한다.The purpose of the present specification is to naturally connect a series of actions that occur in the process of reaching a recognition result desired by a user inputting a character, that is, a process of correcting an erroneously recognized character to a learning data collection step, thereby effectively improving the recognition rate. We propose a method for securing high-purity data (eg, raw data for learning and correct answer data).

또한, 본 명세서의 목적은, 이미 학습된 인공지능 모델에서 잘 인식하지 못하는 부분의 데이터를 효과적으로 확보하여, 적은 추가 학습에도 더 나은 성능을 기대할 수 있는 데이터 수집 방법을 제안한다.In addition, the purpose of this specification is to propose a data collection method that can expect better performance even with little additional learning by effectively securing data of parts that are not well recognized in the already learned artificial intelligence model.

본 명세서가 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 이하의 명세서의 상세한 설명으로부터 본 명세서가 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved by this specification are not limited to the above-mentioned technical problems, and other technical problems not mentioned are clear to those skilled in the art from the detailed description of the specification below. will be understandable.

본 명세서의 일 양상은, AI(Artificial Intelligence) 프로세서가 AI 모델을 학습시키기 위해, 사용자로부터 학습 데이터를 수집하는 방법에 있어서, 사용자 디바이스로부터, 상기 사용자가 입력한 수학식이 포함된 이미지를 수신하는 단계; 상기 사용자가 입력한 수학식이 포함된 이미지에 근거하여 : 상기 AI 모델을 이용하여, 1) 상기 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 예측하는 단계; 상기 사용자 디바이스로 1) 상기 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 전송하는 단계; 상기 사용자 디바이스로부터, 상기 수학식의 문자열 및 상기 후보자 리스트에 근거하여, 생성된 학습 데이터를 수신하는 단계; 및 상기 학습 데이터를 이용하여, 상기 AI 모델을 학습시키는 단계;를 포함할 수 있다.One aspect of the present specification is a method of collecting learning data from a user so that an AI (Artificial Intelligence) processor learns an AI model, comprising the steps of receiving an image including a mathematical expression input by the user from a user device. ; Based on the image including the mathematical expression input by the user: predicting 1) a character string of the mathematical expression and 2) a candidate list related to the character string using the AI model; transmitting to the user device 1) a character string of the equation and 2) a candidate list related to the character string; receiving, from the user device, learning data generated based on the character string of the equation and the candidate list; and learning the AI model using the learning data.

또한, 상기 학습 데이터는 상기 사용자가 상기 사용자 디바이스를 통해, 상기 후보자 리스트에서 선택한 정답 후보자 및 상기 정답 후보자와 대응되는 상기 수학식의 문자열의 쌍으로만 구성될 수 있다.In addition, the learning data may be composed of only a pair of a correct answer candidate selected by the user from the candidate list through the user device and a string of the mathematical expression corresponding to the correct answer candidate.

또한, 상기 사용자 디바이스는 1) 상기 수학식의 문자열 중 특정 문자를 디스플레이부 상에서 지우고 다시 입력하기 위한 아이콘을 표시하고, 2) 상기 아이콘이 상기 사용자에 의해, 선택된 경우, 상기 사용자로부터 상기 사용자가 지우고자 하는 문자를 선택받으며, 3) 상기 사용자가 지우고자 하는 문자를 상기 디스플레이부에 표시하지 않고, 4) 상기 사용자가 지우고자 하는 문자를 대체하는 문자를 상기 사용자에게 입력받으며, 5) 상기 대체하는 문자를 상기 사용자가 지우고자 하는 문자가 표시되었던 위치에 표시할 수 있다.In addition, the user device 1) displays an icon for erasing and re-entering a specific character of the character string of the equation on the display unit, 2) when the icon is selected by the user, the user erases it from the user 3) the character the user wants to erase is not displayed on the display unit, 4) the user inputs a character that replaces the character the user wants to erase, 5) the replacement A text may be displayed at a location where a text the user wants to erase was displayed.

또한, 상기 학습 데이터는 상기 사용자가 지우고자 하는 문자 및 상기 대체하는 문의 쌍으로만 구성될 수 있다.In addition, the learning data may be composed of only a pair of the text to be erased by the user and the text to be replaced.

본 명세서의 또 다른 일 양상은, AI(Artificial Intelligence) 모델을 학습시키기 위해, 사용자로부터 학습 데이터를 수집하기 위한, 수학식 인식 디바이스에 있어서, 상기 AI 모델이 포함된 메모리; 사용자 디바이스와 연결되어 신호를 송수신하기 위한 통신 모듈; 및 상기 메모리 및 상기 통신 모듈을 제어하는 프로세서;를 포함하며, 상기 프로세서는 상기 통신 모듈을 통해, 상기 사용자 디바이스로부터, 상기 사용자가 입력한 수학식이 포함된 이미지를 수신하고, 상기 사용자가 입력한 수학식이 포함된 이미지에 근거하여 : 상기 AI 모델을 이용하여, 1) 상기 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 예측하며, 상기 사용자 디바이스로 1) 상기 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 전송하고, 상기 사용자 디바이스로부터, 상기 수학식의 문자열 및 상기 후보자 리스트에 근거하여, 생성된 학습 데이터를 수신하고, 상기 학습 데이터를 이용하여, 상기 AI 모델을 학습시킬 수 있다.Another aspect of the present specification is a mathematical recognition device for collecting learning data from a user in order to learn an AI (Artificial Intelligence) model, comprising: a memory containing the AI model; A communication module for transmitting and receiving a signal by being connected to a user device; and a processor controlling the memory and the communication module, wherein the processor receives an image including a mathematical formula input by the user from the user device through the communication module, and receives an image including a mathematical formula input by the user, and Based on the image including the expression: Using the AI model, 1) a string of the equation and 2) a candidate list related to the string are predicted, and the user device receives 1) the string of the equation and 2) the Transmits a candidate list related to a character string, receives learning data generated based on the character string of the equation and the candidate list from the user device, and uses the learning data to learn the AI model. .

또한, 본 명세서의 또 다른 일 양상은, 사용자 디바이스가 AI(Artificial Intelligence) 모델을 학습시키기 위해, 사용자로부터 학습 데이터를 수집하는 방법에 있어서, 상기 사용자로부터 디스플레이부를 통해, 수학식을 입력받는 단계; 상기 수학식이 포함된 이미지를 수학식 인식 디바이스로 전송하는 단계; 상기 수학식 인식 디바이스로부터, 상기 수학식이 포함된 이미지에 근거하여, 상기 수학식 인식 디바이스에서 예측된 상기 수학식의 문자열을 수신하는 단계; 상기 사용자에게, 상기 디스플레이부를 통해, 상기 수학식의 문자열을 표시하는 단계; 상기 수학식의 문자열 중 특정 문자를 상기 디스플레이부 상에서 지우고, 다시 입력하기 위한 아이콘을 표시하는 단계; 상기 아이콘이 선택된 경우, 상기 사용자로부터 상기 사용자가 지우고자 하는 문자를 선택받는 단계; 상기 사용자가 지우고자 하는 문자를 상기 디스플레이부에 표시하지 않고, 상기 사용자가 지우고자 하는 문자를 대체하는 문자를 상기 사용자에게 입력받는 단계; 상기 대체하는 문자를 상기 사용자가 지우고자 하는 문자가 표시되었던 위치에 표시하는 단계; 및 상기 수학식 인식 디바이스로, 상기 대체하는 문자에 근거하여, 생성된 상기 AI 모델을 학습시키기 위한 학습 데이터를 전송하는 단계; 를 포함할 수 있다.In addition, another aspect of the present specification is a method of collecting learning data from a user in order for a user device to learn an AI (Artificial Intelligence) model, comprising: receiving an input of a mathematical expression from the user through a display unit; Transmitting the image including the mathematical expression to a mathematical recognition device; receiving, from the math recognition device, a character string of the equation predicted by the equation recognition device based on the image including the equation; displaying the character string of the mathematical expression to the user through the display unit; erasing a specific character from the character string of the equation on the display unit and displaying an icon for re-inputting; receiving, from the user, a character to be erased by the user when the icon is selected; receiving a text that replaces the text the user wants to erase without displaying the text the user wants to erase on the display unit; displaying the replacement text at the position where the text the user wants to erase was displayed; and transmitting, to the mathematical expression recognition device, learning data for learning the generated AI model based on the character to be replaced. can include

또한, 상기 학습 데이터는 상기 사용자가 지우고자 하는 문자 및 상기 대체하는 문자의 쌍으로만 구성될 수 있다.In addition, the learning data may be composed of only a pair of a letter to be erased by the user and a letter to be replaced.

본 명세서의 실시예에 따르면, 사용자가 수행하는 인식 결과에 도달하기 위한 사용과정을 통해, 인공지능 학습 데이터를 수집함으로서, 인식율 향상에 효과적인 순도 높은 데이터를 효율적으로 확보할 수 있다.According to the embodiments of the present specification, by collecting artificial intelligence learning data through a use process for reaching a recognition result performed by a user, it is possible to efficiently secure high-purity data effective for improving the recognition rate.

또한, 본 명세서의 실시예에 따르면, 이미 학습된 인공지능 모델에서 잘 인식하지 못하는 부분의 데이터를 효과적으로 학습할 수 있는 방법을 제안함으로서, 적은 추가 학습에도 인공지능 모델의 더 나은 성능을 기대할 수 있다.In addition, according to an embodiment of the present specification, by proposing a method for effectively learning data of a part that is not well recognized in an already learned artificial intelligence model, better performance of the artificial intelligence model can be expected with little additional learning. .

본 명세서에서 얻을 수 있는 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 명세서가 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.Effects obtainable in the present specification are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below. .

도 1은 본 명세서의 수학식을 인식하기 위한 시스템에 대한 구성을 개념적으로 설명하기 위한 도이다.
도 2는 본 명세서의 일 실시예에 따른 수학식 인식 디바이스의 구성을 설명하기 위한 도이다.
도 3은 본 명세서가 적용될 수 있는 DNN 모델의 예시이다.
도 4는 본 명세서에서 적용될 수 있는 일반적인 문자 인식에서 후보자 추천 방법의 예시이다.
도 5 및 도 6은 본 명세서가 적용될 수 있는 디스플레이부의 예시이다.
도 7은 기존의 수학식 인식 방법의 예시이다.
도 8은 본 명세서가 적용될 수 있는 후보자 리스트를 통한 학습 방법의 예시이다.
도 9는 본 명세서가 적용될 수 있는 지우개 기능을 통한 학습 방법의 예시이다.
도 10은 본 명세서가 적용될 수 있는 일 실시예이다.
도 11은 본 명세서가 적용될 수 있는 AI 프로세서의 일 실시예이다.
본 명세서에 관한 이해를 돕기 위해 상세한 설명의 일부로 포함되는, 첨부 도면은 본 명세서에 대한 실시예를 제공하고, 상세한 설명과 함께 본 명세서의 기술적 특징을 설명한다.1 is a diagram for conceptually explaining the configuration of a system for recognizing mathematical expressions of the present specification.
2 is a diagram for explaining the configuration of a mathematical expression recognition device according to an embodiment of the present specification.
3 is an example of a DNN model to which the present specification can be applied.
4 is an example of a candidate recommendation method in general text recognition that can be applied in this specification.
5 and 6 are examples of a display unit to which the present specification can be applied.
7 is an example of a conventional mathematical expression recognition method.
8 is an example of a learning method through a candidate list to which this specification can be applied.
9 is an example of a learning method through an eraser function to which the present specification can be applied.
10 is an embodiment to which the present specification may be applied.
11 is an embodiment of an AI processor to which the present specification can be applied.
The accompanying drawings, which are included as part of the detailed description to aid understanding of the present specification, provide examples of the present specification and describe technical features of the present specification together with the detailed description.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 또한, 본 명세서에 개시된 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 명세서의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. Hereinafter, the embodiments disclosed in this specification will be described in detail with reference to the accompanying drawings, but the same or similar components are given the same reference numerals regardless of reference numerals, and redundant description thereof will be omitted. The suffixes "module" and "unit" for components used in the following description are given or used together in consideration of ease of writing the specification, and do not have meanings or roles that are distinct from each other by themselves. In addition, in describing the embodiments disclosed in this specification, if it is determined that a detailed description of a related known technology may obscure the gist of the embodiment disclosed in this specification, the detailed description thereof will be omitted. In addition, the accompanying drawings are only for easy understanding of the embodiments disclosed in this specification, the technical idea disclosed in this specification is not limited by the accompanying drawings, and all changes included in the spirit and technical scope of this specification , it should be understood to include equivalents or substitutes.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms including ordinal numbers, such as first and second, may be used to describe various components, but the components are not limited by the terms. These terms are only used for the purpose of distinguishing one component from another.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.It is understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle. It should be. On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 출원에서, "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this application, terms such as "comprise" or "have" are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded.

도 1은 본 명세서의 수학식을 인식하기 위한 시스템에 대한 구성을 개념적으로 설명하기 위한 도이다.1 is a diagram for conceptually explaining the configuration of a system for recognizing mathematical expressions of the present specification.

도 1을 살펴보면, 수학식을 인식하기 위한 시스템은 사용자 디바이스(100) 및 수학식 인식 디바이스(300)를 포함할 수 있다.Referring to FIG. 1 , a system for recognizing a mathematical expression may include a user device 100 and a mathematical expression recognition device 300 .

사용자 디바이스(100)는 통신 네트워크를 이용하여 수학식 인식 디바이스(300)에 접속할 수 있다. 사용자는 사용자 디바이스(100)를 이용하여 수학식 인식 디바이스(300)에 접속하고, 수학식 인식 디바이스(300)와 데이터를 송수신함으로써, 수학식을 입력하고, 판독 받을 수 있다.The user device 100 may access the mathematical expression recognition device 300 using a communication network. A user may input and receive a mathematical expression by accessing the mathematical expression recognition device 300 using the user device 100 and transmitting/receiving data with the mathematical expression recognition device 300 .

사용자 디바이스(100)는 IP 할당된 디바이스로서 인터넷등을 통해 네트워크 통신을 수행할 수 있다. 사용자 디바이스(100)는 휴대성과 이동성이 보장되는 무선 통신 디바이스일 수 있다. 예를 들어 사용자 디바이스(100)는 네비게이션, PCS(Personal Communication System), GSM(Global System for Mobile communications), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), Wibro(Wireless Broadband Internet) 단말, 스마트폰(smartphone), 스마트 패드(smart pad), 타블렛 PC(Tablet PC) 등과 같은 모든 종류의 핸드 헬드(Hand held) 기반의 무선 통신 디바이스, 데스크탑 PC(desktop PC), 슬레이트 PC(slate PC), 노트북(랩탑) 컴퓨터(notebook(laptop) computer), PMP(Portable Multimedia Player) 등을 포함할 수 있다. The user device 100 is an IP-assigned device and can perform network communication through the Internet or the like. The user device 100 may be a wireless communication device that ensures portability and mobility. For example, the user device 100 includes navigation, PCS (Personal Communication System), GSM (Global System for Mobile communications), PDC (Personal Digital Cellular), PHS (Personal Handyphone System), PDA (Personal Digital Assistant), IMT ( International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)-2000, W-CDMA (W-Code Division Multiple Access), Wibro (Wireless Broadband Internet) terminal, smartphone, smart pad, All kinds of hand held based wireless communication devices such as tablet PCs, desktop PCs, slate PCs, notebook(laptop) computers, PMP (Portable Multimedia Player) and the like may be included.

본 명세서가 적용 가능한 사용 디바이스(100)는 상술한 종류에 한정되지 않고, 외부 디바이스와 통신이 가능한 형태의 디바이스를 모두 포함할 수 있음은 당연하다.It is natural that the use device 100 to which this specification can be applied is not limited to the above-described types, and may include all types of devices capable of communicating with external devices.

상술한 사용자 디바이스(100)는 수학식을 인식하기 위한 앱을 설치 또는 탑재할 수 있다. 사용자는 사용자 디바이스(100)에 설치된 앱을 통해 수학식 인식 디바이스(300)에 접속함으로써, 수학식 인식 디바이스(300)로부터 후술되는 수학식 인식을 위한 서비스를 제공받을 수 있다.The above-described user device 100 may install or mount an app for recognizing mathematical formulas. The user may receive a service for recognizing a mathematical expression described later from the mathematical recognition device 300 by accessing the mathematical expression recognition device 300 through an app installed in the user device 100 .

수학식 인식 디바이스(300)는 통신 네트워크를 통해 접속된 사용자 디바이스(100)에 입력된 수학식을 판독할 수 있다. 수학식 인식 디바이스(300)는 프로세서(310), 메모리(330) 및 통신 모듈(350) 등을 포함할 수 있다.The mathematical expression recognition device 300 may read mathematical expressions input to the user device 100 connected through a communication network. The mathematical expression recognition device 300 may include a processor 310, a memory 330, a communication module 350, and the like.

프로세서(310)는 언어처리를 수행하는 하드웨어/소프트웨어를 포함하고, 소프트웨어의 구동과 수학식 인식 디바이스(300)의 입출력과 기능 등을 제어할 수 있다.The processor 310 may include hardware/software that performs language processing, and may control driving of software and input/output and functions of the mathematical expression recognition device 300 .

메모리(330)는 미리 구축되어 있는 지식 베이스로 데이터를 저장하고 기계학습에 관련한 다양한 데이터베이스를 보관할 수 있다. 지식 베이스는 사용자로부터 입력되는 수학식을 판독하기 위한, 다양한 정보를 포함할 수 있다.The memory 330 may store data as a pre-built knowledge base and store various databases related to machine learning. The knowledge base may include various types of information for reading a mathematical expression input from a user.

통신 모듈(330)은 무선 통신 모듈 또는 RF 모듈를 포함할 수 있다. 무선 통신 모듈은, 예를 들면, Wi-Fi, BT, GPS 또는 NFC를 포함할 수 있다.The communication module 330 may include a wireless communication module or an RF module. The wireless communication module may include, for example, Wi-Fi, BT, GPS or NFC.

수학식 인식 디바이스(300)에 관한 자세한 내용은 도 2에서 자세하게 설명하기로 한다.Details of the mathematical expression recognition device 300 will be described in detail with reference to FIG. 2 .

도 2는 본 명세서의 일 실시예에 따른 수학식 인식 디바이스의 구성을 설명하기 위한 도이다.2 is a diagram for explaining the configuration of a mathematical expression recognition device according to an embodiment of the present specification.

도 2를 살펴보면, 본 명세서의 일 실시예에 따른 수학식을 인식하기 위한 디바이스(300)는 프로세서(310), 메모리(330), 통신 모듈(350)을 포함할 수 있다.Referring to FIG. 2 , a device 300 for recognizing a mathematical expression according to an embodiment of the present specification may include a processor 310, a memory 330, and a communication module 350.

프로세서(310)는 하나 이상의 어플리케이션 프로세서(application processor, AP), 하나 이상의 커뮤니케이션 프로세서(communication processor, CP) 또는 적어도 하나 이상의 AI 프로세서(artificial intelligence processor)를 포함할 수 있다. 어플리케이션 프로세서, 커뮤니케이션 프로세서 또는 AI 프로세서는 서로 다른 IC(integrated circuit) 패키지들 내에 각각 포함되거나 하나의 IC 패키지 내에 포함될 수 있다.The processor 310 may include one or more application processors (APs), one or more communication processors (CPs), or one or more AI processors (artificial intelligence processors). The application processor, communication processor, or AI processor may be each included in different integrated circuit (IC) packages or included in one IC package.

어플리케이션 프로세서는 운영체제 또는 응용 프로그램을 구동하여 어플리케이션 프로세서에 연결된 다수의 하드웨어 또는 소프트웨어 구성요소들을 제어하고, 멀티미디어 데이터를 포함한 각종 데이터 처리/연산을 수행할 수 있다. 일 예로, 어플리케이션 프로세서는 SoC(system on chip)로 구현될 수 있다. 프로세서(310)는 GPU(graphic prcessing unit)를 더 포함할 수 있다.The application processor may control a plurality of hardware or software components connected to the application processor by driving an operating system or an application program, and may process/calculate various data including multimedia data. For example, the application processor may be implemented as a system on chip (SoC). The processor 310 may further include a graphic processing unit (GPU).

커뮤니케이션 프로세서는 네트워크로 연결된 사용자 디바이스(100)와의 통신에서 데이터 링크를 관리하고 통신 프로토콜을 변환하는 기능을 수행할 수 있다. 일 예로, 커뮤니케이션 프로세서는 SoC로 구현될 수 있다. 커뮤니케이션 프로세서는 멀티미디어 제어 기능의 적어도 일부를 수행할 수 있다.The communication processor may perform a function of managing a data link and converting a communication protocol in communication with the user device 100 connected to the network. For example, the communication processor may be implemented as an SoC. The communications processor may perform at least part of the multimedia control function.

또한, 커뮤니케이션 프로세서는 통신 모듈(350)의 데이터 송수신을 제어할 수 있다. 커뮤니케이션 프로세서는 어플리케이션 프로세서의 적어도 일부로 포함되도록 구현될 수도 있다.Also, the communication processor may control data transmission and reception of the communication module 350 . The communication processor may be implemented to be included as at least a part of the application processor.

어플리케이션 프로세서 또는 커뮤니케이션 프로세서는 각각에 연결된 비휘발성 메모리 또는 다른 구성요소 중 적어도 하나로부터 수신한 명령 또는 데이터를 휘발성 메모리에 로드(load)하여 처리할 수 있다. 또한, 어플리케이션 프로세서 또는 커뮤니케이션 프로세서는 다른 구성요소 중 적어도 하나로부터 수신하거나 다른 구성요소 중 적어도 하나에 의해 생성된 데이터를 비휘발성 메모리에 저장할 수 있다.The application processor or communication processor may load a command or data received from at least one of a non-volatile memory or other components connected thereto into the volatile memory and process the load. Also, the application processor or communication processor may store data received from at least one of the other components or generated by at least one of the other components in a non-volatile memory.

메모리(330)는 내장 메모리 또는 외장 메모리를 포함할 수 있다. 내장 메모리는 휘발성 메모리(예를 들면, DRAM(dynamic RAM), SRAM(static RAM), SDRAM(synchronous dynamic RAM) 등) 또는 비휘발성 메모리 비휘발성 메모리(예를 들면, OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM, NAND flash memory, NOR flash memory 등) 중 적어도 하나를 포함할 수 있다. 일례에 따르면, 내장 메모리는 SSD(solid state drive)의 형태를 취할 수도 있다. 외장 메모리는 플래시 드라이브(flash drive), 예를 들면, CF(compact flash), SD(secure digital), Micro-SD(micro secure digital), Mini-SD(mini secure digital), xD(extreme digital) 또는 메모리 스틱(memory stick) 등을 포함할 수 있다.The memory 330 may include a built-in memory or an external memory. The built-in memory includes volatile memory (eg, dynamic RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), etc.) or non-volatile memory (eg, one time programmable ROM (OTPROM)), It may include at least one of programmable ROM (PROM), erasable and programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), mask ROM, flash ROM, NAND flash memory, NOR flash memory, etc.). According to one example, the embedded memory may take the form of a solid state drive (SSD). The external memory is a flash drive, for example, compact flash (CF), secure digital (SD), micro secure digital (Micro-SD), mini secure digital (Mini-SD), extreme digital (xD), or It may include a memory stick and the like.

통신 모듈(350)은 무선 통신 모듈 또는 RF 모듈를 포함할 수 있다. 무선 통신 모듈은, 예를 들면, Wi-Fi, BT, GPS 또는 NFC를 포함할 수 있다. 예를 들면, 무선 통신 모듈은 무선 주파수를 이용하여 무선 통신 기능을 제공할 수 있다. 추가적으로 또는 대체적으로, 무선 통신 모듈은 사용자 디바이스(100)를 네트워크(예: Internet, LAN, WAN, telecommunication network, cellular network, satellite network, POTS 또는 5G network 등)와 연결시키기 위한 네트워크 인터페이스 또는 모뎀 등을 포함할 수 있다.The communication module 350 may include a wireless communication module or an RF module. The wireless communication module may include, for example, Wi-Fi, BT, GPS or NFC. For example, the wireless communication module may provide a wireless communication function using a radio frequency. Additionally or alternatively, the wireless communication module includes a network interface or modem for connecting the user device 100 to a network (eg, Internet, LAN, WAN, telecommunication network, cellular network, satellite network, POTS or 5G network, etc.) can include

RF 모듈은 데이터의 송수신, 예를 들면, RF 신호 또는 호출된 전자 신호의 송수신을 담당할 수 있다. 일 예로, RF 모듈은 트랜시버(transceiver), PAM(power amp module), 주파수 필터(frequency filter) 또는 LNA(low noise amplifier) 등을 포함할 수 있다. 또한, RF 모듈은 무선 통신에서 자유공간상의 전자파를 송수신하기 위한 부품, 예를 들면, 도체 또는 도선 등을 포함할 수 있다.The RF module may be responsible for transmitting and receiving data, for example, transmitting and receiving RF signals or called electronic signals. For example, the RF module may include a transceiver, a power amp module (PAM), a frequency filter, or a low noise amplifier (LNA). In addition, the RF module may include components for transmitting and receiving electromagnetic waves in free space in wireless communication, for example, conductors or wires.

프로세서(310)는 AI 프로세서(311), 데이터 학습부(311a), 데이터 전처리부(311b), 데이터 선택부(311c), 모델 평가부(311d) 및 응답 모듈(315) 등을 포함할 수 있다.The processor 310 may include an AI processor 311, a data learning unit 311a, a data preprocessing unit 311b, a data selection unit 311c, a model evaluation unit 311d, a response module 315, and the like. .

AI 프로세서(311)는 메모리(330)에 저장된 프로그램을 이용하여 신경망을 학습할 수 있다. 특히, AI 프로세서(311)는 사용자 디바이스(100) 간로부터 입력된 수학식을 인식하기 위한 신경망을 학습할 수 있다. 여기서, 신경망은 인간의 뇌 구조(예를 들어, 인간의 신경망의 뉴런 구조)를 컴퓨터 상에서 모의하도록 설계될 수 있다. 신경망은 입력층(input layer), 출력층(output layer) 및 적어도 하나의 은닉층(hidden layer)를 포함할 수 있다. 각 층은 가중치를 갖는 적어도 하나의 뉴런을 포함하고, 신경망은 뉴런과 뉴런을 연결하는 시냅스(synapse)를 포함할 수 있다. 신경망에서 각 뉴런은 시냅스를 통해 입력되는 입력 신호를 가중치(weight) 및/또는 편향(bias)에 대한 활성함수(activation function)의 함수값으로 출력할 수 있다.The AI processor 311 may learn a neural network using a program stored in the memory 330 . In particular, the AI processor 311 may learn a neural network for recognizing mathematical formulas input from the user devices 100 . Here, the neural network may be designed to simulate a human brain structure (eg, a neuron structure of a human neural network) on a computer. A neural network may include an input layer, an output layer, and at least one hidden layer. Each layer may include at least one neuron having a weight, and the neural network may include neurons and synapses connecting the neurons. In the neural network, each neuron may output an input signal input through a synapse as a function value of an activation function for weight and/or bias.

복수의 네트워크 모드들은 뉴런이 시냅스를 통해 신호를 주고받는 뉴런의 시냅틱 활동을 모의하도록 각각 연결 관계에 따라 데이터를 주고받을 수 있다. 여기서 신경망은 신경망 모델에서 발전한 딥러닝 모델을 포함할 수 있다. 딥러닝 모델에서 복수의 네트워크 노드들은 서로 다른 레이어에 위치하면서 콘볼루션(convolution) 연결 관계에 따라 데이터를 주고받을 수 있다. 신경망 모델의 예는 심층 신경망(deep neural network, DNN), 합성곱 신경망(convolutional neural network, CNN), 순환 신경망(recurrent neural network), 제한 볼츠만 머신(restricted Boltzmann machine), 심층 신뢰 신경망(deep belief network), 심층 Q-네트워크(deep Q-Network)와 같은 다양한 딥러닝 기법들을 포함하며, 비전인식, 음성인식, 자연어처리, 음성/신호처리 등의 분야에서 적용될 수 있다.A plurality of network modes may transmit and receive data according to a connection relationship, respectively, so as to simulate synaptic activity of neurons that transmit and receive signals through synapses. Here, the neural network may include a deep learning model developed from a neural network model. In the deep learning model, a plurality of network nodes may exchange data according to a convolutional connection relationship while being located in different layers. Examples of neural network models include deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks, restricted Boltzmann machines, and deep belief networks. ), and deep Q-Network, and can be applied to fields such as vision recognition, voice recognition, natural language processing, and voice/signal processing.

한편, 상술한 바와 같은 기능을 수행하는 프로세서(310)는 범용 프로세서(예를 들어, CPU)일 수 있으나, 인공지능 학습을 위한 AI 전용 프로세서(예를 들어, GPU)일 수 있다.Meanwhile, the processor 310 performing the functions described above may be a general-purpose processor (eg, CPU), or may be an AI-only processor (eg, GPU) for artificial intelligence learning.

메모리(330)는 사용자 디바이스(100) 및/또는 수학식 인식 디바이스(300)의 동작에 필요한 각종 프로그램 및 데이터를 저장할 수 있다. 메모리(330)는 AI 프로세서(311)에 의해 액세스되며, AI 프로세서(311)에 의한 데이터의 독취/기록/수정/삭제/갱신 등이 수행될 수 있다. 또한, 메모리(330)는 본 명세서의 일 실시예에 따른 데이터 분류/인식을 위한 학습 알고리즘을 통해 생성된 신경망 모델(예를 들어, 딥러닝 모델)을 저장할 수 있다. 나아가, 메모리(330)는 학습 모델(221) 뿐만 아니라, 입력 데이터, 학습 데이터, 학습 히스토리 등을 저장할 수도 있다.The memory 330 may store various programs and data necessary for the operation of the user device 100 and/or the mathematical expression recognition device 300 . The memory 330 is accessed by the AI processor 311, and reading/writing/modifying/deleting/updating of data by the AI processor 311 may be performed. In addition, the memory 330 may store a neural network model (eg, a deep learning model) generated through a learning algorithm for data classification/recognition according to an embodiment of the present specification. Furthermore, the memory 330 may store not only the learning model 221 but also input data, learning data, learning history, and the like.

한편, AI 프로세서(311)는 데이터 분류/인식을 위한 신경망을 학습하는 데이터 학습부(311a)를 포함할 수 있다. 데이터 학습부(311a)는 데이터 분류/인식을 판단하기 위하여 어떤 학습 데이터를 이용할지, 학습 데이터를 이용하여 데이터를 어떻게 분류하고 인식할지에 관한 기준을 학습할 수 있다. 데이터 학습부(311a)는 학습에 이용될 학습 데이터를 획득하고, 획득된 학습데이터를 딥러닝 모델에 적용함으로써, 딥러닝 모델을 학습할 수 있다.Meanwhile, the AI processor 311 may include a data learning unit 311a that learns a neural network for data classification/recognition. The data learning unit 311a may learn criteria for which training data to use to determine data classification/recognition and how to classify and recognize data using the training data. The data learning unit 311a may acquire learning data to be used for learning and learn the deep learning model by applying the obtained learning data to the deep learning model.

데이터 학습부(311a)는 적어도 하나의 하드웨어 칩 형태로 제작되어 수학식 인식 디바이스(300)에 탑재될 수 있다. 일 예로, 데이터 학습부(311a)는 인공지능을 위한 전용 하드웨어 칩 형태로 제작될 수 있고, 범용 프로세서(CPU) 또는 그래픽 전용 프로세서(GPU)의 일부로 제작되어 수학식 인식 디바이스(300)에 탑재될 수 있다. 또한, 데이터 학습부(311a)는 소프트웨어 모듈로 구현될 수도 있다. 소프트웨어 모듈(또는 인스트럭션(instruction)을 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록 매체(non-transitory computer readable media)에 저장될 수 있다. 이 경우에 적어도 하나의 소프트웨어 모듈은 OS(operating system)에 제공되거나, 애플리케이션에 의해 제공될 수 있다.The data learning unit 311a may be manufactured in the form of at least one hardware chip and mounted in the mathematical expression recognition device 300 . For example, the data learning unit 311a may be manufactured in the form of a dedicated hardware chip for artificial intelligence, and may be manufactured as a part of a general-purpose processor (CPU) or a graphics-only processor (GPU) to be installed in the mathematical recognition device 300. can Also, the data learning unit 311a may be implemented as a software module. When implemented as a software module (or a program module including instructions), the software module may be stored in a computer-readable, non-transitory computer readable recording medium (non-transitory computer readable media). In this case, at least one software module may be provided to an operating system (OS) or may be provided by an application.

데이터 학습부(311a)는 획득된 학습 데이터를 이용하여, 신경망 모델이 소정의 데이터를 어떻게 분류/인식할지에 관한 판단기준을 가지도록 학습할 수 있다. 이때, 데이터 학습부(311a)에 의한 학습 방식은 지도 학습(supervised learning), 비지도 학습(unsupervised learning), 강화 학습(reinforcement learning)으로 분류될 수 있다. 여기서, 지도 학습은 학습 데이터에 대한 레이블(label)이 주어진 상태에서 인공 신경망을 학습시키는 방법을 지칭하며, 레이블이란 학습 데이터가 인공 신경망에 입력되는 경우 인공 신경망이 추론해야 하는 정답(또는 결과 값)을 의미할 수 있다. 비지도 학습은 학습 데이터에 대한 레이블이 주어지지 않는 상태에서 인공 신경망을 학습시키는 방법을 의미할 수 있다. 강화 학습은 특정 환경 안에서 정의된 에이전트(agent)가 각 상태에서 누적 보상을 최대화하는 행동 혹은 행동 순서를 선택하도록 학습시키는 방법을 의미할 수 있다. 또한, 데이터 학습부(311a)는 오류 역전파법(backpropagation) 또는 경사 하강법(gradient decent)을 포함하는 학습 알고리즘을 이용하여 신경망 모델을 학습시킬 수 있다. 신경망 모델이 학습되면 학습된 신경망 모델은 학습 모델(331)이라 호칭할 수 있다. 학습 모델(331)은 메모리(330)에 저장되어 학습 데이터가 아닌 새로운 입력 데이터에 대한 결과를 추론하는 데 사용될 수 있다.The data learning unit 311a may learn to have a criterion for determining how to classify/recognize predetermined data by using the obtained training data. At this time, the learning method by the data learning unit 311a may be classified into supervised learning, unsupervised learning, and reinforcement learning. Here, supervised learning refers to a method of learning an artificial neural network given a label for training data, and a label is an answer (or a result value) that the artificial neural network must infer when learning data is input to the artificial neural network. can mean Unsupervised learning may refer to a method of training an artificial neural network in a state in which a label for training data is not given. Reinforcement learning may refer to a method of learning to select an action or an action sequence that maximizes a cumulative reward in each state by an agent defined in a specific environment. In addition, the data learning unit 311a may train the neural network model using a learning algorithm including backpropagation or gradient descent. When the neural network model is learned, the learned neural network model may be referred to as a learning model 331 . The learning model 331 may be stored in the memory 330 and used to infer results for new input data other than learning data.

한편, AI 프로세서(311)는 학습 모델(331)을 이용한 분석 결과를 향상시키거나, 학습 모델(331)의 생성에 필요한 리소스 또는 시간을 절약하기 위하여 데이터 전처리부(311b) 및/또는 데이터 선택부(311c)를 더 포함할 수도 있다.On the other hand, the AI processor 311 is a data preprocessing unit 311b and/or a data selection unit in order to improve analysis results using the learning model 331 or to save resources or time required for generating the learning model 331. (311c) may be further included.

데이터 전처리부(311b)는 획득된 데이터가 상황 판단을 위한 학습/추론에 이용될 수 있도록, 획득된 데이터를 전처리할 수 있다. 일 예로, 데이터 전처리부(311b)는 입력 장치를 통해 획득된 입력 데이터에 대하여 전처리로서 특징 정보(feature information)을 추출할 수 있으며, 특징 정보는 특징 벡터(feature vector), 특징점(feature point) 또는 특징맵(feature map) 등의 포맷으로 추출될 수 있다.The data pre-processing unit 311b may pre-process the acquired data so that the acquired data can be used for learning/reasoning for situation determination. For example, the data preprocessing unit 311b may extract feature information as preprocessing for input data acquired through an input device, and the feature information may include a feature vector, a feature point, or It can be extracted in a format such as a feature map.

데이터 선택부(311c)는 데이터 학습부(311a) 또는 데이터 전처리부(311b)에서 전처리된 학습 데이터 중 학습에 필요한 데이터를 선택할 수 있다. 선택된 학습 데이터는 학습 모델(331)에 제공될 수 있다. 일 예로, 데이터 선택부(311c)는 사용자가 음성, 문자 또는 다양한 입력 장치를 통해 입력한 수학식에 근거하여, "정답"으로 판단된 수학식에 포함된 객체에 대한 데이터만을 학습 데이터로 선택할 수 있다. 또한, 데이터 선택부(311c)는 입력 장치를 통해 획득된 입력 데이터 또는 전처리부에서 전처리된 입력 데이터 중 추론에 필요한 데이터를 선택할 수도 있다.The data selector 311c may select data necessary for learning from among the learning data preprocessed by the data learner 311a or the data preprocessor 311b. The selected training data may be provided to the learning model 331 . For example, the data selection unit 311c may select only data about an object included in a mathematical formula determined to be “correct” as training data based on a mathematical formula input by a user through voice, text, or various input devices. there is. Also, the data selector 311c may select data necessary for reasoning from among input data obtained through an input device or input data preprocessed by a preprocessor.

또한, AI 프로세서(311)는 신경망 모델의 분석 결과를 향상시키기 위하여 모델 평가부(311d)를 더 포함할 수 있다. 모델 평가부(311d)는, 신경망 모델에 평가 데이터를 입력하고, 평가 데이터로부터 출력되는 분석 결과가 소정 기준을 만족하지 못하는 경우, 모델 학습부로 하여금 다시 학습하도록 할 수 있다. 이 경우, 평가 데이터는 학습 모델(331)을 평가하기 위한 기 설정된 데이터일 수 있다. 일 예로, 모델 평가부(311d)는 평가 데이터에 대한 학습된 신경망 모델의 분석 결과 중, 분석 결과가 정확하지 않은 평가 데이터의 개수 또는 비율이 미리 설정된 임계치를 초과하는 경우에 소정 기준을 만족하지 못한 것으로 평가할 수 있다.In addition, the AI processor 311 may further include a model evaluation unit 311d to improve the analysis result of the neural network model. The model evaluation unit 311d inputs evaluation data to the neural network model, and when an analysis result output from the evaluation data does not satisfy a predetermined criterion, it may cause the model learning unit to learn again. In this case, the evaluation data may be preset data for evaluating the learning model 331 . For example, the model evaluator 311d determines whether a predetermined criterion is not satisfied when the number or ratio of evaluation data for which the analysis result is not accurate among the analysis results of the learned neural network model for the evaluation data exceeds a preset threshold. can be evaluated as

응답 모듈(315)은 프로세서의 제어 하에 사용자 디바이스에게 입력된 수학식의 후보자를 응답 할 수 있다. 이에 대한 자세한 설명은 후술하기로 한다.The response module 315 may respond to the candidate of the mathematical expression input to the user device under the control of the processor. A detailed description of this will be described later.

통신 모듈(350)은 AI 프로세서(311)에 의한 AI 프로세싱 결과를 사용자 디바이스(100)로 전송할 수 있다.The communication module 350 may transmit the AI processing result by the AI processor 311 to the user device 100 .

사용자 디바이스(100)도 전술한 수학식 인식 디바이스(300)의 구성을 포함할 수 있다.The user device 100 may also include the above-described structure of the mathematical recognition device 300 .

DNN(Deep Neural Network) 모델Deep Neural Network (DNN) model

도 3은 본 명세서가 적용될 수 있는 DNN 모델의 예시이다.3 is an example of a DNN model to which the present specification can be applied.

심층 신경망(Deep Neural Network, DNN)은 입력층(input layer)과 출력층(output layer) 사이에 여러 개의 은닉층(hidden layer)들로 이루어진 인공신경망(Artificial Neural Network, ANN)이다. 심층 신경망은 일반적인 인공신경망과 마찬가지로 복잡한 비선형 관계(non-linear relationship)들을 모델링할 수 있다.A deep neural network (DNN) is an artificial neural network (ANN) composed of several hidden layers between an input layer and an output layer. Deep neural networks can model complex non-linear relationships, just like regular artificial neural networks.

예를 들어, 사물 식별 모델을 위한 심층 신경망 구조에서는 각 객체가 이미지 기본 요소들의 계층적 구성으로 표현될 수 있다. 이때, 추가 계층들은 점진적으로 모여진 하위 계층들의 특징들을 규합시킬 수 있다. 심층 신경망의 이러한 특징은, 비슷하게 수행된 인공신경망에 비해 더 적은 수의 유닛(unit, node)들 만으로도 복잡한 데이터를 모델링할 수 있게 해준다.For example, in a deep neural network structure for an object identification model, each object may be represented as a hierarchical composition of basic image elements. In this case, the additional layers may consolidate the characteristics of the gradually gathered lower layers. This feature of deep neural networks allows complex data to be modeled with fewer units (units, nodes) compared to similarly performed artificial neural networks.

은닉층의 개수가 많아질수록 인공신경망이　'깊어졌다(deep)'고 부르며, 이렇게　충분히 깊어진 인공신경망을 러닝 모델로 사용하는 머신러닝 패러다임을 바로　딥러닝(Deep Learning)이라고 한다. 그리고, 이러한 딥러닝을 위해 사용하는 충분히 깊은 인공신경망이　심층 신경망(DNN: Deep neural network)이라고 통칭된다.As the number of hidden layers increases, the artificial neural network is called “deep.” The machine learning paradigm that uses a sufficiently deep artificial neural network as a learning model is called deep learning. In addition, a sufficiently deep artificial neural network used for such deep learning is collectively referred to as a deep neural network (DNN).

본 명세서 상에서 AI 프로세서(311) 및/또는 학습 모델(331)에서 이러한 딥러닝 방식을 위해 사용되는 인공신경망을 DNN으로 통칭하고 있으나, 이와 유사한 방식으로 의미있는 데이터를 출력할 수 있다면, 다른 방식의 딥러닝 방식이 적용될 수 있음은 물론이다.In this specification, the artificial neural network used for this deep learning method in the AI processor 311 and / or learning model 331 is collectively referred to as DNN, but if meaningful data can be output in a similar way, other methods It goes without saying that a deep learning method can be applied.

도 4는 본 명세서에서 적용될 수 있는 일반적인 문자 인식에서 후보자 추천 방법의 예시이다.4 is an example of a candidate recommendation method in general text recognition that can be applied in this specification.

도 4를 참조하면, 사용자는 사용자 디바이스(100)에 포함된 디스플레이부(410)를 통해, 문자를 입력할 수 있다. 예를 들어, 디스플레이부(410)는 터치 센서와 상호 레이어 구조를 이루거나 일체형으로 형성됨으로써, 터치 스크린을 구현할 수 있다. 이러한 터치 스크린은, 사용자 디바이스(100)와 사용자 사이의 입력 인터페이스를 제공하는 입력부로서 기능함과 동시에, 사용자 디바이스(100)와 사용자 사이의 출력 인터페이스를 제공할 수 있다.Referring to FIG. 4 , a user may input text through the display unit 410 included in the user device 100 . For example, the display unit 410 may implement a touch screen by forming a mutual layer structure or integrally with a touch sensor. Such a touch screen may function as an input unit providing an input interface between the user device 100 and the user, and may provide an output interface between the user device 100 and the user.

디스플레이부(410)는 사용자가 입력한 문자를 사용자에게 출력할 수 있다. 사용자 디바이스(100)는 입력되는 문자의 굵기와 연관된 PenSize를 사용자에게 입력받을 수 있다. The display unit 410 may output text input by the user to the user. The user device 100 may receive a PenSize associated with the thickness of an input character from the user.

사용자 디바이스(100)는 수학식 인식 디바이스(300)로부터, 사용자가 입력한 문자의 후보자 리스트(420)를 수신하고, 디스플레이 할 수 있다. 예를 들어, 사용자가 디스플레이부(410)에'가나다'를 필기하는 경우, 수학식 인식 디바이스(300)는 학습 모델(331)을 통해, '가나다', '가나나', '가나아' 등등 예상되는 후보자 리스트를 생성하고, 사용자 디바이스(100)로 전달 할 수 있다. 사용자는 디스플레이부(410)에 표시된 후보자 리스트 중에서 적합한 후보자를 선택할 수 있다.The user device 100 may receive the candidate list 420 of the character input by the user from the mathematical expression recognition device 300 and display the candidate list 420 . For example, when the user writes 'Kanada' on the display unit 410, the mathematical expression recognition device 300 uses the learning model 331 to write 'Kanada', 'Kanana', 'Kanaa', etc. An expected candidate list may be generated and transmitted to the user device 100 . The user can select a suitable candidate from the list of candidates displayed on the display unit 410 .

다만, 일반적으로 사용되는 상기 방식을 수학식 인식에 그대로 적용하기에는 어려움이 있다. 예를 들어, 수학식은 짧은 한,두개의 숫자, 문자로 구성될 수도 있지만 분수, 첨자, 상하 결합형태 등과 같이, 문자와 다른 복잡한 구조로 구성될 수 있다. 또한, 이러한 복잡한 구조의 결합 형태를 갖는 수학식이 사용자 디바이스(100) 입력되는 경우, 학습 모델(331)이 생성하는 후보자 리스트의 경우의 수가 너무 많아지는 문제가 발생할 수 있다.However, it is difficult to apply the generally used method to mathematical expression recognition as it is. For example, a mathematical expression may consist of one or two short numbers and letters, but may also consist of complex structures other than letters, such as fractions, subscripts, and upper and lower combinations. In addition, when an equation having such a complicated combination of structures is input to the user device 100, a problem in that the number of cases in the candidate list generated by the learning model 331 is too large may occur.

도 5 및 도 6은 본 명세서가 적용될 수 있는 디스플레이부의 예시이다.5 and 6 are examples of a display unit to which the present specification can be applied.

도 5 및 도 6을 참조하면, 사용자 디바이스(100)는 전술한 문제점을 해결하기 위해, 입력된 수학식의 요소별 후보자 리스트를 수학식 인식 디바이스(300)로부터, 전달 받을 수 있다.Referring to FIGS. 5 and 6 , the user device 100 may receive a candidate list for each element of the input equation from the equation recognition device 300 in order to solve the above problem.

예를 들어, 수학식 인식 디바이스(300)는 사용자 디바이스(100)로부터 수신한 수학식이 포함된 이미지를 학습 모델(331)을 이용하여, 수학식의 각 요소를 예측하고, 각 요소들의 후보자 리스트를 생성할 수 있다.For example, the equation recognition device 300 predicts each element of the equation by using the learning model 331 for the image including the equation received from the user device 100, and generates a candidate list of each element. can create

도 5(a)를 참조하면, 사용자는 사용자 디바이스(100)에 수학식을 입력할 수 있다. 예를 들어, 사용자는 디스플레이부(410)를 통해, 수학식을 입력할 수 있다.Referring to FIG. 5 ( a ) , a user may input a mathematical expression into the user device 100 . For example, a user may input a mathematical expression through the display unit 410 .

사용자 디바이스(100)는 입력된 수학식을 표시할 수 있으며, 입력된 수학식이 포함된 이미지를 수학식 인식 디바이스(300)로 전송할 수 있다. 수학식 인식 디바이스(300)는 수학식이 포함된 이미지를 분석하여, 1) 수학식의 문자열 및 2) 수학식의 문자열에 포함된 각 요소들의 후보자 리스트를 판단하고, 사용자 디바이스(100)로 전송할 수 있다.The user device 100 may display the input mathematical formula and transmit an image including the input mathematical formula to the mathematical recognition device 300 . The equation recognition device 300 analyzes the image including the equation to determine 1) a character string of the equation and 2) a candidate list of elements included in the equation string, and transmits the result to the user device 100. there is.

예를 들어, 수학식의 문자열은 학습 모델(331)이 예측한, 가장 정확도가 높은 요소들의 집합일 수 있다.For example, the string of equations may be a set of elements predicted by the learning model 331 with the highest accuracy.

또한, 후보자 리스트는 학습 모델(331)이 예측한, 일정 범위의 정확도를 갖는 요소들이 포함된 리스트일 수 있다.Also, the candidate list may be a list including elements predicted by the learning model 331 and having accuracy within a certain range.

도 5(b)를 참조하면, 사용자 디바이스(100)는 수신한 수학식의 문자열을 표시할 수 있다. 다만, 이 경우에도 사용자는 도 5(c)의 수학식을 의도하고, 입력하였으나, 학습 모델(331)은 "0" 요소를 "ㅇ"또는 "o" 요소로 잘못 예측하여, 사용자 디바이스(100)에 전달할 수 있다.Referring to FIG. 5( b ), the user device 100 may display the received mathematical expression string. However, even in this case, the user intends and inputs the equation of FIG. ) can be passed on.

도 6(a), 도 6(b) 및 도 6(c)를 참조하면, 사용자가 수학식의 문자열 중에 오류가 있는 문자를 선택하면, 사용자 디바이스(100)는 후보자 리스트에서 선택된 문자와 관련된 후보자들을 표시할 수 있다.Referring to FIGS. 6(a), 6(b) and 6(c) , when a user selects an erroneous character in a character string of a mathematical expression, the user device 100 provides a candidate related to the character selected from the candidate list. can display them.

사용자는 표시된 후보자들 중에서 의도한 문자를 나타내는 정답 후보자를 선택할 수 있으며, 사용자 디바이스(100)는 선택된 정답 후보자로 오류가 있는 문자를 대체하고, 대체된 문자가 포함된 수학식을 표시할 수 있다.The user may select a correct answer candidate representing an intended letter from among displayed candidates, and the user device 100 may replace the erroneous letter with the selected correct answer candidate and display a mathematical expression including the replaced letter.

도 6(d)를 참조하면, 사용자는 만약 개별 문자의 오류가 아닌, 일부 수학식 구조의 오류가 있는 경우, 해당 오류만 지우고, 재입력하여 수학식의 재인식을 요청할 수 있다.Referring to FIG. 6(d), if there is an error in the structure of a part of a mathematical formula rather than an error in individual characters, the user can request re-recognition of the mathematical formula by erasing only the error and re-entering the error.

예를 들어, 사용자는 지우개 아이콘을 선택하고, 사용자가 지우고자 하는 문자를 선택하는 경우, 사용자 디바이스(100)는 선택된 문자를 표시하지 않을 수 있으며, 사용자는 대체 문자를 사용자 디바이스(100)에 입력할 수 있다. 사용자 디바이스(100)는 입력된 대체 문자를 선택된 문자가 표시되었던 위치에 표시할 수 있다. 또한, 사용자 디바이스(100)는 이렇게 재생성된 수학식의 이미지를 다시 수학식 인식 디바이스(300)로 전송하여, 전술한 수학식 인식 절차를 재요청할 수 있다.For example, when a user selects an eraser icon and a character to be erased is selected, the user device 100 may not display the selected character, and the user inputs a replacement character into the user device 100. can do. The user device 100 may display the input replacement text at the location where the selected text was displayed. In addition, the user device 100 may transmit the regenerated image of the mathematical formula to the mathematical recognition device 300 again, and request the aforementioned mathematical recognition procedure again.

전술한 수학식 인식 디바이스(300)의 동작은 사용자 디바이스(100)에 설치되는 앱을 통해, 오프라인 상태에서 사용자 디바이스(100) 자체에서 유사한 동작으로 수행될 수 있음은 물론이다. 이를 위해, 상기 앱은 학습 모델(331)을 포함할 수 있다.Of course, the above-described operation of the mathematical recognition device 300 can be performed as a similar operation in the user device 100 itself in an offline state through an app installed in the user device 100 . To this end, the app may include a learning model 331.

도 7은 기존의 수학식 인식 방법의 예시이다.7 is an example of a conventional mathematical expression recognition method.

도 7을 참조하면, 기존의 광학 인식 및 스트로크 수학식 인식의 경우(예를 들어, MS math 앱), 사용자가 입력한 수학식을 수정할 때, 처음 입력한 필기를 지우개로 수정하여 재인식 시키거나, 입력된 결과의 잘못된 부분을 지우고 다시 수정을 요청하는 방식이었다.Referring to FIG. 7, in the case of conventional optical recognition and stroke mathematical recognition (for example, MS math app), when modifying the mathematical expression input by the user, the first input handwriting is corrected with an eraser and re-recognized, It was a method of erasing the incorrect part of the input result and requesting correction again.

이러한 방식에서는, 수학식을 재인식하는 경우, 사용자 디바이스(100)가 다시 인공지능 서버나 해당 기능을 수행하는 어플리케이션에 처음과 동일한 작업 수행을 요청해야 한다. 이러한 인공지능 인식은 많은 컴퓨터 자원을 요구하므로, 1개의 수식을 완전히 인식시키기 위해서 여러번의 인식 과정을 거치는 비용의 낭비가 발생했다.In this way, when recognizing the mathematical expression again, the user device 100 needs to request the artificial intelligence server or the application performing the corresponding function to perform the same task as the first time. Since this artificial intelligence recognition requires a lot of computer resources, a waste of cost was incurred by going through several recognition processes in order to completely recognize one formula.

그러나, 본 명세서에서 제시하는 방식은 거의 대부분 최초 예측 결과와 함께 전달하는 후보자 리스트를 통해, 인식 완료가 가능하므로 추가적인 비용이 발생하지 않는다. However, the method presented in this specification does not incur additional cost because recognition can be completed through a candidate list delivered together with the initial prediction result in most cases.

또한, 사용자가 수행하는 후보자 리스트를 통한 정답 문자 승인 과정을 통해, 인공지능 모델은 추가적인 비용 없이 인공지능 학습 데이터를 획득할 수 있다.In addition, through a process of approving correct text through a list of candidates performed by a user, the artificial intelligence model can obtain artificial intelligence learning data without additional cost.

또한, 재인식 과정이 없으므로 수학식 수정이 바로 진행될 수 있어, 수학식을 입력하는 속도가 증대될 수 있다.In addition, since there is no re-recognition process, correction of the mathematical formula can be performed immediately, so that the speed of inputting the mathematical formula can be increased.

또한, 전술한 지우개 기능은 기존의 방식과 다르게 인식된 수학식의 결과를 직접 수정하기 때문에, 정상적으로 인식된 문자는 그대로 유지시키고 수정된 부분만 재인식 시키는 효과를 발휘할 수 있다.In addition, since the above-described eraser function directly corrects the result of the recognized mathematical expression differently from the conventional method, it is possible to maintain the normally recognized character as it is and to re-recognize only the corrected part.

보다 자세하게, 활자화된 부분의 인식은 그렇지 않은 부분의 필기에 비해 월등히 인식율이 높기 때문에, 수학식의 구조를 수정하고 다시 인식받는 데, 유리할 수 있다.More specifically, since the recognition rate of the typed portion is much higher than that of the other portion, it may be advantageous to modify the structure of the mathematical expression and to be recognized again.

또한, 인공지능 모델에서, 정상적으로 출력된 문자는 다음에도 정답이 출력 될 것으로 기대될 수 있으나, 지우개 기능을 이용하여, 수정된 문자의 경우, 인공지능 모델이 다시 잘못된 결과를 판단할 가능성이 높으므로, 지우개 기능을 활용한 수정된 문자의 인공지능 학습 데이터 획득을 통해, 인식률이 낮은 문자를 위한 학습 데이터가 효과적으로 확보될 수 있다.In addition, in the artificial intelligence model, normally output letters can be expected to be output again, but in the case of modified letters using the eraser function, the artificial intelligence model is likely to judge the wrong result again. , Learning data for characters with a low recognition rate can be effectively secured through the acquisition of artificial intelligence learning data of modified characters using the eraser function.

도 8은 본 명세서가 적용될 수 있는 후보자 리스트를 통한 학습 방법의 예시이고, 도 9는 본 명세서가 적용될 수 있는 지우개 기능을 통한 학습 방법의 예시이다.8 is an example of a learning method through a candidate list to which this specification can be applied, and FIG. 9 is an example of a learning method through an eraser function to which this specification can be applied.

도 8 및 도 9를 참조하면, 사용자 디바이스(100) 및 AI 프로세서(311)는 전술한 도 5 및 도 6의 동작에 인공지능 학습 데이터 획득을 위한 학습 방법을 추가로 수행 할 수 있다. Referring to FIGS. 8 and 9 , the user device 100 and the AI processor 311 may additionally perform a learning method for obtaining artificial intelligence learning data in addition to the operations of FIGS. 5 and 6 described above.

도 8을 참조하면, 최초 학습 모델(331)은 개발자를 통해, 수집된 인공지능 학습 데이터를 이용하여, 학습된 상태를 갖을 수 있다. Referring to FIG. 8 , the initial learning model 331 may have a learned state using artificial intelligence learning data collected through a developer.

도 8(a)를 참조하면, 사용자 디바이스(100)는 사용자로부터 활자 또는 스트로크 방식으로 수학식을 입력받을 수 있다.Referring to FIG. 8( a ) , the user device 100 may receive a mathematical expression from the user in a typeface or stroke method.

도 8(b)를 참조하면, AI 프로세서(311)는 사용자 디바이스(100)로부터 수학식의 이미지를 전달받고, 학습 모델(331)를 통해, 인식된 결과를 사용자 디바이스(100)로 전달 할 수 있다.Referring to FIG. 8( b ), the AI processor 311 may receive an image of a mathematical expression from the user device 100 and deliver the recognized result to the user device 100 through the learning model 331. there is.

도 8(c)를 참조하면, 사용자는 잘못 인식된 문자를 수정하기 위해, 해당 문자의 심볼을 선택하고, 사용자 디바이스(100)는 선택된 심볼의 후보자 리스트를 출력하며, 사용자는 후보자 리스트에서 원하는 문자를 선택할 수 있다. Referring to FIG. 8(c), the user selects a symbol of the corresponding character in order to correct the incorrectly recognized character, the user device 100 outputs a candidate list of the selected symbol, and the user selects a desired character from the candidate list. can choose

도 8(e)를 참조하면, 만일, 후보자 리스트가 정답 문자를 포함하고 있다면, 사용자는 후보자 리스트에서 정답 문자를 선택할 수 있고, 사용자 디바이스(100)는 잘못 인식된 문자를 정답 문자로 교체하여, 디스플레이 할 수 있다. 이 경우, 사용자 디바이스(100)는 학습 모델(331)을 학습시키기 위해, 정답 문자와 최초 사용자가 입력한 원시 데이터를 쌍으로 만들어 저장하고, AI 프로세서(311)로 전달할 수 있다. AI 프로세서(311)는 수신한 데이터 쌍을 이용하여, 학습 모델(331)을 학습시킬 수 있다. 이 경우, 1) 사용자 디바이스(100)는 사용자가 입력한 전체 수식과 전체 정답 문자를 쌍으로 전달하거나, 또는, 2) 당해 잘못 인식된 문자와 정답 문자의 쌍만을 전달하여, 사용자 디바이스(100)와 AI 프로세서(311) 간의 시그널링 비용을 경감시킬 수 있다.Referring to FIG. 8(e), if the candidate list includes the correct text, the user can select the correct text from the candidate list, and the user device 100 replaces the erroneously recognized text with the correct text, can display In this case, in order to train the learning model 331, the user device 100 may make a pair of the correct text and raw data input by the first user, store them, and transmit them to the AI processor 311. The AI processor 311 may train the learning model 331 using the received data pair. In this case, 1) the user device 100 transfers the entire formula input by the user and all correct answer characters in pairs, or 2) transfers only the pair of the incorrectly recognized character and the correct answer character, so that the user device 100 It is possible to reduce the signaling cost between the AI processor 311.

추가적으로 사용자 디바이스(100)는 다음과 같은 경우에는 원시 데이터의 상태가 좋지 않아, 학습 데이터로 부적절하다고 판단하여, AI 프로세서(311)에게 데이터 쌍을 전달하는 동작을 수행하지 않을 수 있다 :Additionally, in the following cases, the user device 100 may not perform an operation of transmitting the data pair to the AI processor 311 by determining that the raw data is not in a good state and therefore inappropriate as training data:

(1) 사용자가 최종 결과에 만족하지 못하고 최종 승인단계를 수행하지 않는 경우(1) If the user is not satisfied with the final result and does not perform the final approval step

(2) 심볼이 아닌 구조 수식(예를 들어, 분수, 시그마와 같은 구조를 포함하는 수식)을 지운 경우(2) When structural formulas that are not symbols (for example, formulas containing structures such as fractions and sigma) are deleted

(3) 한번 수정에서 심볼을 2개 초과하여, 수정하는 경우(3) In case of modifying more than 2 symbols in one modification

(4) 심볼을 지우는 행위없이 사용자의 추가 수식 입력만 있는 경우(4) If there is only additional formula input by the user without the act of deleting the symbol

도 8(d)를 참조하면, 만일, 후보자 리스트가 정답 문자를 포함하고 있지 않은 경우, 사용자는 후보자 리스트에서 정답 문자를 선택하지 않을 수 있다. 이 경우, 사용자 디바이스(100)는 최초 사용자가 입력한 심볼을 표시하여, 지우개 기능을 수행하기 위한 준비를 할 수 있다.Referring to FIG. 8(d), if the candidate list does not include the correct answer text, the user may not select the correct answer text from the candidate list. In this case, the user device 100 may prepare to perform the eraser function by displaying the symbol input by the first user.

도 9(a) 내지 도9(d)를 참조하면, 사용자는 지우개 아이콘을 선택하여, 특정 문자만을 재입력할 수 있다(예를 들어, f 문자를 5로 재입력). 사용자가 사용자 디바이스(100)에서 새로 제시한 결과에 만족하고 승인 단계(예를 들어, 문서 삽입 등의 결과 활용단계)를 수행하는 경우, 사용자 디바이스(100)는 학습 모델(331)을 학습시키기 위해, 정답 문자와 최초 사용자가 입력한 원시 데이터를 쌍으로 만들어 저장하고, AI 프로세서(311)로 전달할 수 있다. AI 프로세서(311)는 수신한 데이터 쌍을 이용하여, 학습 모델(331)을 학습시킬 수 있다. 이 경우에도, 1) 사용자 디바이스(100)는 사용자가 입력한 전체 수식과 전체 정답 문자를 쌍으로 전달하거나, 또는, 2) 당해 잘못 인식된 문자와 정답 문자의 쌍만을 전달하여, 사용자 디바이스(100)와 AI 프로세서(311)간의 시그널링 비용을 경감시킬 수 있다.Referring to FIGS. 9(a) to 9(d) , the user may select an eraser icon and re-enter only a specific character (eg, re-enter the character f as 5). When the user is satisfied with the result newly presented by the user device 100 and performs an approval step (eg, a step of utilizing the result such as inserting a document), the user device 100 is configured to train the learning model 331. , The correct answer text and the original data input by the first user can be paired and stored, and transmitted to the AI processor 311. The AI processor 311 may train the learning model 331 using the received data pair. Even in this case, 1) the user device 100 transfers the entire formula input by the user and all correct answer characters in pairs, or 2) transfers only the pair of the incorrectly recognized character and the correct answer character, so that the user device 100 ) and the signaling cost between the AI processor 311 can be reduced.

추가적으로 사용자 디바이스(100)는 전술한 단계와 동일하게, 다음과 같은 경우에 원시 데이터의 상태가 좋지 않아, 학습 데이터로 부적절하다고 판단하여, AI 프로세서(311)에게 데이터 쌍을 전달하는 동작을 수행하지 않을 수 있다 :In addition, the user device 100 determines that the raw data is not in a good state and is inappropriate as learning data in the following cases, in the same manner as in the above step, and does not perform an operation of transmitting the data pair to the AI processor 311. may not:

도 10은 본 명세서가 적용될 수 있는 일 실시예이다.10 is an embodiment to which the present specification may be applied.

도 10을 참조하면, 사용자 디바이스(100)는 네트워크를 통해, AI 프로세서(311)와 연결되며, 수학식 인식을 위한 앱을 포함할 수 있다.Referring to FIG. 10 , the user device 100 is connected to the AI processor 311 through a network and may include an app for recognizing a mathematical expression.

사용자 디바이스(100)는 사용자로부터 디스플레이부(410)를 통해, 수학식을 입력받는다(S1010).The user device 100 receives a mathematical expression from the user through the display unit 410 (S1010).

사용자 디바이스(100)는 수학식이 포함된 이미지를 네트워크를 통해, AI 프로세서(311)로 전송한다(S1020).The user device 100 transmits the image including the mathematical expression to the AI processor 311 through a network (S1020).

AI 프로세서(311)는 학습 모델(331)을 이용하여, 수학식이 포함된 이미지에 근거하여, 수학식의 문자열을 예측한다(S1030). 예를 들어, AI 프로세서(311)는 학습 모델(331)을 이용한 OCR(optical character recognition)을 통해, 수학식의 문자열 예측할 수 있다. 보다 자세하게, AI 프로세서(311)는 학습 모델(331)을 이용하여, 신뢰도값이 가장 높은 문자들의 집합으로 수학식의 문자열을 예측할 수 있다.The AI processor 311 uses the learning model 331 to predict a string of equations based on an image including the equations (S1030). For example, the AI processor 311 may predict a string of mathematical expressions through OCR (optical character recognition) using the learning model 331 . In more detail, the AI processor 311 may use the learning model 331 to predict the character string of the mathematical expression as a set of characters having the highest reliability value.

AI 프로세서(311)는 학습 모델(331)을 이용하여, 예측된 수학식의 문자열에 포함된 각 요소들의 후보자 리스트를 예측한다(S1040). 예를 들어, 후보자 리스트는 일정범위의 신뢰도값을 갖는 문자들의 집합일 수 있다.The AI processor 311 uses the learning model 331 to predict a candidate list of each element included in the string of the predicted equation (S1040). For example, the candidate list may be a set of characters having a certain range of reliability values.

AI 프로세서(311)는 네트워크를 통해, 수학식의 문자열 및 후보자 리스트를 사용자 디바이스(100)로 전송한다(S1050).The AI processor 311 transmits the mathematical string and the candidate list to the user device 100 through the network (S1050).

사용자 디바이스(100)는 디스플레이부(410)에 수학식의 문자열을 표시한다(S1060).The user device 100 displays a string of mathematical formulas on the display unit 410 (S1060).

사용자 디바이스(100)는 사용자를 통해, 수학식의 오류정정을 수행하고, 학습 모델(331)을 학습시키기 위한, 학습 데이터를 저장한다(S1070). The user device 100 performs error correction of equations through the user and stores learning data for learning the learning model 331 (S1070).

예를 들어, 사용자 디바이스(100)는 사용자로부터 수학식의 문자열 중에 오류가 있는 문자를 선택받고, 후보자 리스트에서 오류가 있는 문자와 관련된 하나 이상의 후보자들을 표시할 수 있다. 보다 자세하게, 사용자 디바이스(100)는 후보자들의 신뢰도값에 따라 정렬하여, 후보자들을 표시할 수 있다. 이후, 사용자 디바이스(100)는 사용자로부터 표시된 후보자들 중에서 정답 후보자를 선택받을 수 있다. 사용자 디바이스(100)는 오류가 있는 문자를 선택된 정답 후보자로 교체하여, 디스플레이부(410)에 표시할 수 있다.For example, the user device 100 may receive an erroneous character selected from a character string of a mathematical expression from a user, and display one or more candidates related to the erroneous character in a candidate list. In more detail, the user device 100 may display the candidates by sorting them according to reliability values of the candidates. Thereafter, the user device 100 may select a correct candidate from among displayed candidates from the user. The user device 100 may replace the erroneous character with the selected correct answer candidate and display the result on the display unit 410 .

이 경우, 사용자 디바이스(100)는 학습 모델(331)을 학습시키기 위해, 정답 문자와 최초 사용자가 입력한 수학식과 관련된 원시 데이터를 쌍으로 만들어 저장할 수 있다.In this case, in order to train the learning model 331, the user device 100 may make a pair of correct answer text and raw data related to a mathematical formula input by the first user and store them.

또한, 사용자 디바이스(100)는 수학식의 문자열 중 특정 문자를 지우고 다시 입력하기 위한 지우개 아이콘을 표시할 수 있다. 사용자에 의해 지우개 아이콘이 선택된 경우, 사용자 디바이스(100)는 사용자로부터 사용자가 지우고자 하는 문자를 선택받고, 사용자가 지우고자 하는 문자를 디스플레이부(410)에 표시하지 않을 수 있다. In addition, the user device 100 may display an eraser icon for erasing and re-entering a specific character in a mathematical string. When the eraser icon is selected by the user, the user device 100 may receive selection of a character the user wants to erase from the user and may not display the character the user wants to erase on the display unit 410 .

사용자 디바이스(100)는 사용자가 지우고자 하는 문자를 대체하는 문자를 사용자로부터 입력받고, 대체하는 문자를 사용자가 지우고자 하는 문자가 표시되었던 위치에 표시할 수 있다. 예를 들어, 사용자 디바이스(100)는 탑재된 앱을 통해, 대체 문자를 다시 판단하거나, 또는, 학습 모델(331)을 통해, 예측 받기 위해, S1020-S1060 단계를 다시 수행할 수 있다.The user device 100 may receive a text to replace the text the user wants to erase from the user, and display the text to be replaced at the location where the text the user wants to erase was displayed. For example, the user device 100 may re-execute steps S1020 to S1060 in order to re-determine the alternative character through the installed app or receive a prediction through the learning model 331.

이 경우에도, 사용자 디바이스(100)는 학습 모델(331)을 학습시키기 위해, 정답 문자와 최초 사용자가 입력한 원시 데이터를 쌍으로 만들어 저장할 수 있다.Even in this case, the user device 100 may pair and store the answer text and raw data input by the first user in order to train the learning model 331 .

사용자 디바이스(100)는 학습 데이터를 AI 프로세서(311)로 전달한다(S1080). 예를 들어, 사용자 디바이스(100)는 학습 모델(331)을 학습시키기 위해, 사용자가 선택한 정답 후보자, 또는 사용자가 입력한 대체하는 문자를 AI 프로세서(311)로 전달 할 수 있다. 이 경우, 1) 사용자 디바이스(100)는 사용자가 입력한 전체 수식과 전체 정답 문자를 쌍으로 전달하거나, 또는, 2) 당해 잘못 인식된 요소와 정답 요소의 쌍만을 전달하여, 사용자 디바이스(100)와 AI 프로세서(311) 간의 시그널링 비용을 경감시킬 수 있다.The user device 100 transfers the learning data to the AI processor 311 (S1080). For example, the user device 100 may transfer a correct answer candidate selected by the user or a character input by the user to the AI processor 311 in order to train the learning model 331 . In this case, 1) the user device 100 transmits the entire formula input by the user and all correct answer characters in pairs, or 2) transmits only the pair of the incorrectly recognized element and the correct answer element, so that the user device 100 It is possible to reduce the signaling cost between the AI processor 311.

AI 프로세서(311)는 학습 데이터를 이용하여, 학습 모델(331)을 재학습한다(S1090). 이를 통해, 학습 모델(331)은 학습을 위한 데이터를 다양한 사용환경에 있는 사용자 디바이스(100)로부터, 효율적으로 수집할 수 있다.The AI processor 311 re-learns the learning model 331 using the learning data (S1090). Through this, the learning model 331 can efficiently collect data for learning from the user device 100 in various usage environments.

도 11은 본 명세서가 적용될 수 있는 AI 프로세서의 일 실시예이다.11 is an embodiment of an AI processor to which the present specification can be applied.

도 11을 참조하면, 도 10의 AI 프로세서(311)의 동작을 보다 자세히 예시한다. AI 모델은 학습 모델(331)을 포함할 수 있다.Referring to FIG. 11, the operation of the AI processor 311 of FIG. 10 is illustrated in more detail. The AI model may include a learning model 331 .

1. AI 프로세서(311)는 사용자 디바이스로부터, 상기 사용자가 입력한 수학식이 포함된 이미지를 수신한다. 예를 들어, 사용자 디바이스(100)는 사용자로부터 디스플레이부(410)를 통해, 수학식을 입력받을 수 있다. 사용자 디바이스(100)는 수학식이 포함된 이미지를 네트워크를 통해, AI 프로세서(311)로 전송할 수 있다.1. The AI processor 311 receives an image including a mathematical formula input by the user from a user device. For example, the user device 100 may receive a mathematical expression from the user through the display unit 410 . The user device 100 may transmit an image including a mathematical expression to the AI processor 311 through a network.

2. AI 프로세서(311)는 사용자가 입력한 수학식이 포함된 이미지에 근거하여 : 연결된 AI 모델을 이용하여, 1) 수학식의 문자열 및 2) 문자열과 관련된 후보자 리스트를 예측한다.2. The AI processor 311 predicts 1) a character string of the equation and 2) a candidate list related to the character string using the connected AI model based on the image including the equation input by the user.

3. AI 프로세서(311)는 사용자 디바이스로 1) 수학식의 문자열 및 2) 상기 문자열과 관련된 후보자 리스트를 전송한다. 예를 들어, 사용자 디바이스(100)는 AI 프로세서(311)로부터, 상기 수학식이 포함된 이미지에 근거하여, AI 모델에서 예측된 상기 수학식의 문자열, 및 수학식의 문자열과 관련된 후보자 리스트를 수신할 수 있다. 사용자 디바이스(100)는 상기 사용자에게, 디스플레이부(410)를 통해, 수학식의 문자열을 표시하고, 사용자로부터, 수학식의 문자열 중에 오류가 있는 하나 이상의 문자를 선택받을 수 있다. 또한, 사용자 디바이스(100)는 수학식의 문자열과 관련된 후보자 리스트에 근거하여, 오류가 있는 문자와 관련된 적어도 하나 이상의 후보자들을 상기 디스플레이부(410)에 표시하고, 상기 사용자로부터, 상기 하나 이상의 후보자들 중에서, 정답 후보자를 선택받는다. 예를 들어, 정답 후보자는 후보자들 중에, 사용자가 의도한 문자를 의미할 수 있다. 또한, 사용자 디바이스(100)는 상기 오류가 있는 문자를 상기 정답 후보자로 교체하여, 상기 디스플레이부(410)에 표시할 수 있다. 또한, 사용자 디바이스(100)는 상기 수학식의 문자열 중 특정 문자를 상기 디스플레이부(410) 상에서 지우고, 다시 입력하기 위한 아이콘을 표시할 수 있다. 상기 아이콘은 전술한 지우개 아이콘을 의미할 수 있다. 만일 사용자에 의해 상기 아이콘이 선택된 경우, 사용자 디바이스(100)는 상기 사용자로부터 상기 사용자가 지우고자 하는 문자를 선택받을 수 있다. 사용자 디바이스(100)는 상기 사용자가 지우고자 하는 문자를 상기 디스플레이부(410)에 표시하지 않고, 상기 사용자가 지우고자 하는 문자를 대체하는 문자를 상기 사용자에게 입력받을 수 있다. 사용자 디바이스(100)는 입력받은 대체하는 문자를 상기 사용자가 지우고자 하는 문자가 표시되었던 위치에 표시할 수 있다. 이를 통해, 사용자 디바이스(100)는 수학식의 오류가 정정된 수학식이 포함된 이미지를 생성할 수 있다. 사용자 디바이스(100)는 이러한 정정된 수학식이 포함된 이미지를 상기 AI 프로세서(311)로 전달할 수 있다. 예를 들어, 정정된 수학식은 사용자에 의해 입력된 대체하는 문자가 포함될 수 있다.3. The AI processor 311 transmits 1) a string of equations and 2) a candidate list related to the string to the user device. For example, the user device 100 may receive, from the AI processor 311, a character string of the equation predicted in an AI model based on an image including the equation, and a candidate list related to the character string of the equation. can The user device 100 may display a mathematical expression string to the user through the display unit 410, and may select one or more erroneous characters from the mathematical expression string from the user. In addition, the user device 100 displays at least one or more candidates related to an erroneous character on the display unit 410 based on a candidate list related to a character string of a mathematical formula, and displays the one or more candidates from the user. Among them, the correct candidate is selected. For example, the candidate for the correct answer may mean a character intended by the user among the candidates. In addition, the user device 100 may replace the erroneous character with the correct answer candidate and display it on the display unit 410 . In addition, the user device 100 may display an icon for erasing a specific character from the character string of the mathematical expression on the display unit 410 and inputting it again. The icon may mean the aforementioned eraser icon. If the icon is selected by the user, the user device 100 may receive a selection of text that the user wants to erase from the user. The user device 100 may not display the text the user wants to erase on the display unit 410 and may receive a text that replaces the text the user wants to erase from the user. The user device 100 may display the input replacement text at the location where the text the user wants to erase was displayed. Through this, the user device 100 may generate an image including a mathematical formula in which an error in the mathematical formula is corrected. The user device 100 may transmit the image including the corrected mathematical expression to the AI processor 311 . For example, the corrected mathematical expression may include replacement characters input by the user.

4. AI 프로세서(311)는 상기 사용자 디바이스로부터, 상기 수학식의 문자열 및 상기 후보자 리스트에 근거하여, 생성된 학습 데이터를 수신한다. 예를 들어, 사용자 디바이스(100)는 AI 프로세서(311)로부터, 상기 정정된 수학식이 포함된 이미지에 근거하여, 상기 AI 모델에서 예측된 1) 정정된 수학식의 문자열, 및 2) 상기 정정된 수학식의 문자열과 관련된 후보자 리스트를 수신하고, 사용자에게, 상기 디스플레이부(410)를 통해, 상기 정정된 수학식의 문자열을 표시할 수 있다. 이를 통해, 예측된 정정된 수학식의 문자열은 최초 수학식의 문자열보다 높은 신뢰도를 기대할 수 있다.4. The AI processor 311 receives learning data generated from the user device based on the character string of the equation and the candidate list. For example, the user device 100, based on the image including the corrected equation from the AI processor 311, 1) a string of the corrected equation predicted by the AI model, and 2) the corrected equation. A candidate list related to the mathematical expression string may be received, and the corrected mathematical expression string may be displayed to the user through the display unit 410 . Through this, the string of the predicted corrected equation can be expected to have higher reliability than the string of the first equation.

5. AI 프로세서(311)는 학습 데이터를 이용하여, AI 모델을 학습시킬 수 있다. 이러한 학습 데이터는 상기 사용자가 상기 사용자 디바이스를 통해, 상기 후보자 리스트에서 선택한 정답 후보자 및 상기 정답 후보자와 대응되는 상기 수학식의 문자열의 쌍으로만 구성되거나, 지우개 아이콘을 통해, 상기 사용자가 지우고자 하는 문자 및 상기 대체하는 문의 쌍으로만 구성될 수 있다.5. The AI processor 311 may learn the AI model using the learning data. Such learning data consists of only a pair of a correct answer candidate selected from the candidate list by the user through the user device and a string of the mathematical expression corresponding to the correct answer candidate, or through an eraser icon, the user wants to erase the learning data. It can consist only of a pair of letters and the above replacing statements.

전술한 본 명세서는, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 매체의 예로는, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있으며, 또한 캐리어 웨이브(예를 들어, 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 따라서, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 명세서의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 명세서의 등가적 범위 내에서의 모든 변경은 본 명세서의 범위에 포함된다. The above specification can be implemented as computer readable code on a medium on which a program is recorded. The computer-readable medium includes all types of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable media include Hard Disk Drive (HDD), Solid State Disk (SSD), Silicon Disk Drive (SDD), ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc. , and also includes those implemented in the form of a carrier wave (eg, transmission over the Internet). Accordingly, the above detailed description should not be construed as limiting in all respects and should be considered illustrative. The scope of this specification should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of this specification are included in the scope of this specification.

또한, 이상에서 서비스 및 실시 예들을 중심으로 설명하였으나 이는 단지 예시일 뿐 본 명세서를 한정하는 것이 아니며, 본 명세서가 속하는 분야의 통상의 지식을 가진 자라면 본 서비스 및 실시 예의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 예를 들어, 실시 예들에 구체적으로 나타난 각 구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부한 청구 범위에서 규정하는 본 명세서의 범위에 포함되는 것으로 해석되어야 할 것이다.In addition, although services and embodiments have been described above, this is only an example and does not limit the present specification, and those skilled in the art to which this specification belongs will not deviate from the essential characteristics of the present service and embodiments. It will be appreciated that various modifications and applications not exemplified above are possible. For example, each component specifically shown in the embodiments can be modified and implemented. And differences related to these modifications and applications should be construed as being included in the scope of the present specification as defined in the appended claims.

Claims

delete

In a method for collecting learning data from a user in order for a user device to learn an AI (Artificial Intelligence) model,
receiving an input of a mathematical expression from the user through a display unit;
Transmitting the image including the mathematical expression to a mathematical recognition device;
Receiving, from the equation recognition device, a character string of the equation predicted by the equation recognition device based on an image including the equation, and a candidate list related to the character string of the equation;
displaying the character string of the mathematical expression to the user through the display unit;
receiving, from the user, a character with an error selected from the text string of the mathematical expression;
displaying on the display unit one or more candidates related to the erroneous character based on the candidate list;
Based on the user's selection of the correct answer letter from among the candidates:
generating first learning data for learning the AI model; and
transmitting the first learning data to the mathematical expression recognition device;
Based on not receiving a selection of the correct answer letter from the user:
erasing a specific character from the character string of the equation on the display unit and displaying an icon for re-inputting;
receiving, from the user, a character to be erased by the user when the icon is selected;
receiving a text that replaces the text the user wants to erase without displaying the text the user wants to erase on the display unit;
displaying the replacement text at the position where the text the user wants to erase was displayed;
generating second learning data for learning the AI model based on the text to be replaced; and
transmitting the first learning data or the second learning data to the mathematical expression recognition device;
Including,
The first learning data is
1) the erroneous character, and 2) a pair of the correct answer character,
The second learning data is
Learning data collection method, consisting of a pair of 1) the character the user wants to erase, and 2) the replacement character.

delete

In a user device that collects learning data from a user in order to learn an AI (Artificial Intelligence) model,
a communication module for communicating with a mathematical expression recognition device;
display unit; and
a processor for functionally controlling the communication module and the display unit; Including,
The processor
Receiving an input of a mathematical expression from the user through the display unit;
Transmitting the image containing the equation to the equation recognition device,
Receiving, from the equation recognition device, a character string of the equation predicted by the equation recognition device based on an image including the equation, and a candidate list related to the character string of the equation;
Displaying the character string of the mathematical expression to the user through the display unit;
The user selects an erroneous character among the strings of the equation,
Based on the candidate list, at least one or more candidates related to the erroneous character are displayed on the display unit;
Based on the user's selection of the correct answer letter from among the candidates:
Generating first learning data for learning the AI model, transmitting the first learning data to the equation recognition device,
Based on not receiving a selection of the correct answer letter from the user:
Erase a specific character from the character string of the equation on the display unit and display an icon for re-inputting;
When the icon is selected, the user selects a character to be erased,
Receiving input from the user of a character replacing the character the user wants to erase without displaying the character the user wants to erase on the display unit;
Display the replacement character at the position where the character the user wants to erase was displayed;
Based on the replacement character, second learning data for learning the AI model is generated;
Transmitting the first learning data or the second learning data to the mathematical expression recognition device;
The first learning data is composed of a pair of 1) the erroneous character and 2) the correct answer character,
The second learning data is composed of a pair of 1) a character to be erased by the user, and 2) a character to be replaced.