KR102344145B1

KR102344145B1 - Early childhood learning method using by handwriting recognition, and recording medium for recording the early childhood learning method

Info

Publication number: KR102344145B1
Application number: KR1020210090741A
Authority: KR
Inventors: 최정민
Original assignee: 주식회사 천재교과서
Priority date: 2021-07-12
Filing date: 2021-07-12
Publication date: 2021-12-29

Abstract

Disclosed are a learning method for children using handwriting recognition, which provides descriptive questions on a screen of computer terminals or portable terminals used by a child and uses input handwriting image data to automatically perform scoring and provide a score by using handwriting recognition and natural language processing technology when the child directly writes an answer to the question, so that the child can perform learning anytime, and a recording medium storing the same. According to the present invention, the learning method for children comprises the following steps: (a) receiving a user's handwriting through a touch input; (b) processing an input of the user's handwriting through an optical character reader (OCR); (c) post-processing an OCR processing result to recognize the user's handwriting; (d) pre-processing the recognized user's handwriting; (e) automatically scoring the preprocessed handwriting recognition result through natural language processing; and (f) outputting a scoring result by post-processing the automatic scoring result.

Description

Early childhood learning method using by handwriting recognition, and recording medium for recording the early childhood learning method

본 발명은 손글씨 인식을 통한 유아 학습 방법 및 이를 기록한 기록매체에 관한 것으로, 보다 상세하게는 유아 사용자가 이용하는 컴퓨터 단말기나 휴대용 단말기 등의 화면 상으로 서술형 문제를 제공하고, 유아 사용자가 문제의 답안을 직접 필기하여 입력하면, 입력된 필기체 이미지 데이터를 이용하여 필기인식 및 자연어 처리 기술로 자동 채점하여 제공함으로써 유아 사용자가 24시간 언제든지 학습할 수 있도록 하는, 손글씨 인식을 통한 유아 학습 방법 및 이를 기록한 기록매체에 관한 것이다.The present invention relates to a method for learning children through handwriting recognition and a recording medium recording the same, and more particularly, to provide a narrative problem on the screen of a computer terminal or portable terminal used by an infant user, and to provide an answer to the problem by an infant user Infant learning method through handwriting recognition and recording medium that allows infant users to learn at any time 24 hours a day by providing automatic scoring using handwriting recognition and natural language processing technology using the input cursive image data when inputting by handwriting is about

유아 학습 분야에서 본 출원인이 기존에 제공하던 레거시 채점 시스템은 아이들이 제출한 정답 이미지를 관리교사가 직접 일일이 확인하며 정오답을 판단하여 채점하는 시스템이다.The legacy scoring system previously provided by the applicant in the field of early childhood learning is a system in which the teacher directly checks the correct answers submitted by children, and determines and scores incorrect answers.

코로나-19 여파로 인하여 본 출원인이 제공하는 '밀크T' 학습 서비스의 이용자(학생 수)가 증가하였다. 이로 인하여 레거시 채점 시스템의 자동화가 필요하게 되었으며, 또한 제한된 인원의 관리교사가 여러명의 학생들을 관리하다 보니 정오답 판별에 소요되는 시간(채점)이 실시간으로 이루어지지 않고 있다.Due to the aftermath of COVID-19, the number of users (number of students) of the 'Milk T' learning service provided by the applicant has increased. For this reason, the automation of the legacy scoring system was required, and since a limited number of administrative teachers manage several students, the time (scoring) required to determine incorrect answers is not performed in real time.

사람이 직접 판단하는 부분에서 인건비가 많이 발생하고, 채점 서비스의 신속성과, 채점 결과의 통일성 향상이 필요하게 되었다.Labor costs are high in the part where people directly judge, and it is necessary to improve the speed of the grading service and the unity of the grading result.

답이 정해져 있는 단순형 및 서술형의 완전 자동화(디지털화)를 위한 손글씨 인식 및 자동 채점 서비스의 필요성이 대두되었다.The need for handwriting recognition and automatic scoring services for complete automation (digitization) of simple and narrative types with fixed answers has emerged.

관련 선행 특허 문헌으로는 대한민국 등록특허공보 제10-2013919호(등록일 : 2019.08.19)가 있으며, 상기 문헌에는 다감각 자극을 활용한 즉각적 피드백을 구비한 손글씨 쓰기 학습 관리 방법 및 이를 위한 손글씨 쓰기 학습 관리 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록매체가 기재되어 있다.As a related prior patent document, there is Republic of Korea Patent Publication No. 10-2013919 (registration date: August 19, 2019), which includes a method for learning and managing handwriting with immediate feedback using multisensory stimuli and handwriting learning for the same A computer-readable recording medium in which a management program is recorded is described.

본 발명의 목적은 유아 사용자가 이용하는 컴퓨터 단말기나 휴대용 단말기 등의 화면 상으로 서술형 문제를 제공하고, 유아 사용자가 문제의 답안을 직접 필기하여 입력하면, 입력된 필기체 이미지 데이터를 이용하여 필기인식 및 자연어 처리 기술로 자동 채점하여 제공함으로써 유아 사용자가 24시간 언제든지 학습할 수 있도록 하는, 손글씨 인식을 통한 유아 학습 방법을 제공하는 것이다.An object of the present invention is to provide a narrative problem on the screen of a computer terminal or portable terminal used by an infant user, and when the infant user directly writes and inputs the answer to the problem, handwriting recognition and natural language using the input cursive image data It is to provide an infant learning method through handwriting recognition that allows infant users to learn at any time 24 hours a day by providing automatic scoring with processing technology.

전술한 목적을 달성하기 위한 본 발명의 실시예에 따른 유아 학습 방법은, (a) 터치 입력으로 사용자의 필기를 입력받는 단계; (b) 상기 입력받은 사용자의 필기를 OCR 처리하는 단계; (c) 상기 OCR 처리 결과를 후처리하여 사용자의 필기를 인식하는 단계; (d) 상기 인식된 사용자의 필기를 전처리하는 단계; (e) 상기 전처리 된 필기 인식 결과를 자연어 처리를 통해 자동 채점하는 단계; 및 (f) 상기 자동 채점을 후처리하여 채점 결과를 출력하는 단계를 포함할 수 있다.In accordance with an embodiment of the present invention for achieving the above object, there is provided a learning method for infants, comprising the steps of: (a) receiving a user's handwriting through a touch input; (b) OCR processing the input of the user's handwriting; (c) post-processing the OCR processing result to recognize the user's handwriting; (d) pre-processing the recognized user's handwriting; (e) automatically scoring the pre-processed handwriting recognition result through natural language processing; and (f) post-processing the automatic scoring to output a scoring result.

상기 (c) 단계에서 상기 필기 인식부는, 사용자로부터 입력받은 손글씨에 따른 필기 이미지에 대하여, 이진화(Binarization) 과정, 레이블링(Labeling) 과정, 레이블링 수동병합 과정을 통해 데이터를 정제할 수 있다.In step (c), the handwriting recognition unit may refine data through a binarization process, a labeling process, and a manual labeling merging process with respect to the handwriting image according to the handwriting input from the user.

상기 (c) 단계에서 상기 필기 인식부는, 상기 이진화 과정에 대하여, 상기 손글씨에 따른 필기 이미지를 그레이 스케일의 단일 채널 이미지로 입력하여 임계값을 초과하면 최대값으로 변경하고, 임계값 이하이면 0으로 변환할 수 있다.In step (c), with respect to the binarization process, the handwriting recognition unit inputs the handwriting image according to the handwriting as a single-channel image of gray scale, and changes it to a maximum value when it exceeds the threshold value, and to 0 when it exceeds the threshold value can be converted

상기 (c) 단계에서 상기 필기 인식부는, 상기 레이블링 과정에 대하여, 연결요소 레이블링(Connected Component Labeling) 기법에 따라, 영상을 좌에서 우로, 위에서 아래로 스캔하고, 픽셀값이 0이 아닌 픽셀에 대해 위쪽과 왼쪽을 확인하여 모두 0이면 현재 픽셀에 새로운 레이블(번호)을 할당하고, 두 개 중에서 1개만 0이 아니면 그 레이블로 할당하며, 모두 0이 아니면 두 개 중의 하나의 레이블을 할당하고, 이 두 레이블을 동치(equivalence)로 설정하며, 모든 픽셀에 상기 위쪽과 왼쪽을 확인하는 과정을 반복하여 동치 레이블에 대해 레이블을 재설정할 수 있다.In step (c), the handwriting recognition unit scans an image from left to right and top to bottom according to a Connected Component Labeling technique with respect to the labeling process, and for pixels whose pixel values are not 0 Check the top and the left, if both are 0, assign a new label (number) to the current pixel, if only one of the two is non-zero, assign that label; if both are non-zero, assign one of the two labels, The two labels are set as equivalence, and the process of checking the top and the left is repeated for every pixel to reset the label for the equivalence label.

상기 (e) 단계에서 상기 자연어 처리는, 상기 인식된 사용자의 필기에 대한 단어를 밀집 벡터(dense vector)의 형태인 임베딩 벡터(embedding vector)로 표현하여 워드 임베딩(word embedding)하고, 임베딩 벡터를 활용하여 다른 두 단어 및 문장에 대하여 유사도를 도출할 수 있다.In the step (e), the natural language processing expresses the recognized word for the user's handwriting as an embedding vector in the form of a dense vector and word embedding, and the embedding vector is By using it, similarity can be derived with respect to two other words and sentences.

상기 (e) 단계에서 상기 자동 채점은, 필기 인식 결과를 컨텐츠 모범 답안과 비교하고, 단답형의 경우에 규칙 기반 및 편집거리 계산으로 채점 결과를 도출하고, 서술형의 경우에 입력 답 특성과 모범답안 특성에 대한 특성 추출을 실행하여 유사도를 계산하여 채점 결과를 도출할 수 있다.In step (e), the automatic scoring compares the handwriting recognition result with the content model answer, derives the scoring result by rule-based and edit distance calculation in the case of short answer type, and input answer characteristic and model answer characteristic in the case of narrative type It is possible to derive the scoring result by calculating the similarity by executing feature extraction for .

상기 서술형의 경우에서 상기 특성 추출은, 띄어씌기 교정, 명사, 동사에 대한 형태소 분석, 명사구, 동사구에 대한 청킹, 부정형 태그를 이용한 부정 표현 인식을 통해 실행할 수 있다.In the case of the narrative type, the feature extraction may be performed through spacing correction, morphological analysis of nouns and verbs, chunking of noun phrases and verb phrases, and negative expression recognition using indefinite tags.

본 발명에 의하면, 유아의 필기체 이미지 데이터를 수집, 정제, OCR 모델에 추가 학습하여 유아 필기체를 좀 더 정확히 인식할 수 있다.According to the present invention, it is possible to more accurately recognize children's cursive handwriting by collecting, refining, and additionally learning the OCR model of the infant's cursive image data.

아울러, 본 발명은 오탈자에 대한 교정을 해줄 수 있다.In addition, the present invention can provide correction for typos.

또한, 본 발명은 단답형의 경우 완전 매칭으로 정답인지 아닌지를 판단하고, 형태소 분석 후에 키워드 위주의 정답을 처리할 수 있다.In addition, in the case of the short answer type, it is determined whether or not the correct answer is the correct answer by perfect matching, and the keyword-oriented correct answer can be processed after morpheme analysis.

아울러, 본 발명은 서술형의 경우, 단답형과는 다르게 확장 모범 답안에 대한 여러 데이터 군집을 오탈자 교정으로 생성하여, 이 확정 모범답안과 실제 답안이 같은지 유사도를 비교하여 정답에 가까운 유사도를 생성하여 인식할 수 있다.In addition, in the case of the narrative type, different from the short answer type, multiple data clusters for the extended model answer are generated through typo correction, and the similarity close to the correct answer is generated and recognized by comparing the similarity between the confirmed model answer and the actual answer. have.

이 결과, 본 발명은 서술형 채점을 실시간으로 해줌으로써, 학습자는 자신이 학습하는 과목에서 어떤 키워드가 가장 중요한지 확인할 수 있다. 채점 이후에 정답에 필요한 요소와 필요 없었던 요소를 제공함으로써 학습자가 과목에서 놓친 부분을 제공하여 학습자의 이해도를 유발할 수 있다. 그리고, 학습자는 실제 답안에 필요한 중요한 키워드가 무엇인지 제공받을 수 있다.As a result, the present invention provides descriptive scoring in real time, so that the learner can check which keyword is the most important in the subject he or she is learning. By providing the elements necessary for the correct answer and elements that were not needed for the correct answer after grading, it is possible to provide the part that the learner missed in the subject, thereby causing the learner's understanding. And, the learner can be provided with important keywords needed for the actual answer.

도 1은 본 발명의 실시예에 따른 손글씨 인식을 통한 유아 학습 시스템의 전체적인 구성을 개략적으로 나타낸 구성도이다.
도 2는 본 발명의 실시예에 따른 유아 학습 시스템의 내부 구성을 개략적으로 나타낸 구성도이다.
도 3은 본 발명의 실시예에 따른 유아 학습 서버에서 필기 인식부의 유아 손글씨 이미지 데이터를 정제하는 과정을 나타낸 도면이다.
도 4는 본 발명의 실시예에 따른 필기 인식부의 연결요소 레이블링의 예시를 나타낸 도면이다.
도 5는 본 발명의 실시예에 따른 자동 채점부의 자동 채점 과정을 나타낸 도면이다.
도 6은 본 발명의 실시예에 따른 자동 채점부의 단답형 처리 및 서술형 처리 과정을 나타낸 도면이다.
도 7은 본 발명의 실시예에 따른 유아 학습 방법을 나타낸 흐름도이다.
도 8은 본 발명의 실시예에 따른 유아 필기체 답안 샘플 데이터를 나타낸 도면이다.
도 9a는 본 발명의 실시예에 따른 사용자의 필기에 대한 OCR 처리의 좋은 결과를 예시한 도면이다.
도 9b는 본 발명의 실시예에 따른 사용자의 필기에 대한 OCR 처리의 나쁜 결과를 예시한 도면이다.
도 10은 본 발명의 실시예에 따른 필기 인식부의 유아 필기체 정제 데이터 형태를 예시한 도면이다.
도 11은 본 발명의 실시예에 따른 필기 인식부의 필기체 이미지 데이터 정제 과정에서 레이블링 과정을 예시한 도면이다.
도 12는 본 발명의 실시예에 따른 유아학습 DB에 저장된 유아 필기체 학습 데이터 형태를 예시한 도면이다.
도 13은 본 발명의 실시예에 따른 Deep CNN의 커널 및 커널을 통과한 이미지 데이터 특징 맵을 예시한 도면이다.
도 14는 본 발명의 실시예에 따른 Deep CNN의 학습된 이미지 특징을 예시한 도면이다.
도 15는 본 발명의 실시예에 따른 자동 채점부의 학습자 입력 답과 확장된 모범 답안을 비교하는 예를 나타낸 도면이다.1 is a configuration diagram schematically showing the overall configuration of an infant learning system through handwriting recognition according to an embodiment of the present invention.
2 is a configuration diagram schematically showing the internal configuration of an early childhood learning system according to an embodiment of the present invention.
3 is a diagram illustrating a process of refining infant handwriting image data by the handwriting recognition unit in the infant learning server according to an embodiment of the present invention.
4 is a diagram illustrating an example of labeling connection elements of a handwriting recognition unit according to an embodiment of the present invention.
5 is a diagram illustrating an automatic scoring process of an automatic scoring unit according to an embodiment of the present invention.
6 is a diagram illustrating short-answer-type processing and narrative-type processing by the automatic scoring unit according to an embodiment of the present invention.
7 is a flowchart illustrating a method for early childhood learning according to an embodiment of the present invention.
8 is a view showing sample data for children's cursive answers according to an embodiment of the present invention.
9A is a diagram illustrating a good result of OCR processing for a user's handwriting according to an embodiment of the present invention.
9B is a diagram illustrating a bad result of OCR processing for a user's handwriting according to an embodiment of the present invention.
10 is a diagram exemplifying the form of infant cursive refinement data of the handwriting recognition unit according to an embodiment of the present invention.
11 is a diagram illustrating a labeling process in the cursive image data refinement process of the handwriting recognition unit according to an embodiment of the present invention.
12 is a diagram illustrating a form of infant cursive learning data stored in an early childhood learning DB according to an embodiment of the present invention.
13 is a diagram illustrating a kernel of a Deep CNN and a feature map of image data passing through the kernel according to an embodiment of the present invention.
14 is a diagram illustrating a learned image feature of a Deep CNN according to an embodiment of the present invention.
15 is a diagram illustrating an example of comparing an answer input by a learner of the automatic scoring unit with an extended model answer according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예를 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예는 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 따라서, 몇몇 실시예에서, 잘 알려진 공정 단계들, 잘 알려진 소자 구조 및 잘 알려진 기술들은 본 발명이 모호하게 해석되는 것을 피하기 위하여 구체적으로 설명되지 않는다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention, and methods for achieving them, will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in a variety of different forms, only this embodiment allows the disclosure of the present invention to be complete, and common knowledge in the technical field to which the present invention belongs It is provided to fully inform the possessor of the scope of the invention, and the present invention is only defined by the scope of the claims. Accordingly, in some embodiments, well-known process steps, well-known device structures, and well-known techniques have not been specifically described in order to avoid obscuring the present invention. Like reference numerals refer to like elements throughout.

도면에서 여러 층 및 영역을 명확하게 표현하기 위하여 두께를 확대하여 나타내었다. 명세서 전체를 통하여 유사한 부분에 대해서는 동일한 도면 부호를 붙였다. 층, 막, 영역, 판 등의 부분이 다른 부분 "위에" 있다고 할 때, 이는 다른 부분 "바로 위에" 있는 경우뿐 아니라 그 중간에 또 다른 부분이 있는 경우도 포함한다. 반대로 어떤 부분이 다른 부분 "바로 위에" 있다고 할 때에는 중간에 다른 부분이 없는 것을 뜻한다. 또한, 층, 막, 영역, 판 등의 부분이 다른 부분 "아래에" 있다고 할 때, 이는 다른 부분 "바로 아래에" 있는 경우뿐 아니라 그 중간에 또 다른 부분이 있는 경우도 포함한다. 반대로 어떤 부분이 다른 부분 "바로 아래에" 있다고 할 때에는 중간에 다른 부분이 없는 것을 뜻한다.In order to clearly express various layers and regions in the drawings, the thicknesses are enlarged. Throughout the specification, like reference numerals are assigned to similar parts. When a part, such as a layer, film, region, plate, etc., is “on” another part, it includes not only cases where it is “directly on” another part, but also cases where there is another part in between. Conversely, when we say that a part is "just above" another part, we mean that there is no other part in the middle. Also, when a part of a layer, film, region, plate, etc. is said to be "under" another part, it includes not only the case where it is "directly under" another part, but also the case where there is another part in the middle. Conversely, when a part is said to be "just below" another part, it means that there is no other part in the middle.

공간적으로 상대적인 용어인 "아래(below)", "아래(beneath)", "하부(lower)", "위(above)", "상부(upper)" 등은 도면에 도시되어 있는 바와 같이 하나의 소자 또는 구성 요소들과 다른 소자 또는 구성 요소들과의 상관관계를 용이하게 기술하기 위해 사용될 수 있다. 공간적으로 상대적인 용어는 도면에 도시되어 있는 방향에 더하여 사용시 또는 동작시 소자의 서로 다른 방향을 포함하는 용어로 이해되어야 한다. 예를 들면, 도면에 도시되어 있는 소자를 뒤집을 경우, 다른 소자의 "아래(below)"또는 "아래(beneath)"로 기술된 소자는 다른 소자의 "위(above)"에 놓여질 수 있다. 따라서, 예시적인 용어인 "아래"는 아래와 위의 방향을 모두 포함할 수 있다. 소자는 다른 방향으로도 배향될 수 있고, 이에 따라 공간적으로 상대적인 용어들은 배향에 따라 해석될 수 있다.Spatially relative terms "below", "beneath", "lower", "above", "upper", etc. It can be used to easily describe the correlation between an element or components and other elements or components. The spatially relative terms should be understood as terms including different orientations of the device during use or operation in addition to the orientation shown in the drawings. For example, when an element shown in the figures is turned over, an element described as "beneath" or "beneath" another element may be placed "above" the other element. Accordingly, the exemplary term “below” may include both directions below and above. The device may also be oriented in other orientations, and thus spatially relative terms may be interpreted according to orientation.

본 명세서에서 어떤 부분이 다른 부분과 연결되어 있다고 할 때, 이는 직접적으로 연결되어있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 전기적으로 연결되어 있는 경우도 포함한다. 또한, 어떤 부분이 어떤 구성 요소를 포함한다고 할 때, 이는 특별히 그에 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In the present specification, when a part is said to be connected to another part, this includes not only a case in which it is directly connected, but also a case in which it is electrically connected with another element interposed therebetween. In addition, when it is said that a part includes a certain component, this means that other components may be further included, rather than excluding other components, unless otherwise stated.

본 명세서에서 제 1, 제 2, 제 3 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 이러한 구성 요소들은 상기 용어들에 의해 한정되는 것은 아니다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소들로부터 구별하는 목적으로 사용된다. 예를 들어, 본 발명의 권리 범위로부터 벗어나지 않고, 제 1 구성 요소가 제 2 또는 제 3 구성 요소 등으로 명명될 수 있으며, 유사하게 제 2 또는 제 3 구성 요소도 교호적으로 명명될 수 있다.In this specification, terms such as first, second, third, etc. may be used to describe various components, but these components are not limited by the terms. The above terms are used for the purpose of distinguishing one component from other components. For example, without departing from the scope of the present invention, a first component may be referred to as a second or third component, and similarly, the second or third component may also be alternately named.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않은 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein may be used with the meaning commonly understood by those of ordinary skill in the art to which the present invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless clearly defined in particular.

이하 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 따른 손글씨 인식을 통한 유아 학습 시스템 및 방법에 관하여 상세히 설명하면 다음과 같다.Hereinafter, a system and method for early childhood learning through handwriting recognition according to a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 실시예에 따른 손글씨 인식을 통한 유아 학습 시스템의 전체적인 구성을 개략적으로 나타낸 구성도이다.1 is a configuration diagram schematically showing the overall configuration of an infant learning system through handwriting recognition according to an embodiment of the present invention.

도 1를 참조하면, 본 발명의 실시예에 따른 유아 학습 시스템(100)은, 학생 단말기(110), 유아학습 서버(120), 유아학습 데이터베이스(DB)(130) 및 교사 단말기(140)를 포함할 수 있다.Referring to FIG. 1 , an early childhood learning system 100 according to an embodiment of the present invention includes a student terminal 110 , an early childhood learning server 120 , an early childhood learning database (DB) 130 , and a teacher terminal 140 . may include

학생 단말기(110) 및 교사 단말기(140)는 학생이나 교사가 휴대하는 스마트 폰(Smart Phone)이나 태블릿 PC, 유선으로 연결된 컴퓨터 단말기 중 하나일 수 있다.The student terminal 110 and the teacher terminal 140 may be one of a smart phone or a tablet PC carried by a student or teacher, or a computer terminal connected by wire.

학생 단말기(110)는 유아학습 서버(120)로부터 객관식이나 단답형, 서술형 등의 학습 문제를 제공받아 화면 상에 표시하고, 학생 사용자로부터 터치 입력으로 문제에 대한 답변에 대해 사용자의 필기로 입력한다.The student terminal 110 receives learning problems, such as multiple choice, short answer type, and narrative type, from the early childhood learning server 120 and displays it on the screen, and inputs the answer to the problem by the user's handwriting with a touch input from the student user.

유아학습 서버(120)는 학생 단말기(110)로부터 문제에 대한 답을 터치 입력으로 사용자의 필기를 입력받아 자동 채점하고, 그에 대한 채점 결과를 학생 단말기(110)로 제공한다.The early childhood learning server 120 automatically scores by receiving the user's handwriting as an answer to the problem from the student terminal 110 as a touch input, and provides the scoring result to the student terminal 110 .

유아학습 DB(130)는 학생 단말기(110)로부터 입력받은 손글씨에 관한 이미지 데이터를 저장하거나, 단답형과 서술형에 따른 손글씨 입력 데이터를 이미지 데이터로 저장하고 있다.The early childhood learning DB 130 stores image data related to handwriting input from the student terminal 110, or stores handwriting input data according to short answer type and narrative type as image data.

또한, 유아학습 DB(130)는 유아 필기체 데이터를 비롯하여 학습 데이터셋, 컨텐츠 모범 답안에 대한 데이터를 저장하고 있다.In addition, the early childhood learning DB 130 stores the data on the learning dataset, the content model answer, including the infant cursive data.

교사 단말기(140)는 학생 단말기(110)로부터 입력된 유아 필기체 데이터, 손글씨 답안 데이터 등에 대한 교사의 수동 채점을 이미지 데이터로 유아학습 서버(130)에 전송한다.The teacher terminal 140 transmits the teacher's manual scoring for the infant cursive data, handwritten answer data, etc. input from the student terminal 110 as image data to the early childhood learning server 130 .

도 2는 본 발명의 실시예에 따른 유아 학습 시스템의 내부 구성을 개략적으로 나타낸 구성도이다.2 is a configuration diagram schematically showing the internal configuration of an early childhood learning system according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 실시예에 따른 유아 학습 서버(120)는, 크게 필기 인식부(210)와 자동 채점부(220)를 포함한다.Referring to FIG. 2 , the early childhood learning server 120 according to an embodiment of the present invention largely includes a handwriting recognition unit 210 and an automatic scoring unit 220 .

필기 인식부(210)는, 터치 입력으로 사용자의 필기를 입력받아 전처리하고, OCR(optical character reader) 처리하며, 후처리 과정을 통해 사용자의 필기를 인식한다. 또한, 필기 인식부(210)는, 입력받은 사용자의 필기에 대한 이미지 데이터에서 커널(kernel) 파라미터의 학습을 통해 고유한 특징을 추출하고, 딥(Deep) CNN(Convolution Neural Network)을 통하여 문자를 인식할 수 있다.The handwriting recognition unit 210 receives and pre-processes the user's handwriting through a touch input, performs optical character reader (OCR) processing, and recognizes the user's handwriting through a post-processing process. In addition, the handwriting recognition unit 210 extracts unique features through learning of a kernel parameter from the input image data of the user's handwriting, and recognizes the characters through a deep convolutional neural network (CNN). can recognize

자동 채점부(220)는, 인식된 사용자의 필기를 전처리하고, 자연어 처리를 통해 자동 채점하고 후처리 과정을 통해 채점 결과를 출력할 수 있다.The automatic scoring unit 220 may pre-process the recognized user's handwriting, automatically score it through natural language processing, and output a scoring result through a post-processing process.

필기 인식부(210)는, 전처리부(212), OCR 엔진부(214) 및 후처리부(216)를 포함할 수 있다.The handwriting recognition unit 210 may include a preprocessor 212 , an OCR engine unit 214 , and a postprocessor 216 .

전처리부(212)는 입력된 사용자의 필기를 전처리한다.The pre-processing unit 212 pre-processes the input user's handwriting.

OCR 엔진부(214)는 전처리 된 사용자의 필기를 OCR 방식으로 처리한다.The OCR engine unit 214 processes the pre-processed user's handwriting in an OCR method.

후처리부(216)는 OCR 처리된 사용자의 필기를 후처리한다.The post-processing unit 216 post-processes the OCR-processed user's handwriting.

자동 채점부는, 전처리부(222), 자연어 처리(NLP) 채점 엔진부(224) 및 후처리부(226)를 포함할 수 있다.The automatic scoring unit may include a pre-processing unit 222 , a natural language processing (NLP) scoring engine unit 224 , and a post-processing unit 226 .

전처리부(222)는 필기 인식부(210)에서 후처리된 필기 인식 데이터를 전처리한다.The preprocessor 222 preprocesses the handwriting recognition data post-processed by the handwriting recognition unit 210 .

NLP 채점 엔진부(224)는 전처리 된 손글씨 인식 데이터를 자연어 처리(NLP) 과정으로 채점을 실행한다.The NLP scoring engine unit 224 scores the pre-processed handwriting recognition data through a natural language processing (NLP) process.

후처리부(226)는 필기 인식 데이터에 대한 채점 결과를 후처리한다.The post-processing unit 226 post-processes a scoring result of the handwriting recognition data.

도 3은 본 발명의 실시예에 따른 유아 학습 서버에서 필기 인식부의 유아 손글씨 이미지 데이터를 정제하는 과정을 나타낸 도면이다.3 is a diagram illustrating a process of refining infant handwriting image data by the handwriting recognition unit in the infant learning server according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 실시예에 따른 필기 인식부(210)는, 사용자로부터 입력받은 손글씨에 따른 필기 이미지에 대하여, 이진화(Binarization) 과정, 레이블링(Labeling) 과정, 레이블링 수동병합 과정을 통해 데이터를 정제할 수 있다.Referring to FIG. 3 , the handwriting recognition unit 210 according to an embodiment of the present invention performs a binarization process, a labeling process, and a manual labeling merging process for a handwritten image according to a handwriting input from a user. data can be refined.

이진화 과정은, 손글씨에 따른 필기 이미지를 그레이 스케일의 단일 채널 이미지로 입력하여 임계값을 초과하면 최대값으로 변경하고, 임계값 이하이면 0으로 변환할 수 있다.In the binarization process, a handwriting image according to handwriting is input as a single-channel image of gray scale, and when it exceeds a threshold value, it is changed to a maximum value, and when it is less than a threshold value, it can be converted to 0.

레이블링 과정은, 도 4에 도시된 연결요소 레이블링(Connected Component Labeling) 기법에 따라 다음과 같이 실행할 수 있다. 도 4는 본 발명의 실시예에 따른 필기 인식부의 연결요소 레이블링의 예시를 나타낸 도면이다. The labeling process may be performed as follows according to the Connected Component Labeling technique shown in FIG. 4 . 4 is a diagram illustrating an example of labeling connection elements of a handwriting recognition unit according to an embodiment of the present invention.

1. 영상을 좌에서 우로, 위에서 아래로 스캔한다.1. Scan the image from left to right and top to bottom.

2. 픽셀값이 0이 아닌 픽셀에 대해 위쪽과 왼쪽을 확인한다. 2. Check top and left for non-zero pixel values.

1) 모두 0이면 현재 픽셀에 새로운 레이블(번호)을 할당한다.1) If all are 0, a new label (number) is assigned to the current pixel.

2) 두 개 중에서 1개만 0이 아니면 그 레이블로 할당한다.2) If only one of the two is not 0, the label is assigned.

3) 모두 0이 아니면 두 개 중의 하나의 레이블을 할당하고, 이 두 레이블을 동치(equivalence)로 설정한다. 3) If both are non-zero, one of the two labels is assigned, and these two labels are set as equivalence.

3. 모든 픽셀에 위쪽과 왼쪽을 확인하는 단계 2 과정을 반복한다. 3. Repeat step 2 process to check top and left for all pixels.

4. 동치 레이블에 대해 레이블을 재설정한다.4. Reset labels for equivalence labels.

레이블링 수동병합 과정은, 손글씨에 따른 필기 이미지를 보고 직접 판단한 사용자로부터 레이블링을 입력받아 병합할 수 있다.In the labeling manual merging process, a labeling input may be received and merged by a user who directly judges the handwriting image by looking at the handwriting image.

한편, 자연어 처리 또는 자연 언어 처리는 인간의 언어 현상을 컴퓨터와 같은 기계를 이용해서 묘사할 수 있도록 연구하고 이를 구현하는 인공지능의 주요 분야 중 하나다. 자연 언어 처리는 연구 대상이 언어이기 때문에 당연하게도 언어 자체를 연구하는 언어학과 언어 현상의 내적 기재를 탐구하는 언어 인지 과학과 연관이 깊다. 구현을 위해 수학적 통계적 도구를 많이 활용하며 특히 기계학습 도구를 많이 사용하는 대표적인 분야이다. 정보검색, QA 시스템, 문서 자동 분류, 신문기사 클러스터링, 대화형 Agent 등 다양한 응용이 이루어지고 있다. 자연 언어에 대한 연구는 오래 전부터 이어져 오고 있음에도 2018년에 들어서도 아직 컴퓨터가 자연 언어를 사람처럼 이해하지는 못한다. 대신, 언어에 대한 깊은 이해없이 피상적인 확률 및 통계를 이용하여 대량의 정보를 처리하는 기술은 많이 발전한 상태이다.On the other hand, natural language processing or natural language processing is one of the main fields of artificial intelligence that studies and implements human language phenomena to be described using machines such as computers. Natural language processing is naturally closely related to linguistics, which studies language itself, and language cognitive science, which explores the internal mechanisms of language phenomena, since the subject of study is language. It is a representative field that uses a lot of mathematical and statistical tools for implementation, especially machine learning tools. Various applications such as information retrieval, QA system, automatic document classification, newspaper article clustering, and interactive agent are being made. Although research on natural language has been going on for a long time, even in 2018, computers still cannot understand natural language like humans. Instead, the technology for processing a large amount of information using superficial probability and statistics without a deep understanding of the language has been greatly developed.

본 발명의 실시예에 따른 자동 채점부(220)에서의 자연어 처리는, 인식된 사용자의 필기에 대한 단어를 밀집 벡터(dense vector)의 형태인 임베딩 벡터(embedding vector)로 표현하여 워드 임베딩(word embedding)하고, 임베딩 벡터를 활용하여 다른 두 단어 및 문장에 대하여 유사도를 도출할 수 있다.In the natural language processing in the automatic scoring unit 220 according to the embodiment of the present invention, the recognized word for the user's handwriting is expressed as an embedding vector in the form of a dense vector to embed a word. embedding), and using the embedding vector, similarity can be derived for the other two words and sentences.

도 5는 본 발명의 실시예에 따른 자동 채점부의 자동 채점 과정을 나타낸 도면이다.5 is a diagram illustrating an automatic scoring process of an automatic scoring unit according to an embodiment of the present invention.

도 5를 참조하면, 본 발명의 실시예에 따른 자동 채점부(220)에서의 자동 채점은, 필기 인식 결과를 컨텐츠 모범 답안과 비교하는 기술이다.Referring to FIG. 5 , the automatic scoring in the automatic scoring unit 220 according to an embodiment of the present invention is a technique of comparing a handwriting recognition result with a content model answer.

자동 채점부(220)는 학습자 답 전체에 대하여 학습데이터 샘플링과 테스트 데이터로 구분하고, 학습 데이터 샘플링에 대해 수동 채점 후 특성을 추출하고, 추출한 특성을 학습한 후 자동 채점 모델에 입력하여 자동 채점을 실행한다.The automatic scoring unit 220 divides the entire learner answer into learning data sampling and test data, extracts characteristics after manual scoring for the learning data sampling, learns the extracted characteristics, and then inputs the extracted characteristics to the automatic scoring model to perform automatic scoring run

또한, 자동 채점부(220)는 테스트 데이터를 자동 채점한 후 채점 결과를 도출할 수 있다.Also, the automatic scoring unit 220 may derive a scoring result after automatically scoring the test data.

자동 채점은 단답형, 서술형으로 나누어져 있다.The automatic scoring is divided into short answer type and narrative type.

도 6에 도시된 바와 같이, 단답형의 경우에 규칙 기반 및 편집거리 계산으로 채점 결과를 도출하고, 서술형의 경우에 입력 답 특성과 모범답안 특성에 대한 특성 추출을 실행하고 유사도를 계산하여 채점 결과를 도출할 수 있다. 도 6은 본 발명의 실시예에 따른 자동 채점부의 단답형 처리 및 서술형 처리 과정을 나타낸 도면이다. As shown in Figure 6, in the case of the short answer type, the scoring result is derived by rule-based and editing distance calculation, and in the case of the narrative type, the characteristic extraction for the input answer characteristic and the model answer characteristic is executed, and the similarity is calculated to obtain the scoring result. can be derived 6 is a diagram illustrating short-answer-type processing and narrative-type processing by the automatic scoring unit according to an embodiment of the present invention.

서술형의 경우에서 특성 추출은, 띄어씌기 교정 과정, 명사, 동사에 대한 형태소 분석 과정, 명사구, 동사구에 대한 청킹 과정, 부정형 태그를 이용한 부정 표현 인식 과정을 통해 실행할 수 있다.In the case of the narrative type, feature extraction can be performed through a spacing correction process, a morpheme analysis process for nouns and verbs, a chunking process for noun phrases and verb phrases, and a negative expression recognition process using indefinite tags.

도 7은 본 발명의 실시예에 따른 유아 학습 방법을 나타낸 흐름도이다.7 is a flowchart illustrating a method for early childhood learning according to an embodiment of the present invention.

도 7을 참조하면, 본 발명의 실시예에 따른 유아 학습 서버(120)는, 먼저 단답형 또는 서술형 등의 문제를 학생 단말기(110)로 제공함에 따라, 학생 단말기(110)로부터 터치 입력으로 답에 대한 사용자의 필기를 입력받는다(S710).Referring to FIG. 7 , the early childhood learning server 120 according to the embodiment of the present invention first provides a short answer type or narrative type problem to the student terminal 110, so that the answer is answered by a touch input from the student terminal 110 A user's handwriting is inputted (S710).

즉, 유아 학습 서버(120)는 학생 단말기(110)로부터 도 8에 도시된 바와 같이, 유아에 의해 터치 입력으로 답에 대한 사용자의 필기를 입력받는다. 도 8은 본 발명의 실시예에 따른 유아 필기체 답안 샘플 데이터를 나타낸 도면이다. That is, as shown in FIG. 8 , the early childhood learning server 120 receives the user's handwriting for the answer by a touch input from the student terminal 110 . 8 is a view showing sample data for children's cursive answers according to an embodiment of the present invention.

이어, 유아 학습 서버(120)는 입력받은 사용자의 필기를 OCR 처리한다(S720).Next, the early childhood learning server 120 performs OCR processing on the input of the user's handwriting (S720).

즉, 유아 학습 서버(120)는 학생 단말기(110)로부터 입력받은 유아의 필기를 OCR 필기체 엔진에 입력하여 도 9a 및 도 9b에 도시된 바와 같이, OCR 처리하여 글자 필기에 대한 의미를 인식할 수 있도록 한다. 도 9a는 본 발명의 실시예에 따른 사용자의 필기에 대한 OCR 처리의 좋은 결과를 예시한 도면이고, 도 9b는 본 발명의 실시예에 따른 사용자의 필기에 대한 OCR 처리의 나쁜 결과를 예시한 도면이다.That is, the early childhood learning server 120 inputs the child's handwriting received from the student terminal 110 into the OCR cursive engine, and as shown in FIGS. 9A and 9B , OCR processing to recognize the meaning of the handwriting. let it be 9A is a diagram illustrating a good result of OCR processing for a user's handwriting according to an embodiment of the present invention, and FIG. 9B is a diagram illustrating a bad result of OCR processing for a user's handwriting according to an embodiment of the present invention to be.

Computer Vision(CV)이란 인간의 시각을 통해 이미지 또는 영상을 인식하는 사고 능력을 컴퓨터에게 부여하고자 하는 인공지능을 한 분야를 말한다. CV분야에서 손글씨, 필기체, 인쇄체를 인식하는 작업을 광학 문자 인식(Optical character recognition - OCR)이라고 칭한다. OCR은 신용카드 및 주민등록증 인식, 차량 번호판 인식, 인쇄물 판독 등 다양한 산업 분야에서 상용화 되었다. 기존의 OCR 모델은 확률 통계적 방법으로 연구가 많이 되었다. 하지만 이러한 방법들은 사람이 직접 설계한 전처리, 확률 기반의 모델을 사용하기 때문에 특정 상황에 대하여 대체가 불가하고, 경우에 따라서 모델의 구조가 복잡하며 연산 속도가 느릴 수 있다. 최근에는 하드웨어 및 인공지능의 기술 발달로 인하여 Deep Convolutional Neural Network(CNN)을 이용한 OCR이 등장하면서 기존의 OCR보다 더 높은 인식률 및 성능을 도출하게 되었다.Computer Vision (CV) refers to a field of artificial intelligence that aims to give computers the thinking ability to recognize images or images through human vision. In the field of CV, the task of recognizing handwriting, cursive, and printed text is called Optical Character Recognition (OCR). OCR has been commercialized in various industries such as credit card and resident registration card recognition, license plate recognition, and print reading. Existing OCR models have been studied a lot as a probabilistic statistical method. However, since these methods use preprocessing and probability-based models designed by humans, they cannot be replaced for specific situations, and in some cases, the structure of the model is complicated and the operation speed may be slow. Recently, OCR using Deep Convolutional Neural Network (CNN) has emerged due to technological developments in hardware and artificial intelligence, resulting in higher recognition rate and performance than conventional OCR.

유아 필기체 데이터는 운필력이 다소 부족한 유아들이 작성한 데이터 이므로, 성인 필기체와 차이점은 1) 초성-중성-종성의 자유로운 크기, 2) 글자를 날려씀으로 인한 데이터의 각도 등이다.Since infant cursive data is written by infants who lack comprehension skills, the differences from adult cursive are 1) the free size of the leading, middle, and final consonants, and 2) the angle of the data due to the spelling.

이어, 유아 학습 서버(120)는 OCR 처리 결과를 후처리하여 사용자의 필기를 인식한다(S730).Next, the early childhood learning server 120 post-processes the OCR processing result to recognize the user's handwriting (S730).

이때, 유아 학습 서버(120)에서 필기 인식부(210)는, 도 10에 도시된 바와 같이, 유아 사용자로부터 입력받은 손글씨에 따른 필기 이미지에 대하여, 이진화(Binarization) 과정, 레이블링(Labeling) 과정, 레이블링 수동병합 과정을 통해 데이터를 정제할 수 있다. 도 10은 본 발명의 실시예에 따른 필기 인식부의 유아 필기체 정제 데이터 형태를 예시한 도면이다. At this time, as shown in FIG. 10 , the handwriting recognition unit 210 in the infant learning server 120 performs a binarization process, a labeling process, Data can be refined through the labeling manual merging process. 10 is a diagram exemplifying the form of infant cursive refinement data of the handwriting recognition unit according to an embodiment of the present invention.

필기 인식부(210)는, 이진화 과정에 대하여, 손글씨에 따른 필기 이미지를 그레이 스케일의 단일 채널 이미지로 입력하여 임계값을 초과하면 최대값으로 변경하고, 임계값 이하이면 0으로 변환할 수 있다.In the binarization process, the handwriting recognition unit 210 may input a handwriting image according to a handwriting as a single-channel image of gray scale, change it to a maximum value if it exceeds a threshold value, and may convert it to 0 if it exceeds the threshold value.

또한, 필기 인식부(210)는, 도 11에 도시된 바와 같이 레이블링 과정에 대하여, 연결요소 레이블링(Connected Component Labeling) 기법에 따라, 영상을 좌에서 우로, 위에서 아래로 스캔하고, 픽셀값이 0이 아닌 픽셀에 대해 위쪽과 왼쪽을 확인하여 모두 0이면 현재 픽셀에 새로운 레이블(번호)을 할당할 수 있다. 도 11은 본 발명의 실시예에 따른 필기 인식부의 필기체 이미지 데이터 정제 과정에서 레이블링 과정을 예시한 도면이다. 필기 인식부(210)는 두 개 중에서 1개만 0이 아니면 그 레이블로 할당하며, 모두 0이 아니면 두 개 중의 하나의 레이블을 할당하고, 이 두 레이블을 동치(equivalence)로 설정하며, 모든 픽셀에 위쪽과 왼쪽을 확인하는 과정을 반복하여 동치 레이블에 대해 레이블을 재설정할 수 있다.In addition, the handwriting recognition unit 210 scans the image from left to right and top to bottom according to the Connected Component Labeling technique for the labeling process as shown in FIG. 11 , and the pixel value is 0 We can assign a new label (number) to the current pixel by checking the top and left for any non-zero pixel. 11 is a diagram illustrating a labeling process in the cursive image data refinement process of the handwriting recognition unit according to an embodiment of the present invention. The handwriting recognition unit 210 assigns one of the two labels to the label if it is not 0, assigns one of the two labels if both are not 0, sets the two labels as equivalence, and assigns the two labels to all pixels. You can reset the label for the equivalence label by repeating the check up and left.

또한, 필기 인식부(210)는, 도 12에 도시된 바와 같이, 유아학습 DB(130)에 저장된 유아 필기체 데이터를 딥러닝 학습하여 유아 필기체 데이터를 인식할 수 있다. 도 12는 본 발명의 실시예에 따른 유아학습 DB에 저장된 유아 필기체 학습 데이터 형태를 예시한 도면이다. Also, as shown in FIG. 12 , the handwriting recognition unit 210 may recognize the infant cursive data by deep learning the infant cursive data stored in the infant learning DB 130 . 12 is a diagram illustrating a form of infant cursive learning data stored in an early childhood learning DB according to an embodiment of the present invention.

딥 러닝은 신경 세포인 뉴런의 형태를 참고하여 제안된 네트워크와 같다. 일반적으로 Input layer(입력층)과 Hidden layer(은닉층), Output layer(출력층)으로 구성되어 있다. 네트워크 모델이 복잡해질 경우 여러 개의 hidden layer를 가질 수 있다. 입력된 정보의 출력이 다음 layer에 입력되는 형태로 이러한 결합을 통해 왼쪽에서 오른쪽으로 정보가 전달된다. 딥 러닝은 학습 데이터와 학습 결과의 오차를 최소화 할 수 있는 방향으로 가중치를 업데이트하며 학습을 진행한다. 학습은 손실 값이 최소화되는 지점까지 이루어진다. 학습된 모델을 활용해 회귀 분석, 분류, 패턴 인식 등 다양한 부분에 활용할 수 있다.Deep learning is like the proposed network by referring to the shape of neurons, which are nerve cells. In general, it consists of an input layer (input layer), a hidden layer (hidden layer), and an output layer (output layer). When the network model becomes complex, it can have multiple hidden layers. The information is transmitted from left to right through this combination in the form of the output of the input information being input to the next layer. In deep learning, learning proceeds by updating the weights in a direction that can minimize the error between the learning data and the learning result. Learning continues to the point where the loss value is minimized. By using the learned model, it can be used in various areas such as regression analysis, classification, and pattern recognition.

CNN(Convolution Neural Network; CNN)은 동물들이 물체를 인식하거나 구분하는 것을 모티브로 한 신경망이다. 동물들이 다른 대상을 구분할 때, 대상의 특정 부분을 민감하게 받아들여 대상을 구분하는 것을 아이디어로 하여 등장했다. 이러한 뇌의 활동에 힌트를 얻어 CNN이라는 신경망이 발표되었고 현재 이미지 분야와 영상 분야에서 널리 활용되고 있다.CNN (Convolution Neural Network; CNN) is a neural network with the motif of animals recognizing or classifying objects. When animals distinguish between different objects, the idea of distinguishing objects by sensitively accepting a specific part of the object has emerged. A neural network called CNN was announced by getting a hint from these brain activities, and it is currently being widely used in the image field and video field.

CNN의 주요 기능은 이미지에서의 특징 추출(Feature extraction)을 하는 것이다. 특징 추출이란 이미지 데이터에서 고유한 특징을 찾는 것을 말하며, CNN에서는 기본적으로 kernel 파라미터의 학습을 통해 이미지의 특징을 추출한다. 그리고 이 커널들을 시각화 하면 도 13과 같다. 도 13은 본 발명의 실시예에 따른 Deep CNN의 커널 및 커널을 통과한 이미지 데이터 특징 맵을 예시한 도면이고, 도 14는 본 발명의 실시예에 따른 Deep CNN의 학습된 이미지 특징을 예시한 도면이다. The main function of CNN is to extract features from images. Feature extraction refers to finding unique features in image data, and CNN basically extracts image features through learning kernel parameters. And when these kernels are visualized, it is shown in FIG. 13 . 13 is a diagram illustrating a kernel of a Deep CNN according to an embodiment of the present invention and an image data feature map that has passed through the kernel, and FIG. 14 is a diagram illustrating a learned image feature of a Deep CNN according to an embodiment of the present invention. to be.

도 14에서, 좌측은 입력 이미지이고, 중앙은 Deep 뉴럴 네트워크의 학습된 주요 영역(이미지 특징)이며, 우측은 입력 데이터와 hitmap의 겹친 이미지를 나타낸다.In Fig. 14, the left side is the input image, the center is the learned main area (image features) of the deep neural network, and the right side shows the overlapped image of the input data and the hitmap.

딥러닝에서 다양한 데이터에서의 Robust(강력)한 모델을 학습하려면, 목표로 하고자 하는 데이터셋을 추가적으로 학습을 진행하면 인식률이 개선된다. 이에 출원인은 아래 표 1과 같이 매년 유아 손글씨 이미지 데이터를 총 260만건 수집 중이며, 수학을 제외한 한글/영어 과목에서는 약 150만건의 데이터가 쌓이고 있다. To learn a robust model from various data in deep learning, the recognition rate is improved by additionally learning the target dataset. Accordingly, as shown in Table 1 below, the applicant is collecting a total of 2.6 million cases of handwriting image data for infants every year, and about 1.5 million cases of data are accumulated in Korean/English subjects except math.

이에 출원인이 보유한 유아 손글씨 데이터를 정제하여 학습 데이터셋을 구축 및 추가적으로 학습하여 유아 필기체에 대한 인식률을 개선하였다. Accordingly, by refining the handwriting data of infants owned by the applicant, a learning dataset was constructed and additionally learned to improve the recognition rate for infant cursive.

출원인은 유아 필기체라는 특수한 이미지 데이터셋을 구축하여 학습을 진행한 결과, 자사 손글씨 이미지 데이터에 대해, 한글 평균 88%, 영어 평균 95% 인식률을 달성하였다.As a result of learning by constructing a special image dataset called infant cursive, the applicant achieved an average recognition rate of 88% in Korean and 95% in English for their handwritten image data.

이어, 유아 학습 서버(120)는 필기 인식부(210)에서 인식된 사용자의 필기를 자동 채점부(220)에 입력하여 전처리한다(S740).Next, the early childhood learning server 120 pre-processes the user's handwriting recognized by the handwriting recognition unit 210 by inputting it into the automatic scoring unit 220 (S740).

이어, 유아 학습 서버(120)에서 자동 채점부(220)는 전처리 된 필기 인식 결과를 자연어 처리를 통해 자동 채점한다(S750).Next, in the early childhood learning server 120 , the automatic scoring unit 220 automatically scores the preprocessed handwriting recognition result through natural language processing ( S750 ).

자연어 처리 또는 자연 언어 처리는 인간의 언어 현상을 컴퓨터와 같은 기계를 이용해서 묘사할 수 있도록 연구하고 이를 구현하는 인공지능의 주요 분야 중 하나다. 자연 언어 처리는 연구 대상이 언어이기 때문에 당연하게도 언어 자체를 연구하는 언어학과 언어 현상의 내적 기재를 탐구하는 언어 인지 과학과 연관이 깊다. 구현을 위해 수학적 통계적 도구를 많이 활용하며 특히 기계학습 도구를 많이 사용하는 대표적인 분야이다. 정보검색, QA 시스템, 문서 자동 분류, 신문기사 클러스터링, 대화형 Agent 등 다양한 응용이 이루어지고 있다. 자연 언어에 대한 연구는 오래 전부터 이어져 오고 있음에도 2018년에 들어서도 아직 컴퓨터가 자연 언어를 사람처럼 이해하지는 못한다. 대신, 언어에 대한 깊은 이해없이 피상적인 확률 및 통계를 이용하여 대량의 정보를 처리하는 기술은 많이 발전한 상태이다.Natural language processing or natural language processing is one of the main fields of artificial intelligence that studies and implements human language phenomena to be described using machines such as computers. Natural language processing is naturally closely related to linguistics, which studies language itself, and language cognitive science, which explores the internal mechanisms of language phenomena, since the subject of study is language. It is a representative field that uses a lot of mathematical and statistical tools for implementation, especially machine learning tools. Various applications such as information retrieval, QA system, automatic document classification, newspaper article clustering, and interactive agent are being made. Although research on natural language has been going on for a long time, even in 2018, computers still cannot understand natural language like humans. Instead, the technology for processing a large amount of information using superficial probability and statistics without a deep understanding of language has been greatly developed.

본 발명의 실시예에 따른 자연어 처리는, 인식된 사용자의 필기에 대한 단어를 밀집 벡터(dense vector)의 형태인 임베딩 벡터(embedding vector)로 표현하여 워드 임베딩(word embedding)하고, 임베딩 벡터를 활용하여 다른 두 단어 및 문장에 대하여 유사도를 도출할 수 있다.Natural language processing according to an embodiment of the present invention expresses a word for a recognized user's handwriting as an embedding vector in the form of a dense vector, performs word embedding, and utilizes the embedding vector. Thus, similarity can be derived with respect to the other two words and sentences.

자동 채점은, 필기 인식 결과를 컨텐츠 모범 답안과 비교하고, 단답형의 경우에 규칙 기반 및 편집거리 계산으로 채점 결과를 도출하고, 서술형의 경우에 입력 답 특성과 모범답안 특성에 대한 특성 추출을 실행하여 유사도를 계산하여 채점 결과를 도출할 수 있다.Automatic scoring compares the handwriting recognition result with the content model answer, derives the scoring result by rule-based and edit distance calculation in the case of a short answer type, and extracts the characteristics of the input answer characteristic and the model answer characteristic in the case of the narrative type. A scoring result can be derived by calculating the similarity.

서술형의 경우에서 특성 추출은, 띄어씌기 교정, 명사, 동사에 대한 형태소 분석, 명사구, 동사구에 대한 청킹, 부정형 태그를 이용한 부정 표현 인식을 통해 실행할 수 있다.In the case of the narrative type, feature extraction can be performed through spacing correction, morphological analysis of nouns and verbs, chunking of noun phrases and verb phrases, and negative expression recognition using indefinite tags.

이어, 유아 학습 서버(120)는 자동 채점을 후처리하여 채점 결과를 출력한다(S760).Next, the early childhood learning server 120 post-processes the automatic scoring and outputs the scoring result (S760).

자동 채점에서 중요한 요소는 언어 처리와, 자동 채점 모듈 내에서 채점 단위에 대한 분류가 필요하다. 예를 들면, 단답형과 서술형이다. An important factor in automatic scoring is language processing and classification of scoring units in the automatic scoring module is required. For example, short answer type and narrative type.

단답형은 단순 형태소 분석과, 청킹으로 유의어인지에 대한 판단 과정만 거치면 된다.The short answer type only needs to go through a simple morpheme analysis and a decision on whether it is a synonym by chunking.

서술형은 특성 추출과 유사도에 대한 계산 과정이 필요하다.The narrative type requires feature extraction and calculation of similarity.

서술형의 특성 추출은 형태소 분석 과정을 거치고 이 과정에서 사전에 등록되지 않은 경우(형태소 분석이 되지 않는 경우), 개체명 인식과 키워드 추출이 필요할 수 있다.The descriptive feature extraction goes through a morpheme analysis process, and if it is not registered in advance in this process (the morpheme analysis is not performed), entity name recognition and keyword extraction may be required.

mecab 형태소 분석기 사용 및 교육 관련 사전 데이터를 이용해 사용자 사전을 추가하여, 다른 형태소 분석기에 비해 교육 특화 형태소 분석이 가능하다.By adding a user dictionary using the mecab morpheme analyzer and education-related dictionary data, education-specific morpheme analysis is possible compared to other morpheme analyzers.

가) 오탈자 교정(확장 모범 답안을 얻기 위해 필요함)a) Correct typos (necessary to get extended model answers)

- 유아 학습 시스템인 만큼 오탈자에 대한 교정을 해줄 수 있다. 저학년인 경우 오탈자가 특히나 많다는 결과가 나온다. - As it is an early childhood learning system, it can correct typos. In the case of lower grades, the result is that there are especially many typos.

- 저학년인 경우는 음운 유사도 + 음절 단위의 언어모델(특정 음절 앞/뒤에 어떤 음절이 나올 확률)을 사용한다.- In the case of lower grades, phonological similarity + syllable unit language model (probability of a certain syllable appearing before/after a specific syllable) is used.

- 고학년인 경우는 어려운 받아쓰기 교정 규칙(사전), 띄어쓰기 교정을 해준다- In the case of high school students, difficult dictation correction rules (dictionary) and spacing are corrected

- 띄어쓰기 교정은 자동 학습데이터 생성이나 seq2seq 모델을 학습하여 교정할 수 있다.- Space correction can be corrected by automatically generating learning data or learning the seq2seq model.

나) 단답형 B) Short answer type

- 반복처리/규칙처리 : 완전 매칭으로 정답인지 아닌지 판단한다.- Repeat processing/rule processing: It is determined whether the answer is correct or not by perfect matching.

- 정답 확장 : 형태소 분석 후에 키워드 위주의 정답을 처리한다. 예를 들면, 정답: 딸기, 입력 답: 딸기입니다. 또한, 완전 매칭 False, 반복 정답 : True 등이다.- Correct answer expansion: After morphological analysis, keyword-oriented correct answers are processed. For example, Answer: Strawberry, Input Answer: Strawberry. Also, exact match False, repeated correct answer: True, etc.

- 편집거리(오탈자 교정과 영역이 겹치긴 함) : 편집거리 게산 -> 점수 반환- Editing distance (although the area overlaps with the correction of typos): Calculation of editing distance -> Return score

(ex. 정답 : 딸기, 입력 답: 달기 --> 유사도 점수 : 4/5 = 80%)(ex. Correct answer: Strawberry, Input answer: Add --> Similarity score: 4/5 = 80%)

- 청킹을 통해 처리할 수 있다. 청크(chunk)란 덩어리, 부분, 그룹, 묶음을 말한다. 예를 들면 '축구 동아리/농구 동아리' 등으로 청크를 나눈다. 이렇게 청크로 나누는 과정을 청킹(chunking)이라고 한다.- It can be processed through chunking. A chunk is a chunk, part, group, or bundle. For example, the chunks are divided into 'soccer clubs/basketball clubs'. This process of dividing into chunks is called chunking.

다) 서술형 문장 단위의 유사도 비교(2m/m --> 향후 고도화)C) Comparison of similarity in descriptive sentence units (2m/m --> future advancement)

- 단답형과는 다르게 확장 모범 답안에 대한 여러 데이터 군집을 오탈자 교정으로 생성하여, 도 15에 도시된 바와 같이, 이 확정 모범답안과 실제 답안이 같은지 유사도를 비교하여 정답에 가까운 유사도를 생성한다. 이 유사도 계산에는 과목별로 그에 맞는 방법론을 사용한다. 도 15는 본 발명의 실시예에 따른 자동 채점부의 학습자 입력 답과 확장된 모범 답안을 비교하는 예를 나타낸 도면이다. - Unlike the short-answer type, multiple data clusters for the extended model answer are created by correcting typos, and as shown in FIG. 15, the degree of similarity close to the correct answer is generated by comparing the similarity between the confirmed model answer and the actual answer. For this similarity calculation, a methodology suitable for each subject is used. 15 is a diagram illustrating an example of comparing an answer input by a learner of the automatic scoring unit with an extended model answer according to an embodiment of the present invention.

- 한 문장으로 끝나는 서술형 문항을 기준으로 채점하고 있다. 한국어의 특성상 한 표현의 단어에 여러 단어를 유의어로 가지고 있는데 이 유의어의 집합들을 교육 사전으로 구축해 놓는다. 사전화 작업은 한 단어와 비슷한 단어들을 벡터 군집 안에서 클러스터링하여 사용한다. 예를 들면 하얗게 라는 단어는 뿌옇게, 흐리게, 불투명하게, 흰, 하얀 등 여러 단어와 같은 벡터 군집을 가질 것이다. 이에 기반한 사용자 단어 사전을 구축하고 나머지 불필요한 데이터를 제거한 채, 공집합으로 문장의 유사도를 구해 정답과 얼마나 비슷한 지에 대한 퍼센트로 결과에 도출한다.- Scoring is based on descriptive questions ending in one sentence. Due to the characteristics of the Korean language, a word of an expression has several words as synonyms, and sets of these synonyms are constructed as an educational dictionary. In the dictionary work, words similar to one word are clustered in a vector cluster and used. For example, the word white will have the same vector cluster as several words such as hazy, blurry, opaque, white, white, etc. Based on this, a user word dictionary is built and the remaining unnecessary data is removed, the similarity of the sentences is calculated using the empty set, and the result is derived as a percentage of how similar to the correct answer is.

- 서술형 채점을 실시간으로 해줌으로써, 학습자는 자신이 학습하는 과목에서 어떤 키워드가 가장 중요한지 확인할 수 있다.- By providing descriptive grading in real time, learners can check which keywords are most important in the subject they are learning.

- 채점 이후에 정답에 필요한 요소와 필요 없었던 요소를 제공함으로써 학습자가 과목에서 놓친 부분을 제공하여 학습자의 이해도를 유발할 수 있다.- By providing the elements necessary for the correct answer and the elements that were not needed for the correct answer after grading, the learner can provide the part that the learner missed in the subject, thereby causing the learner's understanding.

- 학습자는 실제 답안에 필요한 중요한 키워드가 무엇인지 제공받을 수 있다.- Learners can be provided with important keywords needed for actual answers.

라) 논술형 D) Essay type

- 논술형 채점을 위해서는 단계가 필요하다. - Steps are necessary for essay-type grading.

a) 1단계 : 서술형 채점 데이터를 모아, 연결 어미를 붙여서 1단계 논술 데이터를 만들 수 있다. 예를 들면, a는 ~~한 것을 뜻한다. 그리고 ~~ 특징을 가지고 있다.a) Step 1: You can make step 1 essay data by collecting descriptive scoring data and attaching a linking ending. For example, a means something. And it has a ~~ feature.

b) 2단계 : 1단계의 데이터 셋의 확장으로 두가지의 질문에 대한 논술을 할 수 있다. 예를 들면, a는 ~~한 것을 뜻한다, ~~ 특징을 가지고 있다. 또 b는 ~~한 것을뜻하며, ~~특징을 가지고 있다.b) Step 2: Expansion of the data set in Step 1 enables discussion on two questions. For example, a means ~~, which has the ~~ feature. Also, b means ~~, and has the ~~ feature.

c) 3단계 : 질문에 대한 서술형 응답이 아니라 질문속에 두가지 질문에 대한 개인의 판단이 들어가는 부분이다. 예를 들면, a는 ~~한 것을 뜻한다, ~~ 특징을 가지고 있다. 또 b는 ~~한 것을 뜻하며, ~~특징을 가지고 있다. 따라서 a와 b는 ~~~하므로, 이 문제에서는 ~~한 방법을 써야 한다.c) Step 3: It is not a descriptive answer to the question, but the part where the individual's judgment about the two questions is entered in the question. For example, a means ~~, which has the ~~ feature. Also, b means ~~, and has the ~~ feature. Therefore, a and b are ~~~, so in this problem, one method has to be used.

c) 4단계 : 3단계가 포함되어 있고, 추가적으로 더 나은 방법에 대한 제시가 필요하다. 예를 들면, a는 ~~한 것을 뜻한다, ~~ 특징을 가지고 있다. 또 b는 ~~한 것을 뜻하며, ~~특징을 가지고 있다. 따라서 a와 b는 ~~~하므로, 이 문제에서는 ~~한 방법을 쓰는게 더 낫다. 하지만 더 나은 방안은 ~~을 통해 ~~를 하는 것이다.c) Step 4 : Step 3 is included, and additionally, it is necessary to suggest a better method. For example, a means ~~, which has the ~~ feature. Also, b means ~~, and has the ~~ feature. Therefore, a and b are ~~~, so it is better to use the ~~ method in this problem. But a better way is to do ~~ through ~~.

- 이러한 논술형 채점을 하기 위해서는 서술형 채점의 데이터를 많이 모아야 한다. 출원인은 교육 관련 컨텐츠를 많이 모을 수 있으므로, 서술형 데이터를 통해 1단계 과정부터 데이터를 모으고 있다. 이를 이용해 단계별로 데이터를 모을 예정이다. - In order to do this type of grading, it is necessary to collect a lot of data for narrative grading. Since the applicant can collect a lot of education-related content, data is being collected from the first stage through narrative data. This will be used to collect data step by step.

- 이를 통해 학습자에게 제공할 수 있는 것은 사고의 흐름이나 사고 방식을 제안할 수 있다. 기존의 문제에 대한 해결 방법을 제공함으로써 학습자가 기존에 틀에 박힌 서술형에서 끝난 학습을 논술형을 통해 창의력을 키울 수 있다.- What can be provided to the learner through this is to suggest a flow of thought or a way of thinking. By providing a solution to the existing problem, learners can develop their creativity through the narrative type of learning that has been completed in the conventional narrative type.

한편, 본 발명의 실시예에 따른 유아 학습 방법을 컴퓨터 장치에서 실행 가능한 프로그램으로 기록 매체에 저장할 수 있다.Meanwhile, the early childhood learning method according to an embodiment of the present invention may be stored in a recording medium as a program executable in a computer device.

즉, 본 발명의 실시예에 따른 유아 학습 방법은, (a) 터치 입력으로 사용자의 필기를 입력받는 단계; (b) 상기 입력받은 사용자의 필기를 OCR 처리하는 단계; (c) 상기 OCR 처리 결과를 후처리하여 사용자의 필기를 인식하는 단계; (d) 상기 인식된 사용자의 필기를 전처리하는 단계; (e) 상기 전처리 된 필기 인식 결과를 자연어 처리를 통해 자동 채점하는 단계; 및 (f) 상기 자동 채점을 후처리하여 채점 결과를 출력하는 단계를 포함하는 프로그램으로 컴퓨터로 판독이 가능한 기록 매체에 저장할 수 있다.That is, the method for early childhood learning according to an embodiment of the present invention includes the steps of: (a) receiving a user's handwriting as a touch input; (b) OCR processing the input of the user's handwriting; (c) post-processing the OCR processing result to recognize the user's handwriting; (d) pre-processing the recognized user's handwriting; (e) automatically scoring the pre-processed handwriting recognition result through natural language processing; and (f) post-processing the automatic scoring and outputting a scoring result, which may be stored in a computer-readable recording medium.

본 발명의 실시 예에 따른 유아 학습 방법이 기록된 기록매체는, 자기 터널 접합(Magnetic Tunnel Junction: MTJ) 소자를 포함하는 자기 메모리 소자일 수 있으며, 매트릭스 형태로 배열된 복수의 메모리 셀(C)을 포함하는 메모리 셀 어레이로 구현될 수 있다. The recording medium on which the early childhood learning method according to an embodiment of the present invention is recorded may be a magnetic memory device including a magnetic tunnel junction (MTJ) device, and a plurality of memory cells (C) arranged in a matrix form It may be implemented as a memory cell array comprising a.

복수의 메모리 셀(C) 각각은 액세스 트랜지스터(T)와 메모리(M)를 포함할 수 있다. 또한, 메모리 셀 어레이는 복수의 워드 라인(Word Line: WL), 복수의 소스 라인(Source Line: SL), 및 복수의 비트 라인(Bit Line: BL)을 포함할 수 있다. 복수의 메모리 셀(C) 각각은 해당 워드 라인(WL), 소스 라인(SL), 및 비트 라인(BL)에 전기적으로 연결될 수 있다.Each of the plurality of memory cells C may include an access transistor T and a memory M. Also, the memory cell array may include a plurality of word lines (WL), a plurality of source lines (SL), and a plurality of bit lines (BL). Each of the plurality of memory cells C may be electrically connected to a corresponding word line WL, a source line SL, and a bit line BL.

복수의 워드 라인(WL)은 제1 방향(x 방향)으로 상호 평행하게 배치되면서 각각 제2 방향(y 방향)으로 연장될 수 있다. 복수의 소스 라인(SL)은 제1 방향으로 연장하면서 제2 방향으로 상호 평행하게 배치될 수 있다. 복수의 비트 라인(BL)은 소스 라인(SL)과 동일하게 제1 방향으로 연장하면서 제2 방향으로 상호 평행하게 배치될 수 있다. 그러나 복수의 소스 라인(SL)과 복수의 비트 라인(BL)은 제2 방향을 따라서 서로 번갈아 배치되는 식으로 배치될 수 있다.The plurality of word lines WL may be disposed parallel to each other in the first direction (x-direction) and extend in the second direction (y-direction), respectively. The plurality of source lines SL may be disposed parallel to each other in the second direction while extending in the first direction. The plurality of bit lines BL may be disposed parallel to each other in the second direction while extending in the same first direction as the source line SL. However, the plurality of source lines SL and the plurality of bit lines BL may be alternately disposed along the second direction.

이러한 메모리 셀 어레이 구조에서, 비트 라인(BL)과 같은 방향으로 연장하는 소스 라인(SL)은 인접한 비트 라인(BL)과 아래 위로 교차하면서 메모리 셀(C)을 공유를 하게 된다. 특히, 어느 하나의 워드 라인(WL)이 선택될 때 선택된 메모리 셀(C)의 주위에 다른 메모리 셀(C)이 없으므로 중복 선택되는 문제가 발생하지 않을 수 있다. 다시 말해서, 본 발명의 실시 예에 따른 메모리 셀 어레이 구조에서, 하나의 워드 라인(WL)을 따라서 어느 하나의 메모리 셀(C)에 바로 인접하는 메모리 셀(C)은 없으며, 바로 인접하는 메모리 셀(C)은 하나의 워드 라인(WL)에 인접하는 다른 워드 라인(WL)에 배치될 수 있다.In such a memory cell array structure, the source line SL extending in the same direction as the bit line BL crosses the adjacent bit line BL up and down to share the memory cell C. As shown in FIG. In particular, when any one word line WL is selected, since there are no other memory cells C around the selected memory cell C, a problem of duplicate selection may not occur. In other words, in the memory cell array structure according to the embodiment of the present invention, there is no memory cell C immediately adjacent to any one memory cell C along one word line WL, and there is no memory cell immediately adjacent to the memory cell C. (C) may be disposed on another word line WL adjacent to one word line WL.

좀 더 구체적으로, 메모리 셀(C)과 워드 라인(WL), 소스 라인(SL), 및 비트 라인(BL)의 연결 관계를 설명하면 다음과 같다.More specifically, the connection relationship between the memory cell C, the word line WL, the source line SL, and the bit line BL will be described as follows.

복수의 워드 라인(WL)은 복수의 메모리 셀(C)의 액세스 트랜지스터(T)의 게이트에 연결되되, 인접하는 2 개의 워드 라인(WL) 각각은 제2 방향으로 다른 위치에 배치된 액세스 트랜지스터들(T)의 게이트에 연결될 수 있다.The plurality of word lines WL are connected to gates of the access transistors T of the plurality of memory cells C, and each of the two adjacent word lines WL includes access transistors disposed at different positions in the second direction. It can be connected to the gate of (T).

예컨대, 제2 워드 라인(WL1)은 제1 비트 라인(BL0)과 제2 소스 라인(SL1) 사이, 제2 비트 라인(BL1)과 제3 소스 라인(SL2) 사이, 그리고 제3 비트 라인(BL2)과 제4 소스 라인(SL3) 사이에 배치된 액세스 트랜지스터들(T)의 게이트에 연결되고, 제3 워드 라인(WL2)은 제1 소스 라인(SL0)과 제1 비트 라인(BL0) 사이, 제2 소스 라인(SL1)과 제2 비트 라인(BL1) 사이, 그리고 제3 소스 라인(SL2)과 제3 비트 라인(BL2) 사이에 배치된 액세스 트랜지스터들(T)의 게이트에 연결될 수 있다.For example, the second word line WL1 is formed between the first bit line BL0 and the second source line SL1 , between the second bit line BL1 and the third source line SL2 , and the third bit line It is connected to the gates of the access transistors T disposed between BL2 and the fourth source line SL3 , and the third word line WL2 is connected between the first source line SL0 and the first bit line BL0 . , may be connected to the gates of the access transistors T disposed between the second source line SL1 and the second bit line BL1 and between the third source line SL2 and the third bit line BL2 . .

복수의 소스 라인(SL)은 복수의 메모리 셀(C)의 액세스 트랜지스터(T)의 소스 또는 드레인에 연결되되, 복수의 소스 라인(SL) 각각은 제1 방향을 따라서 번갈아 가면서 다른 비트 라인(BL)과 연결된 액세스 트랜지스터들(T)의 소스 또는 드레인에 연결될 수 있다. 예컨대, 제2 소스 라인(SL1)은 제1 방향을 따라서, 제1 비트 라인(BL0)으로 연결된 액세스 트랜지스터(T)와 제2 비트 라인(BL1)으로 연결된 액세스 트랜지스터(T)의 소스 또는 드레인으로 번갈아 연결될 수 있다.The plurality of source lines SL are connected to the sources or drains of the access transistors T of the plurality of memory cells C, and each of the plurality of source lines SL alternately along the first direction to another bit line BL ) and may be connected to the source or drain of the access transistors T. For example, the second source line SL1 is a source or drain of the access transistor T connected to the first bit line BL0 and the access transistor T connected to the second bit line BL1 along the first direction. can be connected alternately.

복수의 비트 라인(BL)은 복수의 메모리 셀(C)의 액세스 트랜지스터(T)의 드레인 또는 소스에 연결되되, 복수의 비트 라인(BL) 각각은 제1 방향을 따라서 번갈아 가면서 다른 소스 라인(SL)과 연결된 액세스 트랜지스터들(T)의 드레인 또는 소스에 연결될 수 있다. 예컨대, 제1 비트 라인(BL0)은 제1 방향을 따라서, 제1 소스 라인(SL0)으로 연결된 액세스 트랜지스터(T)와 제2 소스 라인(SL1)으로 연결된 액세스 트랜지스터(T)의 드레인 또는 소스로 번갈아 연결될 수 있다. 여기서, 비트 라인(BL)은 해당 메모리(M)를 거쳐 해당 액세스 트랜지스터(T)의 드레인 또는 소스에 연결된다고 볼 수 있다.The plurality of bit lines BL are connected to drains or sources of the access transistors T of the plurality of memory cells C, and each of the plurality of bit lines BL is alternately connected to another source line SL in the first direction. ) and may be connected to the drain or source of the access transistors T. For example, the first bit line BL0 serves as the drain or source of the access transistor T connected to the first source line SL0 and the access transistor T connected to the second source line SL1 along the first direction. can be connected alternately. Here, it can be seen that the bit line BL is connected to the drain or source of the corresponding access transistor T through the corresponding memory M.

이러한 연결 관계에 기초하여, 복수의 워드 라인(BL) 중 어느 하나, 그리고 복수의 소스 라인(SL) 또는 복수의 비트 라인(BL) 중 어느 하나를 선택하게 되면 오직 하나의 메모리 셀(C)이 선택될 수 있다. 예컨대, 제3 워드 라인(WL2)과 제3 소스 라인(SL2) 또는 제3 비트 라인(BL2)을 선택한 경우에, 제2 워드 라인(WL1)과 제3 워드 라인(WL2) 사이 그리고 제3 소스 라인(SL2)과 제3 비트 라인(BL2) 사이에 배치된 메모리 셀(Cs)이 선택될 수 있다.Based on this connection relationship, when any one of the plurality of word lines BL, and any one of the plurality of source lines SL or the plurality of bit lines BL is selected, only one memory cell C is generated. can be chosen. For example, when the third word line WL2 and the third source line SL2 or the third bit line BL2 are selected, between the second word line WL1 and the third word line WL2 and the third source A memory cell Cs disposed between the line SL2 and the third bit line BL2 may be selected.

참고로, 자기 메모리 소자의 특성상, 어느 하나의 워드 라인과 어느 하나의 비트 라인이 선택되면, 대응하는 소스 라인이 자동으로 결정될 수 있다. 반대로 어느 하나의 워드 라인과 어느 하나의 소스 라인이 선택되면 대응하는 비트 라인이 자동으로 결정될 수 있다. 한편, 메모리 셀 어레이가 컬럼 소스 라인(Column Source Line: CSL)에 따라 블록 단위로 나뉘는 경우, 하나의 블록 내에 8개 워드 라인, 8개 소스 라인 및 8개의 비트 라인이 배치될 수 있다.For reference, due to the characteristics of the magnetic memory device, when any one word line and any one bit line are selected, the corresponding source line may be automatically determined. Conversely, when any one word line and any one source line are selected, the corresponding bit line may be automatically determined. Meanwhile, when the memory cell array is divided into blocks according to a column source line (CSL), 8 word lines, 8 source lines, and 8 bit lines may be disposed in one block.

메모리 셀(C)을 구성하는 액세스 트랜지스터(T)는 워드 라인(WL)의 전압에 따라 턴-온(Turn-On) 또는 턴-오프(Turn-Off) 되면서 메모리(M)로의 전류 공급을 제어할 수 있다. 예컨대, 액세스 트랜지스터(T)는 모스(MOS) 트랜지스터, 또는 바이폴라(bipolar) 트랜지스터일 수 있다. 또한, 메모리 셀(C)을 구성하는 메모리(M)는 자성체를 포함할 수 있다. 예컨대, 메모리(M)는 자기 터널 접합(MTJ) 소자를 포함할 수 있다. 메모리(M)는 입력되는 전류에 의하여 자성체의 자화 방향이 가변되는 STT(Spin Transfer Torque) 현상을 이용하여 메모리 기능을 수행할 수 있다.The access transistor T constituting the memory cell C is turned on or turned off according to the voltage of the word line WL to control the supply of current to the memory M. can do. For example, the access transistor T may be a MOS transistor or a bipolar transistor. In addition, the memory M constituting the memory cell C may include a magnetic material. For example, the memory M may include a magnetic tunnel junction (MTJ) device. The memory M may perform a memory function by using a spin transfer torque (STT) phenomenon in which a magnetization direction of a magnetic material is changed by an input current.

참고로, 자기 메모리 소자, 예컨대 MRAM(Magnetic Random Access Memory)에 대해서 간단히 설명하면, 자기 메모리 소자에 포함되어 있는 기억 소자인 MTJ 소자에 "0" 과 "1" 상태를 저장하기 위해서는 MTJ 소자에 흐르는 전류가 양방향이어야 한다. 즉, 데이터 "0" 을 기록할 때와 데이터 "1" 을 기록할 때의 MTJ 소자에 흐르는 전류는 방향이 서로 반대이어야 한다. For reference, a brief description of a magnetic memory device, for example, a magnetic random access memory (MRAM), flows through the MTJ device to store “0” and “1” states in the MTJ device, which is a memory device included in the magnetic memory device. The current must be bidirectional. That is, the direction of the current flowing through the MTJ element when writing data "0" and writing data "1" must be opposite to each other.

이렇게 반대 방향의 전류를 흐르게 하는 구조를 형성하기 위해 자기 메모리 소자에서는 비트 라인 외에 소스 라인이 존재한다. 이러한 비트 라인과 소스 라인은 메모리 셀의 MTJ 소자와 액세스 트랜지스터(또는 셀 트랜지스터)를 사이에 두고 각각의 전위차를 바꿔 줌으로써 각각의 메모리 셀의 MTJ 소자에 흐르는 전류의 방향을 선택할 수 있게 된다.In order to form a structure that allows current to flow in the opposite direction, a source line exists in addition to the bit line in the magnetic memory device. The bit line and the source line change the potential difference between the MTJ element and the access transistor (or cell transistor) of the memory cell, so that the direction of the current flowing through the MTJ element of each memory cell can be selected.

본 발명의 기술적 사상에 따른 자기 메모리 소자는 개별 소스 라인 방식을 적용하면서도 단위 메모리 셀의 사이즈를 최소화할 수 있는 메모리 셀 어레이 구조를 가질 수 있다. 다시 말해서, 하나의 비트 라인에 하나의 소스 라인이 배치되는 식으로 메모리 셀 어레이가 설계될 수 있다. 그에 따라, 기존 개별 소스 라인 방식의 경우와 같이 비트 라인과 소스 라인의 전압을 서로 변경하여 사용함으로써 동작 전압을 낮출 수 있다. The magnetic memory device according to the inventive concept may have a memory cell array structure capable of minimizing the size of a unit memory cell while applying an individual source line method. In other words, the memory cell array may be designed in such a way that one source line is disposed on one bit line. Accordingly, as in the case of the conventional individual source line method, the operating voltage can be lowered by changing the voltages of the bit line and the source line to each other.

또한, 2 개의 소스 라인이 하나의 비트 라인을 공유하는 식으로 메모리 셀들이 배치됨으로써, 워드 라인(WL)을 따라서, 그리고 소스 라인(SL)이나 비트 라인(BL을 따라서 메모리 셀들이 지그재그로 배치될 수 있고, 그에 따라, 단위 메모리 셀의 사이즈를 최소화할 수 있다.Also, the memory cells are arranged in such a way that two source lines share one bit line, so that the memory cells are arranged in a zigzag manner along the word line WL and along the source line SL or the bit line BL. Therefore, the size of the unit memory cell may be minimized.

예컨대, 본 발명의 기술적 사상에 따른 자기 메모리 소자는 단위 메모리 셀(Cu)의 사이즈를 가질 수 있다. 구체적으로, 워드 라인들(WL) 간의 피치가 2F이고, 소스 라인들(SL) 또는 비트 라인들(BL) 간의 피치가 4F인 경우, 단위 메모리 셀(Cu)의 사이즈는 8F²일 수 있다. 여기서, F는 최소 리소그라피 피쳐 사이즈(minimum lithographic feature size)를 의미할 수 있다.For example, the magnetic memory device according to the inventive concept may have a size of a unit memory cell Cu. Specifically, when the pitch between the word lines WL is 2F and the pitch between the source lines SL or the bit lines BL is 4F, the size of the unit memory cell Cu may be ^{8F 2 .} Here, F may mean a minimum lithographic feature size.

한편, 메모리 셀들이 워드 라인(WL) 및 소스 라인(SL) 또는 비트 라인(BL)을 따라 지그재그로 배치되고, 워드 라인(WL), 소스 라인(SL), 및 비트 라인(BL)으로 연결됨으로써, 어느 하나의 워드 라인 및 어느 하나의 비트 라인(또는 어느 하나의 소스 라인)의 선택에 의해 오직 하나의 메모리 셀이 선택될 수 있다.Meanwhile, the memory cells are arranged in a zigzag pattern along the word line WL and the source line SL or the bit line BL and are connected to the word line WL, the source line SL, and the bit line BL. , only one memory cell can be selected by selection of any one word line and any one bit line (or any one source line).

본 발명의 실시예에 따른 기록 매체로서 메모리 셀을 예로 들었으나, 이에 한정되지 않고, CD, LP, CD-RW, DVD, 블루레이 디스크(BD) 등에도 동일하게 적용할 수 있다.Although the memory cell is exemplified as the recording medium according to the embodiment of the present invention, the present invention is not limited thereto, and may be equally applied to CD, LP, CD-RW, DVD, Blu-ray Disc (BD), and the like.

전술한 바와 같이, 본 발명에 의하면, 유아 사용자가 이용하는 컴퓨터 단말기나 휴대용 단말기 등의 화면 상으로 서술형 문제를 제공하고, 유아 사용자가 문제의 답안을 직접 필기하여 입력하면, 입력된 필기체 이미지 데이터를 이용하여 필기인식 및 자연어 처리 기술로 자동 채점하여 제공함으로써 유아 사용자가 24시간 언제든지 학습할 수 있도록 하는, 손글씨 인식을 통한 유아 학습 방법 및 이를 기록한 기록매체를 제공할 수 있다.As described above, according to the present invention, a narrative problem is provided on the screen of a computer terminal or a portable terminal used by an infant user, and when the infant user directly writes and inputs an answer to the problem, the input cursive image data is used Thus, it is possible to provide a method for learning children through handwriting recognition and a recording medium recording the same by automatically scoring and providing them with handwriting recognition and natural language processing technology so that infant users can learn at any time 24 hours a day.

이상에서는 본 발명의 실시예를 중심으로 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 기술자의 수준에서 다양한 변경이나 변형을 가할 수 있다. 이러한 변경과 변형은 본 발명이 제공하는 기술 사상의 범위를 벗어나지 않는 한 본 발명에 속한다고 할 수 있다. 따라서 본 발명의 권리범위는 이하에 기재되는 청구범위에 의해 판단되어야 할 것이다.In the above, the embodiments of the present invention have been mainly described, but various changes or modifications can be made at the level of those skilled in the art to which the present invention pertains. Such changes and modifications can be said to belong to the present invention without departing from the scope of the technical spirit provided by the present invention. Accordingly, the scope of the present invention should be judged by the claims described below.

100 : 유아 학습 시스템 110 : 학생 단말기
120 : 유아학습 서버 130 : 유아학습 DB
140 : 교사 단말기 210 : 필기 인식부
212 : 전처리부 214 : OCR 엔진부
216 : 후처리부 220 : 자동 채점부
222 : 전처리부 224 : NPL 채점 엔진부
226 : 후처리부100: early childhood learning system 110: student terminal
120: early childhood learning server 130: early childhood learning DB
140: teacher terminal 210: handwriting recognition unit
212: pre-processing unit 214: OCR engine unit
216: post-processing unit 220: automatic scoring unit
222: pre-processing unit 224: NPL scoring engine unit
226: post-processing unit

Claims

(a) receiving a user's handwriting through a touch input;
(b) OCR processing the input of the user's handwriting;
(c) recognizing the user's handwriting by post-processing the OCR processing result by a handwriting recognition unit;
(d) pre-processing the recognized user's handwriting;
(e) automatically scoring the pre-processed handwriting recognition result through natural language processing; and
(f) outputting a scoring result by post-processing the automatic scoring;
including,
The automatic scoring step uses entity name recognition and keyword extraction for descriptive feature extraction, obtains extended model answers through typographical correction, and phonological similarity and syllable unit language model for lower grades for typographical correction. In the case of high school students, using dictation correction rules and spacing Create narrative scoring data by comparing the extended model answer with the learner input answer based on the narrative question ending in the sentence and scoring as an empty set while removing the remaining unnecessary data,
In the automatic scoring step, the descriptive scoring data is collected and a linking ending is added to make the first-stage essay data, and the data set for the first-stage essay data is expanded by “meaning a, has b features” and Create two-stage essay data that includes a question essay “It means a thing and has b features” In this case, create 3 stage essay data including the part where the judgment that d method should be used", and the above 3 stage essay data are included and additionally "a means a thing and has b characteristics. Therefore, a and b are c, so , it is better to use method d in this problem, but a better method is to do f through e” How to learn early childhood.

The method of claim 1,
In step (c), the handwriting recognition unit refines data through a binarization process, a labeling process, and a manual labeling merging process for a handwritten image according to the handwriting input from the user,
In the binarization process, the handwriting image according to the handwriting is input as a single-channel image of gray scale, and when it exceeds a threshold value, it is changed to a maximum value, and when it is less than the threshold value, it is converted to 0;
In the labeling process, according to the Connected Component Labeling technique, the image is scanned from left to right and from top to bottom, and the top and left sides are checked for pixels whose pixel values are not 0. Allocate a new label (number), if only one of the two is non-zero, assign it to that label, if not all zero, assign one of the two labels, set these two labels to equivalence, and Repeating the process of checking the top and left on the pixel to reset the label for the equivalent label.

The method of claim 1,
In the step (e), the natural language processing expresses the recognized word for the user's handwriting as an embedding vector in the form of a dense vector, word embedding, and the embedding vector Deriving the similarity with respect to the other two words and sentences using
In step (e), the automatic scoring compares the handwriting recognition result with the content model answer, derives the scoring result by rule-based and edit distance calculation in the case of short answer type, and input answer characteristic and model answer characteristic in the case of narrative type A method for early childhood learning that derives a scoring result by calculating the similarity by executing feature extraction for

4. The method of claim 3,
In the case of the narrative type, the feature extraction is performed through spacing correction, morpheme analysis for nouns and verbs, chunking for noun phrases and verb phrases, and negative expression recognition using indefinite tags.

(a) receiving a user's handwriting through a touch input;
(b) OCR processing the input of the user's handwriting;
(c) post-processing the OCR processing result to recognize the user's handwriting;
(d) pre-processing the recognized user's handwriting;
(e) automatically scoring the pre-processed handwriting recognition result through natural language processing; and
(f) outputting a scoring result by post-processing the automatic scoring;
including,
The automatic scoring step uses entity name recognition and keyword extraction for descriptive feature extraction, obtains extended model answers through typographical correction, and phonological similarity and syllable unit language model for lower grades for typographical correction. In the case of high school students, using dictation correction rules and spacing Create narrative scoring data by comparing the extended model answer with the learner input answer based on the narrative question ending in the sentence and scoring as an empty set while removing the remaining unnecessary data,
In the automatic scoring step, the descriptive scoring data is collected and a linking ending is added to make the first-stage essay data, and the data set for the first-stage essay data is expanded by “meaning a, has b features” and Create two-stage essay data that includes a question essay “It means a thing and has b features” In this case, create 3 stage essay data including the part where the judgment that d method should be used", and the above 3 stage essay data are included and additionally "a means a thing and has b characteristics. Therefore, a and b are c, so , in this problem, it is better to use method d, but a better method is to do f through e.” A computer readable program records the early childhood learning method that executes essay-type scoring by creating four-step narrative data. media.