KR20210141421A

KR20210141421A - A system for tracking user knowledge based on artificial intelligence learning and method thereof

Info

Publication number: KR20210141421A
Application number: KR1020210123257A
Authority: KR
Inventors: 최영덕; 이영남; 조중현; 백진언; 김병수; 차영민; 신동민; 배찬; 허재위
Original assignee: (주)뤼이드
Priority date: 2021-02-02
Filing date: 2021-09-15
Publication date: 2021-11-23
Also published as: KR20210141426A; KR20210141420A; KR20210141425A; KR20210141424A; KR20210141791A; KR20210141423A; KR20210141419A; KR20210141422A; KR20210141792A

Abstract

The present disclosure relates to a method for tracking user knowledge in a system for tracking the user knowledge, capable of predicting a probability of a correct answer of a user by inputting information on a problem into an encoder neural network having a transformer structure and inputting information on an answer into a decoder neural network. Attention information is generated by inputting the information on the problem into a k^th encoder neural network and by reflecting a weight in the information on the problem. Query data, which is information on a problem to be predicted in the probability of the correct answer of the user, is generated by inputting the information on the answer into the k^th decoder neural network and by reflecting the weight in the information on the answer. The system for tracking the user knowledge is trained by using the attention information as the weight to the query data. Attention information is regenerated based on the comparison between N and K which are number for stacking a plurality of encoder neural networks and a plurality of decoder neural networks. The system for tracking the user knowledge is trained based on the regenerated attention information. The system for tracking the user knowledge may include the plurality of encoder neural networks and the plurality of decoder neural networks.

Description

A SYSTEM FOR TRACKING USER KNOWLEDGE BASED ON ARTIFICIAL INTELLIGENCE LEARNING AND METHOD THEREOF

본 발명은 인공 지능 학습을 기반으로 사용자의 지식을 추적하기 위한 방법 및 시스템에 관한 것이다. 구체적으로, 트랜스포머 구조의 인코더 신경망에 문제 정보를, 디코더 신경망에 응답 정보를 입력하여 사용자의 정답 확률을 예측하기 위한 사용자 지식 추적 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for tracking a user's knowledge based on artificial intelligence learning. Specifically, it relates to a user knowledge tracking method and system for predicting the probability of a correct answer of a user by inputting problem information to an encoder neural network of a transformer structure and response information to a decoder neural network.

최근 인터넷과 전자장치의 활용이 각 분야에서 활발히 이루어지며 교육 환경 역시 빠르게 변화하고 있다. 특히, 다양한 교육 매체의 발달로 학습자는 보다 폭넓은 학습 방법을 선택하고 이용할 수 있게 되었다. 그 중에서도 인터넷을 통한 교육 서비스는 시간적, 공간적 제약을 극복하고 저비용의 교육이 가능하다는 이점 때문에 주요한 교수 학습 수단으로 자리매김하게 되었다.Recently, the use of the Internet and electronic devices has been actively carried out in each field, and the educational environment is also changing rapidly. In particular, with the development of various educational media, learners can choose and use a wider range of learning methods. Among them, education service through the Internet has become a major teaching and learning method because of its advantages of overcoming temporal and spatial constraints and enabling low-cost education.

이러한 온라인 교육 서비스는 다양한 모델의 인공지능과 접목하여 기존에 오프라인 교육 환경에서는 불가능했던 임의의 문제에 대한 사용자의 정답 확률을 예측하여 보다 효율적인 학습 컨텐츠를 제공할 수 있게 되었다.By combining this online education service with artificial intelligence of various models, it is possible to provide more efficient learning content by predicting the probability of a user's correct answer to a random problem that was not possible in the existing offline education environment.

종래 주어진 문제에 대한 사용자의 정답 확률을 예측하기 위해 RNN, LSTM, 양방향 LSTM, 트랜스포머 등 다양한 인공신경망 모델 이 제안된 바 있지만, 이러한 기존 인공신경망 모델들은 원하는 추론 결과를 얻기에는 레이어가 너무 얇거나 사용자 지식 추적에 최적화된 입력 데이터를 사용하지 않아 충분한 정확성을 가지는 결과를 예측하지 못하는 문제가 있었다.In the past, various artificial neural network models such as RNN, LSTM, bidirectional LSTM, and transformer have been proposed to predict the probability of a user's correct answer to a given problem. There was a problem in that a result with sufficient accuracy could not be predicted because input data optimized for knowledge tracking was not used.

특히, 트랜스포머 모델은 RNN 구조가 아닌 어텐션만을 사용하여 인코더-디코더 구조를 만듦으로써　학습 속도가 무척 빠르고 성능 또한 RNN보다 우수하다는 장점이 있어 사용자 지식 추적 모델로서 주목받기 시작하였는데, 트랜스포머 모델을 학습 컨텐츠에 최적화하여 사용하기 위한 연구가 부족한 실정이다.In particular, the transformer model is starting to attract attention as a user knowledge tracking model because it has the advantage that the learning speed is very fast and the performance is superior to that of RNN by creating an encoder-decoder structure using only attention, not the RNN structure. There is a lack of research to optimize and use it.

구체적으로, 트랜스포머 모델의 인코더와 디코더에 각각 입력되는 데이터를 어떻게 구성해야 학습 컨텐츠에 최적화된 추론 결과를 얻을 수 있는지, 학습 컨텐츠 특성상 사용자가 아직 풀지 않은 문제를 기초로 정답 확률이 예측되는 것을 방지하기 위해 어떤 방법을 사용할 수 있는지 등 트랜스포머 모델을 사용하여 보다 효과적으로 사용자의 정답 확률을 예측하기 위한 방법이 요구된다.Specifically, how to configure the data input to the encoder and decoder of the transformer model to obtain the optimal inference result for the learning content, and to prevent the probability of correct answer from being predicted based on the problem that the user has not yet solved due to the nature of the learning content There is a need for a method to more effectively predict the probability of a user's correct answer using a transformer model, such as which method can be used.

본 발명은 전술한 문제를 해결하기 위한 발명으로, 사용자 지식 추적에 최적화된 입력 데이터 포맷을 사용하여 향상된 성능을 가지는 사용자 지식 추적 시스템을 제공할 수 있다.The present invention is an invention for solving the above-described problem, and it is possible to provide a user knowledge tracking system having improved performance by using an input data format optimized for user knowledge tracking.

또한, 본 발명은 트랜스포머 구조의 인코더 신경망과 디코더 신경망에 상부 삼각 마스킹(Upper triangular masking)을 적절히 사용함으로써 향상된 성능을 가지는 사용자 지식 추적 시스템을 제공할 수 있다.In addition, the present invention can provide a user knowledge tracking system having improved performance by appropriately using upper triangular masking for the encoder neural network and the decoder neural network of the transformer structure.

본 명세서의 일 실시예에 따르면, 사용자 지식 추적 시스템의 사용자 지식 추적 방법에 있어서, 제k 인코더 신경망에 문제 정보를 입력하고, 상기 문제 정보에 가중치를 반영하여 어텐션 정보를 생성하는 단계; 제k 디코더 신경망에 응답 정보를 입력하고, 상기 응답 정보에 가중치를 반영하여, 사용자가 정답 확률을 예측하고자 하는 문제에 대한 정보인 쿼리 데이터를 생성하는 단계; 상기 어텐션 정보를 상기 쿼리 데이터에 대한 가중치로 사용하여 상기 사용자 지식 추적 시스템을 학습하는 단계; 및 복수의 인코더 신경망 및 복수의 디코더 신경망이 스택된 개수인 N과 상기 k의 비교에 근거하여, 상기 어텐션 정보를 재생성하고, 상기 재생성된 어텐션 정보에 근거하여 상기 사용자 지식 추적 시스템을 학습시키는 단계;를 포함하며, 상기 사용자 지식 추적 시스템은 상기 복수의 인코더 신경망 및 상기 복수의 디코더 신경망을 포함할 수 있다.According to an embodiment of the present specification, there is provided a user knowledge tracking method of a user knowledge tracking system, the method comprising: inputting problem information into a kth encoder neural network, and generating attention information by reflecting a weight in the problem information; generating query data that is information about a problem for which a user wants to predict a probability of correct answer by inputting response information into a kth decoder neural network and reflecting a weight in the response information; learning the user knowledge tracking system by using the attention information as a weight for the query data; and regenerating the attention information based on a comparison of N and k, the number of which a plurality of encoder neural networks and a plurality of decoder neural networks are stacked, and training the user knowledge tracking system based on the regenerated attention information; Including, the user knowledge tracking system may include the plurality of encoder neural networks and the plurality of decoder neural networks.

또한, 상기 k가 상기 N보다 작은 경우, 상기 어텐션 정보를 재생성하고, 상기 재생성된 어텐션 정보에 근거하여 상기 사용자 지식 추적 시스템을 학습시킬 수 있다.Also, when k is less than N, the attention information may be regenerated, and the user knowledge tracking system may be trained based on the regenerated attention information.

또한, 상기 k가 상기 N보다 같거나 크면, 상기 사용자 지식 추적 시스템의 학습을 종료하고, 상기 학습된 사용자 지식 추적 시스템으로부터, 사용자가 문제를 맞출 확률인 정답 확률 정보를 출력하는 단계;를 더 포함할 수 있다.In addition, if the k is equal to or greater than N, terminating the learning of the user knowledge tracking system, and outputting, from the learned user knowledge tracking system, correct answer probability information that is a probability that the user will correct the problem; further comprising can do.

또한, 상기 어텐션 정보를 생성하는 단계는, 선택 사항으로 값이 없는 값(제로 패딩)에 어텐션이 수행되지 못하게 하는 동작인 키-쿼리 마스킹(key-query masking)을 수행하는 단계; 및 다음 문제의 예측을 위해 미래 위치에 해당하는 정보가 어텐션이 수행되지 못하게 하기 위한 동작인 상부 삼각 마스킹(upper triangular masking)을 수행하는 단계를 포함할 수 있다.In addition, the generating of the attention information may include optionally performing key-query masking, which is an operation for preventing attention from being performed on a value without a value (zero padding); and performing upper triangular masking, which is an operation to prevent attention from being performed on information corresponding to a future location for prediction of the next problem.

또한, 상기 문제 정보는, 벡터로 표현된 복수의 문제들로 구성되고, 상기 응답 정보는, 상기 벡터로 표현된 상기 복수의 문제들 각각에 대한 사용자의 응답으로 구성될 수 있다.In addition, the problem information may be composed of a plurality of problems expressed in a vector, and the response information may be composed of a user's response to each of the plurality of problems expressed in the vector.

본 명세서의 또 다른 일 실시예로서, 복수의 인코더 신경망과 복수의 디코더 신경망을 포함하는 사용자 지식 추적 시스템에 있어서, 문제 정보를 수신하고, 상기 문제 정보에 가중치를 반영하여 쿼리 데이터에 대한 가중치로 사용될 어텐션 정보를 생성하는 제k 인코더 신경망; 및 응답 정보를 수신하고, 상기 응답 정보에 가중치를 반영하여 사용자가 정답 확률을 예측하고자 하는 문제에 대한 정보인 상기 쿼리 데이터를 생성하고, 상기 어텐션 정보를 상기 쿼리 데이터에 대한 가중치로 사용하여 상기 사용자 지식 추적 시스템을 학습하는 제k 디코더 신경망을 포함할 수 있다.As another embodiment of the present specification, in a user knowledge tracking system including a plurality of encoder neural networks and a plurality of decoder neural networks, the problem information is received, and a weight is reflected in the problem information to be used as a weight for query data. a kth encoder neural network generating attention information; and receiving response information, generating the query data that is information about a problem for which the user wants to predict a correct answer probability by reflecting a weight in the response information, and using the attention information as a weight for the query data, so that the user It may include a kth decoder neural network for learning the knowledge tracking system.

본 발명의 실시 예에 따르면, 사용자 지식 추적에 최적화된 입력 데이터 포맷을 사용하여 향상된 성능을 가지는 사용자 지식 추적 시스템을 제공할 수 있다.According to an embodiment of the present invention, it is possible to provide a user knowledge tracking system having improved performance by using an input data format optimized for user knowledge tracking.

또한, 본 발명의 실시 예에 따르면, 트랜스포머 구조의 인코더 신경망과 디코더 신경망에 상부 삼각 마스킹(Upper triangular masking)을 적절히 사용함으로써 향상된 성능을 가지는 사용자 지식 추적 시스템을 제공할 수 있다.In addition, according to an embodiment of the present invention, it is possible to provide a user knowledge tracking system having improved performance by appropriately using upper triangular masking for the encoder neural network and the decoder neural network of the transformer structure.

도 1은 본 발명의 실시 예에 따른 사용자 지식 추적 시스템을 설명하기 위한 도면이다.
도 2는 도 1의 사용자 지식 추적 시스템의 동작을 상세하게 설명하기 위한 도면이다.
도 3은 종래 사용자 지식 추적 시스템에 사용되던 인공 신경망 구조의 일례를 설명하기 위한 도면이다.
도 4는 종래 사용자 지식 추적 시스템에 사용되던 인공 신경망 구조의 다른 예를 설명하기 위한 도면이다.
도 5는 각 입력 데이터의 구성을 설명하기 위한 도면이다.
도 6은 키-쿼리 마스킹(key-query masking)과 상부 삼각 마스킹(upper triangular masking)을 설명하기 위한 도면이다.
도 7은 하부 삼각 마스킹(lower triangular masking)과 문제 응답 정보를 이용하여 응답 예측 결과를 출력하는 인공 신경망 구조를 설명하기 위한 도면이다.
도 8은 상부 삼각 마스킹(upper triangular masking)과 문제 응답 정보를 이용하여 응답 예측 결과를 출력하는 인공 신경망 구조를 설명하기 위한 도면이다.
도 9는 상부 삼각 마스킹(upper triangular masking), 문제 응답 정보 및 스택된 SAKT를 이용하여 응답 예측 결과를 출력하는 인공 신경망 구조를 설명하기 위한 도면이다.
도 10은 도 2, 도 7 내지 9의 인공 신경망 구조의 ACC 성능을 비교하기 위한 그래프이다.
도 11은 도 2, 도 7 내지 9의 인공 신경망 구조의 AUC 성능을 비교하기 위한 그래프이다.
도 12는 본 발명의 실시 예에 따른, 사용자 지식 추적 시스템의 동작을 설명하기 위한 순서도이다.
도 13은 본 발명의 실시 예에 따른, 문제 처리부, 응답 처리부 또는 문제 응답 처리부의 동작을 상세하게 설명하기 위한 순서도이다.1 is a view for explaining a user knowledge tracking system according to an embodiment of the present invention.
FIG. 2 is a diagram for explaining in detail the operation of the user knowledge tracking system of FIG. 1 .
3 is a diagram for explaining an example of an artificial neural network structure used in a conventional user knowledge tracking system.
4 is a diagram for explaining another example of an artificial neural network structure used in a conventional user knowledge tracking system.
5 is a diagram for explaining the configuration of each input data.
6 is a diagram for explaining key-query masking and upper triangular masking.
7 is a diagram for explaining the structure of an artificial neural network for outputting a response prediction result using lower triangular masking and problem response information.
8 is a diagram for explaining the structure of an artificial neural network that outputs a response prediction result using upper triangular masking and problem response information.
9 is a diagram for explaining the structure of an artificial neural network that outputs a response prediction result using upper triangular masking, problem response information, and stacked SAKT.
10 is a graph for comparing the ACC performance of the artificial neural network structures of FIGS. 2 and 7 to 9 .
11 is a graph for comparing the AUC performance of the artificial neural network structures of FIGS. 2 and 7 to 9 .
12 is a flowchart illustrating an operation of a user knowledge tracking system according to an embodiment of the present invention.
13 is a flowchart for explaining in detail the operation of the problem processing unit, the response processing unit, or the problem response processing unit according to an embodiment of the present invention.

본 명세서 또는 출원에 개시되어 있는 본 발명의 개념에 따른 실시 예들에 대해서 특정한 구조적 내지 단계적 설명들은 단지 본 발명의 개념에 따른 실시 예를 설명하기 위한 목적으로 예시된 것으로, 본 발명의 개념에 따른 실시 예들은 다양한 형태로 실시될 수 있으며 본 발명의 개념에 따른 실시 예들은 다양한 형태로 실시될 수 있으며 본 명세서 또는 출원에 설명된 실시 예들에 한정되는 것으로 해석되어서는 아니 된다.Specific structural or step-by-step descriptions for the embodiments according to the concept of the present invention disclosed in this specification or application are only exemplified for the purpose of explaining the embodiments according to the concept of the present invention, and implementation according to the concept of the present invention Examples may be embodied in various forms, and embodiments according to the concept of the present invention may be embodied in various forms and should not be construed as being limited to the embodiments described in the present specification or application.

본 발명의 개념에 따른 실시 예는 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있으므로 특정 실시 예들을 도면에 예시하고 본 명세서 또는 출원에 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시 예를 특정한 개시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Since the embodiment according to the concept of the present invention may have various changes and may have various forms, specific embodiments will be illustrated in the drawings and described in detail in the present specification or application. However, this is not intended to limit the embodiment according to the concept of the present invention with respect to a specific disclosed form, and should be understood to include all changes, equivalents or substitutes included in the spirit and scope of the present invention.

제1 및/또는 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만, 예컨대 본 발명의 개념에 따른 권리 범위로부터 이탈되지 않은 채, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first and/or second may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another, for example, without departing from the scope of rights according to the inventive concept, a first component may be termed a second component, and similarly The second component may also be referred to as the first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성요소들 간의 관계를 설명하는 다른 표현들, 즉 "~사이에"와 "바로 ~사이에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.When a component is referred to as being “connected” or “connected” to another component, it is understood that the other component may be directly connected or connected to the other component, but other components may exist in between. it should be On the other hand, when it is mentioned that a certain element is "directly connected" or "directly connected" to another element, it should be understood that the other element does not exist in the middle. Other expressions describing the relationship between elements, such as "between" and "immediately between" or "neighboring to" and "directly adjacent to", should be interpreted similarly.

본 명세서에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 서술된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used herein are used only to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. As used herein, terms such as “comprise” or “have” are intended to designate that the stated feature, number, step, operation, component, part, or combination thereof exists, and includes one or more other features or numbers. , it is to be understood that it does not preclude the possibility of the presence or addition of steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical and scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present specification. does not

실시 예를 설명함에 있어서 본 발명이 속하는 기술 분야에 익히 알려져 있고 본 발명과 직접적으로 관련이 없는 기술 내용에 대해서는 설명을 생략한다. 이는 불필요한 설명을 생략함으로써 본 발명의 요지를 흐리지 않고 더욱 명확히 전달하기 위함이다.In describing the embodiments, descriptions of technical contents that are well known in the technical field to which the present invention pertains and are not directly related to the present invention will be omitted. This is to more clearly convey the gist of the present invention without obscuring the gist of the present invention by omitting unnecessary description.

이하, 첨부한 도면을 참조하여 본 발명의 바람직한 실시 예를 설명함으로써, 본 발명을 상세히 설명한다. 이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, the present invention will be described in detail by describing preferred embodiments of the present invention with reference to the accompanying drawings. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 실시 예에 따른 사용자 지식 추적 시스템을 설명하기 위한 도면이다.1 is a view for explaining a user knowledge tracking system according to an embodiment of the present invention.

도 1을 참조하면, 도 1은 트랜스포머 모델에 기반한 사용자 지식 추적 시스템(5)으로서, 실시 예에 따른 사용자 지식 추적 시스템(5)은 임베딩 수행부(10, 30), 인코더 신경망(20) 및 디코더 신경망(40)을 포함할 수 있다.Referring to FIG. 1 , FIG. 1 is a user knowledge tracking system 5 based on a transformer model, and the user knowledge tracking system 5 according to an embodiment includes embedding performers 10 and 30 , an encoder neural network 20 and a decoder. It may include a neural network 40 .

트랜스포머(Transformer) 모델은 기존의 seq2seq의 구조인 인코더-디코더를 따르면서도, 어텐션(Attention)만으로 구현한 모델이다. 트랜스포머 모델은 RNN을 사용하지 않지만 기존의 seq2seq처럼 인코더에서 입력 시퀀스를 입력받고, 디코더에서 출력 시퀀스를 출력하는 인코더-디코더 구조를 유지하며, 인코더와 디코더의 단위가 N개씩 존재할 수 있다는 특징이 있다.The Transformer model follows the encoder-decoder, which is the structure of the existing seq2seq, but is implemented only with attention. Although the transformer model does not use RNN, it maintains an encoder-decoder structure that receives an input sequence from an encoder and outputs an output sequence from a decoder like the existing seq2seq, and has the characteristics that N units of encoder and decoder can exist.

어텐션 메커니즘은 RNN에 기반한 seq2seq의 문제점으로 지적받던 하나의 고정된 크기의 벡터에 모든 정보를 압축하는 데서 오는 정보 손실과, 기울기 소실(Vanishing Gradient) 문제를 해결하기 위해 제안되었다.The attention mechanism was proposed to solve the problem of vanishing gradient and information loss that comes from compressing all information into one fixed-size vector, which has been pointed out as a problem of seq2seq based on RNN.

어텐션 메커니즘에 따르면, 디코더에서 출력 단어를 예측하는 매 시점(time step)마다, 인코더에서의 전체 입력 데이터를 다시 한번 참고한다. 단, 전체 입력 데이터를 전부 다 동일한 비율로 참고하는 것이 아니라, 해당 시점에서 예측해야 할 데이터와 연관이 있는 입력 데이터 부분을 좀 더 집중(attention)하게 된다.According to the attention mechanism, at every time step that the decoder predicts the output word, the entire input data from the encoder is consulted once again. However, the entire input data is not referred to at the same rate, but more attention is paid to the part of the input data that is related to the data to be predicted at the time.

다시 도 1을 참조하면, 사용자 지식 추적 시스템(5)은 문제 데이터베이스에 대한 사용자의 대량의 문제 풀이 결과를 기초로 인공 신경망을 학습하고, 이를 기반으로 문제 데이터베이스에서 임의의 문제에 대한 특정 사용자의 정답 확률을 예측할 수 있다.Referring back to FIG. 1 , the user knowledge tracking system 5 learns an artificial neural network based on the user's mass problem solving results for the problem database, and based on this, a specific user's correct answer for an arbitrary problem in the problem database probabilities can be predicted.

사용자의 실력을 향상시키는 것을 목적으로 하는 교육 도메인에서, 사용자가 확실히 맞출 수 있는 문제를 제공하는 것은 비효율적일 수 있다. 사용자가 틀릴 확률이 높은 문제를 제공하거나, 목표로 하는 시험 점수를 올릴 수 있는 문제를 제공하는 것이 효율적일 것이다. In an educational domain that aims to improve a user's skills, it may be inefficient to provide a problem that the user can definitely fit. It would be efficient to provide a problem with a high probability that the user would be wrong, or to provide a problem that could raise the target test score.

본 발명의 실시 예에 따른 사용자 지식 추적 시스템(5)은, 사용자 특성을 보다 정확하게 반영하는 사용자 모델을 생성하여, 사용자에게 학습 효율이 높은 문제, 예를 들어 틀릴 확률이 높거나, 목표로 하는 시험 점수를 올릴 수 있거나, 반복적으로 틀리는 유형의 문제를 실시간으로 분석하고 사용자가 특히 취약한 유형의 문제를 제공할 수 있다. The user knowledge tracking system 5 according to an embodiment of the present invention generates a user model that more accurately reflects the user characteristics, and provides the user with a problem with high learning efficiency, for example, a high probability of being wrong, or a target test. You can raise your score, or you can analyze in real time the types of problems that are repeatedly wrong and provide the types of problems that users are particularly vulnerable to.

또한, 사용자 지식 추적 시스템(5)은 문제 데이터베이스에 대한 사용자의 대량의 문제 풀이 결과를 기초로 인공 신경망을 학습하고, 이를 기반으로 특정 사용자가 실제 시험에서 받을 수 있는 점수를 예측할 수 있다. 예측되는 점수대에 따라 사용자 맞춤형 학습 설계를 제공할 수 있고, 사용자는 보다 효율적인 학습을 수행할 수 있는 효과가 있다.In addition, the user knowledge tracking system 5 may learn an artificial neural network based on the user's mass problem solving results for the problem database, and predict the score that a specific user can receive in an actual test based on this. It is possible to provide a user-customized learning design according to the predicted score range, and there is an effect that the user can perform more efficient learning.

도 1을 참조하면, 임베딩 수행부(10, 30)는 입력 데이터에 대해 임베딩을 수행할 수 있다. 본 발명의 실시 예에서, 입력 데이터는 문제 정보, 응답 정보 또는 문제 응답 정보를 포함할 수 있다.Referring to FIG. 1 , the embedding performing units 10 and 30 may perform embedding on input data. In an embodiment of the present invention, the input data may include problem information, response information, or problem response information.

문제 정보는 사용자의 지식 수준을 측정하기 위해 제공되는 다양한 유형과 난이도를 가지는 문제에 대한 정보일 수 있다. 응답 정보는 상기 문제 정보에 응답하여 사용자가 선택한 답, 또는 사용자가 해당 문제를 맞췄는지 또는 틀렸는지에 대한 정보일 수 있다. 문제 응답 정보는 문제 정보와 이에 대응되는 사용자의 응답 정보가 매치된 세트에 대한 정보일 수 있다.The problem information may be information on problems having various types and difficulties provided to measure the knowledge level of the user. The response information may be an answer selected by the user in response to the question information, or information on whether the user got the corresponding question right or wrong. The problem response information may be information about a set in which the problem information and the user's response information corresponding thereto are matched.

실시 예에서, 문제 정보는 Example의 약자로 'E', 응답 정보는 Response의 약자로 'R', 문제 응답 정보는 Interaction의 약자로 'I'로 표현될 수 있다. 정답 확률 정보는 'r*'로 표현될 수 있다.In an embodiment, problem information may be expressed as 'E' as an abbreviation for Example, response information as 'R' as an abbreviation for Response, and problem response information as 'I' as an abbreviation for Interaction. Correct answer probability information may be expressed as 'r*'.

임베딩 수행부(10, 30)는 입력 데이터, 예를 들어 문제, 응답, 또는 문제와 응답의 세트를 벡터로 표현하여 사용자 지식 추적 시스템(5)에 임베딩하는 기능을 수행할 수 있다. 입력 데이터를 잠재 공간에 대한 벡터로 표현하는 방법은 다양할 수 있다. 예를 들어, 인공지능 안에서 단어를 수치화하여 사용하는 방법 중 하나일 수 있다. 사용자가 입력한 표현이나 형태가 다르더라도, 연관성을 계산, 수치를 통해 이를 나타내면서 단어, 문장, 글의 의미를 작성할 수 있다.The embedding performing units 10 and 30 may express input data, for example, a problem, a response, or a set of problems and responses as a vector, and perform a function of embedding it in the user knowledge tracking system 5 . There may be various ways to represent the input data as a vector for the latent space. For example, it may be one of the ways to quantify and use words within artificial intelligence. Even if the expression or form input by the user is different, the meaning of words, sentences, and texts can be written while calculating the correlation and expressing this through numerical values.

벡터로 표현된 문제 정보는 임베딩 수행부(10)에서 임베딩 되어 인코더 신경망(20)으로 입력될 수 있다. 벡터로 표현된 응답 정보는 임베딩 수행부(30)에서 임베딩 되어 디코더 신경망(40)으로 입력될 수 있다.The problem information expressed as a vector may be embedded in the embedding performing unit 10 and input to the encoder neural network 20 . Response information expressed as a vector may be embedded in the embedding performing unit 30 and input to the decoder neural network 40 .

본 발명의 실시 예에 따르면, 온라인 학습 컨텐츠에 최적화된 트랜스포머 모델의 입력 데이터로서, 인코더에는 문제 정보를, 디코더에는 응답 정보를 입력함으로써, 보다 향상된 성능을 가지는 사용자 지식 추적 시스템(5)을 제공할 수 있다.According to an embodiment of the present invention, as input data of a transformer model optimized for online learning content, by inputting problem information to an encoder and response information to a decoder, a user knowledge tracking system 5 having improved performance can be provided. can

인코더 신경망(20)은 임베딩된 문제 정보를 기초로 어텐션 정보를 생성할 수 있다. 어텐션 정보는 인코더 신경망(20)의 복수의 레이어를 거치면서 가중치가 부여된 문제 정보일 수 있다. 특히, 어텐션 정보는 인코더 신경망에서 자기 어텐션(self-attention)을 거쳐 생성된 정보일 수 있다. 어텐션 정보는 수학적으로 확률로 표현될 수 있으며, 모든 어텐션 정보의 합은 1이다. 어텐션 정보는 디코더 신경망(40)에 입력되어 디코더 신경망의 쿼리 데이터에 대한 가중치로 사용되어 사용자 지식 추적 시스템(5)을 학습시키는데 사용될 수 있다.The encoder neural network 20 may generate attention information based on embedded problem information. The attention information may be problem information weighted while passing through a plurality of layers of the encoder neural network 20 . In particular, the attention information may be information generated through self-attention in the encoder neural network. Attention information may be mathematically expressed as a probability, and the sum of all attention information is 1. Attention information may be input to the decoder neural network 40 and used as weights for query data of the decoder neural network to learn the user knowledge tracking system 5 .

인공 신경망은 어느 부분이 중요한지 목적함수에 맞게 학습하기 위해 어텐션 정보를 활용할 수 있다. 특히, 자기 어텐션(self-attention)은 어텐션을 자기 자신에게 수행한다는 의미로, 특정 데이터 자체에서 중요하게 고려되어야 할 부분에 가중치를 부여하고, 이를 다시 자기 자신에게 반영하는 동작일 수 있다. 기존의 seq2seq에서의 어텐션에서는 인코더 측의 데이터와 디코더 측의 데이터라는 서로 다른 데이터들의 정보들을 가지고 연관성을 찾아냈기 때문에, 자기 어텐션에 따른 정보는 기존의 seq2seq의 어텐션 구조로는 찾을 수 없던 정보이다.The artificial neural network can utilize attention information to learn which part is important according to the objective function. In particular, self-attention refers to performing attention on oneself, and may be an operation in which a weight is given to a part to be considered important in specific data itself, and this is reflected back to oneself. In the existing attention in seq2seq, the correlation was found with information of different data such as encoder-side data and decoder-side data.

사용자 지식 추적 시스템(5)은 디코더 신경망(40)에 출력 결과(rk*)를 예측하는 매 시점마다 인코더 신경망(20)에 전체 입력 데이터(E1, E2, …, Ek, R1, R2, …, Rk-1)를 다시 참고할 수 있는데, 어텐션 정보에 따라 해당 출력 결과에 연관된 데이터에 집중(attention)할 수 있다.The user knowledge tracking system 5 sends the entire input data E1, E2, ..., Ek, R1, R2, ..., Rk-1) can be referred to again, and according to the attention information, attention can be paid to data related to the corresponding output result.

디코더 신경망(40)은 임베딩된 응답 정보와 어텐션 정보를 기초로 응답 예측 결과를 생성할 수 있다. 디코더 신경망(40)은 응답 정보에, 전술한 자기 어텐션(self-attention)을 적어도 한번 이상 수행하는 멀티 헤드 어텐션(Multi head attention)을 수행할 수 있다..The decoder neural network 40 may generate a response prediction result based on the embedded response information and attention information. The decoder neural network 40 may perform multi-head attention for performing the aforementioned self-attention at least once on the response information.

이처럼, 디코더 신경망(40)은 인코더 신경망(20)에서 문제 정보에서 중요도에 따라 가중치가 부여된 어텐션 정보를 기초로, 응답 정보로부터 생성된 쿼리 데이터에 대해 멀티 헤드 어텐션을 수행하여 정답 확률 정보를 생성할 수 있다.As such, the decoder neural network 40 generates correct answer probability information by performing multi-head attention on the query data generated from the response information based on the attention information weighted according to the importance in the problem information in the encoder neural network 20 . can do.

본 발명의 실시 예에 따르면, 사용자 지식 추적에 최적화된 입력 데이터로서 인코더에는 문제 정보를, 디코더에는 응답 정보를 입력함으로써, 보다 향상된 성능을 가지는 사용자 지식 추적 시스템(5)을 제공할 수 있다.According to an embodiment of the present invention, it is possible to provide the user knowledge tracking system 5 with improved performance by inputting problem information to an encoder and response information to a decoder as input data optimized for user knowledge tracking.

또한, 본 발명은 트랜스포머 구조의 인코더 신경망과 디코더 신경망에 상부 삼각 마스킹(Upper triangular masking)을 적절히 사용함으로써 보다 향상된 성능을 가지는 사용자 지식 추적 시스템(5)을 제공할 수 있다.In addition, the present invention can provide a user knowledge tracking system 5 having improved performance by appropriately using upper triangular masking for the encoder neural network and the decoder neural network of the transformer structure.

도 2는 도 1의 사용자 지식 추적 시스템의 동작을 상세하게 설명하기 위한 도면이다.FIG. 2 is a diagram for explaining in detail the operation of the user knowledge tracking system of FIG. 1 .

도 2를 참조하면, 사용자 지식 추적 시스템(5)은 인코더 신경망(20), 디코더 신경망(40)을 포함할 수 있다. 다시, 인코더 신경망(20)은 문제 처리부(21)와 비선형화 수행부(22)를, 디코더 신경망(40)은 제1 응답 처리부(41), 제2 응답 처리부(42) 및 비선형화 수행부(43)을 포함할 수 있다.Referring to FIG. 2 , the user knowledge tracking system 5 may include an encoder neural network 20 and a decoder neural network 40 . Again, the encoder neural network 20 includes a problem processing unit 21 and a non-linearization performing unit 22, and the decoder neural network 40 includes a first response processing unit 41, a second response processing unit 42, and a non-linearization performing unit ( 43) may be included.

도 2에는 도 1의 임베딩 수행부가 생략되었으나, 입력 데이터의 임베딩 동작은 전술한 도 1에 대한 설명을 참고하여 이해될 수 있다.Although the embedding performing unit of FIG. 1 is omitted in FIG. 2 , the embedding operation of the input data may be understood with reference to the description of FIG. 1 .

문제 정보는 벡터로 표현된 복수의 문제들(E1, E2, …, Ek)로 구성될 수 있다. 응답 정보는 벡터로 표현된 복수의 문제들(E1, E2, …, Ek) 각각에 대한 사용자의 응답(R1, R2, …, Rk-1)으로 구성될 수 있다. 정답 확률 정보는 벡터로 표현된 각 문제들에 대한 사용자의 정답 확률(r1*, r2*, …, rk*)로 구성될 수 있다.The problem information may be composed of a plurality of problems (E1, E2, ..., Ek) expressed as vectors. The response information may consist of a user's responses (R1, R2, ..., Rk-1) to each of a plurality of problems (E1, E2, ..., Ek) expressed in vectors. The correct answer probability information may be composed of the user's correct answer probability (r1*, r2*, ..., rk*) for each problem expressed as a vector.

실시 예에서, 정답 확률 정보 rk*는, 문제 E1에 대한 사용자 응답이 R1, 문제 E2에 대한 사용자 응답이 R2, …, 문제 Ek-1에 대한 사용자 응답이 Rk-1일 때, 문제 Ek에 대해 사용자가 정답을 맞출 확률에 대한 정보일 수 있다.In an embodiment, the correct answer probability information rk* is that the user response to the question E1 is R1, the user response to the question E2 is R2, . , when the user response to the problem Ek-1 is Rk-1, it may be information about the probability that the user answers the correct answer to the problem Ek.

문제 처리부(21)는 문제 정보를 입력받고 자기 어텐션과 관련된 일련의 동작을 수행할 수 있다. 이러한 동작에는 문제 정보를 쿼리, 키, 벨류로 구분하고, 각각의 값에 대한 복수의 헤드값을 생성하며, 복수의 쿼리 헤드값과 복수의 키 헤드값으로부터 가중치를 생성하고, 생성된 가중치에 마스킹을 수행하고, 마스킹이 수행된 가중치를 복수의 벨류 헤드값에 적용하여 예측 데이터를 생성하는 과정을 포함할 수 있다. The problem processing unit 21 may receive problem information and perform a series of operations related to self-attention. In this operation, problem information is divided into query, key, and value, a plurality of head values for each value are generated, a weight is generated from a plurality of query head values and a plurality of key head values, and the generated weight is masked. and generating prediction data by applying the masked weight to the plurality of value head values.

문제 처리부(21)에서 생성된 예측 데이터는 어텐션 정보일 수 있다.The prediction data generated by the problem processing unit 21 may be attention information.

특히, 문제 처리부(21)는 마스킹 동작시 키-쿼리 마스킹(key-query masking) 뿐만 아니라 상부 삼각 마스킹(upper triangular masking)을 수행할 수 있다. 키-쿼리 마스킹(key-query masking)과 상부 삼각 마스킹(upper triangular masking)은 후술되는 도 6에 대한 설명에서 자세하게 설명하도록 한다.In particular, the problem processing unit 21 may perform upper triangular masking as well as key-query masking during the masking operation. Key-query masking and upper triangular masking will be described in detail in the description of FIG. 6 to be described later.

비선형화 수행부(22)는 문제 처리부(21)에서 출력된 예측 데이터를 비선형화 시키는 동작을 수행할 수 있다. 비선형화에는 ReLU 함수가 사용될 수 있다.The non-linearization performing unit 22 may perform an operation of non-linearizing the prediction data output from the problem processing unit 21 . The ReLU function can be used for non-linearity.

도면에는 도시되지 않았지만, 인코더 신경망(20)은 적어도 하나 이상 존재할 수 있다. 인코더 신경망(20)에서 생성된 어텐션 정보는 다시 인코더 신경망(20)에 입력되어 자기 어텐션과 비선형화와 관련된 일련의 동작이 수차례 반복될 수 있다.Although not shown in the figure, at least one encoder neural network 20 may exist. Attention information generated in the encoder neural network 20 is input to the encoder neural network 20 again, and a series of operations related to self-attention and non-linearization may be repeated several times.

이후, 어텐션 정보는 키와 벨류 값으로 나뉘어 제2 응답 처리부에 입력될 수 있다. 어텐션 정보는 제2 응답 처리부에 입력되는 쿼리 데이터에 대한 가중치로 사용되어 사용자 지식 추적 시스템을 학습시키는데 사용될 수 있다.Thereafter, the attention information may be divided into a key and a value value and input to the second response processing unit. The attention information may be used as a weight for the query data input to the second response processing unit to learn the user knowledge tracking system.

제1 응답 처리부(41)는 응답 정보를 입력 받고, 문제 처리부(21)과 유사하게 자기 어텐션과 관련된 일련의 동작을 수행할 수 있다. 이러한 동작에는 문제 정보를 쿼리, 키, 벨류로 구분하고, 각각의 값에 대한 복수의 헤드값을 생성하며, 복수의 쿼리 헤드값과 복수의 키 헤드값으로부터 가중치를 생성하고, 생성된 가중치에 마스킹을 수행하고, 마스킹이 수행된 가중치를 복수의 벨류 헤드값에 적용하여 예측 데이터를 생성하는 과정을 포함할 수 있다. The first response processing unit 41 may receive response information and perform a series of operations related to self-attention similar to the problem processing unit 21 . In this operation, problem information is divided into query, key, and value, a plurality of head values for each value are generated, a weight is generated from a plurality of query head values and a plurality of key head values, and the generated weight is masked. and generating prediction data by applying the masked weight to the plurality of value head values.

제1 응답 처리부(41)에서 생성된 예측 데이터는 쿼리 데이터일 수 있다.The prediction data generated by the first response processing unit 41 may be query data.

제2 응답 처리부(42)는 제1 응답 처리부로부터 쿼리 데이터를, 인코더 신경망(20)으로부터 어텐션 정보를 입력 받고, 정답 확률 정보를 출력할 수 있다.The second response processing unit 42 may receive query data from the first response processing unit, attention information from the encoder neural network 20 , and output correct answer probability information.

어텐션 정보는 디코더 신경망(40)에 입력되어 디코더의 쿼리 데이터에 대한 가중치로 사용되어 사용자 지식 추적 시스템(5)을 학습시키는데 사용될 수 있다.Attention information may be input to the decoder neural network 40 and used as a weight for query data of the decoder to train the user knowledge tracking system 5 .

어텐션 정보는 쿼리 데이터의 특정 영역을 중점적으로 고려하기 위해 부여되는 가중치에 관한 정보일 수 있다. 구체적으로, 사용자 지식 추적 시스템(5)은 디코더 신경망(40)에 출력 결과(rk*)를 예측하는 매 시점마다 인코더 신경망(20)에 전체 입력 데이터(E1, E2, …, Ek, R1, R2, …, Rk-1)를 다시 참고할 수 있는데, 해당 출력 결과에 연관된 데이터에 집중(attention)할 수 있다.The attention information may be information about a weight given to focus on a specific area of query data. Specifically, the user knowledge tracking system 5 transmits the entire input data E1, E2, ..., Ek, R1, R2 to the encoder neural network 20 at every point in time to predict the output result rk* to the decoder neural network 40. , …, Rk-1) can be referred to again, and attention can be paid to the data related to the corresponding output result.

제2 응답 처리부(42)는 상기 동작에 따라 문제 정보 Ek에 대한 사용자의 정답 확률 정보인 rk*를 생성할 수 있다. According to the above operation, the second response processing unit 42 may generate rk*, which is the user's correct answer probability information for the problem information Ek.

도면에는 도시되지 않았지만, 디코더 신경망(40)은 적어도 하나 이상 존재할 수 있다. 디코더 신경망(40)에서 생성된 정답 확률 정보는 다시 디코더 신경망(40)에 입력되어 자기 어텐션, 멀티 헤드 어텐션 및 비선형화와 관련된 일련의 동작이 수차례 반복될 수 있다.Although not shown in the drawing, at least one decoder neural network 40 may exist. The correct answer probability information generated in the decoder neural network 40 is input to the decoder neural network 40 again, and a series of operations related to self-attention, multi-head attention, and non-linearization may be repeated several times.

문제 처리부(21)와 마찬가지로, 제1 응답 처리부(41) 및 제2 응답 처리부는 마스킹 동작시 키-쿼리 마스킹(key-query masking) 뿐만 아니라 상부 삼각 마스킹(upper triangular masking)을 수행할 수 있다.Like the problem processing unit 21 , the first response processing unit 41 and the second response processing unit may perform upper triangular masking as well as key-query masking during the masking operation.

도 3은 종래 사용자 지식 추적 시스템에 사용되던 인공 신경망 구조의 일례를 설명하기 위한 도면이다.3 is a diagram for explaining an example of an artificial neural network structure used in a conventional user knowledge tracking system.

도 3의 사용자 지식 추적 시스템에는, 예측하고자 하는 데이터와 관련된 입력 데이터의 특정 부분에 집중(attention)하는 동작을 수행하는 입력 데이터 처리부가 도시되어 있다.In the user knowledge tracking system of FIG. 3 , an input data processing unit that performs an operation of focusing on a specific part of input data related to data to be predicted is illustrated.

다만, 도 3의 지식 추적 시스템에는, 이러한 입력 데이터 처리부의 레이어가 충분히 심층적이지(deep) 않기 때문에, 수많은 문제들과 이에 대한 사용자 응답을 올바르게 분석하지 못하는 한계가 있었다.However, in the knowledge tracking system of FIG. 3 , since the layer of the input data processing unit is not deep enough, there is a limit in that it cannot properly analyze numerous problems and user responses thereto.

도 4는 종래 사용자 지식 추적 시스템에 사용되던 인공 신경망 구조의 다른 예를 설명하기 위한 도면이다.4 is a diagram for explaining another example of an artificial neural network structure used in a conventional user knowledge tracking system.

도 4의 사용자 지식 추적 시스템에는, 전술한 도 2의 인코더 신경망 또는 디코더 신경망과 유사한 동작을 수행하는 문제 응답 처리부와 비선형화 수행부가 포함될 수 있다.The user knowledge tracking system of FIG. 4 may include a problem response processing unit and a non-linearization performing unit that perform operations similar to those of the encoder neural network or the decoder neural network of FIG. 2 described above.

다만, 도 4의 지식 추적 시스템에는, 문제 응답 정보(I)가 키와 벨류 값으로 제공되고, 문제 정보(E)가 쿼리로 제공되는 기존의 시스템의 한계를 극복하지 못하는 문제가 있었다.However, in the knowledge tracking system of FIG. 4 , there is a problem in that the problem response information (I) is provided as a key and a value value, and the problem information (E) cannot overcome the limitations of the existing system in which the problem information (E) is provided as a query.

본 발명의 실시 예에 따른 사용자 지식 추적 시스템은 이러한 문제를 해결하기 위해, 문제 응답 정보보다 적은 데이터양을 가지는 문제 정보, 응답 정보만을 사용하여 정답 확률을 예측할 수 있고, 어텐션이 수행되는 레이어를 충분히 심층적으로(deep) 구현하여 보다 향상된 정확성을 가지는 인공 신경망을 구현할 수 있다.In order to solve this problem, the user knowledge tracking system according to an embodiment of the present invention can predict the probability of a correct answer using only the problem information and response information having a smaller amount of data than the problem response information, It is possible to implement an artificial neural network with more improved accuracy by implementing it in depth.

도 5는 본 발명의 실시 예에 따른, 각 입력 데이터의 구성을 설명하기 위한 도면이다.5 is a diagram for explaining the configuration of each input data according to an embodiment of the present invention.

도 5를 참조하면, 입력 데이터는 문제 정보(E), 문제 응답 정보(I), 응답 정보(R)를 포함할 수 있다. 구현되는 인공 신경망 모델에 따라 세 가지 입력 데이터 중 특정 데이터가 선택되어 사용될 수 있다. Referring to FIG. 5 , input data may include problem information (E), problem response information (I), and response information (R). Specific data among three input data may be selected and used according to the implemented artificial neural network model.

문제 식별 정보는 문제마다 부여되는 고유값일 수 있다. 사용자 또는 컴퓨터는 문제 식별 정보를 통해 해당 문제가 어떤 문제인지 판별할 수 있다. The problem identification information may be a unique value assigned to each problem. The user or computer can determine what kind of problem the problem is through the problem identification information.

문제 카테고리 정보는 해당 문제가 어떤 유형의 문제인지 알려주는 정보일 수 있다. 예를 들어, 토익 시험 문제에서 문제 카테고리는 듣기 파트인지 또는 읽기 파트인지를 알려주는 정보일 수 있다.The problem category information may be information indicating what type of problem the corresponding problem is. For example, in the TOEIC test question, the question category may be information indicating whether the listening part or the reading part.

위치 정보는 해당 데이터가 전체 데이터 내에서 어디에 위치하는지를 알려주는 정보일 수 있다. 트랜스포머 구조는 RNN 구조와 다르게 입력 데이터의 순서가 표시되지 않기 때문에 순서를 구분하기 위해 각각의 데이터가 전체 데이터 시퀀스 내에서 어디에 위치하는지를 별도로 표시할 필요가 있다. 위치 정보는 입력 데이터와 함께 임베딩 되고, 임베딩된 입력 데이터에 더해져 인코딩 신경망과 디코딩 신경망에 입력될 수 있다.The location information may be information indicating where the corresponding data is located in the entire data. Unlike the RNN structure, in the transformer structure, the order of input data is not indicated, so it is necessary to separately indicate where each data is located within the entire data sequence in order to distinguish the order. The location information may be embedded with the input data, added to the embedded input data, and input to the encoding neural network and the decoding neural network.

응답 정확성 정보는 사용자의 응답이 정답인지 또는 오답인지를 나타내는 정보일 수 있다. 예를 들어, 사용자의 응답이 정답이면 '1'을 나타내는 벡터로 표현될 수 있다. 반대로, 사용자의 응답이 오답이면 '0'을 나타내는 벡터로 표현될 수 있다.The response accuracy information may be information indicating whether the user's response is a correct answer or an incorrect answer. For example, if the user's response is the correct answer, it may be expressed as a vector representing '1'. Conversely, if the user's response is an incorrect answer, it may be expressed as a vector representing '0'.

소요 시간 정보는 사용자가 문제를 푸는데 소요된 시간을 벡터로 나타낸 정보일 수 있다. 소요 시간 정보는 초, 분, 시간 등으로 표현될 수 있으며, 일정 시간(예를 들어 300초)을 초과한 시간에 대해서는 해당 시간(300초)만큼 소요됐다고 판단할 수도 있다.The required time information may be information representing a time required for a user to solve a problem as a vector. The required time information may be expressed in seconds, minutes, hours, etc., and for a time exceeding a predetermined time (for example, 300 seconds), it may be determined that the time (300 seconds) has been consumed.

시간 기록 정보는 사용자가 문제를 푼 시점을 벡터로 나타낸 정보일 수 있다. 시간 기록 정보는 시간, 일, 월, 년도 등으로 표현될 수 있다.The time record information may be information representing a point in time when a user solves a problem as a vector. The time record information may be expressed as time, day, month, year, or the like.

문제 정보(E)는 문제 식별 정보, 문제 카테고리 정보, 위치 정보를 포함할 수 있다. 즉, 해당 문제가 어떤 문제인지, 어떤 유형의 문제인지, 전체 문제 데이터에서 어디에 위치하는지에 대한 정보를 포함할 수 있다.The problem information (E) may include problem identification information, problem category information, and location information. That is, it can include information about what kind of problem the problem is, what type of problem it is, and where it is located in the overall problem data.

응답 정보(I)는 위치 정보와 응답 정확성 정보를 포함할 수 있다. 즉, 사용자의 응답이 전체 응답 데이터 중 어디에 위치하는지, 사용자의 응답이 정답인지 또는 오답인지에 대한 정보를 포함할 수 있다.The response information I may include location information and response accuracy information. That is, it may include information on where the user's response is located among all response data and whether the user's response is a correct or incorrect answer.

문제 응답 정보(I)는 문제 식별 정보, 문제 카테고리 정보, 위치 정보, 응답 정확성 정보, 소요 시간 정보, 시간 기록 정보를 포함할 수 있다. 문제 응답 정보는 문제 정보(E)와 응답 정보(R)의 모든 정보에 더하여 소요 시간 정보와 시간 기록 정보를 추가로 포함할 수 있다.The problem response information I may include problem identification information, problem category information, location information, response accuracy information, required time information, and time record information. The problem response information may further include required time information and time record information in addition to all information of the problem information (E) and the response information (R).

본 발명의 실시 예에 따른 사용자 지식 추적 시스템은, 문제 응답 정보(I) 대신 문제 정보(E)와 응답 정보(R)만을 사용하여 사용자의 정오답 여부를 예측할 수 있어 사용되는 데이터 양을 감소시켜 연산 성능을 증가시키고 증가된 메모리 효율을 가질 수 있다. 또한, 정확성 측면에서도 인코더에는 문제 정보(E)를, 디코더에는 응답 정보(R)를 입력하여 온라인 학습 환경에 최적화되어 보다 향상된 정확도로 정답 확률을 예측할 수 있는 효과가 있다.The user knowledge tracking system according to an embodiment of the present invention can predict whether the user has an incorrect answer by using only the problem information (E) and the response information (R) instead of the problem response information (I), thereby reducing the amount of data used. It can increase computational performance and have increased memory efficiency. In addition, in terms of accuracy, problem information (E) is input to the encoder and response information (R) is input to the decoder, and it is optimized for the online learning environment, thereby predicting the probability of correct answers with improved accuracy.

도 6은 키-쿼리 마스킹(key-query masking)과 상부 삼각 마스킹(upper triangular masking)을 설명하기 위한 도면이다.6 is a view for explaining key-query masking and upper triangular masking.

도 6은 키-쿼리 마스킹 이후 상부 삼각 마스킹이 수행되는 것으로 도시되었지만, 양자는 동시에 수행될 수 있고, 상부 삼각 마스킹이 먼저 수행될 수도 있다.Although FIG. 6 shows that upper triangular masking is performed after key-query masking, both may be performed simultaneously, and upper triangular masking may be performed first.

키-쿼리 마스킹은 선택 사항으로 값이 없는 값(제로 패딩)에 패널티를 부과하여 어텐션이 수행되지 못하게 하는 동작일 수 있다. 키-쿼리 마스킹이 수행된 예측 데이터의 값은 0으로, 나머지 부분은 1로 표현될 수 있다.Key-query masking may optionally be an operation that prevents attention from being performed by imposing a penalty on a value without a value (zero padding). The value of the prediction data on which the key-query masking is performed may be expressed as 0, and the remaining part may be expressed as 1.

도 6의 키-쿼리 마스킹은 설명의 편의를 위해 쿼리와 키의 마지막 값들이 마스킹된 것으로 도시되었지만, 이는 실시 예에 따라 다양하게 변경될 수 있다.In the key-query masking of FIG. 6 , for convenience of description, the last values of the query and the key are masked, but this may be variously changed according to embodiments.

상부 삼각 마스킹(upper triangular masking)은 다음 문제의 예측을 위해 미래 위치(포지션)에 해당하는 정보가 어텐션이 수행되지 못하게 하기 위한 동작일 수 있다. 예를 들어, 아직 사용자가 풀지 않은 문제로부터 예측값이 연산되는 것을 방지하기 위해 마스킹하는 동작일 수 있다. 키-쿼리 마스킹과 마찬가지로, 상부 삼각 마스킹이 수행된 예측 데이터의 값은 0으로, 나머지 부분은 1로 표현될 수 있다.Upper triangular masking may be an operation for preventing attention from being performed on information corresponding to a future location (position) for prediction of the next problem. For example, it may be an operation of masking to prevent a prediction value from being calculated from a problem that the user has not yet solved. Like the key-query masking, the value of the prediction data on which the upper triangular masking is performed may be expressed as 0, and the remaining part may be expressed as 1.

마스킹된 예측 데이터의 값들은 이후 임의의 큰 음수값이 반영되어 소프트맥스 함수를 통해 확률적으로 표현될 때 0에 가까운 확률을 가지도록 제어될 수 있다.The values of the masked prediction data may then be controlled to have a probability close to zero when an arbitrary large negative value is reflected and expressed probabilistically through a softmax function.

종래 트랜스포머 구조에서는 인코더 신경망에서는 키-쿼리 마스킹이 수행되고, 디코더 신경망에서는 키-쿼리 마스킹과 더불어 상부 삼각 마스킹이 수행되었다. 본 발명의 실시 예에서는, 인코더 신경망과 디코더 신경망 모두에서 상부 삼각 마스킹을 수행하여, 정답 확률 정보가 오직 사용자에게 기 제공된 문제 정보들(E1, E2, …, Ek)과 사용자가 이미 제출한 응답 정보(R1, R2, …, Rk-1)에만 의존하도록 제어될 수 있다.In the conventional transformer structure, key-query masking is performed in the encoder neural network, and upper triangular masking is performed in addition to the key-query masking in the decoder neural network. In an embodiment of the present invention, upper triangular masking is performed in both the encoder neural network and the decoder neural network, so that only the correct answer probability information is provided to the user in advance of problem information (E1, E2, ..., Ek) and response information already submitted by the user It can be controlled to depend only on (R1, R2, ..., Rk-1).

도 7은 하부 삼각 마스킹(lower triangular masking)과 문제 응답 정보를 이용하여 응답 예측 결과를 출력하는 인공 신경망 구조(LTMTI)를 설명하기 위한 도면이다.7 is a diagram for explaining an artificial neural network structure (LTMTI) for outputting a response prediction result using lower triangular masking and problem response information.

도 7을 참조하면, 인코더 신경망에 입력 데이터로 문제 응답 정보가 입력되고, 디코더 신경망에 입력 데이터로 문제 정보가 입력될 수 있다. 또한, 디코더 신경망은 자기 어텐션 과정이 생략된 멀티 헤드 어텐션만을 수행할 수 있다.Referring to FIG. 7 , problem response information may be input to the encoder neural network as input data, and problem information may be input to the decoder neural network as input data. Also, the decoder neural network may perform only multi-head attention in which the self-attention process is omitted.

나아가, 도 7의 문제 응답 처리부와 문제 처리부에서는 예측 데이터의 상부 삼각(upper triangle)이 마스킹되는 것이 아닌, 하부 삼각(lower triangle)이 마스킹되는 어텐션이 수행될 수 있다.Furthermore, in the problem response processing unit and the problem processing unit of FIG. 7 , attention may be performed in which the lower triangle of the prediction data is not masked, but the lower triangle is masked.

도 8은 상부 삼각 마스킹(upper triangular masking)과 문제 응답 정보를 이용하여 응답 예측 결과를 출력하는 인공 신경망 구조(UTMTI)를 설명하기 위한 도면이다.8 is a diagram for explaining an artificial neural network structure (UTMTI) that outputs a response prediction result using upper triangular masking and problem response information.

도 8을 참조하면, 인코더 신경망에 입력 데이터로 문제 응답 정보가 입력되고, 디코더 신경망에 입력 데이터로 문제 정보가 입력될 수 있다.Referring to FIG. 8 , problem response information may be input to the encoder neural network as input data, and problem information may be input to the decoder neural network as input data.

도 9는 상부 삼각 마스킹(upper triangular masking), 문제 응답 정보 및 스택된 SAKT를 이용하여 응답 예측 결과를 출력하는 인공 신경망 구조(SSAKT)를 설명하기 위한 도면이다.9 is a diagram for explaining an artificial neural network structure (SSAKT) that outputs a response prediction result using upper triangular masking, problem response information, and stacked SAKT.

도 9를 참조하면, 인코더 신경망에 입력 데이터로 문제 응답 정보가 입력되고, 디코더 신경망에 입력 데이터로 문제 정보가 입력될 수 있다. 또한, 문제 응답 처리부는 자기 어텐션이 아닌 멀티 헤드 어텐션을 수행하고, 응답 처리부는 자기 어텐션을 수행할 수 있다.Referring to FIG. 9 , problem response information may be input as input data to the encoder neural network, and problem information may be input as input data to the decoder neural network. Also, the problem response processing unit may perform multi-head attention instead of self-attention, and the response processing unit may perform self-attention.

도 10은 도 2, 도 7 내지 9의 인공 신경망 구조의 ACC 성능을 비교하기 위한 그래프이다.10 is a graph for comparing the ACC performance of the artificial neural network structures of FIGS. 2 and 7 to 9 .

ACC는 민감도에 대한 지표일 수 있다. ACC는 오답이 있는 전체 응답 정보 중 정답인 응답 정보의 비율을 나타낼 수 있다. N은 스택(stack)된 인코더와 디코더의 개수를 의미할 수 있다. d_model은 모델의 모든 하위 레이어의 출력 차수를 의미할 수 있다. 도 2의 사용자 지식 추적 시스템은 SAINT로 표기되었다. ACC may be an indicator for sensitivity. The ACC may indicate a ratio of response information that is a correct answer among all response information having an incorrect answer. N may mean the number of stacked encoders and decoders. d_model may mean an output order of all lower layers of the model. The user knowledge tracking system of FIG. 2 is marked as SAINT.

통계 결과, SAINT의 ACC는 평균적으로 다른 모델들에 비해 약 1.8%가 높은 것을 확인할 수 있다.As a result of the statistics, it can be seen that the ACC of SAINT is on average about 1.8% higher than that of other models.

도 11은 도 2, 도 7 내지 9의 인공 신경망 구조의 AUC 성능을 비교하기 위한 그래프이다.11 is a graph for comparing the AUC performance of the artificial neural network structures of FIGS. 2 and 7 to 9 .

AUC는 전체 예측에 대한 올바른 예측의 비율을 나타낼 수 있다. N은 스택(stack)된 인코더와 디코더의 개수를 의미할 수 있다. d_model은 모델의 모든 하위 레이어의 출력 차수를 의미할 수 있다. 도 2의 사용자 지식 추적 시스템은 SAINT로 표기되었다. 통계 결과, SAINT의 AUC는 평균적으로 다른 모델들에 비해 약 1.07%가 높은 것을 확인할 수 있다.AUC may represent the ratio of correct predictions to overall predictions. N may mean the number of stacked encoders and decoders. d_model may mean an output order of all lower layers of the model. The user knowledge tracking system of FIG. 2 is marked as SAINT. As a result of the statistics, it can be seen that the AUC of SAINT is about 1.07% higher than that of other models on average.

도 12는 본 발명의 실시 예에 따른, 사용자 지식 추적 시스템의 동작을 설명하기 위한 순서도이다.12 is a flowchart illustrating an operation of a user knowledge tracking system according to an embodiment of the present invention.

도 12를 참조하면, S1201 단계에서, 사용자 지식 추적 시스템은 제k 인코더 신경망에 문제 정보를, 제k 디코더 신경망에 응답 정보를 각각 입력할 수 있다.Referring to FIG. 12 , in step S1201 , the user knowledge tracking system may input problem information to the kth encoder neural network and response information to the kth decoder neural network, respectively.

S1203 단계에서, 사용자 지식 추적 시스템은 문제 정보에 가중치를 반영해 어텐션 정보를 생성하고, 응답 정보에 가중치를 반영해 쿼리 데이터를 생성할 수 있다.In step S1203, the user knowledge tracking system may generate attention information by reflecting the weight on the problem information, and generate query data by reflecting the weight on the response information.

구체적으로, 문제 정보는 자기 어텐션(self-attention)을 통해 문제 정보 자체에서 가중치가 반영될 수 있다. 자기 어텐션은 특정 데이터 자체에서 중요하게 고려되어야 할 부분에 가중치를 부여하고, 이를 다시 자기 자신에게 반영하는 동작일 수 있다.Specifically, the weight of the problem information may be reflected in the problem information itself through self-attention. Self-attention may be an operation in which a weight is given to a part to be considered important in specific data itself, and this is reflected back to oneself.

응답 정보는 자기 어텐션(self-attention) 뿐만 아니라, 어텐션 정보를 기초로 멀티 헤드 어텐션(multi head attention)이 수행되어 가중치가 반영될 수 있다.In response information, not only self-attention, but also multi-head attention is performed based on the attention information, so that a weight may be reflected.

S1205 단계에서, 제k 디코더 신경망의 제1 응답 처리부로부터 출력된 쿼리 데이터를 제2 응답 처리부에 입력할 수 있다. 쿼리 데이터는 제1 응답 처리부에서 출력된 예측 데이터 일 수 있다.In operation S1205, the query data output from the first response processing unit of the kth decoder neural network may be input to the second response processing unit. The query data may be prediction data output from the first response processing unit.

S1207 단계에서, 사용자 지식 추적 시스템은 어텐션 정보를 제2 응답 처리부의 쿼리 데이터에 대한 가중치로 사용하여 사용자 지식 추적 시스템을 학습할 수 있다.In step S1207, the user knowledge tracking system may learn the user knowledge tracking system by using the attention information as a weight for the query data of the second response processing unit.

S1209 단계에서, 사용자 지식 추적 시스템은 k와 N을 비교하여, k가 N보다 크거나 같으면 S1211 단계를 수행하고, 작으면 다시 S1203 단계로 돌아가 S1203 단계 내지 S1207 단계를 반복할 수 있다.In step S1209, the user knowledge tracking system compares k with N, and if k is greater than or equal to N, performs step S1211, and if smaller, returns to step S1203 and repeats steps S1203 to S1207.

인코더 신경망과 디코더 신경망은 N개 만큼 스택될 수 있으므로, 스택된 모든 인코더 신경망과 디코더 신경망에 대해 동작이 종료될 때까지 위 과정을 반복할 수 있다.Since encoder neural networks and decoder neural networks can be stacked as many as N, the above process can be repeated for all the stacked encoder neural networks and decoder neural networks until the operation is finished.

S1211 단계에서, 사용자 지식 추적 시스템은 학습된 사용자 지식 추적 시스템으로부터 사용자의 정답 확률 정보를 출력할 수 있다.In step S1211, the user knowledge tracking system may output the user's correct answer probability information from the learned user knowledge tracking system.

이는 추론 과정(inference)으로, 학습 과정에서 결정된 가중치에 따라 입력 데이터를 처리하여, 사용자가 풀이하는 문제의 정답 확률을 나타내는 정답 확률 정보를 출력할 수 있다.This is an inference process, and by processing input data according to a weight determined in the learning process, it is possible to output correct answer probability information indicating a correct probability of a problem solved by a user.

도 13은 본 발명의 실시 예에 따른, 문제 처리부, 응답 처리부 또는 문제 응답 처리부의 동작을 상세하게 설명하기 위한 순서도이다.13 is a flowchart for explaining in detail the operation of the problem processing unit, the response processing unit, or the problem response processing unit according to an embodiment of the present invention.

도 13을 참조하면, S1301 단계에서, 각 구성은 문제 정보 또는 응답 정보에 대한 쿼리, 키, 벨류를 수신할 수 있다. Referring to FIG. 13 , in step S1301, each component may receive a query, key, and value for problem information or response information.

각각의 값은 벡터로 표현된 값일 수 있으며, 사용되는 역할에 따라 구분된 값일 수 있다. Each value may be a value expressed as a vector, or a value divided according to a role used.

S1303 단계에서, 각 구성은 쿼리, 키, 벨류 각각에 대한 복수의 헤드값을 생성할 수 있다. In step S1303, each configuration may generate a plurality of head values for each of a query, a key, and a value.

S1305 단계에서, 각 구성은 복수의 쿼리 헤드값과 복수의 키 헤드값으로부터 가중치를 생성하고, S1307 단계에서 키-쿼리 마스킹(key-query masking)과 상부 삼각 마스킹(upper triangular masking)을 포함하는 마스킹 동작이 수행될 수 있다.In step S1305, each configuration generates a weight from a plurality of query head values and a plurality of key head values, and in step S1307, a key-query masking (key-query masking) and upper triangular masking (upper triangular masking). An action may be performed.

이후, S1309 단계에서, 마스킹이 수행된 가중치를 복수의 벨류 헤드값에 적용하여 예측 데이터를 생성할 수 있다.Thereafter, in operation S1309 , prediction data may be generated by applying the masked weight to the plurality of value head values.

문제 처리부에서 생성된 예측 데이터는 어텐션 정보, 제1 응답 처리부에서 생성된 예측 데이터는 쿼리 데이터, 제2 응답 처리부에서 생성된 예측 데이터는 정답 확률 정보일 수 있다.The prediction data generated by the problem processing unit may be attention information, the prediction data generated by the first response processing unit may be query data, and the prediction data generated by the second response processing unit may be correct answer probability information.

본 발명에 따른 사용자 지식 추적 시스템은, 최적화된 입력 데이터 포맷을 사용하고, 트랜스포머 구조의 인코더 신경망과 디코더 신경망에 상부 삼각 마스킹(Upper triangular masking)을 적절히 사용함으로써 향상된 성능을 가질 수 있다.The user knowledge tracking system according to the present invention can have improved performance by using an optimized input data format and by appropriately using Upper triangular masking for the encoder neural network and the decoder neural network of the transformer structure.

본 명세서와 도면에 게시된 본 발명의 실시 예들은 본 발명의 기술내용을 쉽게 설명하고 본 발명의 이해를 돕기 위해 특정 예를 제시한 것뿐이며, 본 명의 범위를 한정하고자 하는 것은 아니다. 여기에 게시된 실시 예들 이외에도 발명의 기술적 사상에 바탕을 둔 다른 변형 예들이 실시 가능하다는 것은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 자명한 것이다.The embodiments of the present invention published in the present specification and drawings are merely provided for specific examples to easily explain the technical contents of the present invention and help the understanding of the present invention, and are not intended to limit the scope of the present invention. It will be apparent to those of ordinary skill in the art to which the present invention pertains that other modifications based on the technical spirit of the present invention can be implemented in addition to the embodiments disclosed herein.

5: 사용자 지식 추적 시스템
10, 30: 임베딩 수행부
20: 인코더 신경망
21: 비선형화 수행부
22: 문제 처리부
40: 디코더 신경망
41: 제1 응답 처리부
42: 제2 응답 처리부
43: 비선형화 수행부5: User knowledge tracking system
10, 30: embedding execution unit
20: Encoder Neural Network
21: Non-linearization execution unit
22: Problem Handling
40: decoder neural network
41: first response processing unit
42: second response processing unit
43: Non-linearization performing unit

Claims

In the user knowledge tracking method of the user knowledge tracking system,
generating attention information by inputting problem information into a kth encoder neural network and reflecting a weight in the problem information;
generating query data that is information about a problem for which a user wants to predict a probability of correct answer by inputting response information into a kth decoder neural network and reflecting a weight in the response information;
learning the user knowledge tracking system by using the attention information as a weight for the query data; and
Regenerating the attention information based on a comparison of N and k, the number of which a plurality of encoder neural networks and a plurality of decoder neural networks are stacked, and training the user knowledge tracking system based on the regenerated attention information; includes,
The user knowledge tracking system comprises the plurality of encoder neural networks and the plurality of decoder neural networks.

According to claim 1,
When k is less than N, regenerating the attention information, and learning the user knowledge tracking system based on the regenerated attention information.

3. The method of claim 2,
If k is greater than or equal to N, terminating the learning of the user knowledge tracking system, and outputting, from the learned user knowledge tracking system, correct answer probability information that is a probability that the user will correct the problem; User further comprising How to track knowledge.

The method of claim 1, wherein the generating of the attention information comprises:
Optionally, performing key-query masking, which is an operation for preventing attention from being performed on a value without a value (zero padding); and
A user knowledge tracking method comprising the step of performing upper triangular masking, which is an operation to prevent attention from being performed on information corresponding to a future location for prediction of the next problem.

According to claim 1,
The problem information is
It consists of a plurality of problems expressed as vectors,
The response information is
A user knowledge tracking method consisting of a user's response to each of the plurality of problems represented by the vector.

In a user knowledge tracking system comprising a plurality of encoder neural networks and a plurality of decoder neural networks,
a kth encoder neural network that receives problem information and generates attention information to be used as a weight for query data by reflecting a weight on the problem information; and
Receive response information, generate the query data that is information about a problem for which a user wants to predict a correct answer probability by reflecting a weight in the response information, and use the attention information as a weight for the query data to obtain the user knowledge A user knowledge tracking system comprising a kth decoder neural network for learning the tracking system.