KR102338217B1

KR102338217B1 - control method for language learning system

Info

Publication number: KR102338217B1
Application number: KR1020210059380A
Authority: KR
Inventors: 한지우
Original assignee: 한지우
Priority date: 2021-05-07
Filing date: 2021-05-07
Publication date: 2021-12-09

Abstract

A control method of a language learning system comprises the following steps of: allowing a server to obtain an access time point of a first terminal; allowing the server to obtain learning information generated by the first terminal and share learning information with a second terminal corresponding to the first terminal in real time; allowing the server to obtain an end time point of the first terminal; and allowing the server to generate achievement information based on the learning information to share it with the first terminal and the second terminal.

Description

Control method for language learning system {control method for language learning system}

본 발명은 영유아, 학생 및 성인을 포함한 언어학습 시스템을 제공하는 것으로, 특히, 영유아, 학생과 같이 보호자 혹은 교사의 학습 지도 및 학습 관찰이 필요한 대상에 대해, 주변에 관찰하는 타인의 시선이 없어 자유로운 학습 분위기를 형성하되, 보호자 혹은 교사가 단말(제2 단말)을 통해 학습 관찰이 가능하도록 하는 언어학습 시스템의 제어 방법에 관한 것이다.The present invention provides a language learning system including infants and children, students and adults, in particular, free learning without the eyes of others observing around for subjects that require parental or teacher learning guidance and learning observation, such as infants and students. It relates to a control method of a language learning system that creates an atmosphere, but enables a guardian or a teacher to observe learning through a terminal (second terminal).

종래의 학습 시스템은 단순히 교육 영상이나 학습지를 바탕으로 수동형으로 이루어지고 있다. 교육 영상의 경우, 영상을 보고 반복하여 문장이나 단어를 따라하거나, 묵음 처리된 구간에 대응하는 답변을 작성하도록 하는 식의 교육 방식을 사용하고 있다.The conventional learning system is simply made in a passive type based on an educational image or a study sheet. In the case of an educational video, an educational method is used in which the user watches the video and repeats the sentence or word, or writes an answer corresponding to the silenced section.

하지만, 이러한 종래의 학습 시스템은 영유아가 대상인 경우, 보호자 혹은 교사가 학습 현황을 관찰하기 위해 학습 중 보호자 혹은 교사가 동반한 상황에서 학습이 진행되는 것에 대해 시간적/공간적 제약이 뒤따르며, 학생인 경우, 타인의 학습 관찰에 대해 부담을 느껴 학습이 원활하게 진행되지 못하는 문제점이 발생한다.However, in this conventional learning system, when infants and young children are the subject, there are temporal/spatial restrictions on the learning being carried out in a situation accompanied by a guardian or teacher during learning in order for the guardian or teacher to observe the learning status, and in the case of a student , there is a problem that learning cannot proceed smoothly because it is burdened by others' observation of learning.

또한, 종래의 학습 시스템은 보호자 혹은 교사의 관찰이 없는 상황에서 영유아를 포함한 학생이 학습에 소홀해지는 상황을 방지하기 위한 방법을 제시하지 못하는 문제점이 존재해왔다.In addition, there has been a problem in that the conventional learning system does not provide a method for preventing a situation in which students including infants and young children are neglected in learning in a situation where there is no observation of a guardian or a teacher.

등록특허공보 제10-1060285호, 2011.08.23.Registered Patent Publication No. 10-1060285, 2011.08.23.

본 발명이 해결하고자 하는 과제는 학습 시스템을 통해 학습을 실시하는 학생이 보호자 혹은 교사의 직접적인 감시에 의해 학습에 대한 집중력이 떨어지는 것을 방지하되, 보호자 혹은 교사가 간접적으로 학습 과정을 관찰하는 것이 가능하도록 하기 위해 서버, 제1 단말 및 제2 단말을 포함하는 언어학습 시스템의 제어 방법을 제공하는 것이다.The problem to be solved by the present invention is to prevent a student conducting learning through a learning system from losing concentration on learning by direct supervision of a guardian or teacher, but enabling the guardian or teacher to indirectly observe the learning process To provide a method for controlling a language learning system including a server, a first terminal, and a second terminal.

본 발명이 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

상술한 과제를 해결하기 위한 본 발명의 일 면에 따른 언어학습 시스템의 제어 방법에 있어서, 서버가, 제1 단말의 접속 시점을 획득하는 단계, 서버가, 제1 단말에 의해 발생한 학습정보를 획득하고, 제1 단말에 대응하는 제2 단말에 학습정보를 실시간으로 공유하는 단계, 서버가, 제1 단말의 종료 시점을 획득하는 단계 및 서버가, 학습정보를 바탕으로 성취정보를 생성하여, 제1 단말 및 제2 단말에 공유하는 단계를 포함하고, 학습정보를 실시간으로 공유하는 단계는, 서버가, 제1 단말 및 제2 단말을 매칭하는 단계를 더 포함할 수 있다.In the control method of a language learning system according to an aspect of the present invention for solving the above-mentioned problems, the server acquires the access point of the first terminal, the server acquires the learning information generated by the first terminal and sharing the learning information in real time with the second terminal corresponding to the first terminal, the server, obtaining the end time of the first terminal, and the server, generating achievement information based on the learning information, The step of sharing to the first terminal and the second terminal, and the step of sharing the learning information in real time may further include, by the server, matching the first terminal and the second terminal.

이때, 접속 시점을 획득하는 단계는, 서버가, 제2 단말에 제1 단말의 접속을 알리는 단계, 서버가, 제1 단말에 대응하는 연령정보 및 성별정보를 획득하는 단계, 서버가, 인공지능 모델을 통해 연령정보 및 성별정보에 대응하는 제1 키워드를 추출하고, 제1 키워드에 대응하는 적어도 하나의 제1 영상정보를 제1 단말에 공유하는 단계 및 서버가, 제1 단말로부터 적어도 하나의 관심 키워드를 획득하고, 관심 키워드에 대응하는 적어도 하나의 제2 영상정보를 제1 단말에 공유하는 단계를 더 포함할 수 있고, 종료 시점을 획득하는 단계는, 서버가, 제2 단말에 제1 단말의 종료를 알리는 단계를 더 포함할 수 있다.In this case, the step of obtaining the access point includes, by the server, notifying the second terminal of the access of the first terminal, the server obtaining age information and gender information corresponding to the first terminal, the server, artificial intelligence Extracting a first keyword corresponding to the age information and gender information through the model, and sharing at least one first image information corresponding to the first keyword to the first terminal, and the server, at least one from the first terminal The method may further include obtaining a keyword of interest and sharing at least one piece of second image information corresponding to the keyword of interest to the first terminal, and the step of obtaining an end time may include: the server, the first to the second terminal It may further include the step of notifying the termination of the terminal.

또한, 학습정보를 획득하는 단계는, 서버가, 제1 단말로부터 선택된 영상정보를 획득하는 단계, 서버가, 제1 단말로부터 학생을 대상으로 하는 촬영정보 및 녹음정보를 포함하는 학습정보를 생성하는 단계 및 서버가, 선택된 영상정보를 바탕으로 학습정보를 평가하여 평가정보를 제2 단말에 공유하는 단계를 더 포함하며, 제1 단말 및 제2 단말을 매칭하는 단계는, 서버가, 제2 단말에 촬영정보 및 녹음정보를 실시간 공유하는 단계 및 서버가, 제1 단말 및 제2 단말이 공유하는 채팅 화면을 생성하는 단계를 더 포함할 수 있다.In addition, the step of obtaining the learning information, the server, the step of obtaining the selected image information from the first terminal, the server, from the first terminal to generate learning information including shooting information and recording information for a student The step and the server further comprising the step of evaluating the learning information based on the selected image information and sharing the evaluation information to the second terminal, the step of matching the first terminal and the second terminal, the server, the second terminal The step of sharing the shooting information and the recording information in real time and the server may further include the step of generating a chatting screen shared by the first terminal and the second terminal.

추가로, 학습정보를 평가하여 평가정보를 제2 단말에 공유하는 단계는, 서버가, 학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하고, 발음오류정보 및 오류단어정보에 각각 대응하는 구순 동작 맵을 매칭하는 단계, 서버가, 발음오류정보 및 오류단어정보 중 적어도 하나에 대응하는 발음정보를 획득하는 단계, 서버가, 발음정보에 대응하는 예시 구순 동작 영상을 획득하여, 발음정보와 예시 구순 동작 영상을 매칭하는 단계, 서버가, 예시 구순 동작 영상과 구순 동작 맵의 유사성을 기 설정된 기준에 따라 판단하여 구순 동작 점수를 획득하는 단계, 서버가, 인공지능 모델을 통해, 발음오류정보 및 오류단어정보 각각에 대응하는 기 설정된 가중치를 바탕으로 재학습 필요 점수를 획득하는 단계 및 서버가, 구순 동작 점수 및 재학습 필요 점수를 바탕으로 평가정보를 생성하는 단계를 더 포함할 수 있고, 학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하는 단계는, 서버가, 인공지능 모델을 통해, 발음오류정보 및 오류단어정보를 바탕으로 발음기호 연관성 값을 획득하는 단계, 발음기호 연관성 값이 기 설정된 값 이상인 경우, 서버가, 발음기호 연관성 값에 대응하는 발음기호 정보를 포함한 복수개의 단어를 획득하는 단계 및 서버가, 복수개의 단어를 포함하는 제3 영상정보를 획득하여, 제1 복습 영상으로 제1 단말에 공유하는 단계를 더 포함할 수 있고, 성취정보를 생성하는 단계는, 서버가, 제1 평가정보 및 제2 평가 정보를, 제3 평가 정보와 비교하는 단계 및 서버가, 제1 내지 제3 평가 정보의 비교 결과를 바탕으로 생성한 성취율 그래프를 포함하는 성취정보를 획득하는 단계를 더 포함하며, 제1 평가 정보는, 최초 발생한 평가 정보이고, 제2 평가 정보는, 직전 발생한 평가 정보이며, 제3 평가 정보는, 가장 후열에 발생한 평가 정보이며, 복습 영상을 바탕으로 생성되는 것을 특징으로 한다.In addition, the step of evaluating the learning information and sharing the evaluation information to the second terminal includes, by the server, acquiring at least one of pronunciation error information, erroneous word information, and a verbal action map based on the learning information, and pronunciation error information and Matching oral motion map corresponding to each of the erroneous word information, the server acquiring pronunciation information corresponding to at least one of the pronunciation error information and the erroneous word information, the server providing an example oral motion action image corresponding to the pronunciation information matching the pronunciation information and the example oral motion image, the server determining the similarity between the example oral motion image and the oral motion map according to a preset criterion to obtain a oral motion score, the server, artificial intelligence Using the model, obtaining a re-learning necessary score based on preset weights corresponding to each of the pronunciation error information and the erroneous word information, and generating, by the server, evaluation information based on the oral action score and the re-learning required score may further include, and the step of obtaining at least one of pronunciation error information, erroneous word information, and oral action map based on the learning information includes, by the server, through an artificial intelligence model, pronunciation error information and erroneous word information. obtaining a phonetic symbol correlation value with Obtaining the third image information including, and sharing it with the first terminal as a first review image, the step of generating the achievement information includes, by the server, the first evaluation information and the second evaluation information The step of comparing with the third evaluation information and the server further comprising the step of obtaining, by the server, achievement information including an achievement rate graph generated based on the comparison result of the first to third evaluation information, the first evaluation information , the first evaluation information, the second evaluation information is the evaluation information generated immediately before, the third evaluation information is the evaluation information generated in the rearmost row, it is characterized in that it is generated based on the review video.

또한, 촬영정보 및 녹음정보를 포함하는 학습정보를 생성하는 단계는, 기 설정된 시간을 초과하여 녹음정보가 공백인 구간을 획득한 경우, 서버가, 녹음정보가 공백인 구간에 대응하는 촬영정보를 획득하는 단계, 서버가, 공백인 구간에 대응하는 촬영정보로부터 제1 단말에 대응하는 학생 오브젝트 획득 가능 여부를 획득하는 단계, 서버가, 오브젝트의 획득 여부를 바탕으로 제1 비학습 이벤트 및 제2 비학습 이벤트 중 적어도 하나의 비학습 이벤트를 획득하는 단계, 제1 비학습 이벤트가 획득된 경우, 서버가, 선택 영상정보에 대응하는 음량 값을 기 설정된 음량으로 증가시키는 단계, 제2 비학습 이벤트가 획득된 경우, 서버가, 제2 단말에 제2 비학습 이벤트 발생 시점을 공유하는 단계, 서버가, 접속 시점 및 종료 시점을 바탕으로 총 학습시간을 산출하되, 공백인 구간에 대응하는 시간을 총 학습시간에서 제외하는 단계 및 서버가, 총 학습시간을 바탕으로 권장 학습 시간을 산출하여 제1 단말에 전송하는 단계를 더 포함할 수 있고, 제2 비학습 이벤트를 획득하는 단계는, 서버가, 제1 단말로부터 배경 이미지를 획득하는 단계, 서버가, 배경 이미지 상에 포함된 복수의 심볼을 획득하는 단계 및 서버가, 복수의 심볼 중, 기 설정된 개수 이상의 심볼이 검출되면, 제2 비학습 이벤트를 획득하는 단계를 포함한다.In addition, in the step of generating the learning information including the shooting information and the recording information, when a section in which the recording information is blank is obtained over a preset time, the server receives the shooting information corresponding to the section in which the recording information is blank acquiring, by the server, acquiring whether the student object corresponding to the first terminal can be acquired from the shooting information corresponding to the blank section, the server, based on whether the object is acquired, the first non-learning event and the second Acquiring at least one non-learning event among non-learning events, when the first non-learning event is obtained, increasing, by the server, a volume value corresponding to the selected image information to a preset volume, a second non-learning event is obtained, the server shares the second non-learning event occurrence time with the second terminal, the server calculates the total learning time based on the access time and the end time, but calculates the time corresponding to the blank section The step of excluding from the total learning time and the server may further include the step of calculating a recommended learning time based on the total learning time and transmitting it to the first terminal, and the step of obtaining the second non-learning event includes: , obtaining a background image from the first terminal, the server obtaining a plurality of symbols included on the background image, and the server, among the plurality of symbols, when more than a preset number of symbols are detected, the second non-learning acquiring the event.

또한, 학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하는 단계는, 서버가, 선택 영상정보를 바탕으로 문장 별 사운드 패턴인 제1 사운드 패턴을 획득하는 단계, 서버가, 녹음정보로부터 제1 사운드 패턴에 대응하는 제2 사운드 패턴을 획득하는 단계, 서버가, 기 설정된 기준 이상 유사한 것으로 판단한 제1 사운드 패턴 및 제2 사운드 패턴을 매칭하고, 타 사운드 패턴이 매칭되지 않은 제2 사운드 패턴을 오류정보로 획득하는 단계, 서버가, 오류정보로 획득된 제2 사운드 패턴의 발생 시점을 바탕으로, 동일한 발생 시점을 갖는 제1 사운드 패턴을 오류정보로 획득된 제2 사운드 패턴과 매칭하는 단계, 서버가, 오류정보에서 매칭된 제1 사운드 패턴 및 제2 사운드 패턴 중 소리의 맵시가 기 설정된 기준 이상 유사한 오류정보를 발음오류정보로 정의하는 단계 및 서버가, 소리의 맵시가 기 설정된 기준 미만으로 유사한 오류정보를 오류단어정보로 정의하는 단계를 더 포함하며, 발음정보는, 제1 사운드 패턴을 포함하는 것을 특징으로 한다.In addition, the step of acquiring at least one of pronunciation error information, erroneous word information, and oral action map based on the learning information includes: acquiring, by the server, a first sound pattern that is a sound pattern for each sentence based on the selected image information; obtaining, by the server, a second sound pattern corresponding to the first sound pattern from the recording information; matching, by the server, the first sound pattern and the second sound pattern determined to be similar to or greater than a preset standard, and matching other sound patterns Acquiring, by the server, a second sound pattern that has not been obtained as error information, based on the occurrence time of the second sound pattern obtained as error information, a first sound pattern having the same occurrence time as error information A step of matching a sound pattern, the server defining, as pronunciation error information, error information that is similar to a sound more than a preset standard among the first and second sound patterns matched in the error information as pronunciation error information, and the server The method further includes the step of defining similar error information as erroneous word information that is less than a preset standard, wherein the pronunciation information includes a first sound pattern.

이때, 동일한 발생 시점을 갖는 제1 사운드 패턴 및 제2 사운드 패턴을 서로 매칭하는 단계는, 오류정보에서 매칭된 제1 사운드 패턴 및 제2 사운드 패턴을 비교하여, 제2 사운드 패턴이 제1 사운드 패턴에 대해 차이가 발생한 구간이 적어도 한번 반복되는 소리의 맵시를 갖는 경우, 서버가, 반복된 적어도 두 개의 맵시에 대한 평균 맵시를 산출하는 단계, 서버가, 차이가 발생한 구간에 위치한 복수개의 반복된 맵시를 하나의 평균 맵시로 대체하여 제3 사운드 패턴을 획득하는 단계, 서버가, 제3 사운드 패턴을 대응하는 제1 사운드 패턴과 비교하는 단계, 제1 사운드 패턴과 제3 사운드 패턴이 기 설정된 기준 이상 유사한 경우, 서버가, 제3 사운드 패턴에 대응하는 제2 사운드 패턴을 더듬은 구간정보로 획득하는 단계 및 서버가, 더듬은 구간정보에 대응하는 영상정보를 제2 복습 영상으로 제1 단말에 공유하는 단계를 더 포함할 수 있고, 본 발명의 제어 방법은, 서버가 제1 단말에 제1 내지 제2 복습 영상 중 적어도 하나를 공유하는 경우, 서버가, 제1 내지 제2 복습 영상에 대응하는 구순 동작 맵 및 예시 구순 동작 영상을 제1 단말에 출력하는 단계 및 서버가, 제1 단말로부터 실시간 구순 동작 맵을 획득하고, 동시에 실시간 구순 동작 맵을 제1 단말에 출력하는 단계를 더 포함할 수 있다.In this case, the step of matching the first sound pattern and the second sound pattern having the same generation time point to each other may include comparing the matched first sound pattern and the second sound pattern in the error information so that the second sound pattern is the first sound pattern. When the section in which the difference occurs has a goodness of a sound that is repeated at least once, the server calculates, by the server, an average mapness for at least two repeated maps, the server, a plurality of repeated maps located in the section where the difference occurs obtaining a third sound pattern by replacing the In a similar case, the server acquires the second sound pattern corresponding to the third sound pattern as stuttered section information, and the server shares the image information corresponding to the stuttered section information as a second review image to the first terminal and, in the control method of the present invention, when the server shares at least one of the first to second review images with the first terminal, the server responds to the first to second review images. The method may further include outputting, by the server, the oral motion map and the example oral motion image to the first terminal, acquiring the real-time oral motion map from the first terminal, and simultaneously outputting the real-time oral motion map to the first terminal have.

마지막으로, 배경 이미지 상에 포함된 복수의 심볼을 획득하는 단계는, 서버가, 배경 이미지 상의 기준 좌표를 생성하는 단계, 서버가, 배경 이미지 상에 포함된 복수의 제1 심볼을 획득하는 단계 및 서버가, 제1 심볼 중 기준 좌표에 기 설정된 간격 이하의 밀집도를 갖는 복수 개의 제2 심볼을 획득하는 단계를 더 포함할 수 있고, 기 설정된 개수 이상의 심볼이 검출되면, 제2 비학습 이벤트를 획득하는 단계는, 서버가, 제2 심볼이 검출되는 개수를 획득하는 단계, 획득된 개수가 기 설정된 기준 이상인 경우, 서버가, 구순 동작 맵의 획득 여부를 판단하는 단계, 구순 동작 맵이 획득된 경우, 서버가, 구순 동작 맵에 대응하는 구순 동작 점수를 획득하는 단계, 서버가, 제1 평가 정보 및 제2 평가 정보에 대응하는 평균 구순 동작 점수를 획득하는 단계 및 서버가, 구순 동작 점수가 평균 구순 동작 점수보다 기 설정된 범위 이상 낮은 것으로 판단된 경우, 제1 비학습 이벤트를 획득하는 단계를 더 포함할 수 있고, 구순 동작 맵의 획득 여부를 판단하는 단계는, 구순 동작 맵이 획득되지 않은 경우, 서버가, 학생 오브젝트 획득 여부를 판단하는 단계, 학생 오브젝트가 획득된 경우, 서버가, 제1 비학습 이벤트를 획득하는 단계 및 학생 오브젝트가 획득되지 않은 경우, 서버가, 제2 비학습 이벤트를 획득하는 단계를 더 포함할 수 있고, 기준 좌표는, 배경 이미지 상의 중심 좌표, 혹은 구순 동작 맵이 획득되는 영역 중 구순 동작 점수가 가장 높게 산출된 영역에 대한 중심 좌표인 것을 특징으로 한다.Finally, the step of obtaining the plurality of symbols included on the background image includes, by the server, generating reference coordinates on the background image, the server obtaining, by the server, the plurality of first symbols included on the background image; The server may further include obtaining, by the server, a plurality of second symbols having a density of less than or equal to a preset interval at the reference coordinates among the first symbols, and when more than a preset number of symbols are detected, acquiring a second non-learning event The step of: obtaining, by the server, the number of detected second symbols; if the obtained number is equal to or greater than a preset criterion, determining whether or not to obtain the oral motion map, by the server; when the oral motion map is obtained , the server acquiring an oral motion score corresponding to the oral motion map, the server acquiring an average oral motion score corresponding to the first evaluation information and the second evaluation information, and the server acquiring an average oral motion score When it is determined that the oral motion score is lower than the oral motion score by more than a preset range, the method may further include the step of acquiring a first non-learning event, and the step of determining whether to acquire the oral motion map may include: when the oral motion map is not acquired , the server determines whether the student object is obtained, if the student object is obtained, the server obtains the first non-learning event and when the student object is not obtained, the server receives the second non-learning event The method may further include acquiring, wherein the reference coordinates are central coordinates on the background image or central coordinates for a region in which the oral motion score is calculated the highest among regions in which the oral motion map is obtained.

본 발명의 기타 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Other specific details of the invention are included in the detailed description and drawings.

본 발명의 언어학습 시스템의 제어 방법에 의해 발생하는 효과로는,Effects generated by the control method of the language learning system of the present invention include,

첫째, 학습이 제1 단말(학생 단말)을 통해 이루어짐으로인해 학습에 대한 공간적/시간적 제약이 없다.First, there is no spatial/temporal restriction on learning because learning is performed through the first terminal (student terminal).

둘째, 학습을 받는 학생(영유아, 청소년, 성인을 포함)에 대한 보호자 혹은 교사의 학습 관찰이 간접적으로 이루어지기 때문에, 직접적인 학습 관찰로 인해 학습 상황에서 받을 수 있는 부담감이 해소된다.Second, since the parent or teacher's observation of the learning of the student (including infants, adolescents, and adults) is indirectly made, the burden that may be felt in the learning situation is relieved by direct observation of the learning.

셋째, 간접적인 학습 관찰이 제2 단말(보호자 혹은 교사 단말)로 이루어져 학습 관찰에 대한 공간적/시간적 제약이 없다.Third, indirect learning observation is made of a second terminal (guardian or teacher terminal), so there is no spatial/temporal constraint on learning observation.

넷째, 오류가 발생한(틀린, 오답인) 문장/단어에 대한 검출이 자동으로 이루어져 자가 체점이 필요하지 않다.Fourth, automatic detection of sentences/words in which an error occurred (wrong or incorrect answer) is not required.

다섯째, 오류가 발생한(틀린, 오답인) 문장/단어에 대응하는 발음정보를 획득하여, 획득된 발음 정보를 포함하는 복수개의 단어를 복습용으로 제공하여, 획득된 발음 정보에 대응하는 오류가 발생하는 빈도수를 줄일 수 있다.Fifth, pronunciation information corresponding to a sentence/word in which an error has occurred (wrong or incorrect answer) is obtained, and a plurality of words including the obtained pronunciation information are provided for review, and an error corresponding to the obtained pronunciation information occurs frequency can be reduced.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 일실시예에 따른 제1 단말을 통한 학생의 학습 상태도이다.
도 2는 본 발명의 일실시예에 따른 제1 단말에 매칭된 제2 단말에 학습정보가 실시간으로 공유되는 상태도이다.
도 3은 본 발명의 일실시예에 따른 성취율 그래프이다.
도 4는 본 발명의 일실시예에 따른 시스템 구성도이다.
도 5는 본 발명의 일실시예에 따른 기본 흐름도이다.
도 6은 본 발명의 일실시예에 따른 서버 구성도이다.1 is a diagram of a student's learning state through a first terminal according to an embodiment of the present invention.
2 is a state diagram in which learning information is shared in real time with a second terminal matched with a first terminal according to an embodiment of the present invention.
3 is an achievement rate graph according to an embodiment of the present invention.
4 is a system configuration diagram according to an embodiment of the present invention.
5 is a basic flowchart according to an embodiment of the present invention.
6 is a configuration diagram of a server according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the present embodiments allow the disclosure of the present invention to be complete, and those of ordinary skill in the art to which the present invention pertains. It is provided to fully understand the scope of the present invention to those skilled in the art, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other components in addition to the stated components. Like reference numerals refer to like elements throughout, and "and/or" includes each and every combination of one or more of the recited elements. Although "first", "second", etc. are used to describe various elements, these elements are not limited by these terms, of course. These terms are only used to distinguish one component from another. Accordingly, it goes without saying that the first component mentioned below may be the second component within the spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein will have the meaning commonly understood by those of ordinary skill in the art to which this invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless specifically defined explicitly.

명세서에서 사용되는 "부" 또는 “모듈”이라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, "부" 또는 “모듈”은 어떤 역할들을 수행한다. 그렇지만 "부" 또는 “모듈”은 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. "부" 또는 “모듈”은 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 "부" 또는 “모듈”은 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 "부" 또는 “모듈”들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 "부" 또는 “모듈”들로 결합되거나 추가적인 구성요소들과 "부" 또는 “모듈”들로 더 분리될 수 있다.As used herein, the term “unit” or “module” refers to a hardware component such as software, FPGA, or ASIC, and “unit” or “module” performs certain roles. However, “part” or “module” is not meant to be limited to software or hardware. A “unit” or “module” may be configured to reside on an addressable storage medium or to reproduce one or more processors. Thus, as an example, “part” or “module” refers to components such as software components, object-oriented software components, class components and task components, processes, functions, properties, Includes procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables. Components and functionality provided within “parts” or “modules” may be combined into a smaller number of components and “parts” or “modules” or as additional components and “parts” or “modules”. can be further separated.

공간적으로 상대적인 용어인 "아래(below)", "아래(beneath)", "하부(lower)", "위(above)", "상부(upper)" 등은 도면에 도시되어 있는 바와 같이 하나의 구성요소와 다른 구성요소들과의 상관관계를 용이하게 기술하기 위해 사용될 수 있다. 공간적으로 상대적인 용어는 도면에 도시되어 있는 방향에 더하여 사용시 또는 동작시 구성요소들의 서로 다른 방향을 포함하는 용어로 이해되어야 한다. 예를 들어, 도면에 도시되어 있는 구성요소를 뒤집을 경우, 다른 구성요소의 "아래(below)"또는 "아래(beneath)"로 기술된 구성요소는 다른 구성요소의 "위(above)"에 놓여질 수 있다. 따라서, 예시적인 용어인 "아래"는 아래와 위의 방향을 모두 포함할 수 있다. 구성요소는 다른 방향으로도 배향될 수 있으며, 이에 따라 공간적으로 상대적인 용어들은 배향에 따라 해석될 수 있다.Spatially relative terms "below", "beneath", "lower", "above", "upper", etc. It can be used to easily describe the correlation between a component and other components. Spatially relative terms should be understood as terms including different directions of components during use or operation in addition to the directions shown in the drawings. For example, when a component shown in the drawing is turned over, a component described as “beneath” or “beneath” of another component may be placed “above” of the other component. can Accordingly, the exemplary term “below” may include both directions below and above. Components may also be oriented in other orientations, and thus spatially relative terms may be interpreted according to orientation.

본 명세서에서, 컴퓨터는 적어도 하나의 프로세서를 포함하는 모든 종류의 하드웨어 장치를 의미하는 것이고, 실시 예에 따라 해당 하드웨어 장치에서 동작하는 소프트웨어적 구성도 포괄하는 의미로서 이해될 수 있다. 예를 들어, 컴퓨터는 스마트폰, 태블릿 PC, 데스크톱, 노트북 및 각 장치에서 구동되는 사용자 클라이언트 및 애플리케이션을 모두 포함하는 의미로서 이해될 수 있으며, 또한 이에 제한되는 것은 아니다.In this specification, a computer refers to all types of hardware devices including at least one processor, and may be understood as encompassing software configurations operating in the corresponding hardware device according to embodiments. For example, a computer may be understood to include, but is not limited to, smart phones, tablet PCs, desktops, notebooks, and user clients and applications running on each device.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 4에 따르면, 본 발명은 서버(100), 제1 단말(200) 및 제2 단말(300)을 포함하며, 이때, 제1 단말(200) 및 제2 단말(300)은 전자 장치일 수 있다.According to FIG. 4 , the present invention includes a server 100 , a first terminal 200 , and a second terminal 300 , and in this case, the first terminal 200 and the second terminal 300 may be electronic devices. have.

일 실시예로, 전자 장치는 스마트폰(smartphone), 태블릿 PC(tablet personal computer), 이동 전화기(mobile phone), 영상 전화기, 전자책 리더기(e-book reader), 데스크탑 PC (desktop PC), 랩탑 PC(laptop PC), 넷북 컴퓨터(netbook computer), 워크스테이션(workstation), 서버, PDA(personal digital assistant), PMP(portable multimedia player), MP3 플레이어, 모바일 의료기기, 카메라, 또는 웨어러블 장치(wearable device), 인공지능 스피커(AI speaker) 중 적어도 하나를 포함할 수 있다.In one embodiment, the electronic device is a smartphone, a tablet personal computer (PC), a mobile phone, a video phone, an e-book reader, a desktop PC (desktop PC), a laptop PC (laptop PC), netbook computer (netbook computer), workstation (workstation), server, PDA (personal digital assistant), PMP (portable multimedia player), MP3 player, mobile medical device, camera, or wearable device (wearable device) ), and may include at least one of an AI speaker.

또한, 제1 단말(200)은 본 발명에 의한 언어학습 시스템을 이용해 학습을 실시하는 학생의 단말일 수 있으며, 제2 단말(300)은 학생의 보호자의 단말 혹은 학생의 교사의 단말일 수 있다.In addition, the first terminal 200 may be a terminal of a student who performs learning using the language learning system according to the present invention, and the second terminal 300 may be a terminal of a student's guardian or a terminal of a student's teacher .

도 6에 도시된 바와 같이, 서버(100)는 메모리(110), 통신부(120) 및 프로세서(130)를 포함할 수 있다.As shown in FIG. 6 , the server 100 may include a memory 110 , a communication unit 120 , and a processor 130 .

메모리(110)는 서버(100)의 동작에 필요한 각종 프로그램 및 데이터를 저장할 수 있다. 메모리(110)는 비휘발성 메모리(110), 휘발성 메모리(110), 플래시메모리(110)(flash-memory), 하드디스크 드라이브(HDD) 또는 솔리드 스테이트 드라이브(SSD) 등으로 구현될 수 있다.The memory 110 may store various programs and data necessary for the operation of the server 100 . The memory 110 may be implemented as a non-volatile memory 110 , a volatile memory 110 , a flash-memory 110 , a hard disk drive (HDD), or a solid state drive (SSD).

통신부(120)는 외부 장치와 통신을 수행할 수 있다. 특히, 통신부(120)는 와이파이 칩, 블루투스 칩, 무선 통신 칩, NFC칩, 저전력 블루투스 침(BLE 칩) 등과 같은 다양한 통신 칩을 포함할 수 있다. 이때, 와이파이 칩, 블루투스 칩, NFC 칩은 각각 LAN 방식, WiFi 방식, 블루투스 방식, NFC 방식으로 통신을 수행한다. 와이파이 칩이나 블루투스칩을 이용하는 경우에는 SSID 및 세션 키 등과 같은 각종 연결 정보를 먼저 송수신 하여, 이를 이용하여 통신 연결한 후 각종 정보들을 송수신할 수 있다. 무선 통신칩은 IEEE, 지그비, 3G(3rd Generation), 3GPP(3rd Generation Partnership Project), LTE(Long Term Evolution) 등과 같은 다양한 통신 규격에 따라 통신을 수행하는 칩을 의미한다. The communication unit 120 may communicate with an external device. In particular, the communication unit 120 may include various communication chips such as a Wi-Fi chip, a Bluetooth chip, a wireless communication chip, an NFC chip, and a Bluetooth low power chip (BLE chip). At this time, the Wi-Fi chip, the Bluetooth chip, and the NFC chip perform communication in a LAN method, a WiFi method, a Bluetooth method, and an NFC method, respectively. In the case of using a Wi-Fi chip or a Bluetooth chip, various types of connection information such as an SSID and a session key are first transmitted and received, and then various types of information can be transmitted and received after a communication connection using this. The wireless communication chip refers to a chip that performs communication according to various communication standards, such as IEEE, ZigBee, 3rd Generation (3G), 3rd Generation Partnership Project (3GPP), and Long Term Evolution (LTE).

프로세서(130)는 메모리(110)에 저장된 각종 프로그램을 이용하여 서버(100)의 전반적인 동작을 제어할 수 있다. 프로세서(130)는 RAM, ROM, 그래픽 처리부, 메인 CPU, 제1 내지 n 인터페이스 및 버스로 구성될 수 있다. 이때, RAM, ROM, 그래픽 처리부, 메인 CPU, 제1 내지 n 인터페이스 등은 버스를 통해 서로 연결될 수 있다.The processor 130 may control the overall operation of the server 100 using various programs stored in the memory 110 . The processor 130 may include a RAM, a ROM, a graphic processing unit, a main CPU, first to n interfaces, and a bus. In this case, the RAM, ROM, graphic processing unit, main CPU, first to n interfaces, etc. may be connected to each other through a bus.

RAM은 O/S 및 어플리케이션 프로그램을 저장한다. 구체적으로, 서버(100)가 부팅되면 O/S가 RAM에 저장되고, 사용자가 선택한 각종 어플리케이션 데이터가 RAM에 저장될 수 있다. RAM stores O/S and application programs. Specifically, when the server 100 is booted, O/S may be stored in the RAM, and various application data selected by the user may be stored in the RAM.

ROM에는 시스템 부팅을 위한 명령어 세트 등이 저장된다. 턴 온 명령이 입력되어 전원이 공급되면, 메인 CPU는 ROM에 저장된 명령어에 따라 메모리(110)에 저장된 O/S를 RAM에 복사하고, O/S를 실행시켜 시스템을 부팅시킨다. 부팅이 완료되면, 메인 CPU는 메모리(110)에 저장된 각종 어플리케이션 프로그램을 RAM에 복사하고, RAM에 복사된 어플리케이션 프로그램을 실행시켜 각종 동작을 수행한다. The ROM stores an instruction set for booting the system, and the like. When a turn-on command is input and power is supplied, the main CPU copies the O/S stored in the memory 110 to the RAM according to the command stored in the ROM, and executes the O/S to boot the system. Upon completion of booting, the main CPU copies various application programs stored in the memory 110 to the RAM, and executes the application programs copied to the RAM to perform various operations.

그래픽 처리부는 연산부(미도시) 및 렌더링부(미도시)를 이용하여 아이템, 이미지, 텍스트 등과 같은 다양한 객체를 포함하는 화면을 생성한다. 여기서, 연산부는 입력부로부터 수신된 제어 명령을 이용하여 화면의 레이아웃에 따라 각 객체들이 표시될 좌표값, 형태, 크기, 컬러 등과 같은 속성값을 연산하는 구성일 수 있다. 그리고, 렌더링부는 연산부에서 연산한 속성값에 기초하여 객체를 포함하는 다양한 레이아웃의 화면을 생성하는 구성이 일 수 있다. 이러한 렌더링부에서 생성된 화면은 디스플레이의 디스플레이 영역 내에 표시될 수 있다. The graphic processing unit generates a screen including various objects such as items, images, and texts by using a calculation unit (not shown) and a rendering unit (not shown). Here, the calculation unit may be configured to calculate attribute values such as coordinate values, shape, size, color, etc. of each object to be displayed according to the layout of the screen by using the control command received from the input unit. In addition, the rendering unit may be configured to generate screens of various layouts including objects based on the attribute values calculated by the operation unit. The screen generated by the rendering unit may be displayed in the display area of the display.

메인 CPU는 메모리(110)에 액세스하여, 메모리(110)에 저장된 OS를 이용하여 부팅을 수행한다. 그리고, 메인 CPU는 메모리(110)에 저장된 각종 프로그램, 컨텐츠, 데이터 등을 이용하여 다양한 동작을 수행한다.The main CPU accesses the memory 110 and performs booting using the OS stored in the memory 110 . In addition, the main CPU performs various operations using various programs, contents, data, etc. stored in the memory 110 .

제1 내지 n 인터페이스는 상술한 각종 구성요소들과 연결된다. 제1 내지 n 인터페이스 중 하나는 네트워크를 통해 외부 장치와 연결되는 네트워크 인터페이스가 될 수도 있다.The first to n interfaces are connected to the various components described above. One of the first to n interfaces may be a network interface connected to an external device through a network.

한편, 나아가, 프로세서(130)는 인공지능 모델을 제어할 수 있다. 이 경우, 프로세서(130)는 인공지능 모델을 제어하기 위한 그래픽 전용 프로세서(예: GPU)를 포함할 수 있음은 물론이다.Meanwhile, further, the processor 130 may control the artificial intelligence model. In this case, of course, the processor 130 may include a graphics-only processor (eg, GPU) for controlling the artificial intelligence model.

한편, 본 발명에 따른 인공지능 모델은 교사 지도학습(supervised learning) 또는 비교사 지도학습(unsupervised learning)기반의 모델일 수 있다. 나아가, 본 발명에 따른 인공지능 모델은 SVM(support vector machine), Decision tree, neural network 등 및 이들이 응용된 방법론을 포함할 수 있다.Meanwhile, the artificial intelligence model according to the present invention may be a model based on supervised learning or unsupervised learning by a teacher. Furthermore, the artificial intelligence model according to the present invention may include a support vector machine (SVM), a decision tree, a neural network, and the like, and methodologies to which these are applied.

일 실시예로, 본 발명에 따른 인공지능 모델은 학습데이터를 입력하여 학습된 합성곱 신경망(Convolutional deep Neural Networks, CNN) 기반의 인공지능 모델일 수 있다. 다만, 이에 한정되는 것은 아니며, 다양한 인공지능 모델이 본 발명에 적용될 수 있음은 물론이다. 예컨대, DNN(Deep Neural Network), RNN(Recurrent Neural Network), BRDNN(Bidirectional Recurrent Deep Neural Network)과 같은 모델이 인공지능 모델로서 사용될 수 있으나, 이에 한정되지 않는다.In one embodiment, the artificial intelligence model according to the present invention may be an artificial intelligence model based on a convolutional deep neural network (CNN) learned by inputting learning data. However, the present invention is not limited thereto, and it goes without saying that various artificial intelligence models may be applied to the present invention. For example, a model such as a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), or a Bidirectional Recurrent Deep Neural Network (BRDNN) may be used as the AI model, but is not limited thereto.

이때, 합성곱 신경망(Convolutional deep Neural Networks, CNN)은 최소한의 전처리(preprocess)를 사용하도록 설계된 다계층 퍼셉트론(multilayer perceptrons)의 한 종류이다. 합성곱 신경망은 하나 또는 여러개의 합성곱 계층(convolutional layer)과 그 위에 올려진 일반적인 인공신경망 계층들로 이루어져 있으며, 가중치와 통합 계층(pooling layer)들을 추가로 활용한다. 이러한 구조 덕분에 합성곱 신경망은 2차원 구조의 입력 데이터를 충분히 활용할 수 있다. 또한, 합성곱 신경망은 표준 역전달을 통해 훈련될 수 있다. 합성곱 신경망은 다른 피드포워드 인공신경망 기법들보다 쉽게 훈련되는 편이고 적은 수의 매개변수를 사용한다는 이점이 있다. In this case, convolutional deep neural networks (CNNs) are a type of multilayer perceptrons designed to use minimal preprocessing. A convolutional neural network consists of one or several convolutional layers and general artificial neural network layers on top of it, and additionally utilizes weights and pooling layers. Thanks to this structure, the convolutional neural network can fully utilize the input data of the two-dimensional structure. In addition, convolutional neural networks can be trained via standard backpropagation. Convolutional neural networks are easier to train than other feed-forward neural network techniques and have the advantage of using fewer parameters.

또한, 심층 신경망(Deep Neural Networks, DNN)은 입력 계층(input layer)과 출력 계층(output layer) 사이에 복수개의 은닉 계층(hidden layer)들로 이뤄진 인공신경망(Artificial Neural Network, ANN)이다.In addition, deep neural networks (DNNs) are artificial neural networks (ANNs) composed of a plurality of hidden layers between an input layer and an output layer.

이때, 심층 신경망의 구조는 퍼셉트론(perceptron)으로 구성될 수 있다. 퍼셉트론은 여러 개의 입력 값(input)과 하나의 프로세서(prosessor), 하나의 출력 값으로 구성된다. 프로세서는 여러 개의 입력 값에 각각 가중치를 곱한 후, 가중치가 곱해진 입력 값들을 모두 합한다. 그 다음 프로세서는 합해진 값을 활성화함수에 대입하여 하나의 출력 값을 출력한다. 만약 활성화함수의 출력 값으로 특정한 값이 나오기를 원하는 경우, 각 입력 값에 곱해지는 가중치를 수정하고, 수정된 가중치를 이용하여 출력 값을 다시 계산할 수 있다. 이때, 각각의 퍼셉트론은 서로 다른 활성화함수를 사용할 수 있다. 또한 각각의 퍼셉트론은 이전 계층에서 전달된 출력들을 입력으로 받아들인 다음, 활성화 함수를 이용해서 출력을 구한다. 구해진 출력은 다음 계층의 입력으로 전달된다. 상술한 바와 같은 과정을 거치면 최종적으로 몇 개의 출력 값을 얻을 수 있다. In this case, the structure of the deep neural network may be composed of a perceptron. A perceptron consists of several inputs, one processor, and one output value. The processor multiplies each input value by a weight, and then sums the input values multiplied by the weight. Then, the processor outputs one output value by substituting the summed value into the activation function. If a specific value is desired as the output value of the activation function, the weight multiplied by each input value may be corrected, and the output value may be recalculated using the modified weight. In this case, each perceptron may use a different activation function. In addition, each perceptron receives the outputs from the previous layer as input, and then uses the activation function to obtain the output. The obtained output is transferred to the input of the next layer. Through the process as described above, some output values can be finally obtained.

순환 신경망(Reccurent Neural Network, RNN)은 인공신경망을 구성하는 유닛 사이의 연결이 Directed cycle을 구성하는 신경망을 말한다. 순환 신경망은 앞먹임 신경망과 달리, 임의의 입력을 처리하기 위해 신경망 내부의 메모리를 활용할 수 있다.A recursive neural network (RNN) refers to a neural network in which connections between units constituting an artificial neural network constitute a directed cycle. Unlike forward neural networks, recurrent neural networks can utilize the memory inside the neural network to process arbitrary inputs.

심층 신뢰 신경망(Deep Belief Networks, DBN)이란 기계학습에서 사용되는 그래프 생성 모형(generative graphical model)으로, 딥 러닝에서는 잠재변수(latent variable)의 다중계층으로 이루어진 심층 신경망을 의미한다. 계층 간에는 연결이 있지만 계층 내의 유닛 간에는 연결이 없다는 특징이 있다.Deep Belief Networks (DBN) is a generative graphical model used in machine learning, and in deep learning, it means a deep neural network composed of multiple layers of latent variables. There is a connection between layers, but there is no connection between units within a layer.

심층 신뢰 신경망은 생성 모형이라는 특성상 선행학습에 사용될 수 있고, 선행학습을 통해 초기 가중치를 학습한 후 역전파 혹은 다른 판별 알고리즘을 통해 가중치의 미조정을 할 수 있다. 이러한 특성은 훈련용 데이터가 적을 때 굉장히 유용한데, 이는 훈련용 데이터가 적을수록 가중치의 초기값이 결과적인 모델에 끼치는 영향이 세지기 때문이다. 선행학습된 가중치 초기값은 임의로 설정된 가중치 초기값에 비해 최적의 가중치에 가깝게 되고 이는 미조정 단계의 성능과 속도향상을 가능케 한다.The deep trust neural network can be used for prior learning due to the nature of the generative model, and after learning the initial weights through prior learning, the weights can be fine-tuned through backpropagation or other discrimination algorithms. This characteristic is very useful when the training data is small, because the smaller the training data, the stronger the effect of the initial value of the weight on the resulting model. The pre-learned initial weight value is closer to the optimal weight compared to the arbitrarily set initial weight value, which enables performance and speed improvement in the fine-tuning stage.

상술한 인공지능 및 그 학습방법에 관한 내용은 예시를 위하여 서술된 것이며, 이하에서 설명되는 실시 예들에서 이용되는 인공지능 및 그 학습방법은 제한되지 않는다. 예를 들어, 당 업계의 통상의 기술자가 동일한 과제해결을 위하여 적용할 수 있는 모든 종류의 인공지능 기술 및 그 학습방법이 개시된 실시 예에 따른 시스템을 구현하는 데 활용될 수 있다.The above-described artificial intelligence and its learning method have been described for illustrative purposes, and the artificial intelligence and its learning method used in the embodiments described below are not limited. For example, all kinds of artificial intelligence technology and a learning method thereof that a person skilled in the art can apply to solve the same problem may be utilized to implement the system according to the disclosed embodiment.

도 5에 따르면, 본 발명의 일면에 따른 언어학습 시스템의 제어 방법에 있어서, 서버(100)가, 학생 단말인 제1 단말(200)의 접속 시점을 획득하는 단계, 서버(100)가, 제1 단말(200)에 의해 발생한 학습정보를 획득하고, 제1 단말(200)에 대응하는 학부모 단말인 제2 단말(300)에 학습정보를 실시간으로 공유하는 단계, 서버(100)가, 제1 단말(200)의 종료 시점을 획득하는 단계 및 서버(100)가, 학습정보를 바탕으로 성취정보를 생성하여, 제1 단말(200) 및 제2 단말(300)에 공유하는 단계를 포함한다.According to FIG. 5, in the method of controlling a language learning system according to an aspect of the present invention, the server 100 acquires the access point of the first terminal 200, which is a student terminal, the server 100 includes the second A step of acquiring the learning information generated by the first terminal 200 and sharing the learning information in real time with the second terminal 300 that is the parent terminal corresponding to the first terminal 200, the server 100, the first The step of obtaining the end time of the terminal 200 and the server 100, generating achievement information based on the learning information, and sharing it with the first terminal 200 and the second terminal 300 .

서버(100)가, 학생 단말인 제1 단말(200)의 접속 시점을 획득하는 단계에서, 도 2에 도시된 바와 같이, 제1 단말(200)과 매칭된 제2 단말(300)에 접속 시점(학습 시작 시간)을 공유할 수 있다.In the step of the server 100 acquiring the access point of the first terminal 200, which is a student terminal, as shown in FIG. 2, the access point of the second terminal 300 matched with the first terminal 200 (learning start time) can be shared.

서버(100)가, 제1 단말(200)에 의해 발생한 학습정보를 획득하고, 제1 단말(200)에 대응하는 학부모 단말인 제2 단말(300)에 학습정보를 실시간으로 공유하는 단계에서, 도 1에 도시된 바와 같이, 제1 단말(200)에 대응하는 학생이 학습을 실시하는 것에 대한 학습정보를 실시간으로 제1 단말(200)이 수집하고, 이를 도 2에 도시된 바와 같이 제2 단말(300)에 공유할 수 있다.In the step of the server 100, acquiring the learning information generated by the first terminal 200, and sharing the learning information in real time with the second terminal 300, which is a parent terminal corresponding to the first terminal 200, As shown in FIG. 1 , the first terminal 200 collects learning information about the student performing learning corresponding to the first terminal 200 in real time, and as shown in FIG. 2 , the second It can be shared with the terminal 300 .

여기서, 학습정보는, 제1 단말(200)이 학생에 대해 수집하거나 발생한 영상정보(제1 내지 제3 영상정보, 선택 영상정보, 제1 내지 제2 복습 영상 등), 오류정보(발음오류정보, 단어오류정보 등) 등을 포함하고, 학생이 학습영상(선택 영상, 제1 내지 제2 복습 영상 등)에 대하여 학습하여 발생한 학습 관련 정보(제1 내지 제3 평가 정보, 촬영정보, 녹음정보 등)일 수 있다.Here, the learning information includes image information (first to third image information, selected image information, first and second review images, etc.) collected or generated by the first terminal 200 for the student, error information (pronunciation error information, etc.) . etc.) may be

이때, 학습정보를 실시간으로 공유하는 단계는, 서버(100)가, 제1 단말(200) 및 제2 단말(300)을 매칭하는 단계를 더 포함할 수 있다.In this case, the step of sharing the learning information in real time may further include, by the server 100 , matching the first terminal 200 and the second terminal 300 .

추가로, 접속 시점을 획득하는 단계는, 서버(100)가, 제2 단말(300)에 제1 단말(200)의 접속을 알리는 단계, 서버(100)가, 제1 단말(200)에 대응하는 연령정보 및 성별정보를 획득하는 단계, 서버(100)가, 인공지능 모델을 통해 연령정보 및 성별정보에 대응하는 제1 키워드를 추출하고, 제1 키워드에 대응하는 적어도 하나의 제1 영상정보를 제1 단말(200)에 공유하는 단계 및 서버(100)가, 제1 단말(200)로부터 적어도 하나의 관심 키워드를 획득하고, 관심 키워드에 대응하는 적어도 하나의 제2 영상정보를 제1 단말(200)에 공유하는 단계를 더 포함할 수 있다.In addition, the step of obtaining the access point includes the server 100 notifying the second terminal 300 of the access of the first terminal 200 , the server 100 corresponding to the first terminal 200 . obtaining age information and gender information, the server 100 extracts a first keyword corresponding to the age information and gender information through an artificial intelligence model, and at least one first image information corresponding to the first keyword to the first terminal 200, and the server 100 obtains at least one interest keyword from the first terminal 200, and transmits at least one second image information corresponding to the interest keyword to the first terminal (200) may further include the step of sharing.

서버(100)가, 제2 단말(300)에 제1 단말(200)의 접속을 알리는 단계에서, 제2 단말(300)은 본 발명의 시스템에 의한 서비스를 제공받기 위해 애플리케이션 기 다운로드 해놓은 상태이며, 서버(100)는 제2 단말(300)에 설치된 애플리케이션을 통해 제2 단말(300)에 특정 신호(진동, 소리, 화면출력 등)를 전송하여 제1 단말(200)에 대응하는 학생이 학습을 시작했다는 것을 실시간으로 알릴 수 있다.In the step of the server 100 notifying the second terminal 300 of the connection of the first terminal 200, the second terminal 300 is in a state in which the application has been downloaded to receive the service provided by the system of the present invention. , the server 100 transmits a specific signal (vibration, sound, screen output, etc.) to the second terminal 300 through the application installed in the second terminal 300, so that the student corresponding to the first terminal 200 learns You can notify in real time that you have started.

서버(100)가, 제1 단말(200)에 대응하는 연령정보 및 성별정보를 획득하는 단계에서, 서버(100)는 제1 단말(200)에 대응하는 학생의 회원정보를 바탕으로 개인정보를 획득함으로써 상기 단계를 시행할 수 있다.In the step of the server 100 acquiring age information and gender information corresponding to the first terminal 200 , the server 100 collects personal information based on the student's member information corresponding to the first terminal 200 . The above steps can be implemented by obtaining.

이때, 제1 단말(200)은 서버(100)에 로그인을 위해 본 발명의 시스템에 의해 제공되는 언어학습 프로그램을 기 다운로드 하거나, 본 발명의 서버(100)와 연동된 사이트에 접속하여 서비스를 이용할 수 있다.At this time, the first terminal 200 pre-downloads the language learning program provided by the system of the present invention for logging into the server 100, or accesses a site linked with the server 100 of the present invention to use the service. can

서버(100)가, 인공지능 모델을 통해 연령정보 및 성별정보에 대응하는 제1 키워드를 추출하고, 제1 키워드에 대응하는 적어도 하나의 제1 영상정보를 제1 단말(200)에 공유하는 단계에서, 연령정보 및 성별정보를 입력정보로 하면, 인공지능 모델에서 연령정보 및 성별정보를 바탕으로 제1 키워드를 출력정보로 제공할 수 있다.The server 100 extracts a first keyword corresponding to age information and gender information through an artificial intelligence model, and shares at least one first image information corresponding to the first keyword to the first terminal 200 In , when age information and gender information are input information, the first keyword may be provided as output information based on the age information and gender information in the artificial intelligence model.

또한, 서버(100)는 제1 키워드를 해시태그(#)로 포함하는 영상을, 서버(100) 혹은 타 서버(100)로부터 제1 영상정보로 획득할 수 있으며, 제1 키워드에 포함된 복수개의 키워드들은 검출 횟수가 매칭되며, 검출 횟수가 많을수록 제1 키워드의 목록에서 상위에 위치하게 된다.In addition, the server 100 may acquire an image including the first keyword as a hashtag (#) as the first image information from the server 100 or another server 100, and a plurality of images included in the first keyword The number of detections of the keywords is matched, and the greater the number of detections, the higher in the list of first keywords.

이에 따라, 복수개의 제1 영상정보가 제1 단말(200)에 제공되는데 있어서, 제1 키워드의 목록에서 상위에 위치한 키워드를 포함한 제1 영상 정보가 제1 단말(200)의 화면 상단에서 추천될 수 있으며, 제1 키워드에 포함된 키워드들 중 해시태그로 포함하는 키워드가 가장 많은 제1 영상정보가 제1 단말(200)의 화면 최상단에 위치할 수 있다.Accordingly, when the plurality of first image information is provided to the first terminal 200 , the first image information including the keyword positioned higher in the list of first keywords is to be recommended at the top of the screen of the first terminal 200 . Also, among the keywords included in the first keyword, the first image information having the most keywords included as hashtags may be located at the top of the screen of the first terminal 200 .

예를 들어, 제1 단말(200)에 대응하는 학생이 8세에 여성이고, 8세 및 여성에 대해 출력되는 제1 키워드가 마법소녀, 고양이, 강아지, 드레스 등을 포함하고, 검출 횟수가 높은 키워드부터 나열된 경우, 서버(100)는 마법소녀를 해시태그로 포함하는 제1 영상정보를 제1 단말(200)의 화면 상단에 위치하게 하되, 마법소녀 및 고양이를 해시태그로 포함하는 제1 영상정보가 마법소녀만 해시태그로 포함하는 제1 영상정보 보다 상단에 위치하도록 출력할 수 있다.For example, the student corresponding to the first terminal 200 is a female at the age of 8, and the first keyword output for the 8-year-old and female includes magical girl, cat, puppy, dress, etc., and the number of detections is high. When the keywords are listed first, the server 100 places the first image information including the magical girl as a hashtag at the top of the screen of the first terminal 200, but the first image including the magical girl and the cat as the hashtag The information can be output so that it is located above the first image information including only the magical girl as a hashtag.

서버(100)가, 제1 단말(200)로부터 적어도 하나의 관심 키워드를 획득하고, 관심 키워드에 대응하는 적어도 하나의 제2 영상정보를 제1 단말(200)에 공유하는 단계는, 구체적으로, 서버(100)가, 제1 단말(200)로부터 검색어를 획득하는 단계, 서버(100)가, 검색어로부터 적어도 하나의 관심 키워드를 획득하는 단계, 서버(100)가, 관심 키워드에 대응하는 적어도 하나의 제2 영상정보를 획득하여 제1 단말(200)에 공유하는 단계를 더 포함할 수 있다.The server 100, obtaining at least one keyword of interest from the first terminal 200, and sharing at least one second image information corresponding to the keyword of interest to the first terminal 200 includes, The step of the server 100 obtaining a search term from the first terminal 200, the server 100 obtaining at least one interest keyword from the search word, the server 100, at least one corresponding to the interest keyword It may further include the step of acquiring the second image information of the sharing to the first terminal (200).

예를 들어, 제1 단말(200)에 대응하는 로그인 정보가 발생한 경우, 서버(100)가 로그인 정보에 대응하는 개인정보를 바탕으로 제1 영상정보를 자동 추천한다.For example, when login information corresponding to the first terminal 200 is generated, the server 100 automatically recommends the first image information based on personal information corresponding to the login information.

이때, 제1 단말(200)에 대응하는 학생이 제1 영상정보 내에서 검색을 실시하거나, 혹은 새로은 영상정보인 제2 영상정보를 획득하기를 원하는 경우, 제1 단말(200) 화면에 제공된 검색창에 검색어가 입력되고, 서버(100)는 검색어를 바탕으로 추출된 관심 키워드를 바탕으로 검색된 적어도 하나의 제2 영상정보를 획득하여 제1 단말(200) 화면에 공유할 수 있다.At this time, when the student corresponding to the first terminal 200 searches within the first image information or wants to acquire the second image information that is new image information, the search provided on the screen of the first terminal 200 A search word is input into the window, and the server 100 may acquire at least one second image information searched based on the keyword of interest extracted based on the search word and share it on the screen of the first terminal 200 .

이때, 서버(100)가 관심 키워드를 연령정보 및 성별정보에 대응하는 출력정보가 되도록 인공지능 모델을 학습시킬 수 있다.In this case, the server 100 may train the artificial intelligence model so that the keyword of interest becomes output information corresponding to age information and gender information.

예컨대, 본 발명의 제어 방법은, 서버(100)가, 관심 키워드를 획득하는 단계, 서버(100)가, 관심 키워드에 대응하는 연령정보 및 성별정보를 획득하는 단계 및 서버(100)가, 관심 키워드, 연령정보 및 성별정보를 포함하는 학습 데이터를 이용하여 연령정보 및 성별정보와, 제1 키워드 간에 연관성을 평가하는 인공지능 모델을 학습시키는 단계를 더 포함할 수 있다.For example, in the control method of the present invention, the server 100 obtains an interest keyword, the server 100 obtains age information and gender information corresponding to the interest keyword, and the server 100 obtains an interest The method may further include training an artificial intelligence model for evaluating the association between the age information and the gender information and the first keyword by using the learning data including the keyword, the age information, and the gender information.

이때, 관심 키워드는 최종적으로 제1 키워드로 획득될 수 있다.In this case, the keyword of interest may be finally obtained as the first keyword.

또한, 종료 시점을 획득하는 단계는, 서버(100)가, 제2 단말(300)에 제1 단말(200)의 종료를 알리는 단계를 더 포함하여, 제2 단말(300)에 대응하는 보호자(혹은 교사)가 제1 단말(200)에 대응하는 학생의 수업 종료 사실을 확인할 수 있도록 한다.In addition, the step of obtaining the end point further includes, by the server 100, notifying the second terminal 300 of the end of the first terminal 200, the guardian ( Or, the teacher) can confirm the fact that the class of the student corresponding to the first terminal 200 has ended.

학습정보를 획득하는 단계는, 서버(100)가, 제1 단말(200)로부터 선택된 영상정보를 획득하는 단계, 서버(100)가, 제1 단말(200)로부터 학생을 대상으로 하는 촬영정보 및 녹음정보를 포함하는 학습정보를 생성하는 단계 및 서버(100)가, 선택된 영상정보를 바탕으로 학습정보를 평가하여 평가정보를 제2 단말(300)에 공유하는 단계를 더 포함할 수 있다.The step of acquiring the learning information includes, by the server 100, acquiring the selected image information from the first terminal 200, the server 100, shooting information targeting the student from the first terminal 200 and The step of generating learning information including recording information and the server 100 may further include the step of evaluating the learning information based on the selected image information and sharing the evaluation information with the second terminal 300 .

서버(100)가, 제1 단말(200)로부터 선택된 영상정보를 획득하는 단계에서, 선택된 영상정보는 제1 영상정보 혹은 제2 영상정보 중 어느 하나일 수 있다.In the step of the server 100 acquiring the selected image information from the first terminal 200 , the selected image information may be either the first image information or the second image information.

서버(100)가, 제1 단말(200)로부터 학생을 대상으로 하는 촬영정보 및 녹음정보를 포함하는 학습정보를 생성하는 단계에서, 도 1에 도시된 바와 같이, 제1 단말(200)에 내장 혹은 연결된 카메라, 스피커 및 마이크 등을 통해 촬영정보 및 녹음정보가 획득되어 제1 단말(200)로부터 서버(100)로 전송될 수 있다.In the step of the server 100, generating learning information including shooting information and recording information for a student from the first terminal 200, as shown in FIG. 1, it is built into the first terminal 200 Alternatively, photographing information and recording information may be acquired through a connected camera, speaker, and microphone and transmitted from the first terminal 200 to the server 100 .

서버(100)가, 선택된 영상정보를 바탕으로 학습정보를 평가하여 평가정보를 제2 단말(300)에 공유하는 단계에서, 도 2에 도시된 바와 같이, 제2 단말(300)의 화면에 학습량, 실시간 점수 등이 공유되어 보호자(혹은 교사)가 자녀(혹은 학생)의 현재 학습 진도와 성취 정도를 확인할 수 있다.In the step of the server 100 evaluating the learning information based on the selected image information and sharing the evaluation information to the second terminal 300 , as shown in FIG. 2 , the amount of learning is displayed on the screen of the second terminal 300 . , real-time scores, etc. are shared so that guardians (or teachers) can check the current learning progress and achievement of their children (or students).

제1 단말(200) 및 제2 단말(300)을 매칭하는 단계는, 서버(100)가, 제2 단말(300)에 촬영정보 및 녹음정보를 실시간 공유하는 단계 및 서버(100)가, 제1 단말(200) 및 제2 단말(300)이 공유하는 채팅 화면을 생성하는 단계를 더 포함할 수 있다.The step of matching the first terminal 200 and the second terminal 300 includes the server 100 sharing the shooting information and the recording information with the second terminal 300 in real time, and the server 100 is the second The method may further include generating a chatting screen shared by the first terminal 200 and the second terminal 300 .

서버(100)가, 제2 단말(300)에 촬영정보 및 녹음정보를 실시간 공유하는 단계에서, 도 2에 도시된 바와 같이, 서버(100)가 제2 단말(300)에 제1 단말(200)로부터 실시간으로 획득되고 있는 음성 및 영상을 공유하여, 보호자(혹은 교사)가 자녀(혹은 학생)의 현재 학습 상황을 확인할 수 있도록 서비스를 제공할 수 있다.In the step of the server 100 sharing the shooting information and the recording information to the second terminal 300 in real time, as shown in FIG. 2 , the server 100 sends the second terminal 300 to the first terminal 200 ), by sharing the audio and video acquired in real time, it is possible to provide a service so that the guardian (or teacher) can check the current learning status of the child (or student).

이때, 도시된 바와 같이, 제2 단말(300)에서 제1 단말(200)에 재생되는 영상정보를 확인할 수 있다.In this case, as shown, image information reproduced in the first terminal 200 in the second terminal 300 may be checked.

서버(100)가, 제1 단말(200) 및 제2 단말(300)이 공유하는 채팅 화면을 생성하는 단계에서, 도 2에 도시된 바와 같이, 서버(100)는 제1 단말(200) 및 제2 단말(300)에 각각 입력된 메시지를 매칭된 단말에 전송하되, 서로가 전송한 메시지 내역을 확인할 수 있는 채팅 화면을 생성하여 제공할 수 있다.In the step of the server 100 generating a chatting screen shared by the first terminal 200 and the second terminal 300 , as shown in FIG. 2 , the server 100 includes the first terminal 200 and the second terminal 300 . A message inputted to the second terminal 300 may be transmitted to a matched terminal, but a chatting screen may be generated and provided to check the details of messages transmitted by each other.

학습정보를 평가하여 평가정보를 제2 단말(300)에 공유하는 단계는, 서버(100)가, 학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하고, 발음오류정보 및 오류단어정보에 각각 대응하는 구순 동작 맵을 매칭하는 단계, 서버(100)가, 발음오류정보 및 오류단어정보 중 적어도 하나에 대응하는 발음정보를 획득하는 단계, 서버(100)가, 발음정보에 대응하는 예시 구순 동작 영상을 획득하여, 발음정보와 예시 구순 동작 영상을 매칭하는 단계, 서버(100)가, 예시 구순 동작 영상과 구순 동작 맵의 유사성을 기 설정된 기준에 따라 판단하여 구순 동작 점수를 획득하는 단계, 서버(100)가, 인공지능 모델을 통해, 발음오류정보 및 오류단어정보 각각에 대응하는 기 설정된 가중치를 바탕으로 재학습 필요 점수를 획득하는 단계 및 서버(100)가, 구순 동작 점수 및 재학습 필요 점수를 바탕으로 평가정보를 생성하는 단계를 더 포함할 수 있다.In the step of evaluating the learning information and sharing the evaluation information to the second terminal 300, the server 100 acquires at least one of pronunciation error information, error word information, and oral action map based on the learning information, and pronunciation Matching the oral action map corresponding to the error information and the error word information, respectively, the server 100 acquiring pronunciation information corresponding to at least one of the pronunciation error information and the error word information, the server 100, obtaining an example oral motion image corresponding to the pronunciation information and matching the pronunciation information with the example oral motion image; the server 100 determines the similarity between the example oral motion image and the oral motion map according to a preset criterion to determine the oral order Acquiring the operation score, the server 100, through the artificial intelligence model, the step of acquiring a score necessary for re-learning based on preset weights corresponding to each of the pronunciation error information and the erroneous word information, and the server 100 , may further include the step of generating evaluation information based on the oral motion score and the re-learning required score.

서버(100)가, 학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하고, 발음오류정보 및 오류단어정보에 각각 대응하는 구순 동작 맵을 매칭하는 단계에서, 발음오류정보란, 서버(100)가 제1 단말(200)에 제공한 따라읽기 문장에 대해 일치하지 않는 언어가 녹음정보에서 검출된 경우, 일치하지 않는 언어가 검출되는 구간인 것을 특징으로 한다.The server 100 acquires at least one of pronunciation error information, erroneous word information, and oral action map based on the learning information, and matches the oral oral action map corresponding to the pronunciation error information and the erroneous word information, respectively, in the pronunciation The error information is a section in which the mismatched language is detected when a language that does not match the read sentence provided by the server 100 to the first terminal 200 is detected in the recorded information.

오류단어정보란, 서버(100)가 제1 단말(200)에 듣고 쓰기를 위한 받아쓰기 정보를 전송한 경우, 제1 단말(200)에 입력된 단어의 철자가 잘못되거나 미기입된 단어 및 서버(100)가 제1 단말(200)에 제공한 따라읽기 단어에 대해 일치하지 않는 언어가 녹음정보에서 검출된 경우, 일치하지 않는 언어가 검출되는 구간인 것을 특징으로 한다.Error word information means that, when the server 100 transmits dictation information for listening and writing to the first terminal 200 , the word entered into the first terminal 200 is misspelled or not written, and the server 100 ), when a language that does not match with the read-along word provided to the first terminal 200 is detected in the recorded information, it is characterized in that it is a section in which the language does not match.

구순 동작 맵이란, 촬영정보를 바탕으로 획득된 학생의 입모양에 복수개의 특징점을 획득하여 특징점의 움직임을 매핑한 정보인 것을 특징으로 한다.The oral motion map is characterized as information obtained by mapping the movement of the characteristic points by acquiring a plurality of characteristic points to the mouth shape of the student obtained based on the photographing information.

구체적으로, 발음오류정보 및 오류단어정보 중 적어도 하나의 오답정보가 발생한 경우, 서버(100)가, 오답정보가 발생된 시점 및 오답정보가 종료된 시점 사이의 구간에 대응하는 구순 동작 맵을 획득하는 단계 및 서버(100)가, 구순 동작 맵을 오답정보에 매칭하는 단계를 더 포함할 수 있다.Specifically, when incorrect answer information of at least one of the pronunciation error information and the error word information occurs, the server 100 acquires a verbal motion map corresponding to the section between the time when the incorrect answer information is generated and the time when the incorrect answer information is terminated. and matching, by the server 100, the oral order motion map to the incorrect answer information.

서버(100)가, 발음오류정보 및 오류단어정보 중 적어도 하나에 대응하는 발음정보를 획득하는 단계에서, 서버(100)가, 발음오류정보 및 오류단어정보 중 적어도 하나의 오답정보가 발생된 시점 및 오답정보가 종료된 시점 사이의 구간에 대응하는 문장 혹은 단어를 획득하는 단계 및 서버(100)가, 문장 혹은 단어에 대한 발음 정보를 추출하는 단계를 더 포함할 수 있다.When the server 100 acquires pronunciation information corresponding to at least one of the pronunciation error information and the erroneous word information, the server 100 generates incorrect answer information on at least one of the pronunciation error information and the erroneous word information and obtaining, by the server 100, a sentence or word corresponding to a section between the time points where the incorrect answer information is terminated, and extracting, by the server 100, pronunciation information about the sentence or word.

여기서, 발음정보란, 문장 혹은 단어를 옳바르게 소리내어 발음한 소리를 데이터화한 것 혹은 선택된 영상정보에서 오답정보가 종료된 시점 사이의 구간에 대응하는 소리를 데이터화한 것일 수 있다.Here, the pronunciation information may be a data obtained by correctly pronouncing a sentence or word or a sound corresponding to a section between a time point at which incorrect answer information is terminated in the selected image information.

서버(100)가, 발음정보에 대응하는 예시 구순 동작 영상을 획득하여, 발음정보와 예시 구순 동작 영상을 매칭하는 단계에서, 예시 구순 동작 영상은, 발음정보를 원어민 혹은 영상 제작자 혹은 3D 캐릭터가 발음하는 입술 모양에 대한 영상정보인 것을 특징으로 한다.In the step where the server 100 acquires an example oral motion image corresponding to the pronunciation information and matches the pronunciation information and the example oral motion image, the example oral motion image is a native speaker or a video producer or a 3D character pronounces the pronunciation information It is characterized in that it is image information about the shape of the lips.

서버(100)가, 예시 구순 동작 영상과 구순 동작 맵의 유사성을 기 설정된 기준에 따라 판단하여 구순 동작 점수를 획득하는 단계에서, 서버(100)가, 예시 구순 동작 영상로부터 예시 구순 동작 맵을 추출하는 단계, 서버(100)가, 구순 동작 맵의 백터값인 제1 백터량을 추출하고, 예시 구순 동작 맵의 백터값인 제2 백터량을 추출하는 단계, 서버(100)가, 제1 내지 제2 백터량을 바탕으로 유사성을 판단하는 단계 및 서버(100)가, 유사성에 비례하는 구순 동작 점수를 획득하는 단계를 더 포함할 수 있다.In the step in which the server 100 determines the similarity between the example oral motion image and the oral motion map according to a preset criterion to obtain a oral motion score, the server 100 extracts the example oral motion map from the example oral motion image step, the server 100 extracting a first vector amount that is a vector value of the oral motion map, and extracting a second vector amount that is a vector value of an exemplary oral motion map, the server 100, the first to The method may further include determining similarity based on the second vector amount and obtaining, by the server 100, an oral motion score proportional to the similarity.

즉, 같은 발음정보에 대한 예시 입모양과 학생의 입모양을 비교하여 입모양이 유사한지에 따라 점수를 책정할 수 있다.That is, by comparing the mouth shape of the student with the example mouth shape for the same pronunciation information, a score can be set according to whether the mouth shape is similar.

서버(100)가, 인공지능 모델을 통해, 발음오류정보 및 오류단어정보 각각에 대응하는 기 설정된 가중치를 바탕으로 재학습 필요 점수를 획득하는 단계에서, 기 설정된 가중치는, 구체적으로, 발음오류정보 및 오류단어정보를 포함하는 오답정보에 대해 문제별 난이도, 영상정보별 난이도, 발음정보별 난이도, 유사한 오답정보 생성 횟수, 동일한 오답정보 생성 횟수 등에 기초하여 난이도가 높고 유사한 오답정보 혹은 동일한 오답정보 생성 횟수가 많을수록 높은 가중치를 가지게 된다.In the step of the server 100, through the artificial intelligence model, acquiring a re-learning necessary score based on a preset weight corresponding to each of the pronunciation error information and the erroneous word information, the preset weight is specifically, pronunciation error information And, for the incorrect answer information including error word information, the difficulty is high and similar incorrect answer information or the same incorrect answer information is generated based on the difficulty by problem, the difficulty by video information, the difficulty by pronunciation information, the number of times of generation of similar incorrect information, the number of times of generation of the same incorrect information, etc. The higher the number, the higher the weight.

따라서, 가중치가 높을수록 재학습 필요 점수가 비례하여 높게 획득되고, 이에 따라, 서버(100)가, 구순 동작 점수 및 재학습 필요 점수를 바탕으로 평가정보를 생성하는 단계에서, 평가정보에 대응하는 재학습 필요 횟수가 증가할 수 있다.Therefore, the higher the weight, the higher the re-learning required score is obtained proportionally, and accordingly, the server 100, in the step of generating evaluation information based on the oral operation score and the re-learning required score, corresponding to the evaluation information The number of times required for re-learning may increase.

여기서, 인공지능 모델은, 유사한 오답정보 생성 횟수, 동일한 오답정보 생성 횟수를 산출하는 역할을 수행하며, 즉, 복수개의 발음오류정보 및 복수개의 오류단어 정보를 입력변수로 하여, 유사한 오답정보 생성 횟수, 동일한 오답정보 생성 횟수를 출력할 수 있다.Here, the artificial intelligence model serves to calculate the number of times of generating similar incorrect answer information and the number of generating the same incorrect answer information, that is, using a plurality of pronunciation error information and a plurality of incorrect word information as input variables, the number of similar incorrect answer information generated , it is possible to output the number of times of generating the same incorrect answer information.

학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하는 단계는, 서버(100)가, 인공지능 모델을 통해, 발음오류정보 및 오류단어정보를 바탕으로 발음기호 연관성 값을 획득하는 단계, 발음기호 연관성 값이 기 설정된 값 이상인 경우, 서버(100)가, 발음기호 연관성 값에 대응하는 발음기호 정보를 포함한 복수개의 단어를 획득하는 단계 및 서버(100)가, 복수개의 단어를 포함하는 제3 영상정보를 획득하여, 제1 복습 영상으로 제1 단말(200)에 공유하는 단계를 더 포함할 수 있다.Acquiring at least one of pronunciation error information, error word information, and oral action map based on the learning information includes, by the server 100, through an artificial intelligence model, pronunciation error information and pronunciation symbol association based on the error word information. obtaining, by the server 100, a plurality of words including phonetic symbol information corresponding to the phonetic symbol association value when the phonetic symbol association value is equal to or greater than a preset value, and the server 100, The method may further include acquiring the third image information including the words and sharing it with the first terminal 200 as a first review image.

서버(100)가, 인공지능 모델을 통해, 발음오류정보 및 오류단어정보를 바탕으로 발음기호 연관성 값을 획득하는 단계에서, 발음오류정보와 오류단어정보에 대응하는 정답정보를 비교하여 유사하거나 동일한 소리를 가진 발음기호가 존재하는지 판단하고, 존재하는 경우 얼마나 유사한지를 연관성 값으로 획득할 수 있다.In the step of the server 100, through the artificial intelligence model, acquiring the pronunciation symbol association value based on the pronunciation error information and the erroneous word information, the pronunciation error information and the correct answer information corresponding to the erroneous word information are compared and similar or identical It is determined whether a phonetic symbol with a sound exists, and if it exists, how similar it is can be obtained as a correlation value.

이때, 연관성 값은 유사하지 않은 경우, 0으로 획득되고, 동일한 경우, 1로 획득될 수 있다.In this case, the correlation values may be obtained as 0 when they are not similar, and may be obtained as 1 when they are the same.

발음기호 연관성 값이 기 설정된 값 이상인 경우, 서버(100)가, 발음기호 연관성 값에 대응하는 발음기호 정보를 포함한 복수개의 단어를 획득하는 단계에서, 일실시예로, 설정된 값이 50%인 0.5인 경우, 서버(100)가, 연관성 값이 0.5 이상 1이하인 적어도 하나의 발음기호를 발음기호 정보로 획득하는 단계 및 서버(100)가, 발음기호 정보를 포함하거나 유사한 발음기호를 갖는 복수개의 단어를 획득하는 단계를 더 포함할 수 있다.When the phonetic symbol correlation value is equal to or greater than a preset value, in the step of the server 100 acquiring a plurality of words including phonetic symbol information corresponding to the phonetic symbol correlation value, in one embodiment, the set value is 0.5, which is 50% , obtaining, by the server 100, at least one phonetic symbol having a correlation value of 0.5 or more and 1 or less as phonetic symbol information, and the server 100, a plurality of words including phonetic symbol information or having similar phonetic symbols It may further include the step of obtaining

서버(100)가, 복수개의 단어를 포함하는 제3 영상정보를 획득하여, 제1 복습 영상으로 제1 단말(200)에 공유하는 단계에서, 서버(100)가, 인공지능 모델을 통해, 복수개의 단어에 대응하는 적어도 하나의 제2 키워드를 획득하는 단계 및 서버(100)가, 제2 키워드에 대응하는 적어도 하나의 추천 영상정보를 제3 영상정보로 획득하는 단계를 더 포함할 수 있다.In the step of the server 100 acquiring third image information including a plurality of words and sharing it with the first terminal 200 as a first review image, the server 100, through an artificial intelligence model, The method may further include obtaining, by the server 100, at least one second keyword corresponding to the words, and obtaining, by the server 100, at least one piece of recommended image information corresponding to the second keyword as third image information.

서버(100)가, 인공지능 모델을 통해, 복수개의 단어에 대응하는 적어도 하나의 제2 키워드를 획득하는 단계에서, 제2 키워드는 복수개의 단어로 인해 추출될 수 있는 이야기 주제에 관한 것일 수 있다.In the step of the server 100 acquiring at least one second keyword corresponding to the plurality of words through the artificial intelligence model, the second keyword may relate to a story topic that can be extracted due to the plurality of words .

구체적으로, 인공지능 모델은 복수개의 단어를 입력받으면, 복수개의 단어로부터 추출될 수 있는 복수개의 주제를 획득하고, 복수개의 주제에 대응하는 해시태그가 될 수 있는 나타내는 제2 키워드를 출력할 수 있다.Specifically, when receiving a plurality of words, the artificial intelligence model may obtain a plurality of topics that can be extracted from the plurality of words, and output a second keyword indicating which can be a hashtag corresponding to the plurality of topics. .

서버(100)가, 제2 키워드에 대응하는 적어도 하나의 추천 영상정보를 제3 영상정보로 획득하는 단계에서, 제2 키워드를 해시태그로 포함하는 추천 영상정보, 즉, 제2 키워드가 가리키는 주제로 이야기가 진행되는 추천 영상정보를 제3 영상정보로 획득할 수 있다.In the step of the server 100 acquiring at least one piece of recommended image information corresponding to the second keyword as the third image information, the recommended image information including the second keyword as a hashtag, that is, the subject indicated by the second keyword It is possible to obtain the recommended image information in which the story proceeds as the third image information.

따라서, 본 발명은 학생이 자주 틀리는 발음과 유사하거나 동일한 발음기호를 갖는 문장이나 단어가 등장할 확률이 높은 영상정보를 획득하여, 제1 단말(200)에 복습용 영상정보인 제3 영상정보로 제공할 수 있다.Therefore, the present invention acquires image information with a high probability of appearing a sentence or word having a pronunciation similar or identical to the pronunciation frequently mistaken by a student, and transmits it to the first terminal 200 as third image information, which is image information for review. can provide

성취정보를 생성하는 단계는, 서버(100)가, 제1 평가정보 및 제2 평가 정보를, 제3 평가 정보와 비교하는 단계 및 서버(100)가, 제1 내지 제3 평가 정보의 비교 결과를 바탕으로 생성한 성취율 그래프를 포함하는 성취정보를 획득하는 단계를 더 포함하며, 제1 평가 정보는, 최초 발생한 평가 정보이고, 제2 평가 정보는, 직전 발생한 평가 정보이며, 제3 평가 정보는, 가장 후열에 발생한 평가 정보이며, 복습 영상을 바탕으로 생성되는 것을 특징으로 한다.In the step of generating the achievement information, the server 100 compares the first evaluation information and the second evaluation information with the third evaluation information, and the server 100 compares the first to third evaluation information. Further comprising the step of obtaining achievement information including a graph of the achievement rate generated based on the first evaluation information, the first evaluation information, the second evaluation information, the evaluation information that occurred immediately before, the third evaluation information , is the evaluation information generated in the most rear row, and is characterized in that it is generated based on the review image.

도 3에 도시된 바와 같이, 제1 내지 제3 평가 정보는 수치화 하여 점수로 나타낼 수 있으며, 이를 비교한 성취정보는 도시된 바와 같은 그래프로 표현이 가능하다.As shown in FIG. 3 , the first to third evaluation information may be numericalized and expressed as a score, and the comparison achievement information may be expressed as a graph as shown.

촬영정보 및 녹음정보를 포함하는 학습정보를 생성하는 단계는, 기 설정된 시간을 초과하여 녹음정보가 공백인 구간을 획득한 경우, 서버(100)가, 녹음정보가 공백인 구간에 대응하는 촬영정보를 획득하는 단계, 서버(100)가, 공백인 구간에 대응하는 촬영정보로부터 제1 단말(200)에 대응하는 학생 오브젝트 획득 가능 여부를 획득하는 단계, 서버(100)가, 오브젝트의 획득 여부를 바탕으로 제1 비학습 이벤트 및 제2 비학습 이벤트 중 적어도 하나의 비학습 이벤트를 획득하는 단계, 제1 비학습 이벤트가 획득된 경우, 서버(100)가, 선택 영상정보에 대응하는 음량 값을 기 설정된 음량으로 증가시키는 단계, 제2 비학습 이벤트가 획득된 경우, 서버(100)가, 제2 단말(300)에 제2 비학습 이벤트 발생 시점을 공유하는 단계, 서버(100)가, 접속 시점 및 종료 시점을 바탕으로 총 학습시간을 산출하되, 공백인 구간에 대응하는 시간을 총 학습시간에서 제외하는 단계 및 서버(100)가, 총 학습시간을 바탕으로 권장 학습 시간을 산출하여 제1 단말(200)에 전송하는 단계를 더 포함할 수 있다.In the step of generating learning information including shooting information and recording information, when a section in which the recording information is blank is obtained over a preset time, the server 100, shooting information corresponding to the section in which the recording information is blank step of obtaining, the server 100 obtaining whether the student object corresponding to the first terminal 200 can be obtained from the shooting information corresponding to the blank section, the server 100 determining whether the object is obtained Based on the first non-learning event and the second non-learning event, obtaining at least one non-learning event, when the first non-learning event is obtained, the server 100 selects a volume value corresponding to the selected image information The step of increasing the volume to a preset volume, when the second non-learning event is obtained, the server 100 sharing the second non-learning event occurrence time point with the second terminal 300, the server 100 accessing Calculating the total learning time based on the starting point and the ending time, excluding the time corresponding to the blank section from the total learning time, and the server 100 calculates the recommended learning time based on the total learning time to first The method may further include transmitting to the terminal 200 .

상기의 촬영정보 및 녹음정보를 포함하는 학습정보를 생성하는 단계에서 추가로 포함하는 단계들은 학생이 학습을 실시하지 않거나, 자리를 비우는 등 학습에 집중하지 못하는 상황을 검출하고 대처하기 위한 단계들이다.The steps additionally included in the step of generating the learning information including the shooting information and the recording information are steps for detecting and coping with situations in which the student does not study or is unable to concentrate on learning, such as leaving a seat.

구체적으로, 기 설정된 시간을 초과하여 녹음정보가 공백인 구간을 획득한 경우, 서버(100)가, 녹음정보가 공백인 구간에 대응하는 촬영정보를 획득하는 단계에서, 서버(100)는 제1 단말(200)에 기 등록된 학생의 음성정보를 저장하여, 음성정보를 바탕으로 녹음정보에서 대응하는 음성이 획득되었는지를 판단할 수 있다.Specifically, when a section in which the recording information is blank is obtained over a preset time, in the step of the server 100 acquiring shooting information corresponding to the section in which the recording information is blank, the server 100 first By storing the voice information of the pre-registered student in the terminal 200, it is possible to determine whether a corresponding voice is obtained from the recording information based on the voice information.

따라서, 음성정보에 대응하는 음성이 획득되지 않은 구간을 녹음정보의 공백 구간으로 판단하고, 공백 구간에 대응하는 시점에 획득된 촬영정보를 판별할 수 있다.Accordingly, it is possible to determine a section in which a voice corresponding to the voice information is not obtained as a blank section of the recording information, and determine the shooting information obtained at a time point corresponding to the blank section.

서버(100)가, 공백인 구간에 대응하는 촬영정보로부터 제1 단말(200)에 대응하는 학생 오브젝트 획득 가능 여부를 획득하는 단계에서, 학생 오브젝트는 제1 단말(200)을 통해 기 등록한 학생의 안면이미지에 대응하는 이미지 정보일 수 있다.In the step of the server 100 acquiring whether the student object corresponding to the first terminal 200 can be obtained from the shooting information corresponding to the blank section, the student object is the It may be image information corresponding to the face image.

서버(100)가, 오브젝트의 획득 여부를 바탕으로 제1 비학습 이벤트 및 제2 비학습 이벤트 중 적어도 하나의 비학습 이벤트를 획득하는 단계에서, 제1 비학습 이벤트는 오브젝트가 획득된 경우 발생할 수 있는 이벤트로, 학생이 학습 중에 졸거나 학습 외의 다른 행동을 하는 것에 대응한다.In the step of the server 100 acquiring at least one non-learning event among the first non-learning event and the second non-learning event based on whether the object is acquired, the first non-learning event may occur when the object is acquired An event that exists in response to a student falling asleep while learning or engaging in a behavior other than learning.

또한, 제2 비학습 이벤트는 오브젝트가 획득되지 않은 경우 발생할 수 있는 이벤트로, 학생이 학습 중 제1 단말(200)이 있는 공간을 벗어난 것을 의미할 수 있다.Also, the second non-learning event is an event that may occur when an object is not obtained, and may mean that a student leaves the space in which the first terminal 200 is located during learning.

제1 비학습 이벤트가 획득된 경우, 서버(100)가, 선택 영상정보에 대응하는 음량 값을 기 설정된 음량으로 증가시키는 단계에서, 제1 단말(200)에 대응하는 학생의 집중을 장려하기 위해 음량을 일정 음량까지 증가시킴으로써 알람 효과를 줄 수 있다.When the first non-learning event is obtained, in the step of the server 100 increasing the volume value corresponding to the selected image information to a preset volume, in order to encourage the concentration of the student corresponding to the first terminal 200 By increasing the volume to a certain volume, an alarm effect can be given.

제2 비학습 이벤트가 획득된 경우, 서버(100)가, 제2 단말(300)에 제2 비학습 이벤트 발생 시점을 공유하는 단계에서, 서버(100)가, 보호자(혹은 교사)의 단말인 제2 단말(300)에 알람을 전송하여 학생이 자리를 비운 것에 대한 정보를 제공할 수 있으며, 제2 단말(300)은, 도 2에 도시된 바와 같이, 실시간 촬영정보를 공유할 수 있어 현재 제1 단말(200)에 촬영되는 상황을 확인함으로써 학생의 부재사실을 인지할 수 있다.When the second non-learning event is obtained, in the step of the server 100 sharing the second non-learning event occurrence time point with the second terminal 300 , the server 100 is a terminal of a guardian (or a teacher) By sending an alarm to the second terminal 300, information about the absence of a student can be provided, and the second terminal 300 can share real-time shooting information, as shown in FIG. It is possible to recognize the absence of the student by checking the situation being photographed in the first terminal 200 .

서버(100)가, 접속 시점 및 종료 시점을 바탕으로 총 학습시간을 산출하되, 공백인 구간에 대응하는 시간을 총 학습시간에서 제외하는 단계 및 서버(100)가, 총 학습시간을 바탕으로 권장 학습 시간을 산출하여 제1 단말(200)에 전송하는 단계에서, 학생의 총 학습시간을 바탕으로, 제1 단말(200) 혹은 교사에 의해 기 등록된 일간, 주간, 월간 학습 권장량에 비교하여 권장 학습 시간을 제공할 수 있다.The server 100 calculates the total learning time based on the access point and the end time, but excludes the time corresponding to the blank section from the total learning time, and the server 100 recommends it based on the total learning time In the step of calculating the learning time and transmitting it to the first terminal 200, based on the total learning time of the student, it is recommended compared to the recommended daily, weekly, and monthly learning amount previously registered by the first terminal 200 or the teacher Learning time can be provided.

제2 비학습 이벤트를 획득하는 단계는, 서버(100)가, 제1 단말(200)로부터 배경 이미지를 획득하는 단계, 서버(100)가, 배경 이미지 상에 포함된 복수의 심볼을 획득하는 단계 및 서버(100)가, 복수의 심볼 중, 기 설정된 개수 이상의 심볼이 검출되면, 제2 비학습 이벤트를 획득하는 단계를 더 포함할 수 있다.Acquiring the second non-learning event includes, by the server 100, acquiring a background image from the first terminal 200, and acquiring, by the server 100, a plurality of symbols included in the background image and obtaining, by the server 100, a second non-learning event when more than a preset number of symbols are detected among the plurality of symbols.

서버(100)가, 제1 단말(200)로부터 배경 이미지를 획득하는 단계에서, 배경 이미지는 오브젝트를 제외한 이미지이다.In the step of the server 100 acquiring the background image from the first terminal 200 , the background image is an image excluding the object.

서버(100)가, 배경 이미지 상에 포함된 복수의 심볼을 획득하는 단계에서, 심볼은 책장의 꼭지점, 문틀의 꼭지점, 문 손잡이, 벽면의 모서리, 모퉁이, 의자 머리받침의 꼭지점, 의자 등받침의 꼭지점, 벽걸이 시계의 중앙점, 탁상시계의 중앙점 등의 좌표를 포함할 수 있다.In the step of the server 100 acquiring the plurality of symbols included on the background image, the symbols are the vertices of the bookshelf, the vertices of the door frames, the door handles, the corners of the wall, the corners, the vertices of the chair headrest, and the vertex of the chair backrest. It can include coordinates of a vertex, the center point of a wall clock, the center point of a table clock, and the like.

서버(100)가, 복수의 심볼 중, 기 설정된 개수 이상의 심볼이 검출되면, 제2 비학습 이벤트를 획득하는 단계에서, 촬영정보를 바탕으로 오브젝트가 획득되는 경우, 특정 심볼들은 오브젝트에 의해 가려져 검출되지 않으나, 오브젝트가 위치하지 않거나, 배경 이미지의 가장자리에 위치하는 경우, 심볼이 기 설정된 개수 이상 검출될 수 있다.When the server 100 detects more than a preset number of symbols among the plurality of symbols, in the step of acquiring the second non-learning event, when an object is acquired based on shooting information, specific symbols are detected by being obscured by the object However, when the object is not located or is located at the edge of the background image, more than a preset number of symbols may be detected.

즉, 실시간으로 심볼이 검출되는 개수에 따라 오브젝트의 획득 여부를 판단하는 것이 가능하고, 이를 통해 학생이 학습 중에 자리를 이탈하거나 졸고 있는지 여부를 판단할 수 있다.That is, it is possible to determine whether an object is obtained according to the number of symbols detected in real time, and through this, it is possible to determine whether the student leaves the seat or is asleep during learning.

학습정보를 바탕으로 발음오류정보, 오류단어정보 및 구순 동작 맵 중 적어도 하나를 획득하는 단계는, 서버(100)가, 선택 영상정보를 바탕으로 문장 별 사운드 패턴인 제1 사운드 패턴을 획득하는 단계, 서버(100)가, 녹음정보로부터 제1 사운드 패턴에 대응하는 제2 사운드 패턴을 획득하는 단계, 서버(100)가, 기 설정된 기준 이상 유사한 것으로 판단한 제1 사운드 패턴 및 제2 사운드 패턴을 매칭하고, 타 사운드 패턴이 매칭되지 않은 제2 사운드 패턴을 오류정보로 획득하는 단계, 서버(100)가, 오류정보로 획득된 제2 사운드 패턴의 발생 시점을 바탕으로, 동일한 발생 시점을 갖는 제1 사운드 패턴을 오류정보로 획득된 제2 사운드 패턴과 매칭하는 단계, 서버(100)가, 오류정보에서 매칭된 제1 사운드 패턴 및 제2 사운드 패턴 중 소리의 맵시가 기 설정된 기준 이상 유사한 오류정보를 발음오류정보로 정의하는 단계 및 서버(100)가, 소리의 맵시가 기 설정된 기준 미만으로 유사한 오류정보를 오류단어정보로 정의하는 단계를 더 포함할 수 있으며, 발음정보는, 제1 사운드 패턴을 포함하는 것을 특징으로 한다.Acquiring at least one of pronunciation error information, erroneous word information, and oral action map based on the learning information includes, by the server 100, acquiring a first sound pattern that is a sound pattern for each sentence based on the selected image information , server 100, obtaining a second sound pattern corresponding to the first sound pattern from the recording information, the server 100 matches the first sound pattern and the second sound pattern determined to be similar to or more than a preset standard and obtaining, by the server 100, a second sound pattern that is not matched with other sound patterns as error information; The step of matching the sound pattern with the second sound pattern obtained as error information, the server 100 receives error information similar to the sound shape among the first and second sound patterns matched in the error information above a preset standard The step of defining the pronunciation error information and the server 100 may further include the step of defining, by the server 100, similar error information as the error word information in which the soundness is less than a preset standard, the pronunciation information is the first sound pattern characterized by including.

이때, 제1 내지 제2 사운드 패턴을 포함하는 사운드 패턴이란, 음성 체계에 대한 정보로, 호흡, 발성 및 조음을 포함한하는 것을 특징으로 한다.In this case, the sound pattern including the first and second sound patterns is information on the voice system, and it is characterized in that it includes breathing, vocalization, and articulation.

서버(100)가, 선택 영상정보를 바탕으로 문장 별 사운드 패턴인 제1 사운드 패턴을 획득하는 단계에서, 제1 사운드 패턴은, 선택 영상정보(정답)에서 출력된 음성 정보에 대한 것이다.In the step of the server 100 acquiring a first sound pattern that is a sound pattern for each sentence based on the selected image information, the first sound pattern relates to the audio information output from the selected image information (correct answer).

서버(100)가, 녹음정보로부터 제1 사운드 패턴에 대응하는 제2 사운드 패턴을 획득하는 단계에서, 제2 사운드 패턴은, 녹음정보(학생 답안)에서 출력된 음성 정보에 대한 것이다.In the step of the server 100 obtaining a second sound pattern corresponding to the first sound pattern from the recording information, the second sound pattern relates to voice information output from the recording information (student answer).

서버(100)가, 기 설정된 기준 이상 유사한 것으로 판단한 제1 사운드 패턴 및 제2 사운드 패턴을 매칭하고, 타 사운드 패턴이 매칭되지 않은 제2 사운드 패턴을 오류정보로 획득하는 단계에서, 제1 사운드 패턴(정답)에 유사한 것으로 판단된 제2 사운드 패턴을 정답으로 분류하고, 유사한 제1 사운드 패턴이 존재하지 않은 제2 사운드 패턴을 오답인, 오류정보로 분류할 수 있다.In the step of the server 100 matching the first sound pattern and the second sound pattern determined to be similar to or more than a preset standard, and acquiring a second sound pattern that does not match other sound patterns as error information, the first sound pattern A second sound pattern determined to be similar to (correct answer) may be classified as a correct answer, and a second sound pattern having no similar first sound pattern may be classified as incorrect answer and error information.

서버(100)가, 오류정보로 획득된 제2 사운드 패턴의 발생 시점을 바탕으로, 동일한 발생 시점을 갖는 제1 사운드 패턴을 오류정보로 획득된 제2 사운드 패턴과 매칭하는 단계에서, 제2 사운드 패턴에 대한 정답 정보인 제1 사운드 패턴을 특정하여 제2 사운드 패턴에 대응하는 정답 단어 혹은 문장을 획득할 수 있다.In the step of the server 100 matching the first sound pattern having the same occurrence time with the second sound pattern obtained as the error information, based on the occurrence time of the second sound pattern obtained as the error information, the second sound The correct answer word or sentence corresponding to the second sound pattern may be obtained by specifying the first sound pattern, which is correct answer information for the pattern.

서버(100)가, 오류정보에서 매칭된 제1 사운드 패턴 및 제2 사운드 패턴 중 소리의 맵시가 기 설정된 기준 이상 유사한 오류정보를 발음오류정보로 정의하는 단계에서, 오류정보(오답)인 제2 사운드 패턴이 매칭된 정답 정보인 제1 사운드 패턴과 소리의 맵시가 유사한 것은, 제2 사운드 패턴이 단어에 대한 오답이 아닌, 발음 상의 오답인 것으로 분류될 수 있다.In the step of the server 100 defining, as pronunciation error information, error information similar to a sound more than a preset standard among the first sound pattern and the second sound pattern matched in the error information, the second error information (wrong answer) The similarity of the sound to the first sound pattern, which is correct answer information to which the sound pattern is matched, may be classified as an incorrect pronunciation in the second sound pattern rather than an incorrect answer to a word.

서버(100)가, 소리의 맵시가 기 설정된 기준 미만으로 유사한 오류정보를 오류단어정보로 정의하는 단계에서, 오류정보(오답)인 제2 사운드 패턴이 매칭된 정답 정보인 제1 사운드 패턴과 소리의 맵시가 유사하지 않은 것은, 제2 사운드 패턴이 단어에 대한 오답으로 분류될 수 있다.In the step of the server 100, in the step of defining similar error information as erroneous word information whose sound is less than a preset standard, the second sound pattern, which is error information (wrong answer), is matched with a first sound pattern and sound, which is correct information. If the appearance of ' is not similar, the second sound pattern may be classified as an incorrect answer for a word.

동일한 발생 시점을 갖는 제1 사운드 패턴 및 제2 사운드 패턴을 서로 매칭하는 단계는, 오류정보에서 매칭된 제1 사운드 패턴 및 제2 사운드 패턴을 비교하여, 제2 사운드 패턴이 제1 사운드 패턴에 대해 차이가 발생한 구간이 적어도 한번 반복되는 소리의 맵시를 갖는 경우, 서버(100)가, 반복된 적어도 두 개의 맵시에 대한 평균 맵시를 산출하는 단계, 서버(100)가, 차이가 발생한 구간에 위치한 복수개의 반복된 맵시를 하나의 평균 맵시로 대체하여 제3 사운드 패턴을 획득하는 단계, 서버(100)가, 제3 사운드 패턴을 대응하는 제1 사운드 패턴과 비교하는 단계, 제1 사운드 패턴과 제3 사운드 패턴이 기 설정된 기준 이상 유사한 경우, 서버(100)가, 제3 사운드 패턴에 대응하는 제2 사운드 패턴을 더듬은 구간정보로 획득하는 단계 및 서버(100)가, 더듬은 구간정보에 대응하는 영상정보를 제2 복습 영상으로 제1 단말(200)에 공유하는 단계를 더 포함할 수 있다.In the step of matching the first sound pattern and the second sound pattern having the same occurrence time point to each other, the first sound pattern and the second sound pattern matched in the error information are compared, and the second sound pattern is determined with respect to the first sound pattern. When the section in which the difference occurs has a sound quality that is repeated at least once, the server 100 calculates an average mapness for at least two repeated maps, the server 100 includes a plurality of locations in the section where the difference occurs obtaining a third sound pattern by substituting, by the server 100, the third sound pattern by replacing the repeated mapsy with one average mapsy, comparing, by the server 100, the third sound pattern with a corresponding first sound pattern, the first sound pattern and the third When the sound pattern is similar to or more than a preset standard, the server 100 acquires the second sound pattern corresponding to the third sound pattern as stuttered section information, and the server 100 responds to the stuttered section information corresponding to the stuttered section information. The method may further include sharing the image information as a second review image to the first terminal 200 .

오류정보에서 매칭된 제1 사운드 패턴 및 제2 사운드 패턴을 비교하여, 제2 사운드 패턴이 제1 사운드 패턴에 대해 차이가 발생한 구간이 적어도 한번 반복되는 소리의 맵시를 갖는 경우, 서버(100)가, 반복된 적어도 두 개의 맵시에 대한 평균 맵시를 산출하는 단계에서, 반복되는 소리의 맵시를 갖는 구간은 제1 단말(200)에 대응하는 학생이 발성을 하는 중에 더듬은 것으로 판단할 수 있으며, 이에 따라, 더듬은 구간을 더듬지 않은 음성으로 가공하여, 반복된 구간을 갖는 제2 사운드 패턴의 정답 유무를 판단할 수 있다.The server 100 compares the first sound pattern and the second sound pattern matched in the error information, and when the section in which the second sound pattern differs from the first sound pattern has a sound that is repeated at least once, the server 100 , in the step of calculating the average of the at least two repeated Masipsies, it can be determined that the section having the repeated sounds is stuttered while the student corresponding to the first terminal 200 is vocalizing, Accordingly, by processing the stuttered section into a non-stuttering voice, it is possible to determine whether the second sound pattern having the repeated section is correct or not.

서버(100)가, 차이가 발생한 구간에 위치한 복수개의 반복된 맵시를 하나의 평균 맵시로 대체하여 제3 사운드 패턴을 획득하는 단계에서, 더듬은 구간을 더듬지 않은 음성으로 가공된 제2 사운드 패턴이 제3 사운드 패턴으로 저장된다.In the step of the server 100 acquiring the third sound pattern by replacing the plurality of repeated mapsi located in the section in which the difference occurs with one average mapsi, the second sound pattern processed into a voice not stuttered in the stuttered section This third sound pattern is stored.

서버(100)가, 제3 사운드 패턴을 대응하는 제1 사운드 패턴과 비교하는 단계 및 제1 사운드 패턴과 제3 사운드 패턴이 기 설정된 기준 이상 유사한 경우, 서버(100)가, 제3 사운드 패턴에 대응하는 제2 사운드 패턴을 더듬은 구간정보로 획득하는 단계에서, 제3 사운드 패턴이 제1 사운드 패턴과 유사한 것으로 판단되면, 제3 사운드 패턴의 비 가공 형태인 제2 사운드 패턴이 더듬은 구간정보(오답이 아닌 정보)인 것으로 판단되고, 반대의 경우는, 오류정보(오답인 정보)인 것으로 판단될 수 있다.Comparing, by the server 100, the third sound pattern with the corresponding first sound pattern, and when the first sound pattern and the third sound pattern are similar to or more than a preset standard, the server 100 responds to the third sound pattern When it is determined that the third sound pattern is similar to the first sound pattern in the step of acquiring the corresponding second sound pattern as the stuttered section information, information on the section stuttered by the second sound pattern, which is an unprocessed form of the third sound pattern (information that is not an incorrect answer), and in the opposite case, it may be determined that it is error information (information that is an incorrect answer).

서버(100)가, 더듬은 구간정보에 대응하는 영상정보를 제2 복습 영상으로 제1 단말(200)에 공유하는 단계에서, 더듬은 구간정보인 제2 사운드 패턴에 대응하는 영상정보(선택 영상정보, 제1 내지 제3 영상정보 등)를 제2 복습 영상으로 분류하여 제1 단말(200)에 공유하되, 더듬은 구간정보에 대응하는 시점 혹은 구간을 재생하여, 제1 단말(200)에 대응하는 학생이 복습도록 할 수 있다.In the step of the server 100 sharing the image information corresponding to the stuttered section information to the first terminal 200 as a second review image, image information corresponding to the second sound pattern that is the stuttered section information (selected image) information, first to third image information, etc.) is classified as a second review image and shared with the first terminal 200 , but a viewpoint or a section corresponding to the stuttered section information is reproduced and transmitted to the first terminal 200 . You can have the corresponding student review it.

제어 방법은, 서버(100)가 제1 단말(200)에 제1 내지 제2 복습 영상 중 적어도 하나를 공유하는 경우, 서버(100)가, 제1 내지 제2 복습 영상에 대응하는 구순 동작 맵 및 예시 구순 동작 영상을 제1 단말(200)에 출력하는 단계 및 서버(100)가, 제1 단말(200)로부터 실시간 구순 동작 맵을 획득하고, 동시에 실시간 구순 동작 맵을 제1 단말(200)에 출력하는 단계를 더 포함할 수 있다.In the control method, when the server 100 shares at least one of the first and second review images with the first terminal 200 , the server 100 performs an oral motion map corresponding to the first and second review images. and outputting an exemplary oral motion image to the first terminal 200, and the server 100, obtaining a real-time oral motion map from the first terminal 200, and simultaneously displaying the real-time oral motion map to the first terminal 200 It may further include the step of outputting to.

구체적으로, 제1 단말(200)의 화면 상에서 정답에 대한 입술 모양인 예시 구순 동작 영상을, 오답에 대한 학생의 입술 모양인 구순 동작 맵과 함께 출력하여 비교할 수 있도록 하되, 학생 자신이 현재 발음하는 입술 모양을 제1 단말(200)의 화면을 통해 거울처럼 확인할 수 있도록 하여, 발음하는 입술 동작을 교정할 수 있도록 할 수 있다.Specifically, on the screen of the first terminal 200, an example oral motion image in the shape of the lips for the correct answer is output together with the oral motion map, which is the shape of the student's lips, for the incorrect answer so that they can be compared, but the student's current pronunciation The shape of the lips can be checked through the screen of the first terminal 200 like a mirror, so that the pronouncing lip motion can be corrected.

배경 이미지 상에 포함된 복수의 심볼을 획득하는 단계는, 서버(100)가, 배경 이미지 상의 기준 좌표를 생성하는 단계, 서버(100)가, 배경 이미지 상에 포함된 복수의 제1 심볼을 획득하는 단계 및 서버(100)가, 제1 심볼 중 기준 좌표에 기 설정된 간격 이하의 밀집도를 갖는 복수 개의 제2 심볼을 획득하는 단계를 더 포함할 수 있다.The step of obtaining a plurality of symbols included on the background image includes, by the server 100, generating reference coordinates on the background image, and the server 100 obtaining, by the server 100, a plurality of first symbols included on the background image and obtaining, by the server 100, a plurality of second symbols having a density of less than or equal to a predetermined interval at the reference coordinates among the first symbols.

기 설정된 개수 이상의 심볼이 검출되면, 제2 비학습 이벤트를 획득하는 단계는, 서버(100)가, 제2 심볼이 검출되는 개수를 획득하는 단계, 획득된 개수가 기 설정된 기준 이상인 경우, 서버(100)가, 구순 동작 맵의 획득 여부를 판단하는 단계, 구순 동작 맵이 획득된 경우, 서버(100)가, 구순 동작 맵에 대응하는 구순 동작 점수를 획득하는 단계, 서버(100)가, 제1 평가 정보 및 제2 평가 정보에 대응하는 평균 구순 동작 점수를 획득하는 단계 및 서버(100)가, 구순 동작 점수가 평균 구순 동작 점수보다 기 설정된 범위 이상 낮은 것으로 판단된 경우, 제1 비학습 이벤트를 획득하는 단계를 더 포함할 수 있다.When more than a preset number of symbols are detected, the step of acquiring the second non-learning event includes, by the server 100, acquiring the number of detected second symbols; The step of 100) determining whether to obtain the oral motion map, when the oral motion map is obtained, the server 100 acquiring the oral motion score corresponding to the oral motion map, the server 100, the second Obtaining an average oral motion score corresponding to the first evaluation information and the second evaluation information, and when the server 100 determines that the oral motion score is lower than the average oral motion score by more than a preset range, the first non-learning event It may further include the step of obtaining

구순 동작 맵의 획득 여부를 판단하는 단계는, 구순 동작 맵이 획득되지 않은 경우, 서버(100)가, 학생 오브젝트 획득 여부를 판단하는 단계, 학생 오브젝트가 획득된 경우, 서버(100)가, 제1 비학습 이벤트를 획득하는 단계 및 학생 오브젝트가 획득되지 않은 경우, 서버(100)가, 제2 비학습 이벤트를 획득하는 단계를 더 포함할 수 있으며, 기준 좌표는, 배경 이미지 상의 중심 좌표, 혹은 구순 동작 맵이 획득되는 영역 중 구순 동작 점수가 가장 높게 산출된 영역에 대한 중심 좌표인 것을 특징으로 한다.The step of determining whether to acquire the oral motion map is, if the oral motion map is not acquired, the server 100 determines whether to acquire the student object, when the student object is acquired, the server 100, the second The step of obtaining 1 non-learning event and if the student object is not obtained, the server 100 may further include the step of obtaining a second non-learning event, the reference coordinates are the center coordinates on the background image, or It is characterized in that it is the central coordinate for the region where the oral motion score is calculated the highest among regions where the oral motion map is obtained.

추가로, 본 발명의 제어 방법은, 외국어와 한국어를 동일인이 발음하였을 때, 언어에 따라 발생하는 톤(진동) 변화를 인식하여 학습 중 한국어에 대응하는 톤정보를 획득하여, 녹음정보에서 톤정보가 획득된 경우, 제1 비학습 이벤트를 획득할 수 있다.In addition, the control method of the present invention recognizes a change in tone (vibration) that occurs depending on the language when a foreign language and Korean are pronounced by the same person, acquires tone information corresponding to Korean during learning, and obtains tone information from recording information When is obtained, a first non-learning event may be obtained.

이때, 자국어와 외국어의 발음시 톤이 변화하는 것은, 외국어를 교육한 교사의 톤을 흉내내어 학습한 것에 대한 영향일 수 있다.In this case, the change in tone when pronouncing the native language and the foreign language may be an effect of learning by imitating the tone of a teacher who has educated the foreign language.

구체적으로, 제어 방법은, 서버(100)가, 제1 단말(200)로부터 톤정보를 획득하는 단계, 서버(100)가, 톤정보에 대응하는 제4 사운드 패턴을 획득하는 단계, 서버(100)가, 녹음정보로부터 제4 사운드 패턴에 대응하는 음성정보를 획득하는 단계 및 서버(100)가 제1 비학습 이벤트를 획득하는 단계를 더 포함할 수 있다.Specifically, the control method includes, by the server 100, acquiring tone information from the first terminal 200, the server 100 acquiring a fourth sound pattern corresponding to the tone information, the server 100 ) may further include: obtaining voice information corresponding to the fourth sound pattern from the recording information; and obtaining, by the server 100, a first non-learning event.

한편, 본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.On the other hand, the steps of the method or algorithm described in relation to the embodiment of the present invention may be implemented as hardware directly, implemented as a software module executed by hardware, or a combination thereof. A software module may include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any type of computer-readable recording medium well known in the art to which the present invention pertains.

본 발명의 구성 요소들은 하드웨어인 컴퓨터와 결합되어 실행되기 위해 프로그램(또는 애플리케이션)으로 구현되어 매체에 저장될 수 있다. 본 발명의 구성 요소들은 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있으며, 이와 유사하게, 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다.The components of the present invention may be implemented as a program (or application) to be executed in combination with a computer, which is hardware, and stored in a medium. Components of the present invention may be implemented as software programming or software components, and similarly, embodiments may include various algorithms implemented as data structures, processes, routines, or combinations of other programming constructs, including C, C++ , Java, assembler, etc. may be implemented in a programming or scripting language. Functional aspects may be implemented in an algorithm running on one or more processors.

이상, 첨부된 도면을 참조로 하여 본 발명의 실시예를 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며, 제한적이 아닌 것으로 이해해야만 한다.As mentioned above, although embodiments of the present invention have been described with reference to the accompanying drawings, those skilled in the art to which the present invention pertains know that the present invention may be embodied in other specific forms without changing the technical spirit or essential features thereof. you will be able to understand Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive.

100 : 서버
110 : 메모리
120 : 통신부
130 : 프로세서
200 : 제1 단말
300 : 제2 단말100 : server
110: memory
120: communication department
130: processor
200: first terminal
300: second terminal

Claims

In the control method of a language learning system,
obtaining, by the server, an access point of the first terminal;
obtaining, by the server, the learning information generated by the first terminal, and sharing the learning information with a second terminal corresponding to the first terminal in real time;
obtaining, by the server, an end time of the first terminal; and
Including, by the server, generating achievement information based on the learning information and sharing it with the first terminal and the second terminal;
The step of sharing the learning information in real time,
Further comprising; matching, by the server, the first terminal and the second terminal;
The step of obtaining the access point includes:
notifying, by the server, the connection of the first terminal to the second terminal;
obtaining, by the server, age information and gender information corresponding to the first terminal;
extracting, by the server, a first keyword corresponding to the age information and the gender information through an artificial intelligence model, and sharing at least one first image information corresponding to the first keyword with the first terminal; and
The method further includes, by the server, obtaining at least one keyword of interest from the first terminal and sharing at least one second image information corresponding to the keyword of interest to the first terminal,
The step of obtaining the end time is,
The server notifying the end of the first terminal to the second terminal; further comprising,
The step of obtaining the learning information is,
obtaining, by the server, image information selected from the first terminal;
generating, by the server, learning information including shooting information and recording information for a student from the first terminal; and
The server evaluates the learning information based on the selected image information and shares the evaluation information with the second terminal; further comprising,
The step of matching the first terminal and the second terminal comprises:
sharing, by the server, the shooting information and the recording information in real time with the second terminal; and
The server further comprises; generating a chatting screen shared by the first terminal and the second terminal;
The step of evaluating the learning information and sharing the evaluation information to the second terminal comprises:
obtaining, by the server, at least one of pronunciation error information, erroneous word information, and an oral action map based on the learning information, and matching the pronunciation error information and a verbal action map corresponding to the erroneous word information, respectively;
obtaining, by the server, pronunciation information corresponding to at least one of the pronunciation error information and the error word information;
obtaining, by the server, an example oral motion image corresponding to the pronunciation information, and matching the pronunciation information with the example oral motion image;
obtaining, by the server, a verbal motion score by determining the similarity between the example oral motion image and the oral motion map according to a preset criterion;
obtaining, by the server, a re-learning necessary score based on a preset weight corresponding to each of the pronunciation error information and the error word information, through an artificial intelligence model; and
The server further comprises; generating evaluation information based on the oral motion score and the re-learning required score;
Acquiring at least one of pronunciation error information, erroneous word information, and oral action map based on the learning information comprises:
obtaining, by the server, a pronunciation symbol association value based on the pronunciation error information and the error word information through an artificial intelligence model;
obtaining, by the server, a plurality of words including phonetic symbol information corresponding to the phonetic symbol correlation value when the phonetic symbol correlation value is equal to or greater than a preset value; and
Further comprising; obtaining, by the server, third image information including the plurality of words, and sharing it with the first terminal as a first review image;
The step of generating the achievement information,
comparing, by the server, the first evaluation information and the second evaluation information with the third evaluation information; and
Obtaining, by the server, achievement information including an achievement rate graph generated based on a comparison result of the first to third evaluation information;
The first evaluation information is the first evaluation information,
The second evaluation information is evaluation information that occurred just before,
The third evaluation information is evaluation information generated in the rearmost row, and is generated based on the review image.

delete

According to claim 1,
The step of generating learning information including the shooting information and the recording information,
obtaining, by the server, photographing information corresponding to a section in which the recording information is blank, when a section in which the recording information is blank is obtained over a preset time;
obtaining, by the server, whether it is possible to obtain a student object corresponding to the first terminal from the photographing information corresponding to the blank section;
acquiring, by the server, at least one non-learning event of a first non-learning event and a second non-learning event based on whether the object is acquired;
increasing, by the server, a volume value corresponding to the selected image information to a preset volume when the first non-learning event is acquired;
when the second non-learning event is obtained, the server sharing a second non-learning event occurrence time point with the second terminal;
calculating, by the server, a total learning time based on the access time and the end time, excluding a time corresponding to the blank section from the total learning time; and
Further comprising; calculating, by the server, a recommended learning time based on the total learning time and transmitting it to the first terminal;
The step of acquiring the second non-learning event includes:
obtaining, by the server, a background image from the first terminal;
obtaining, by the server, a plurality of symbols included in the background image; and
and obtaining, by the server, a second non-learning event when more than a preset number of symbols are detected among the plurality of symbols.

According to claim 1,
Acquiring at least one of pronunciation error information, erroneous word information, and oral action map based on the learning information comprises:
obtaining, by the server, a first sound pattern that is a sound pattern for each sentence based on the selected image information;
obtaining, by the server, a second sound pattern corresponding to the first sound pattern from the recording information;
matching, by the server, a first sound pattern and a second sound pattern determined to be similar to or greater than a preset standard, and obtaining, as error information, a second sound pattern in which other sound patterns do not match;
matching, by the server, a first sound pattern having the same generation time point to a second sound pattern obtained as the error information based on the generation time point of the second sound pattern obtained as the error information;
defining, by the server, error information similar to a sound more than a preset standard among the first sound pattern and the second sound pattern matched in the error information as pronunciation error information; and
Further comprising; defining, by the server, similar error information as erroneous word information whose sound quality is less than a preset standard;
The pronunciation information, the control method characterized in that it includes the first sound pattern.

7. The method of claim 6,
The step of matching the first sound pattern and the second sound pattern having the same generation time point to each other comprises:
Comparing the first sound pattern and the second sound pattern matched in the error information, when the second sound pattern has a goodness of a sound in which a section in which a difference occurs with respect to the first sound pattern is repeated at least once, calculating, by the server, an average mapsy of at least two repeated mapsy;
obtaining, by the server, a third sound pattern by replacing a plurality of repeated mapsi located in the section where the difference occurs with one of the average mapsi;
comparing, by the server, the third sound pattern with the corresponding first sound pattern;
when the first sound pattern and the third sound pattern are similar to or more than a preset standard, obtaining, by the server, information on a section in which the second sound pattern corresponding to the third sound pattern is traced; and
The server further includes; sharing, by the server, image information corresponding to the stuttered section information to the first terminal as a second review image,
The control method is
When the server shares at least one of the first to second review images with the first terminal, the server generates a oral motion map and an example oral motion image corresponding to the first and second review images to the second terminal. 1 outputting to the terminal; and
The method further comprising: obtaining, by the server, a real-time oral motion map from the first terminal, and simultaneously outputting the real-time oral motion map to the first terminal.

6. The method of claim 5,
The step of obtaining a plurality of symbols included on the background image,
generating, by the server, reference coordinates on the background image;
obtaining, by the server, a plurality of first symbols included on the background image; and
The method further includes, by the server, obtaining, by the server, a plurality of second symbols having a density of less than or equal to a predetermined interval in the reference coordinates among the first symbols,
If more than the preset number of symbols are detected, acquiring a second non-learning event comprises:
obtaining, by the server, the number of detected second symbols;
determining, by the server, whether to acquire the oral motion map, when the acquired number is equal to or greater than a preset criterion;
obtaining, by the server, an oral motion score corresponding to the oral motion map when the oral motion map is obtained;
obtaining, by the server, an average oral motion score corresponding to the first evaluation information and the second evaluation information; and
When the server determines that the oral operation score is lower than the average oral operation score by more than a preset range, acquiring a first non-learning event; further comprising,
The step of determining whether to obtain the oral motion map is,
determining, by the server, whether to obtain the student object, when the oral motion map is not obtained;
obtaining, by the server, a first non-learning event when the student object is obtained; and
If the student object is not obtained, the server acquiring a second non-learning event; further comprising,
The reference coordinate is a central coordinate on the background image or a central coordinate for a region in which the oral motion score is calculated the highest among regions in which the oral motion map is obtained.