KR102414626B1

KR102414626B1 - Foreign language pronunciation training and evaluation system

Info

Publication number: KR102414626B1
Application number: KR1020200138953A
Authority: KR
Inventors: 신정훈; 김원응
Original assignee: 주식회사 에듀템
Priority date: 2020-10-26
Filing date: 2020-10-26
Publication date: 2022-06-30
Also published as: KR20220054964A

Abstract

본 발명은 외국어 발음 훈련 및 평가 시스템에 관한 것으로, 특정 텍스트에 관한 원어민 발음을 재생하는 원어민 발음 재생부; 상기 특정 텍스트에 관한 학습자 발음을 입력받는 학습자 발음 수신부; 상기 원어민 발음 및 상기 학습자 발음에 관한 발성 그래프 비교를 통해 발성의 유창성을 측정하는 유창성 측정부; 상기 학습자 발음을 변환하여 평가 텍스트를 생성하고 상기 특정 텍스트와의 텍스트 비교를 통해 발음의 정확성을 측정하는 정확성 측정부; 및 상기 발성의 유창성과 상기 발음의 정확성을 기초로 발음 평가를 수행하여 평가 결과를 생성하는 발음 평가 수행부를 포함한다.The present invention relates to a foreign language pronunciation training and evaluation system, comprising: a native speaker pronunciation reproducing unit for reproducing a native speaker pronunciation for a specific text; a learner pronunciation receiver for receiving the learner pronunciation for the specific text; a fluency measuring unit for measuring the fluency of vocalization by comparing the pronunciation graphs of the native speaker's pronunciation and the learner's pronunciation; an accuracy measuring unit for generating an evaluation text by converting the learner's pronunciation and measuring the accuracy of pronunciation by comparing the text with the specific text; and a pronunciation evaluation unit configured to generate an evaluation result by performing a pronunciation evaluation based on the fluency of the vocalization and the accuracy of the pronunciation.

Description

Foreign language pronunciation training and evaluation system

본 발명은 외국어 발음 훈련 및 평가 시스템에 관한 것으로, 보다 상세하게는 외국어 학습 과정에서 음성 발음에 관한 평가 결과를 정확히 제공하여 외국어 학습의 효율을 향상시킬 수 있는 외국어 발음 훈련 및 평가 시스템에 관한 것이다.The present invention relates to a foreign language pronunciation training and evaluation system, and more particularly, to a foreign language pronunciation training and evaluation system capable of improving the efficiency of foreign language learning by accurately providing an evaluation result on voice pronunciation in a foreign language learning process.

외국어에 대한 관심이 고조되면서, 효율적이고 체계적인 외국어 학습 방안에 대한 필요성이 높아지고 있다. 최근 언어 교육에서 실질적인 의사소통 능력이 중시됨에 따라 의사소통의 수단인 음성 언어, 특히 말하기 영역에 대한 중요성이 높아지고 있으며, 외국어 교육에서도 동일한 경향이 나타난다.As interest in foreign languages increases, the need for efficient and systematic foreign language learning methods is increasing. Recently, as practical communication skills are emphasized in language education, the importance of spoken language, especially speaking, as a means of communication is increasing, and the same trend appears in foreign language education.

일반적으로 말하기 평가는 다수의 평가 전문가가 직접 학습자의 발화를 듣고 그에 대한 평가를 수행하는 수동 평가의 방식으로 이루어진다. 이와 대비되는 개념으로 말하기 평가 시스템을 사용하여 평가자 없이 자동으로 학습자의 발화를 평가하는 자동 평가 방식을 생각해 볼 수 있다.In general, speaking evaluation is performed in a manual evaluation method in which a number of evaluation experts directly listen to learners' utterances and evaluate them. As a contrasting concept, an automatic evaluation method that uses a speech evaluation system to automatically evaluate a learner's utterance without an evaluator can be considered.

그러나, 현재까지의 자동 평가 방식은 외국어 학습자의 국적이 무엇인지, 모국어가 무엇인지 등과 무관하게 일률적인 기준으로 평가를 진행하고 있으므로, 그 정확성에 문제점이 있을 수 있다. 예를 들어, 일본어를 모국어로 하는 학습자의 한국어 발음과 중국어를 모국어로 하는 학습자의 한국어 발음은 모국어에 의한 영향으로 차이가 발생할 수 있는데, 두 학습자의 발음 평가를 일률적으로 하는 것은 발음 평가의 정확성을 떨어뜨릴 수 있는 것이다However, the automatic evaluation method up to now evaluates according to a uniform standard regardless of the foreign language learner's nationality or native language, and thus there may be a problem in accuracy. For example, there may be differences between the Korean pronunciation of a learner whose native language is Japanese and the Korean pronunciation of a learner whose native language is Chinese. can be dropped

따라서, 학습자의 발화를 비교적 정확히 평가할 수 있는 기술 및 시스템의 개발이 요구되고 있다.Accordingly, there is a demand for the development of a technology and system that can relatively accurately evaluate a learner's utterance.

한국등록특허 제10-0733469호 (2007.06.22)Korean Patent Registration No. 10-0733469 (2007.06.22)

본 발명의 일 실시예는 외국어 학습 과정에서 음성 발음에 관한 평가 결과를 정확히 제공하여 외국어 학습의 효율을 향상시킬 수 있는 외국어 발음 훈련 및 평가 시스템을 제공하고자 한다.An embodiment of the present invention is to provide a foreign language pronunciation training and evaluation system capable of improving the efficiency of foreign language learning by accurately providing an evaluation result on voice pronunciation in a foreign language learning process.

본 발명의 일 실시예는 발성 그래프의 유사도와 발음에 대응되는 단어 간의 매칭을 통해 발성 및 발음 측면에서 각각 평가를 수행할 수 있는 외국어 발음 훈련 및 평가 시스템을 제공하고자 한다.An embodiment of the present invention is to provide a foreign language pronunciation training and evaluation system capable of performing evaluation in terms of vocalization and pronunciation, respectively, through matching between the similarity of the speech graph and the words corresponding to the pronunciation.

본 발명의 일 실시예는 발성 그래프의 구간별 면적과 무음 구간의 길이 및 위치 간의 비교를 통해 발음 평가를 수행할 수 있는 외국어 발음 훈련 및 평가 시스템을 제공하고자 한다.An embodiment of the present invention is to provide a foreign language pronunciation training and evaluation system capable of performing pronunciation evaluation by comparing the area of each section of a speech graph and the length and position of the silent section.

실시예들 중에서, 외국어 발음 훈련 및 평가 시스템은 특정 텍스트에 관한 원어민 발음을 재생하는 원어민 발음 재생부; 상기 특정 텍스트에 관한 학습자 발음을 입력받는 학습자 발음 수신부; 상기 원어민 발음 및 상기 학습자 발음에 관한 발성 그래프 비교를 통해 발성의 유창성을 측정하는 유창성 측정부; 상기 학습자 발음을 변환하여 평가 텍스트를 생성하고 상기 특정 텍스트와의 텍스트 비교를 통해 발음의 정확성을 측정하는 정확성 측정부; 및 상기 발성의 유창성과 상기 발음의 정확성을 기초로 발음 평가를 수행하여 평가 결과를 생성하는 발음 평가 수행부를 포함한다.In embodiments, the foreign language pronunciation training and evaluation system may include: a native speaker pronunciation reproducing unit for reproducing a native speaker pronunciation for a specific text; a learner pronunciation receiver for receiving the learner pronunciation for the specific text; a fluency measuring unit for measuring the fluency of vocalization by comparing the pronunciation graphs of the native speaker's pronunciation and the learner's pronunciation; an accuracy measuring unit for generating an evaluation text by converting the learner's pronunciation and measuring the accuracy of pronunciation by comparing the text with the specific text; and a pronunciation evaluation unit configured to generate an evaluation result by performing a pronunciation evaluation based on the fluency of the vocalization and the accuracy of the pronunciation.

상기 원어민 발음 재생부는 상기 재생 과정에서 상기 원어민 발음의 진행에 따라 상기 특정 텍스트에 대한 시각화된 하이라이팅을 제공하고, 상기 원어민 발음의 발성 그래프 상에서 피크(peak) 값이 기 설정된 임계값을 초과하는 경우 해당 피크 값에 대응되는 단어를 시각적으로 표시할 수 있다.The native speaker pronunciation reproducing unit provides a visualized highlighting of the specific text according to the progress of the native speaker pronunciation in the reproduction process, and when the peak value on the pronunciation graph of the native speaker exceeds a preset threshold, corresponding Words corresponding to peak values can be visually displayed.

상기 유창성 측정부는 상기 원어민 발음에 관한 제1 발성 그래프를 획득하는 단계; 상기 학습자 발음에 관한 제2 발성 그래프를 획득하는 단계; 상기 제1 및 제2 발성 그래프들 각각의 발화 시점을 검출하여 정렬하는 단계; 및 상기 원어민 발음의 성별을 기준으로 상기 학습자 발음의 성별에 따라 상기 제2 발성 그래프에 관한 스케일링을 수행하는 단계를 통해 상기 발성 그래프 비교를 위한 전처리 동작을 수행할 수 있다.obtaining, by the fluency measurement unit, a first speech graph related to the pronunciation of the native speaker; obtaining a second speech graph related to the learner's pronunciation; detecting and arranging an utterance timing of each of the first and second utterance graphs; and performing scaling on the second speech graph according to the gender of the learner's pronunciation based on the gender of the native speaker's pronunciation, thereby performing a preprocessing operation for comparing the speech graph.

상기 유창성 측정부는 상기 제1 및 제2 발성 그래프들 각각의 무음(mute) 구간을 검출하고 전체 발음 구간을 기준으로 상기 무음 구간의 위치와 길이를 식별하여 상기 무음 구간의 차이를 기초로 제1 유창성 평가를 수행하는 단계; 상기 제1 및 제2 발성 그래프들 각각에 대해 상기 전체 발음 구간을 복수의 부분 발음 구간들로 분할하고 복수의 부분 발음 구간들 각각의 그래프 면적량을 산출하여 상기 그래프 면적량의 차이를 기초로 제2 유창성 평가를 수행하는 단계; 및 상기 제1 및 제2 유창성 평가들을 통합하여 평가 점수를 결정하는 단계를 통해 상기 발성의 유창성에 관한 측정 동작을 수행할 수 있다.The fluency measurement unit detects a mute section of each of the first and second vocalization graphs, identifies the position and length of the silent section based on the entire pronunciation section, and determines the first fluency based on the difference between the silent sections. performing an evaluation; For each of the first and second speech graphs, the entire pronunciation section is divided into a plurality of partial pronunciation sections, a graph area amount of each of the plurality of partial pronunciation sections is calculated, and the second speech section is formed based on the difference in the graph area amount. 2 performing a fluency assessment; and determining an evaluation score by integrating the first and second fluency evaluations to measure the fluency of the vocalization.

상기 정확성 측정부는 상기 평가 텍스트에서 해당 발성 그래프 상의 피크 값에 대응되는 주요 단어들을 검출하는 단계; 상기 평가 텍스트에서 상기 주요 단어들 각각의 전후에서 연결되는 보조 단어들을 검출하는 단계; 및 상기 특정 텍스트를 기준으로 상기 주요 단어들 및 상기 보조 단어들 각각의 일치 여부에 따라 평가 점수를 결정하는 단계를 통해 상기 발음의 정확성에 관한 측정 동작을 수행할 수 있다.detecting, by the accuracy measuring unit, key words corresponding to a peak value on a corresponding speech graph in the evaluation text; detecting auxiliary words connected before and after each of the main words in the evaluation text; and determining an evaluation score according to whether each of the main words and the auxiliary words match each other based on the specific text, thereby measuring the accuracy of the pronunciation.

상기 시스템은 외국어 학습의 난이도별, 주차별, 과정별, 주제별 훈련 및 평가 이력을 누적하여 관리하고 학습자의 학습 수준에 따라 학습 컨텐츠를 추천하는 학습 컨텐츠 추천부를 더 포함할 수 있다.The system may further include a learning content recommendation unit that accumulates and manages training and evaluation histories for each difficulty, week, course, and topic of foreign language learning, and recommends learning content according to the learner's learning level.

개시된 기술은 다음의 효과를 가질 수 있다. 다만, 특정 실시예가 다음의 효과를 전부 포함하여야 한다거나 다음의 효과만을 포함하여야 한다는 의미는 아니므로, 개시된 기술의 권리범위는 이에 의하여 제한되는 것으로 이해되어서는 아니 될 것이다.The disclosed technology may have the following effects. However, this does not mean that a specific embodiment should include all of the following effects or only the following effects, so the scope of the disclosed technology should not be understood as being limited thereby.

본 발명의 일 실시예에 따른 외국어 발음 훈련 및 평가 시스템은 외국어 학습 과정에서 음성 발음에 관한 평가 결과를 정확히 제공하여 외국어 학습의 효율을 향상시킬 수 있다.The foreign language pronunciation training and evaluation system according to an embodiment of the present invention can improve the efficiency of foreign language learning by accurately providing an evaluation result on voice pronunciation in a foreign language learning process.

본 발명의 일 실시예에 따른 외국어 발음 훈련 및 평가 시스템은 발성 그래프의 유사도와 발음에 대응되는 단어 간의 매칭을 통해 발성 및 발음 측면에서 각각 평가를 수행할 수 있다.The foreign language pronunciation training and evaluation system according to an embodiment of the present invention may perform evaluation in terms of vocalization and pronunciation, respectively, through matching between the similarity of the vocalization graph and the words corresponding to the pronunciation.

본 발명의 일 실시예에 따른 외국어 발음 훈련 및 평가 시스템은 발성 그래프의 구간별 면적과 무음 구간의 길이 및 위치 간의 비교를 통해 발음 평가를 수행할 수 있다.The foreign language pronunciation training and evaluation system according to an embodiment of the present invention may perform pronunciation evaluation by comparing the area for each section of the speech graph and the length and position of the silent section.

도 1은 본 발명에 따른 외국어 발음 훈련 및 평가 시스템을 설명하는 도면이다.
도 2는 도 1에 있는 훈련 및 평가 장치의 시스템 구성을 설명하는 도면이다.
도 3은 도 1에 있는 훈련 및 평가 장치의 기능적 구성을 설명하는 도면이다.
도 4는 본 발명에 따른 외국어 발음 훈련 및 평가 과정을 설명하는 순서도이다.
도 5 및 6은 본 발명에 따른 외국어 발음 훈련 및 평가 시스템이 제공하는 인터페이스의 일 실시예를 설명하는 도면이다.1 is a view for explaining a foreign language pronunciation training and evaluation system according to the present invention.
FIG. 2 is a diagram for explaining a system configuration of the training and evaluation apparatus shown in FIG. 1 .
FIG. 3 is a view for explaining a functional configuration of the training and evaluation device in FIG. 1 .
4 is a flowchart illustrating a foreign language pronunciation training and evaluation process according to the present invention.
5 and 6 are diagrams for explaining an embodiment of an interface provided by the foreign language pronunciation training and evaluation system according to the present invention.

본 발명에 관한 설명은 구조적 내지 기능적 설명을 위한 실시예에 불과하므로, 본 발명의 권리범위는 본문에 설명된 실시예에 의하여 제한되는 것으로 해석되어서는 아니 된다. 즉, 실시예는 다양한 변경이 가능하고 여러 가지 형태를 가질 수 있으므로 본 발명의 권리범위는 기술적 사상을 실현할 수 있는 균등물들을 포함하는 것으로 이해되어야 한다. 또한, 본 발명에서 제시된 목적 또는 효과는 특정 실시예가 이를 전부 포함하여야 한다거나 그러한 효과만을 포함하여야 한다는 의미는 아니므로, 본 발명의 권리범위는 이에 의하여 제한되는 것으로 이해되어서는 아니 될 것이다.Since the description of the present invention is merely an embodiment for structural or functional description, the scope of the present invention should not be construed as being limited by the embodiment described in the text. That is, since the embodiment may have various changes and may have various forms, it should be understood that the scope of the present invention includes equivalents capable of realizing the technical idea. In addition, since the object or effect presented in the present invention does not mean that a specific embodiment should include all of them or only such effects, it should not be understood that the scope of the present invention is limited thereby.

한편, 본 출원에서 서술되는 용어의 의미는 다음과 같이 이해되어야 할 것이다.On the other hand, the meaning of the terms described in the present application should be understood as follows.

"제1", "제2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.Terms such as “first” and “second” are for distinguishing one component from another, and the scope of rights should not be limited by these terms. For example, a first component may be termed a second component, and similarly, a second component may also be termed a first component.

어떤 구성요소가 다른 구성요소에 "연결되어"있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결될 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어"있다고 언급된 때에는 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 한편, 구성요소들 간의 관계를 설명하는 다른 표현들, 즉 "~사이에"와 "바로 ~사이에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.When a component is referred to as being “connected to” another component, it may be directly connected to the other component, but it should be understood that other components may exist in between. On the other hand, when it is mentioned that a certain element is "directly connected" to another element, it should be understood that the other element does not exist in the middle. Meanwhile, other expressions describing the relationship between elements, that is, "between" and "between" or "neighboring to" and "directly adjacent to", etc., should be interpreted similarly.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함하다"또는 "가지다" 등의 용어는 실시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The singular expression is to be understood as including the plural expression unless the context clearly dictates otherwise, and terms such as "comprises" or "have" refer to the embodied feature, number, step, action, component, part or these It is intended to indicate that a combination exists, and it should be understood that it does not preclude the possibility of the existence or addition of one or more other features or numbers, steps, operations, components, parts, or combinations thereof.

각 단계들에 있어 식별부호(예를 들어, a, b, c 등)는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 단계들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In each step, identification numbers (eg, a, b, c, etc.) are used for convenience of description, and identification numbers do not describe the order of each step, and each step clearly indicates a specific order in context. Unless otherwise specified, it may occur in a different order from the specified order. That is, each step may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the reverse order.

본 발명은 컴퓨터가 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있고, 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The present invention can be embodied as computer-readable codes on a computer-readable recording medium, and the computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored. . Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like. In addition, the computer-readable recording medium is distributed in a computer system connected to a network, so that the computer-readable code can be stored and executed in a distributed manner.

여기서 사용되는 모든 용어들은 다르게 정의되지 않는 한, 본 발명이 속하는 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미를 지니는 것으로 해석될 수 없다.All terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs, unless otherwise defined. Terms defined in general used in the dictionary should be interpreted as having the meaning consistent with the context of the related art, and cannot be interpreted as having an ideal or excessively formal meaning unless explicitly defined in the present application.

도 1은 본 발명에 따른 외국어 발음 훈련 및 평가 시스템을 설명하는 도면이다.1 is a view for explaining a foreign language pronunciation training and evaluation system according to the present invention.

도 1을 참조하면, 외국어 발음 훈련 및 평가 시스템(100)은 사용자 단말(110), 훈련 및 평가 장치(130) 및 데이터베이스(150)를 포함할 수 있다.Referring to FIG. 1 , the foreign language pronunciation training and evaluation system 100 may include a user terminal 110 , a training and evaluation device 130 , and a database 150 .

사용자 단말(110)은 외국어 학습을 위한 학습 내용을 확인하고 외국어 발음을 입력할 수 있는 컴퓨팅 장치에 해당할 수 있다. 사용자 단말(110)은 스마트폰, 노트북 또는 컴퓨터로 구현될 수 있으며, 반드시 이에 한정되지 않고, 태블릿 PC 등 다양한 디바이스로도 구현될 수 있다. 사용자 단말(110)은 훈련 및 평가 장치(130)와 네트워크를 통해 연결될 수 있고, 복수의 사용자 단말(110)들은 훈련 및 평가 장치(130)와 동시에 연결될 수 있다.The user terminal 110 may correspond to a computing device capable of checking learning contents for foreign language learning and inputting foreign language pronunciation. The user terminal 110 may be implemented as a smartphone, a notebook computer, or a computer, but is not limited thereto, and may be implemented in various devices such as a tablet PC. The user terminal 110 may be connected to the training and evaluation apparatus 130 through a network, and a plurality of user terminals 110 may be simultaneously connected to the training and evaluation apparatus 130 .

또한, 사용자 단말(110)은 외국어 학습을 위한 전용 프로그램 또는 어플리케이션을 설치하여 실행할 수 있다. 사용자 단말(110)은 외국어 발음을 녹음할 수 있는 마이크 모듈과 외국어 발음을 재생할 수 있는 스피커 모듈을 포함하여 구현될 수 있으며, 필요에 따라 사용자와 원어민 간의 영상 대화를 위한 카메라 모듈을 더 포함하여 구현될 수 있다.In addition, the user terminal 110 may install and execute a dedicated program or application for foreign language learning. The user terminal 110 may be implemented by including a microphone module capable of recording foreign language pronunciation and a speaker module capable of reproducing foreign language pronunciation, and further includes a camera module for video conversation between the user and a native speaker if necessary. can be

훈련 및 평가 장치(130)는 외국어 학습 과정에서 사용자의 외국어 발음을 평가하고 그에 관한 피드백을 제공하여 외국어 학습을 지원하기 위한 다양한 기능을 제공하는 컴퓨터 또는 프로그램에 해당하는 서버로 구현될 수 있다. 훈련 및 평가 장치(130)는 사용자 단말(110)과 유선 네트워크 또는 블루투스, WiFi 등과 같은 무선 네트워크로 연결될 수 있고, 네트워크를 통해 사용자 단말(110)과 통신을 수행할 수 있다. 또한, 훈련 및 평가 장치(130)는 외부 시스템(도 1에 미도시함)과 연결되어 동작할 수 있고, 예를 들어 외부 결제 시스템, 온라인 시스템, 인증 시스템 등과 연결될 수 있다.The training and evaluation device 130 may be implemented as a computer or a server corresponding to a program that provides various functions for supporting foreign language learning by evaluating a user's foreign language pronunciation in a foreign language learning process and providing feedback thereon. The training and evaluation device 130 may be connected to the user terminal 110 through a wired network or a wireless network such as Bluetooth or WiFi, and may communicate with the user terminal 110 through the network. In addition, the training and evaluation device 130 may operate in connection with an external system (not shown in FIG. 1 ), for example, may be connected to an external payment system, an online system, an authentication system, and the like.

데이터베이스(150)는 훈련 및 평가 장치(130)의 동작 과정에서 필요한 다양한 정보들을 저장하는 저장장치에 해당할 수 있다. 데이터베이스(150)는 외국어 학습을 위한 다양한 학습 컨텐츠에 관한 정보를 저장할 수 있고, 외국어 학습을 위한 원어민 발음에 관한 정보를 저장할 수 있으며, 반드시 이에 한정되지 않고, 훈련 및 평가 장치(130)가 외국어 학습을 위한 평가 및 훈련 과정에서 다양한 형태로 수집 또는 가공된 정보들을 저장할 수 있다.The database 150 may correspond to a storage device for storing various types of information required in the operation process of the training and evaluation device 130 . The database 150 may store information on various learning contents for learning a foreign language, and may store information on pronunciation of a native speaker for learning a foreign language, but is not necessarily limited thereto, and the training and evaluation device 130 may be configured to learn a foreign language. Information collected or processed in various forms can be stored in the evaluation and training process for

도 2는 도 1에 있는 훈련 및 평가 장치의 시스템 구성을 설명하는 도면이다.FIG. 2 is a view for explaining the system configuration of the training and evaluation apparatus shown in FIG. 1 .

도 2를 참조하면, 훈련 및 평가 장치(130)는 프로세서(210), 메모리(230), 사용자 입출력부(250) 및 네트워크 입출력부(270)를 포함하여 구현될 수 있다.Referring to FIG. 2 , the training and evaluation apparatus 130 may be implemented including a processor 210 , a memory 230 , a user input/output unit 250 , and a network input/output unit 270 .

프로세서(210)는 훈련 및 평가 장치(130)가 동작하는 과정에서의 각 단계들을 처리하는 프로시저를 실행할 수 있고, 그 과정 전반에서 읽혀지거나 작성되는 메모리(230)를 관리할 수 있으며, 메모리(230)에 있는 휘발성 메모리와 비휘발성 메모리 간의 동기화 시간을 스케줄할 수 있다. 프로세서(210)는 훈련 및 평가 장치(130)의 동작 전반을 제어할 수 있고, 메모리(230), 사용자 입출력부(250) 및 네트워크 입출력부(270)와 전기적으로 연결되어 이들 간의 데이터 흐름을 제어할 수 있다. 프로세서(210)는 훈련 및 평가 장치(130)의 CPU(Central Processing Unit)로 구현될 수 있다.The processor 210 may execute a procedure for processing each step in the process in which the training and evaluation device 130 operates, and may manage the memory 230 that is read or written throughout the process, and the memory ( 230) may schedule a synchronization time between the volatile memory and the non-volatile memory. The processor 210 may control the overall operation of the training and evaluation device 130 , and is electrically connected to the memory 230 , the user input/output unit 250 , and the network input/output unit 270 to control the flow of data therebetween. can do. The processor 210 may be implemented as a central processing unit (CPU) of the training and evaluation device 130 .

메모리(230)는 SSD(Solid State Drive) 또는 HDD(Hard Disk Drive)와 같은 비휘발성 메모리로 구현되어 훈련 및 평가 장치(130)에 필요한 데이터 전반을 저장하는데 사용되는 보조기억장치를 포함할 수 있고, RAM(Random Access Memory)과 같은 휘발성 메모리로 구현된 주기억장치를 포함할 수 있다.The memory 230 is implemented as a non-volatile memory, such as a solid state drive (SSD) or a hard disk drive (HDD), and may include an auxiliary storage device used to store overall data required for the training and evaluation device 130 and , and may include a main memory implemented as a volatile memory such as random access memory (RAM).

사용자 입출력부(250)는 사용자 입력을 수신하기 위한 환경 및 사용자에게 특정 정보를 출력하기 위한 환경을 포함할 수 있다. 예를 들어, 사용자 입출력부(250)는 터치 패드, 터치 스크린, 화상 키보드 또는 포인팅 장치와 같은 어댑터를 포함하는 입력장치 및 모니터 또는 터치스크린과 같은 어댑터를 포함하는 출력장치를 포함할 수 있다. 일 실시예에서, 사용자 입출력부(250)는 원격 접속을 통해 접속되는 컴퓨팅 장치에 해당할 수 있고, 그러한 경우, 훈련 및 평가 장치(130)는 독립적인 서버로서 수행될 수 있다.The user input/output unit 250 may include an environment for receiving a user input and an environment for outputting specific information to the user. For example, the user input/output unit 250 may include an input device including an adapter such as a touch pad, a touch screen, an on-screen keyboard, or a pointing device, and an output device including an adapter such as a monitor or a touch screen. In one embodiment, the user input/output unit 250 may correspond to a computing device accessed through a remote connection, and in such a case, the training and evaluation device 130 may be performed as an independent server.

네트워크 입출력부(270)은 네트워크를 통해 외부 장치 또는 시스템과 연결하기 위한 환경을 포함하고, 예를 들어, LAN(Local Area Network), MAN(Metropolitan Area Network), WAN(Wide Area Network) 및 VAN(Value Added Network) 등의 통신을 위한 어댑터를 포함할 수 있다.The network input/output unit 270 includes an environment for connecting with an external device or system through a network, for example, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), and a VAN (Wide Area Network) (VAN). It may include an adapter for communication such as Value Added Network).

도 3은 도 1에 있는 훈련 및 평가 장치의 기능적 구성을 설명하는 도면이다.FIG. 3 is a diagram for explaining a functional configuration of the training and evaluation device in FIG. 1 .

도 3을 참조하면, 훈련 및 평가 장치(130)는 원어민 발음 재생부(310), 학습자 발음 수신부(320), 유창성 측정부(330), 정확성 측정부(340), 발음 평가 수행부(350), 학습 컨텐츠 추천부(360) 및 제어부(370)를 포함할 수 있다.Referring to FIG. 3 , the training and evaluation device 130 includes a native speaker pronunciation reproducing unit 310 , a learner pronunciation receiving unit 320 , a fluency measurement unit 330 , an accuracy measurement unit 340 , and a pronunciation evaluation performing unit 350 . , a learning content recommendation unit 360 and a control unit 370 may be included.

원어민 발음 재생부(310)는 특정 텍스트에 관한 원어민 발음을 재생할 수 있다. 원어민 발음은 학습 컨텐츠에 포함된 다양한 텍스트들에 관한 원어민의 음성 내용을 녹음한 것에 해당할 수 있고, 주제별, 문장별, 단어별로 구분되어 생성될 수 있으며, 데이터베이스(150)에 저장되어 관리될 수 있다. 원어민 발음 재생부(310)는 사용자의 요청이나 학습 진행 과정에서 데이터베이스(150)에 저장된 데이터를 읽어들여 원어민 발음을 재생할 수 있으며, 필요에 따라 사용자 단말(110)을 통해 제공할 수도 있다. 이 경우, 원어민 발음 재생부(310)는 사용자 단말(110)의 전용 프로그램 또는 어플리케이션과 연동하여 동작할 수 있다.The native speaker pronunciation reproducing unit 310 may reproduce the native speaker pronunciation of a specific text. The native speaker pronunciation may correspond to a recording of the voice contents of a native speaker regarding various texts included in the learning content, may be generated separately by topic, sentence, and word, and may be stored and managed in the database 150 have. The native speaker pronunciation reproducing unit 310 may read data stored in the database 150 at a user's request or in the course of learning to reproduce the native speaker's pronunciation, and may provide it through the user terminal 110 if necessary. In this case, the native speaker pronunciation reproducing unit 310 may operate in conjunction with a dedicated program or application of the user terminal 110 .

일 실시예에서, 원어민 발음 재생부(310)는 재생 과정에서 원어민 발음의 진행에 따라 특정 텍스트에 대한 시각화된 하이라이팅을 제공할 수 있다. 즉, 원어민 발음 재생부(310)는 원어민 발음을 재생하면서 관련 인터페이스를 통해 대응되는 텍스트 내용을 표시할 수 있고, 원어민 발음의 진행에 동기화되어 해당 텍스트를 하이라이팅 함으로써 사용자로 하여금 원어민 발음에 해당되는 부분이 텍스트 상의 어디인지를 즉각적으로 인식하도록 할 수 있다.In an embodiment, the native speaker pronunciation reproducing unit 310 may provide a visualized highlighting of a specific text according to the progress of the native speaker pronunciation during the reproduction process. That is, the native speaker pronunciation reproducing unit 310 may display the corresponding text content through a related interface while playing the native speaker pronunciation, and by highlighting the text in synchronization with the progress of the native speaker pronunciation, the user can make the part corresponding to the native speaker pronunciation. You can instantly recognize where it is on this text.

일 실시예에서, 원어민 발음 재생부(310)는 원어민 발음의 발성 그래프 상에서 피크(peak) 값이 기 설정된 임계값을 초과하는 경우 해당 피크 값에 대응되는 단어를 시각적으로 표시할 수 있다. 여기에서, 발성 그래프는 원어민 발음에 관한 오디오 신호를 파형으로 변환한 파형 그래프에 해당할 수 있다. 예를 들어, 발성 그래프는 원어민 발음에 관한 에너지 크기를 파형으로 변환한 그래프에 해당할 수 있다. 피크(peak) 값은 발성 그래프의 곡선 상에서 봉우리 부분의 최대값에 해당할 수 있다. In an embodiment, the native speaker pronunciation reproducing unit 310 may visually display a word corresponding to the peak value when the peak value exceeds a preset threshold value on the pronunciation graph of the native speaker pronunciation. Here, the speech graph may correspond to a waveform graph in which an audio signal related to the pronunciation of a native speaker is converted into a waveform. For example, the vocalization graph may correspond to a graph in which the energy level related to the pronunciation of a native speaker is converted into a waveform. The peak value may correspond to the maximum value of the peak portion on the curve of the vocalization graph.

즉, 원어민 발음 재생부(310)는 발성 그래프 상에서 피크 값을 검출할 수 있고, 해당 피크 값을 기 설정된 임계값과 비교하여 임계값보다 더 큰 경우 해당 피크값에 대응되는 단어를 결정하고 시각화하여 표시할 수 있다. 이를 통해, 학습자는 원어민 발음을 듣는 과정에서 강세가 높은 단어를 쉽게 파악할 수 있고, 원어민 발음에 가까운 억양을 효과적으로 훈련할 수 있다.That is, the native speaker pronunciation reproducing unit 310 can detect a peak value on the vocalization graph, compare the peak value with a preset threshold value, and when it is greater than the threshold value, determine and visualize the word corresponding to the peak value. can be displayed Through this, the learner can easily identify high-stressed words in the process of listening to the native speaker's pronunciation, and can effectively train an intonation close to the native speaker's pronunciation.

학습자 발음 수신부(320)는 특정 텍스트에 관한 학습자 발음을 입력받을 수 있다. 학습자는 원어민 발음이 재생된 후 이를 듣고 동일한 텍스트에 대해 발음하는 훈련을 진행할 수 있으며, 학습자 발음 수신부(320)는 해당 과정에서 학습자의 발음을 그대로 녹음하여 음성 파일 형태로 수집할 수 있다. 또한, 학습자 발음 수신부(320)는 음성 파일 대신 학습자의 발음에 관한 오디오 신호를 수신할 수 있고, 신호 분석을 통해 학습자 발음에 대응되는 파형 그래프를 학습자 발음에 관한 데이터로서 수신할 수도 있다. 학습자 발음 수신부(320)는 수집한 학습자 발음에 관한 데이터를 데이터베이스(150)에 저장할 수 있다.The learner pronunciation receiver 320 may receive a learner pronunciation for a specific text. After the native speaker's pronunciation is reproduced, the learner may listen to the pronunciation and train to pronounce the same text, and the learner pronunciation receiver 320 may record the learner's pronunciation as it is in the process and collect it in the form of a voice file. In addition, the learner pronunciation receiving unit 320 may receive an audio signal related to the learner's pronunciation instead of a voice file, and may receive a waveform graph corresponding to the learner's pronunciation as data about the learner's pronunciation through signal analysis. The learner pronunciation receiver 320 may store the collected data on the learner pronunciation in the database 150 .

유창성 측정부(330)는 원어민 발음 및 학습자 발음에 관한 발성 그래프 비교를 통해 발성의 유창성을 측정할 수 있다. 유창성 측정부(330)는 발성 그래프를 기초로 다양한 방법을 통해 학습자의 발음 중 발성의 유창성에 관하여 평가할 수 있다. 발성의 유창성은 학습자가의 발성이 원어민의 발성에 가까운 정도에 해당할 수 있고, 구체적으로, 특정 단어 또는 특정 텍스트를 발음하는 과정에서 억양이나 강세에 관한 표현이 얼마나 원어민에 가까운지를 평가한 것에 해당할 수 있다. 유창성 측정부(330)는 학습자의 발음을 기초로 발성 그래프의 유사도를 측정하여 발성의 유창성을 평가하고 이에 관한 평가 점수를 산출하여 제공할 수 있다.The fluency measuring unit 330 may measure the vocal fluency by comparing the pronunciation graphs of the native speaker's pronunciation and the learner's pronunciation. The fluency measurement unit 330 may evaluate the fluency of the learner's vocalization among pronunciations through various methods based on the vocalization graph. Vocal fluency can correspond to the degree to which a learner's vocalization is close to that of a native speaker. Specifically, it corresponds to evaluating how close to a native speaker the expression of intonation or stress is in the process of pronouncing a specific word or specific text. can do. The fluency measurement unit 330 may measure the similarity of the speech graph based on the learner's pronunciation to evaluate the fluency of the speech, and may calculate and provide an evaluation score.

일 실시예에서, 유창성 측정부(330)는 발성 그래프 비교를 위한 전처리 동작을 수행할 수 있으며, 구체적인 방법은 다음과 같을 수 있다. 유창성 측정부(330)는 원어민 발음에 관한 제1 발성 그래프를 획득하고, 학습자 발음에 관한 제2 발성 그래프를 획득하며, 제1 및 제2 발성 그래프들 각각의 발화 시점을 검출하여 정렬하고, 원어민 발음의 성별을 기준으로 학습자 발음의 성별에 따라 제2 발성 그래프에 관한 스케일링을 수행하여 발성 그래프 비교를 위한 전처리 동작을 수행할 수 있다.In an embodiment, the fluency measurement unit 330 may perform a pre-processing operation for comparing the speech graph, and the specific method may be as follows. The fluency measurement unit 330 obtains a first speech graph related to the pronunciation of a native speaker, obtains a second speech graph related to the learner's pronunciation, detects and arranges the speech timing of each of the first and second speech graphs, and arranges the speech by a native speaker. A preprocessing operation for comparing the speech graph may be performed by scaling the second speech graph according to the gender of the learner's pronunciation based on the gender of the pronunciation.

보다 구체적으로, 유창성 측정부(330)는 각 발성 그래프 상에서 STT(Speech-To-Text) 음성 인식을 통해 첫 단어의 시작 부분, 즉 발화 시점(time point)을 결정할 수 있고, 발화 시점이 서로 일치하도록 발성 그래프들을 정렬시킬 수 있다. 발화 시점은 음성 신호의 크기 변화가 최초로 발생하는 지점에 해당할 수 있다. 또한, 유창성 측정부(330)는 남·여의 성조 차이를 보정하기 위하여 발성 그래프에 대한 노멀라이징(nomalizing)을 수행할 수 있으며, 노멀라이징을 통해 성량 차이에서 발생하는 단순 고저 값의 편차를 축소하여 평가 결과의 정확성을 향상시킬 수 있다.More specifically, the fluency measurement unit 330 may determine the beginning of the first word, that is, a time point, through speech-to-text (STT) speech recognition on each speech graph, and the speech points coincide with each other. You can arrange the vocalization graphs so that The utterance time may correspond to a point at which a change in the magnitude of the voice signal first occurs. In addition, the fluency measurement unit 330 may perform normalizing on the vocalization graph in order to correct the difference in tone between male and female, and through normalizing, it is evaluated by reducing the deviation of simple high and low values that occur in the difference in voice volume. It can improve the accuracy of the results.

일 실시예에서, 유창성 측정부(330)는 제1 및 제2 발성 그래프들 각각의 무음(mute) 구간을 검출하고 전체 발음 구간을 기준으로 무음 구간의 위치와 길이를 식별하여 무음 구간의 차이를 기초로 제1 유창성 평가를 수행할 수 있고, 제1 및 제2 발성 그래프들 각각에 대해 전체 발음 구간을 복수의 부분 발음 구간들로 분할하고 복수의 부분 발음 구간들 각각의 그래프 면적량을 산출하여 그래프 면적량의 차이를 기초로 제2 유창성 평가를 수행할 수 있으며, 제1 및 제2 유창성 평가들을 통합하여 평가 점수를 최종 결정할 수 있다.In an embodiment, the fluency measurement unit 330 detects a mute section of each of the first and second speech graphs and identifies the position and length of the silent section based on the entire pronunciation section to determine the difference between the silent sections Based on the first fluency evaluation, the entire pronunciation section is divided into a plurality of partial pronunciation sections for each of the first and second speech graphs, and the graph area amount of each of the plurality of partial pronunciation sections is calculated. The second fluency evaluation may be performed based on the difference in the amount of graph area, and the evaluation score may be finally determined by integrating the first and second fluency evaluations.

보다 구체적으로, 유창성 측정부(330)는 서로 다른 관점에서 발성의 유창성 평가를 각각 수행할 수 있으며, 최종적으로 해당 결과들을 통합하여 최종 유창성 점수를 결정할 수 있다. 제1 유창성 평가는 발성 그래프 상에서 무음 구간 별로 위치와 길이를 상호 비교한 결과로서 발성의 유창성에 관한 평가 결과에 해당할 수 있다. 이때, 무음(mute) 구간은 학습자가 발음하는 과정에서 문장간, 단어간의 띄어읽기 또는 쉬어읽기 등을 하는 과정에서 발생할 수 있다. 즉, 무음 구간은 발성 그래프 상에서 특정 세기 이하의 음성 신호가 지속되는 구간으로서 학습자의 발성이 존재하지 않는 구간에 대응될 수 있다. 결과적으로, 유창성 측정부(330)는 무음 구간들의 비교를 통해 학습자의 표현력이나 리듬감을 효과적으로 측정할 수 있다.More specifically, the fluency measurement unit 330 may perform each of the vocal fluency evaluations from different viewpoints, and may finally integrate the corresponding results to determine a final fluency score. The first fluency evaluation is a result of comparing positions and lengths for each silent section on the vocalization graph, and may correspond to an evaluation result regarding the vocal fluency. In this case, the mute section may occur in the process of the learner reading a space between sentences and words or reading a pause in the process of pronunciation. That is, the silent section is a section in which a voice signal of a specific intensity or less continues on the speech graph, and may correspond to a section in which the learner's speech does not exist. As a result, the fluency measurement unit 330 can effectively measure the expressive power or sense of rhythm of the learner by comparing the silent sections.

또한, 제2 유창성 평가는 발성 그래프 간의 면적량을 측정하여 이를 비교한 결과로서 수행될 수 있으며, 발성 그래프들의 형태(예를 들어, 모양)에 관한 유사성을 상호 비교한 결과로서 발성의 유창성에 관한 평가 결과에 해당할 수 있다. 한편, 제2 유창성 평가는 제1 유창성 평가와 달리 발성 그래프 상에서 최대값(high), 최소값(low) 또는 무음(mute) 구간을 별도로 측정하지 않더라도 관련 평가를 수행할 수 있다는 점에서 방법 상의 차이가 존재할 수 있다. 또한, 제2 유창성 평가는 발성 그래프의 전체 발음 구간을 대상으로 전체 면적에 대한 유사도를 기초로 수행될 수 있으며, 전체 발음 구간을 부분 발음 구간들로 분할한 후 각 구간별 면적값 비교를 통해 수행될 수도 있다.In addition, the second fluency evaluation may be performed as a result of measuring the amount of area between the vocalization graphs and comparing them. It may correspond to the evaluation result. On the other hand, unlike the first fluency evaluation, the second fluency evaluation differs in the method in that the related evaluation can be performed without separately measuring the maximum value (high), minimum value (low), or mute section on the vocalization graph. may exist. In addition, the second fluency evaluation may be performed based on the degree of similarity to the total area for the entire pronunciation section of the vocalization graph, and is performed by dividing the entire pronunciation section into partial pronunciation sections and comparing the area values for each section it might be

또한, 유창성 측정부(330)는 제1 및 제2 유창성 평가들을 통합하여 발성의 유창성에 관한 평가 점수를 최종 결정할 수 있으며, 제1 및 제2 유창성 평가들을 통합하기 위해 다양한 합산 알고리즘을 적용할 수 있다. 예를 들어, 유창성 측정부(330)는 제1 및 제2 유창성 평가들 각각에 가중치를 적용하여 가중합을 통해 최종 평가 점수를 산출할 수 있다. 이때, 각 평가 결과에 대한 가중치는 훈련 및 평가 장치(130)에 의해 사전에 설정될 수 있으며, 학습에 사용되는 텍스트의 내용이나 발음 난이도, 단어의 구성 등에 따라 차별적으로 설정될 수 있다.Also, the fluency measurement unit 330 may integrate the first and second fluency evaluations to finally determine an evaluation score for vocal fluency, and may apply various summing algorithms to integrate the first and second fluency evaluations. have. For example, the fluency measurement unit 330 may calculate a final evaluation score through a weighted sum by applying a weight to each of the first and second fluency evaluations. In this case, the weight for each evaluation result may be set in advance by the training and evaluation device 130 , and may be set differentially according to the content of the text used for learning, the difficulty of pronunciation, the composition of words, and the like.

정확성 측정부(340)는 학습자 발음을 변환하여 평가 텍스트를 생성하고 특정 텍스트와의 텍스트 비교를 통해 발음의 정확성을 측정할 수 있다. 훈련 및 평가 장치(130)는 외국어 학습에 관한 다양한 관점의 평가를 수행할 수 있으며, 발성의 유창성 뿐만 아니라 발음의 정확성에 대한 평가를 독립적으로 수행할 수 있다. 정확성 측정부(340)는 유창성 측정부(330)와 독립적으로 구현되어 동작할 수 있으며, 정확성 측정은 유창성 측정과 병렬적으로 수행될 수 있다. 정확성 측정은 학습자가 발음한 음성 신호를 기초로 STT 알고리즘을 통해 변환된 텍스트를 기초로 수행될 수 있으며, 원어민 발음에 관한 텍스트와 비교하여 일치 정도에 따라 수치화 된 결과를 생성할 수 있다.The accuracy measuring unit 340 may convert the pronunciation of the learner to generate an evaluation text, and measure the accuracy of pronunciation by comparing the text with a specific text. The training and evaluation apparatus 130 may perform evaluation from various viewpoints regarding foreign language learning, and may independently evaluate not only vocal fluency but also pronunciation accuracy. The accuracy measurement unit 340 may be implemented and operated independently of the fluency measurement unit 330 , and the accuracy measurement may be performed in parallel with the fluency measurement. Accuracy measurement can be performed based on the text converted through the STT algorithm based on the speech signal pronounced by the learner, and a quantified result can be generated according to the degree of agreement by comparing it with the text about the pronunciation of a native speaker.

일 실시예에서, 정확성 측정부(340)는 평가 텍스트에서 해당 발성 그래프 상의 피크 값에 대응되는 주요 단어들을 검출하고, 평가 텍스트에서 주요 단어들 각각의 전후에서 연결되는 보조 단어들을 검출하며, 특정 텍스트를 기준으로 주요 단어들 및 보조 단어들 각각의 일치 여부에 따라 평가 점수를 결정하는 과정을 통해 발음의 정확성 평가를 수행할 수 있다.In an embodiment, the accuracy measurement unit 340 detects key words corresponding to a peak value on a corresponding speech graph in the evaluation text, detects auxiliary words connected before and after each of the main words in the evaluation text, and the specific text Accuracy of pronunciation may be evaluated through a process of determining an evaluation score according to whether each of the main words and the auxiliary words coincides with each other.

보다 구체적으로, 정확성 측정부(340)는 발성 그래프 상의 피크 값을 기준으로 주요 단어들을 검출할 수 있다. 즉, 주요 단어들은 특정 문장을 발음하는 과정에서 강세가 가장 큰 단어에 해당할 수 있고, 억양을 결정하는 단어에 해당할 수 있다. 정확성 측정부(340)는 주요 단어들 간의 비교 결과를 기초로 정확성 평가를 수행할 수도 있다.More specifically, the accuracy measuring unit 340 may detect key words based on a peak value on the vocalization graph. That is, the main words may correspond to the word with the greatest stress in the process of pronouncing a specific sentence, and may correspond to the word determining the intonation. The accuracy measurement unit 340 may perform accuracy evaluation based on a comparison result between key words.

또한, 정확성 측정부(340)는 주요 단어들을 기준으로 전과 후에서 연결되는 단어들을 보조 단어들로서 결정하고, 보조 단어들 간의 비교 결과를 정확성 평가에 반영할 수 있다. 즉, 정확성 측정부(340)는 주요 단어들 간의 1차 비교 결과와 보조 단어들 간의 2차 비교 결과를 통합한 결과로서 발음의 정확성에 관한 최종 평가 점수를 산출할 수 있다. 예를 들어, 정확성 측정부(340)는 주요 단어들의 총 개수에 대한 단어 일치 수의 비율을 통해 제1 정확도를 산출할 수 있고, 보조 단어들의 총 개수에 대한 단어 일치 수의 비율을 통해 제2 정확도를 산출할 수 있으며, 제1 및 제2 정확도들 간의 가중합을 통해 발음의 정확성에 관한 최종 평가 점수를 결정할 수 있다.Also, the accuracy measuring unit 340 may determine words connected before and after the main words as auxiliary words, and reflect the comparison result between the auxiliary words in the accuracy evaluation. That is, the accuracy measurement unit 340 may calculate a final evaluation score regarding pronunciation accuracy as a result of integrating the primary comparison result between the main words and the secondary comparison result between the auxiliary words. For example, the accuracy measuring unit 340 may calculate the first accuracy through the ratio of the number of word matches to the total number of main words, and the second accuracy through the ratio of the number of word matches to the total number of auxiliary words Accuracy may be calculated, and a final evaluation score regarding pronunciation accuracy may be determined through a weighted sum between the first and second accuracies.

발음 평가 수행부(350)는 발성의 유창성과 발음의 정확성을 기초로 발음 평가를 수행하여 평가 결과를 생성할 수 있다. 즉, 발음 평가 수행부(350)는 학습자의 발음에 대해 발성과 발음 측면으로 분리하여 각각 독립적으로 측정할 수 있고 이를 통합한 결과로서 최종 평가 결과를 생성하여 제공할 수 있다. 또한, 발음 평가 수행부(350)는 학습자의 학습 및 평가 내용에 관한 실시간 통계 지표들을 다양한 인터페이스를 통해 시각화화여 제공할 수 있다. 이를 통해 학습자는 자신의 외국어 학습 내용을 효과적으로 파악하고, 학습에 대한 의욕을 고취시킬 수 있다.The pronunciation evaluation unit 350 may generate an evaluation result by performing pronunciation evaluation based on the vocal fluency and pronunciation accuracy. That is, the pronunciation evaluation performing unit 350 may separate and independently measure the pronunciation of the learner in terms of vocalization and pronunciation, and may generate and provide a final evaluation result as a result of integrating them. In addition, the pronunciation evaluation performing unit 350 may provide visualization of real-time statistical indicators regarding the learner's learning and evaluation contents through various interfaces. Through this, learners can effectively grasp their foreign language learning content and inspire their motivation to learn.

학습 컨텐츠 추천부(360)는 외국어 학습의 난이도별, 주차별, 과정별, 주제별 훈련 및 평가 이력을 누적하여 관리하고 학습자의 학습 수준에 따라 학습 컨텐츠를 추천할 수 있다. 학습 컨텐츠 추천부(360)는 발음 평가 수행부(350)로부터 수신되는 학습 및 평가 내용을 사용자 별로 관리하여 데이터베이스(150)에 저장할 수 있다.The learning content recommendation unit 360 may accumulate and manage training and evaluation histories for each difficulty, week, course, and topic of foreign language learning, and recommend learning content according to the learner's learning level. The learning content recommendation unit 360 may manage the learning and evaluation contents received from the pronunciation evaluation execution unit 350 for each user and store it in the database 150 .

또한, 학습 컨텐츠 추천부(360)는 사용자 별로 외국어 학습에 관한 히스토리(history)를 제공할 수 있으며, 다양한 인터페이스를 통해 세부 통계 지표 등을 시각화 하여 제공할 수 있다. 사용자는 시각화된 정보를 통해 자신의 학습 정도와 상황, 추가 학습에 대한 필요성 등을 효과적으로 인지할 수 있다.In addition, the learning content recommendation unit 360 may provide a history of foreign language learning for each user, and may provide visualization of detailed statistical indicators and the like through various interfaces. Through the visualized information, the user can effectively recognize his/her learning level and situation, and the need for additional learning.

또한, 학습 컨텐츠 추천부(360)는 사용자의 평가 내용을 기초로 해당 사용자에 필요한 학습 컨텐츠를 추천할 수 있다. 이를 위하여, 학습 컨텐츠 추천부(360)는 다른 사용자들의 학습 내용을 참조할 수 있으며, CF(Collaborative Filtering), CBF(Contents Based Filtering) 등의 추천 알고리즘을 활용할 수 있다.Also, the learning content recommendation unit 360 may recommend learning content necessary for the user based on the user's evaluation content. To this end, the learning content recommendation unit 360 may refer to learning contents of other users, and may utilize a recommendation algorithm such as CF (Collaborative Filtering) and CBF (Contents Based Filtering).

또한, 학습 컨텐츠 추천부(360)는 학습자의 수준에 적합한 학습 컨텐츠를 추천할 수도 있다. 이때, 학습 컨텐츠는 학습 유형별로 분류될 수 있다. 예를 들어, 쉐도잉(shadowing)은 원어민 발음을 따라서 발음하는 방식으로 학습을 수행하는 컨텐츠에 해당할 수 있고, 리드 어라우드(read aloud)는 원어민 발음없이 학습 텍스트를 크게 읽는 방식으로 학습을 수행하는 컨텐츠에 해당할 수 있다. 학습 컨텐츠 추천부(360)는 학습자 수준에 따라 학습 난이도를 조절하여 적절한 학습 컨텐츠를 추천할 수 있다. 난이도별 학습 컨텐츠는 영문 숨기기 등의 부가 기능과 함께 학습 횟수나 학습 텍스트의 난이도가 다양하게 설정될 수 있다.Also, the learning content recommendation unit 360 may recommend learning content suitable for the learner's level. In this case, the learning content may be classified by learning type. For example, shadowing may correspond to content in which learning is performed by pronouncing according to a native speaker's pronunciation, and read aloud is learning by reading a learning text aloud without a native speaker's pronunciation. It may correspond to the content that The learning content recommendation unit 360 may recommend appropriate learning content by adjusting the learning difficulty according to the learner's level. For the learning content for each difficulty level, the number of learning times or the difficulty level of the learning text may be set in various ways along with additional functions such as hiding English.

또한, 학습 컨텐츠 추천부(360)는 학습자가 자주 틀리는 발음에 대한 기록을 기초로 해당 발음을 집중 학습할 수 있는 학습 컨텐츠를 제공할 수 있다. 예를 들어, 학습자가 학습 과정에서 'R'과 'L' 발음을 자주 틀리는 경우 해당 발음을 훈련할 수 있는 학습 컨텐츠를 집중적으로 제공하여 학습자의 학습을 지원할 수 있다.In addition, the learning content recommendation unit 360 may provide learning content through which the learner can intensively learn the pronunciation based on the record of the pronunciation that the learner frequently makes wrong. For example, if a learner frequently misses the pronunciation of 'R' and 'L' during the learning process, learning content that can train the pronunciation can be intensively provided to support the learner's learning.

제어부(370)는 훈련 및 평가 장치(130)의 전체적인 동작을 제어하고, 원어민 발음 재생부(310), 학습자 발음 수신부(320), 유창성 측정부(330), 정확성 측정부(340), 발음 평가 수행부(350) 및 학습 컨텐츠 추천부(360) 간의 제어 흐름 또는 데이터 흐름을 관리할 수 있다.The control unit 370 controls the overall operation of the training and evaluation device 130 , the native speaker pronunciation reproducing unit 310 , the learner pronunciation receiving unit 320 , the fluency measurement unit 330 , the accuracy measurement unit 340 , and the pronunciation evaluation A control flow or data flow between the performing unit 350 and the learning content recommendation unit 360 may be managed.

도 4는 본 발명에 따른 외국어 발음 훈련 및 평가 과정을 설명하는 순서도이다.4 is a flowchart illustrating a foreign language pronunciation training and evaluation process according to the present invention.

도 4를 참조하면, 훈련 및 평가 장치(130)는 원어민 발음 재생부(310)를 통해 특정 텍스트에 관한 원어민 발음을 재생할 수 있다(단계 S410). 훈련 및 평가 장치(130)는 학습자 발음 수신부(320)를 통해 특정 텍스트에 관한 학습자 발음을 입력받을 수 있다(단계 S430). 훈련 및 평가 장치(130)는 유창성 측정부(330)를 통해 원어민 발음 및 학습자 발음에 관한 발성 그래프 비교를 통해 발성의 유창성을 측정할 수 있다(단계 S450).Referring to FIG. 4 , the training and evaluation apparatus 130 may reproduce the native speaker pronunciation of a specific text through the native speaker pronunciation reproducing unit 310 (step S410 ). The training and evaluation apparatus 130 may receive the learner's pronunciation of a specific text through the learner's pronunciation receiving unit 320 (step S430). The training and evaluation apparatus 130 may measure the fluency of vocalization by comparing the pronunciation graphs of the native speaker's pronunciation and the learner's pronunciation through the fluency measurement unit 330 (step S450).

또한, 훈련 및 평가 장치(130)는 정확성 측정부(340)를 통해 학습자 발음을 변환하여 평가 텍스트를 생성하고 특정 텍스트와의 텍스트 비교를 통해 발음의 정확성을 측정할 수 있다(단계 S470). 훈련 및 평가 장치(130)는 발음 평가 수행부(350)를 통해 발성의 유창성과 발음의 정확성을 기초로 발음 평가를 수행하여 평가 결과를 생성할 수 있다(단계 S490). In addition, the training and evaluation apparatus 130 may generate an evaluation text by converting the learner's pronunciation through the accuracy measuring unit 340 , and may measure the pronunciation accuracy by comparing the text with a specific text (step S470 ). The training and evaluation apparatus 130 may generate an evaluation result by performing pronunciation evaluation based on the vocal fluency and pronunciation accuracy through the pronunciation evaluation performing unit 350 (step S490).

일 실시예에서, 훈련 및 평가 장치(130)는 외국어 학습의 주차별, 과정별, 주제별 훈련 및 평가 이력을 누적하여 관리하고 훈련 및 평가 향상을 위한 학습 컨텐츠를 추천하는 학습 컨텐츠 추천부(360)를 더 포함하여 구현될 수 있다.In one embodiment, the training and evaluation device 130 accumulates and manages training and evaluation histories for each week, course, and topic of foreign language learning, and the learning content recommendation unit 360 recommends learning content for training and evaluation improvement. It may be implemented by further including.

도 5 및 6은 본 발명에 따른 외국어 발음 훈련 및 평가 시스템이 제공하는 인터페이스의 일 실시예를 설명하는 도면이다.5 and 6 are diagrams for explaining an embodiment of an interface provided by the foreign language pronunciation training and evaluation system according to the present invention.

도 5 및 6을 참조하면, 훈련 및 평가 장치(130)는 외국어 학습 과정에서 학습자의 발음에 대한 훈련 및 평가를 지원할 수 있는 다양한 기능을 제공할 수 있다. 훈련 및 평가 장치(130)는 체계적인 외국어 학습을 위해 학습 커리큘럼에 따라 주차별, 과정별로 설계된 학습 주제를 순차적으로 제공할 수 있다.5 and 6 , the training and evaluation apparatus 130 may provide various functions that can support training and evaluation of a learner's pronunciation in a foreign language learning process. The training and evaluation device 130 may sequentially provide learning topics designed for each week and each course according to a learning curriculum for systematic foreign language learning.

도 5에서 도시된 인터페이스의 경우, 화면 상단 중앙부에 주차별 과제명(510)이 표시될 수 있고, 화면 상단 우측부에 각 과정별 명칭(520)이 표시되어 현재 학습하는 컨텐츠에 관한 정보를 제공할 수 있다. 또한, 화면 상단부에는 현재 학습되는 학습 텍스트(530)가 한글과 외국어로 표현되어 각각 상하로 병기되어 표시될 수 있다. 이때, 훈련 및 평가 장치(130)는 외부 시스템과 연동하여 외부 컨텐츠를 학습 컨텐츠로서 제공할 수 있다.In the case of the interface shown in FIG. 5 , a task name 510 for each parking may be displayed in the upper center of the screen, and a name 520 for each course is displayed in the upper right portion of the screen to provide information about the content currently being studied. can do. In addition, at the upper part of the screen, the currently learned learning text 530 may be expressed in Korean and a foreign language, and may be displayed vertically and vertically. In this case, the training and evaluation apparatus 130 may provide external content as learning content in conjunction with an external system.

또한, 훈련 및 평가 장치(130)는 화면에 표시되는 학습 텍스트(530)에 대해 하이라이팅 기능을 제공하여 현재 재생 중인 원어민 발음에 동기화되어 학습 텍스트(530)가 하이라이팅 되도록 동작할 수 있다. 이를 통해, 학습자는 원어민의 발음이 현재 표시된 학습 텍스트(530)의 어느 단어 부분에 대해 진행되고 있는지를 직관적으로 이해할 수 있다.In addition, the training and evaluation apparatus 130 may provide a highlighting function for the training text 530 displayed on the screen, so that the training text 530 is highlighted in synchronization with the pronunciation of a native speaker currently being reproduced. Through this, the learner can intuitively understand which word part of the currently displayed learning text 530 is being pronounced by the native speaker.

또한, 훈련 및 평가 장치(130)는 화면 중앙부에서 원어민 발음에 대한 발성 그래프(550)를 시각화 하여 표시할 수 있다. 발성 그래프(550)는 발음의 세기(또는 에너지의 크기)에 따라 곡선 그래프 형태로 표현될 수 있고, 학습자는 이를 통해 원어민 발음을 들으면서 시각을 통해 발음의 각 구간별 세기와 억양 등을 직관적으로 이해할 수 있다. 훈련 및 평가 장치(130)는 원어민 발음에 관한 발성 그래프(550) 상에서 피크값에 대응되는 주요 단어(540)를 발성 그래프(550)의 상부에 병행하여 표시할 수 있다. 이를 통해, 학습자는 학습 텍스트(530) 상에서 발음에 주의해야 하는 단어들을 직관적으로 이해할 수 있다.In addition, the training and evaluation apparatus 130 may visualize and display the vocalization graph 550 for the pronunciation of a native speaker in the center of the screen. The vocalization graph 550 may be expressed in the form of a curved graph according to the strength (or the amount of energy) of the pronunciation, and the learner can intuitively understand the strength and intonation of each section of the pronunciation through sight while listening to the pronunciation of a native speaker. can The training and evaluation apparatus 130 may display the main word 540 corresponding to the peak value on the pronunciation graph 550 related to the pronunciation of the native speaker in parallel on the upper portion of the pronunciation graph 550 . Through this, the learner can intuitively understand the words that need attention to pronunciation in the learning text 530 .

또한, 훈련 및 평가 장치(130)는 학습자의 외국어 발음을 녹음할 수 있고, 학습자의 외국어 발음을 실시간으로 수신하여 그에 관한 발성 그래프(550)를 화면 중앙부에 표시할 수 있다. 학습자는 이를 통해 원어민 발음과 자신의 실제 발음 간의 세기와 억양, 강세의 차이를 직관적으로 이해할 수 있으며, 차이가 큰 구간을 통해 학습이 집중적으로 이루어져야 하는 부분을 확인할 수 있다.In addition, the training and evaluation apparatus 130 may record the learner's foreign language pronunciation, receive the learner's foreign language pronunciation in real time, and display a speech graph 550 related thereto in the center of the screen. Through this, the learner can intuitively understand the difference in strength, intonation, and stress between the pronunciation of a native speaker and his or her actual pronunciation, and can check the part where learning should be concentrated through the section with a large difference.

이때, 훈련 및 평가 장치(130)는 학습자의 외국어 발음의 발화 시점을 검출할 수 있고, 해당 발화 시점을 기준으로 학습자의 발성 그래프를 원어민 발음의 발화 시점과 일치하도로 정렬한 다음 화면을 통해 표시할 수 있다. 즉, 학습자는 원어민 발음을 듣고 난 후 자신의 발음을 개시할 수 있는데, 발음의 개시 시기를 정확히 맞추기는 쉽지 않다. 따라서, 정상 시점에 발음을 개시하지 않을 수 있고, 이를 그대로 발성 그래프로 표시하는 경우 원어민 발음의 발성 그래프와 싱크가 일치하지 않을 수 있다. 이를 방지하기 위하여, 훈련 및 평가 장치(130)는 학습자 발음 중 최초 시작 구간에서 무음 구간을 자동 제거할 수 있고, 학습자 발음의 최초 발화 시점을 기준으로 발성 그래프의 표시 시점을 자동 정렬하여 디스플레이 할 수 있다.At this time, the training and evaluation device 130 may detect the utterance timing of the learner's foreign language pronunciation, and align the learner's utterance graph to match the utterance timing of the native speaker's pronunciation based on the utterance timing, and then display it on the screen can do. That is, the learner can start his or her own pronunciation after hearing the native speaker's pronunciation, but it is not easy to accurately match the pronunciation start time. Therefore, pronunciation may not be started at a normal time point, and when the pronunciation graph is displayed as it is, the pronunciation graph of the native speaker may not be in sync with the pronunciation graph. In order to prevent this, the training and evaluation device 130 can automatically remove the silent section from the first starting section among the learner’s pronunciation, and automatically arrange and display the display point of the speech graph based on the first speech point of the learner’s pronunciation. have.

또한, 훈련 및 평가 장치(130)는 원어민 발음과 학습자의 발음을 2가지 측면에서 평가하고 이에 대한 평가 결과를 화면의 하단부 중앙에서 시각적으로 표시할 수 있다. 도 5 및 6의 경우, 평가 결과가 말풍선 형태로 표시될 수 있으며, 학습 텍스트(530)에 대한 전체 평가 결과(570)가 합격(PASS) 또는 불합격(FAIL) 형태로 표시될 수 있다. 이와 함께, 보다 구체적인 수치로서 원어민 발음과 학습자 발음 간의 일치도(620)가 표시될 수 있다.In addition, the training and evaluation device 130 may evaluate the pronunciation of the native speaker and the pronunciation of the learner in two aspects, and visually display the evaluation result in the center of the lower part of the screen. In the case of FIGS. 5 and 6 , the evaluation result may be displayed in the form of a speech bubble, and the entire evaluation result 570 for the learning text 530 may be displayed in the form of PASS or FAIL. In addition, as a more specific numerical value, the degree of correspondence 620 between the pronunciation of the native speaker and the pronunciation of the learner may be displayed.

또한, 훈련 및 평가 장치(130)는 발음 평가를 위해 발성 그래프(550) 상에서 무음(mute) 구간(630)의 위치와 길이를 수집할 수 있고, 외국어 발음과 학습자 발음에 대해 무음 구간(630)을 서로 비교하여 발성의 유창성에 관한 발음 평가를 수행할 수 있다. 한편, 훈련 및 평가 장치(130)는 발성 그래프 간의 비교를 위해 발성 그래프(550) 상에서 발화 시점(610)을 각각 검출할 수 있고, 발화 시점(610)을 기준으로 발성 그래프(550)들을 정렬시킬 수 있다.In addition, the training and evaluation device 130 may collect the position and length of the mute section 630 on the vocalization graph 550 for pronunciation evaluation, and the silent section 630 for foreign language pronunciation and learner pronunciation. can be compared with each other to perform a pronunciation evaluation on the fluency of vocalization. On the other hand, the training and evaluation apparatus 130 may detect each of the speech time points 610 on the speech graph 550 for comparison between the speech graphs, and align the speech graphs 550 based on the speech time point 610 . can

또한, 훈련 및 평가 장치(130)는 음성인식 STT를 이용하여 발음의 정확성을 평가할 수 있다. 예를 들어, 학습자가 얼버무리며 학습 텍스트(530)를 발음한 경우 발성 그래프(550)는 서로 유사한 반면 발음의 정확성은 상대적으로 상이할 수 있으며, 평가의 통합에 따른 가감점으로 인해 전체 평가 결과에서는 0점이 나올 수 있다. 만약 발음의 정확성이 20% 미만인 경우 감점이 적용될 수 있고, 발음의 정확성이 100%인 경우 가산점이 적용될 수 있다.In addition, the training and evaluation apparatus 130 may evaluate the accuracy of pronunciation by using the speech recognition STT. For example, if the learner pronounces the learning text 530 while babbling, the pronunciation graph 550 may be similar to each other, but the pronunciation accuracy may be relatively different. points may appear. If the pronunciation accuracy is less than 20%, deduction points may be applied, and if the pronunciation accuracy is 100%, additional points may be applied.

또한, 훈련 평가 장치(130)는 외국어 학습을 위한 인터페이스를 통해 원어민 발음 청취를 위한 컨트롤바(560)를 제공할 수 있다. 도 5에서, 컨트롤바(560)는 학습자는 재녹음 기능, 음량 조절 기능, 재생 및 정지 기능, 배속 기능 및 글자크기 조절 기능 등을 포함할 수 있다.In addition, the training evaluation device 130 may provide a control bar 560 for listening to the pronunciation of a native speaker through an interface for learning a foreign language. In FIG. 5 , the control bar 560 for the learner may include a re-recording function, a volume control function, a play and stop function, a double speed function, a font size control function, and the like.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although the above has been described with reference to preferred embodiments of the present invention, those skilled in the art can variously modify and change the present invention within the scope without departing from the spirit and scope of the present invention as set forth in the claims below. You will understand that it can be done.

100: 외국어 발음 훈련 및 평가 시스템
510: 주차별 과제명 520: 각 과정별 명칭
530: 학습 텍스트 540: 주요 단어
550: 발성 그래프 560: 컨트롤바
570: 전체 평가 결과
610: 발화 시점 620: 일치도
630: 무음 구간100: Foreign language pronunciation training and evaluation system
510: Assignment name by week 520: Name of each course
530: learning text 540: key words
550: vocalization graph 560: control bar
570: Overall evaluation result
610: ignition timing 620: coincidence
630: silent section

Claims

a native speaker pronunciation reproducing unit for reproducing a native speaker pronunciation for a specific text;
a learner pronunciation receiver for receiving the learner pronunciation for the specific text;
a fluency measuring unit for measuring the fluency of vocalization by comparing the pronunciation graphs of the native speaker's pronunciation and the learner's pronunciation;
an accuracy measuring unit for generating an evaluation text by converting the learner's pronunciation and measuring the accuracy of pronunciation by comparing the text with the specific text; and
a pronunciation evaluation performing unit configured to generate an evaluation result by performing a pronunciation evaluation based on the fluency of the vocalization and the accuracy of the pronunciation,
obtaining, by the fluency measurement unit, a first speech graph related to the pronunciation of the native speaker; obtaining a second speech graph related to the learner's pronunciation; detecting and arranging an utterance timing of each of the first and second utterance graphs; and performing a preprocessing operation for comparing the speech graph through the step of scaling the second speech graph according to the gender of the learner's pronunciation based on the gender of the native speaker's pronunciation,
Detecting a mute section of each of the first and second speech graphs, identifying the position and length of the silent section based on the entire pronunciation section, and performing a first fluency evaluation based on the difference between the silent sections step; For each of the first and second speech graphs, the entire pronunciation section is divided into a plurality of partial pronunciation sections, and an amount of graph area of each of the plurality of partial pronunciation sections is calculated, and based on the difference in the graph area 2 performing a fluency assessment; and determining an evaluation score by integrating the first and second fluency evaluations to measure the fluency of the spoken language.

According to claim 1, wherein the native speaker pronunciation reproduction unit
Provide a visualized highlighting for the specific text according to the progress of the native speaker pronunciation in the playback process,
Foreign language pronunciation training and evaluation system, characterized in that when a peak value exceeds a preset threshold on the pronunciation graph of the native speaker, a word corresponding to the peak value is visually displayed.

delete

According to claim 1, wherein the accuracy measuring unit
detecting key words corresponding to peak values on a corresponding speech graph in the evaluation text;
detecting auxiliary words connected before and after each of the main words in the evaluation text; and
Foreign language pronunciation training and evaluation system, characterized in that the measurement operation regarding the accuracy of the pronunciation is performed through the step of determining an evaluation score according to whether each of the main words and the auxiliary words coincide with the specific text .

The method of claim 1,
Foreign language pronunciation training and evaluation system, characterized in that it further comprises a learning content recommendation unit that accumulates and manages training and evaluation histories by difficulty, week, course, and topic of foreign language learning and recommends learning content according to the learner's learning level .