KR100898104B1

KR100898104B1 - Learning system and method by interactive conversation

Info

Publication number: KR100898104B1
Application number: KR1020070042708A
Authority: KR
Inventors: 김시은
Original assignee: 김시은
Priority date: 2007-05-02
Filing date: 2007-05-02
Publication date: 2009-05-18
Also published as: KR20080097619A

Abstract

본 발명은 상호 대화식 학습 시스템 및 방법에 관한 것으로서, 제어수단(50,60), 음성 인식수단(20), 음성 출력수단(30)과; 영상 출력수단(40)과, 데이터 저장수단(70)을 포함하는 상호 대화식 학습 시스템에 있어서, 학습자료의 질문 데이터를 내보내고, 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하고, 사용자가 상기 표준 음성의 답변 데이터 중 선택한 답변을 따라 말한 것을 상기 음성 인식수단을 통하여 입력받아 미리 저장된 상기 표준 음성의 답변 데이터와 발음 및 액센트를 비교하고, 이 비교 결과가 미리 결정된 음성 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 음성 인식값에 도달하지 못한 경우, 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보내고, 이에 따라 입력된 답변 데이터의 발음 및 액센트를 비교하여 미리 결정된 음성 인식값에 도달했는지를 판정한다. 이에 따라, 학습자의 부정확한 발음의 경우 단순 오답처리하는 것이 아니라 학습자의 발음을 교정하면서 계속적인 상호 대화식 언어 학습이 가능하다는 장점을 제공한다.The present invention relates to an interactive learning system and method, comprising: control means (50, 60), voice recognition means (20), and voice output means (30); In the interactive learning system including the image output means 40 and the data storage means 70, the question data of the learning data is exported, and a plurality of stored answer data according to the question data are stored in a pre-stored standard voice. The voice output means is provided to the user so that the user can select one, and the user inputs the speech according to the selected answer among the answer data of the standard voice through the voice recognition means; Compare the pronunciation and accent, and if the comparison result reaches the predetermined speech recognition value, proceeds to the next conversation. In contrast, if the speech recognition value is not reached, the user is instructed to repeatedly input the selection answer data. Export the data to the audio output means, and accordingly By comparing the pronunciation and accent of the response data, it is determined whether the predetermined value reaching the speech recognition. Accordingly, inaccurate pronunciation of the learner provides a merit that continuous interactive language learning can be performed while correcting the learner's pronunciation, rather than simply incorrectly processing the incorrect answer.

학습, 대화, 발음, 교정, 음성인식 Learning, conversation, pronunciation, correction, speech recognition

Description

Interactive Learning Systems and Methods {LEARNING SYSTEM AND METHOD BY INTERACTIVE CONVERSATION}

도 1은 본 발명의 일 실시예에 따른 상호 대화식 학습 방법의 흐름을 설명하기 위한 개략적인 흐름도.1 is a schematic flowchart illustrating a flow of an interactive learning method according to an embodiment of the present invention.

도 2는 종래 대화식 학습 방법에 있어서 학습자료로서 트리 구조의 질문자료 및 답변자료를 예시하는 개략도.Figure 2 is a schematic diagram illustrating a question structure and a question material of the tree structure as the learning material in the conventional interactive learning method.

도 3은 본 발명의 일 실시예에 따른 상호 대화식 학습 시스템의 구성을 설명하기 위한 개략적인 블록도.3 is a schematic block diagram illustrating a configuration of an interactive learning system according to an embodiment of the present invention.

도 4는 종래 대화식 학습 방법의 흐름을 설명하기 위한 개략적인 흐름도.4 is a schematic flowchart illustrating a flow of a conventional interactive learning method.

도 5는 (a) 종래 대화식 학습 방법의 발산형 트리 구조의 대화 흐름 구조와 이에 대비되는 (b) 본 발명의 대화식 학습 방법의 수렴형 트리 구조의 대화 흐름 구조를 예시하는 개략도.5 is a schematic diagram illustrating a conversation flow structure of a divergent tree structure of the divergent tree structure of the conventional interactive learning method and (b) a converging tree structure of the interactive learning method of the present invention, in contrast thereto.

<도면의 주요부분에 대한 간단한 설명><Brief description of the main parts of the drawing>

10 : 키 입력부 20 : 음성 인식부10: key input unit 20: voice recognition unit

21 : 마이크 30 : 음성 출력부21: microphone 30: audio output unit

31 : 스피커 40 : 영상 출력부31: speaker 40: video output unit

41 : 모니터 50 : 주처리부41: monitor 50: main processor

60 : 부처리부 61 : 배경영상 처리부60: subprocessor 61: background image processor

62 : 배경음향 처리부 63 : 음성출력 처리부62: background sound processor 63: voice output processor

64 : 음성인식 처리부 70 : 데이터 저장부64: speech recognition processing unit 70: data storage unit

71 : 배경영상 데이터베이스 72 : 배경음향 데이터베이스71: background image database 72: background sound database

73 : 음성출력 데이터베이스 74 : 음성인식 데이터베이스73: voice output database 74: voice recognition database

본 발명은 상호 대화식 학습 시스템 및 방법에 관한 것으로서, 더 상세하게는 컴퓨터 및 음성인식 프로그램을 사용하여 기계와 학습자인 사용자가 서로 대화식으로 실시하는 언어 학습 시스템 및 방법으로서, 특히 단순히 질문을 제시하고 답변을 입력받아 정답/오답 처리를 하는 문제은행식 학습이 가지는 단문 형태의 대화형식이 아니라, 학습자의 부정확한 발음에 의한 인식불가의 경우에도 상황에 따른 비교적 긴 내용의 대화식 학습이 끊어지지 않고 계속될 수 있으며 또한 자연스럽게 흘러가는 듯한 긴 대화가 가능하고 더 나아가 학습자의 이름을 대화에 자연스럽게 포함시킴으로써 실제 상황에 더욱 근접한 느낌의 상호 대화식 학습이 가능하게 한 새로운 상호 대화식 학습 시스템 및 방법에 관한 것이다.The present invention relates to an interactive learning system and method, and more particularly, to a language learning system and method that a machine and a user who is a learner perform interactively with each other by using a computer and a voice recognition program, and in particular, simply present a question and answer a question. It is not a short form of dialogue form of question bank type learning that receives correct answer / correct answer, but even if it is not recognized by the learner's incorrect pronunciation, interactive learning of relatively long contents does not stop. The present invention relates to a new interactive learning system and method that enables long conversations that seem to flow naturally, and furthermore, includes a learner's name naturally in the conversation, thereby enabling interactive learning with a feeling closer to the real situation.

일반적으로 제2외국어와 같은 언어를 학습하는 경우 많은 문장을 듣고 말하 는 것이 중요하게 생각되고 있다. 이에 따라 각종 녹음 테이프나 CD를 이용하여 문장을 듣고 따라서 말하는 형태의 언어 학습 교재들이 널리 알려져 있다. 그러나 이러한 교재들은 상호성이 없고 학습자측의 일방적인 노력이 크게 필요하기 때문에, 학습자가 지루하게 느끼기 쉬워 효율적인 학습이 이루어지기 힘들다는 문제가 있었다.In general, when learning a language such as a second foreign language, it is important to hear and speak many sentences. Accordingly, language learning materials in the form of listening to sentences using various recording tapes or CDs are widely known. However, these textbooks have a problem that learners are tedious and difficult to learn effectively because they are not interactive and require much one-sided effort.

이러한 문제를 해결하기 위하여, 근래 컴퓨터, 인터넷 및 음성인식 소프트웨어와 같은 기술이 발전함에 따라 사용자와 컴퓨터 간에 대화식 질문-대답이 가능하게 되었고, 이를 이용하여 사용자가 컴퓨터를 이용하여 자율적으로 대화식 학습이 가능한 개량된 형태의 학습 방법이 제안되었다. 즉, 컴퓨터의 프로그램에 의해 질문이 출력되고 이용자가 이에 대해 답변을 하면, 컴퓨터의 음성인식 기능에 의하여 컴퓨터가 이를 알아듣고 정답/오답을 결정하는 방식이 알려져 있는데, 이러한 방식은, 예컨대 대한민국 특허 등록번호 10-0447667 (특허권자: 이경목, “음성 인식 기능을 갖는 컴퓨터와 학습용 인형들을 이용한 상호 대화 언어 학습 시스템”)과 특허 등록번호 10-0296272 (특허권자: 이경목, “상호 대화식 언어 학습시스템”)의 공보에 개시되어 있다.In order to solve this problem, with the development of technology such as computer, internet and voice recognition software, interactive question-answer is possible between user and computer, which enables the user to use interactive computer to learn interactively. An improved form of learning has been proposed. That is, when a question is output by a computer program and a user answers the question, the computer recognizes the question by the computer's voice recognition function and determines a correct answer / incorrect answer. Publication No. 10-0447667 (Patent holder: Kyung-Mok Lee, “Interactive Language Learning System Using Computer with Speech Recognition and Learning Dolls”) and Patent Registration No. 10-0296272 (Patent Holder: Kyung-Mok Lee, “Interactive Language Learning System”) Is disclosed in.

이러한 종래의 컴퓨터 또는 인터넷을 이용한 학습시스템은, 본 명세서에 첨부된 도 4에 도시된 바와 같이, 학습레벨을 설정하면(S316), 미리 저장된 학습자료를 프로그램 루틴에 따라 사용자의 모니터 및 스피커를 통해 음성과 배경영상 및 배경음향을 출력하고(S318, S319), 사용자가 이에 적절한 답변을 하면, 이 답변을 인식하여 제시된 답변자료와의 일치여부를 판정하여(S320), 정답 또는 오답을 결정 하고(S322, S323) 반복학습할지를 묻는다(S324). 그런데 이 학습시스템은, 본 명세서에 첨부된 도2에서 예시하고 있는 구조와 같이, 한가지 질문자료에 대응하는 여러 개의 답변자료를 저장하고, 각각의 답변자료에 대응하는 별도의 질문자료를 저장함으로써, 트리구조의 대화식 학습에 의하여 언어의 학습이 가능하다고 주장하고 있다. 그러나 이 학습시스템은 미리 저장되어 있는 답변자료에 사용자의 답변이 일치하는지 여부에 따라 정답/오답을 판정하는 방식이기 때문에, 예컨대 한국의 김포공항에 도착한 영어를 모국어로 사용하는 외국인에게 영어로 질문하고 이 영어를 모국어로 사용하는 사용자가 영어로 대답하는 경우에 이 방식은 적절할 수 있다. 그러나 이러한 방식은 사용자가 제2외국어를 학습하는 경우에 많이 발생하는 경우와 같이, 제2외국어에 익숙하지 않은 사용자의 부정확한 발음 때문에 컴퓨터의 음성인식 프로그램이 사용자의 답변을 인식하지 못할 때는, 오답처리가 되고, 따라서 도2에 도시된 바와 같은 트리 구조의 대화 경로를 따라 대화가 진행될 수 없을 뿐만 아니라, 더 이상 대화식 학습 프로그램이 진행되기 힘들다는 문제점이 있었다. 따라서 이 학습 시스템은 사용자가 모국어가 아닌 제2외국어를 학습하기 위한 시스템으로서는 한계가 있었다.In the conventional learning system using a computer or the Internet, as shown in FIG. 4 attached to the present specification, when the learning level is set (S316), the previously stored learning materials are stored through a user's monitor and a speaker according to a program routine. Outputs the sound and the background image and the background sound (S318, S319), and if the user makes an appropriate answer to this, by recognizing this answer and determining whether it matches the presented answer data (S320), and determines the correct or incorrect ( S322, S323) asking whether to repeat the learning (S324). By the way, this learning system, as shown in the structure illustrated in Figure 2 attached to this specification, by storing a plurality of answer data corresponding to one question data, and by storing a separate question material corresponding to each answer material, It is claimed that language learning is possible by interactive learning of tree structure. However, this learning system is a method of determining correct or incorrect answers based on whether or not the user's answers match the pre-stored answer data. For example, if a foreigner who has arrived at Gimpo Airport in Korea as a native language asks in English, This may be appropriate if a user who speaks this English as the native language answers in English. However, this method is incorrect when the computer's speech recognition program does not recognize the user's answer due to incorrect pronunciation of the user who is not familiar with the second language, such as when a user learns a second language. As a result, the conversation cannot proceed along the conversation path of the tree structure as shown in Fig. 2, and there is a problem that the interactive learning program is no longer progressed. Therefore, this learning system was limited as a system for the user to learn a second foreign language other than the native language.

그러므로 컴퓨터 및 음성인식 프로그램을 이용하여, 학습자가 부정확한 발음으로 답변을 입력할 경우에도 끊김없이 미리 정해진 복수의 대화 경로 중 어느 하나를 따라 자연스럽게 상호 대화식으로 제2외국어를 효율적으로 학습하기 위한 기술에 대한 요구는 존재한다.Therefore, by using a computer and a voice recognition program, even if a learner inputs an answer with incorrect pronunciation, the technology for efficiently learning a second foreign language naturally and interactively along any one of a plurality of predetermined conversation paths without interruption is provided. The need exists.

본 발명은 앞서 설명한 바와 같은 요구에 부응하기 위하여 발명된 것으로서, 컴퓨터 및 음성인식 프로그램을 사용하여 기계와 학습자인 사용자가 서로 대화식으로 실시하는 언어 학습 시스템 및 방법으로서, 특히 단순히 질문을 제시하고 답변을 입력받아 정답/오답 처리를 하는 문제은행식 학습이 가지는 단문 형태의 대화형식이 아니라, 학습자의 부정확한 발음에 의한 인식불가의 경우에도 학습자의 발음을 교정할 수 있으면서도 상황에 따른 비교적 긴 내용의 대화식 학습이 끊어지지 않고 계속될 수 있으며 또한 자연스럽게 흘러가는 듯한 긴 대화가 가능하고 더 나아가 학습자의 이름을 대화에 자연스럽게 포함시킴으로써 실제 상황에 더욱 근접한 느낌의 상호 대화식 학습이 가능하게 한 새로운 상호 대화식 학습 시스템 및 방법을 제공하는 것을 목적으로 한다.The present invention has been invented in order to meet the above-described needs, and is a language learning system and method that is implemented interactively by a user who is a machine and a learner using a computer and a voice recognition program, and in particular, simply presents a question and answers a question. It is not a short form of dialogue that has a problem banking type of learning that accepts correct answer / correct answer, but it can correct the pronunciation of the learner even when it is not recognized by the incorrect pronunciation of the learner. A new interactive learning system that enables long, uninterrupted learning as well as natural flow of learning, and furthermore, the inclusion of the learner's name naturally into the conversation, enabling interactive learning with a closer feel to the real world. To provide a way The purpose.

상기와 같은 목적은 본 발명의 상호 대화식 학습 시스템 및 방법에 의해 달성된다.This object is achieved by the interactive learning system and method of the present invention.

본 발명의 일 양상에 따른 상호 대화식 학습 시스템은, 사람의 음성을 입력받아 전기적인 신호로 변환하는 음성입력수단을 포함하고, 상기 음성입력수단에서 변환된 음성신호를 분석하여 발음과 액센트에 근거하여 그 음성신호가 의미하는 내용과 일치하는 단어로 변환시켜 제어수단으로 출력하는 음성 인식수단과; 스피커를 포함하고, 상기 제어수단으로부터 전달되는 음성신호를 입력받아 신호 처리하여 상기 스피커를 통해 출력하는 음성 출력수단과; 모니터를 포함하고, 상기 제어수단으로부터 전달되는 영상신호를 입력받아 신호 처리하여 상기 모니터를 통해 출력하는 영상 출력수단과; 키입력수단과; 실생활 대화에서 전개될 수 있는 다양한 상황을 바탕으로 한 질문 데이터와, 그 질문에 따른 다수의 답변 데이터, 및 상기 질문과 다수의 답변 데이터 각각의 내용에 알맞은 영상데이터와 음향데이터를 학습자료 데이터베이스로 저장하는 데이터 저장수단을 포함하는 상호 대화식 학습 시스템에 있어서, 상기 제어수단은, 상기 데이터 저장수단에서 읽은 학습자료의 질문 데이터를 상기 영상출력수단 및 음성출력수단이나, 영상출력수단 또는 음성출력수단을 통해 내보내고, 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하고, 사용자가 상기 표준 음성의 답변 데이터 중 선택한 답변을 따라 말한 것을 상기 음성 인식수단을 통하여 입력받아 미리 저장된 상기 표준 음성의 답변 데이터와 발음 및 액센트를 비교하고, 이 비교 결과가 미리 결정된 음성 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 음성 인식값에 도달하지 못한 경우, 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보내고, 이에 따라 입력된 답변 데이터의 발음 및 액센트를 비교하여 미리 결정된 음성 인식값에 도달했는지를 판정하는 동작을 제어한다.An interactive learning system according to an aspect of the present invention includes voice input means for receiving a human voice and converting the voice into an electrical signal, and analyzing the voice signal converted by the voice input means based on pronunciation and accent. Voice recognition means for converting the voice signal into words corresponding to the meanings and outputting the same to the control means; Voice output means for receiving a voice signal from the control means for receiving a signal and outputs through the speaker; A video output means for receiving a video signal from the control means, receiving a signal, and outputting the signal through the monitor; Key input means; Storing question data based on various situations that can be developed in real life conversations, a plurality of answer data according to the question, and image data and sound data corresponding to the contents of the question and the plurality of answer data as a learning material database. In the interactive learning system comprising a data storage means, the control means, through the video output means and the audio output means, the video output means or the audio output means to query the question data of the learning data read from the data storage means Exports a plurality of stored answer data according to the question data to the voice output means as a pre-stored standard voice so that the user can select one, and the user selects one of the standard voice answer data. Through the voice recognition means When the input data is compared with the pre-stored answer data of the standard voice and pronunciation and accents, and the comparison result reaches a predetermined voice recognition value, the user proceeds to the next conversation. And outputting data instructing the user to repeatedly input the selection answer data to the voice output means, thereby controlling an operation of determining whether a predetermined voice recognition value has been reached by comparing pronunciation and accent of the input answer data.

바람직한 일 실시예에 있어서, 상기 제어수단은, 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하기 전에, 상기 음성 인식수단을 통해 사용자에 의한 임의의 답변 데이터가 들어오면 이 임의의 답변 데이터를 인식하여 의미를 추출하여 추출된 의미를 상기 데이터 저장수단으로부터 읽어낸 답변 데이터와 비교하고, 이 비교 결과가 미리 결정된 의미 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 인식값에 도달하지 못한 경우, 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하는 동작을 더 제어한다.In one preferred embodiment, the control means, before exporting the plurality of stored answer data according to the question data to the voice output means as a pre-stored standard voice to provide the user to select one, When random answer data by the user is input through the voice recognition means, the random answer data is recognized, the meaning is extracted, the extracted meaning is compared with the answer data read from the data storage means, and the comparison result is determined in advance. When the meaning recognition value is reached, the process proceeds to the next conversation. On the contrary, when the predetermined recognition value is not reached, a plurality of stored answer data according to the question data is exported to the voice output means as a pre-stored standard voice to the user. Further provides an action for the user to select one. The.

바람직한 일 실시예에 있어서, 상기 제어수단은, 상기 데이터 저장수단에서 읽은 학습자료의 질문 데이터를 내보낼 때와 사용자에게 다른 지시 데이터를 내보내는 경우, 미리 입력받은 사용자의 이름을 포함시켜 내보내는 동작을 더 제어한다.In a preferred embodiment, the control means, when exporting the question data of the learning material read in the data storage means and when exporting other instruction data to the user, further control the operation to include the user's name input in advance do.

본 발명의 일 양상에 따른 상호 대화식 학습 방법은, 사람의 음성을 입력받아 전기적인 신호로 변환하는 음성입력수단을 포함하고, 상기 음성입력수단에서 변환된 음성신호를 분석하여 발음과 액센트에 근거하여 그 음성신호가 의미하는 내용과 일치하는 단어로 변환시켜 제어수단으로 출력하는 음성 인식수단과; 스피커를 포함하고, 상기 제어수단으로부터 전달되는 음성신호를 입력받아 신호 처리하여 상기 스피커를 통해 출력하는 음성 출력수단과; 모니터를 포함하고, 상기 제어수단으로부터 전달되는 영상신호를 입력받아 신호 처리하여 상기 모니터를 통해 출력하는 영상 출력수단과; 키입력수단과; 실생활 대화에서 전개될 수 있는 다양한 상황을 바탕으로 한 질문 데이터와, 그 질문에 따른 다수의 답변 데이터, 및 상기 질문과 다수의 답변 데이터 각각의 내용에 알맞은 영상데이터와 음향데이터를 학습자료 데이터베이스로 저장하는 데이터 저장수단을 포함하는 시스템을 이용하는 상호 대화식 학습 방법에 있어서, 상기 제어수단이 상기 데이터 저장수단에서 읽은 학습자료의 질문 데이터를 상기 영상출력수단 및 음성출력수단이나, 영상출력수단 또는 음성출력수단을 통해 내보내는 대화 개시 단계와; 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하는 표준 답변 제시 단계와; 사용자가 상기 표준 음성의 답변 데이터 중 선택한 답변을 따라 말한 것을 상기 음성 인식수단을 통하여 입력받아 미리 저장된 상기 표준 음성의 답변 데이터와 발음 및 액센트를 비교하는 음성 인식값 결정 단계와; 음성 인식값 비교 결과가 미리 결정된 음성 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 음성 인식값에 도달하지 못한 경우, 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보내는 답변 반복 지시 단계와; 이에 따라 입력된 답변 데이터의 발음 및 액센트를 비교하여 미리 결정된 음성 인식값에 도달했는지를 판정하는 반복 음성 인식값 결정 단계와; 반복 음성 인식값 비교 결과가 미리 결정된 음성 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 음성 인식값에 도달하지 못한 경우, 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보내는 답변 반복 지시 단계 및 상기 반복 음성 인식값 결정 단계를 반복하는 교정 단계를 포함한다.The interactive learning method according to an aspect of the present invention includes a voice input means for receiving a human voice and converting the voice into an electrical signal, and analyzing the voice signal converted by the voice input means based on pronunciation and accent. Voice recognition means for converting the voice signal into words corresponding to the meanings and outputting the same to the control means; Voice output means for receiving a voice signal from the control means for receiving a signal and outputs through the speaker; A video output means for receiving a video signal from the control means, receiving a signal, and outputting the signal through the monitor; Key input means; Storing question data based on various situations that can be developed in real life conversations, a plurality of answer data according to the question, and image data and sound data corresponding to the contents of the question and the plurality of answer data as a learning material database. An interactive learning method using a system including data storage means, wherein the control means outputs the question data of the learning data read from the data storage means to the video output means and the audio output means, the video output means or the audio output means. A conversation start step of exporting through; A standard answer presentation step of exporting a plurality of stored answer data according to the question data to the voice output means as a pre-stored standard voice so that the user can select one; A speech recognition value determining step of receiving a user's speech according to a selected answer among the standard speech answers data through the speech recognition means, and comparing the previously stored answer data of the standard speech with pronunciation and accents; When the voice recognition value comparison result reaches the predetermined voice recognition value, the process proceeds to the next conversation. On the contrary, when the voice recognition value comparison result does not reach the predetermined voice recognition value, the voice outputting data instructing the user to repeatedly input the selection answer data. A step of repeating the reply of sending out by means; A repetitive speech recognition value determination step of determining whether a predetermined speech recognition value has been reached by comparing the pronunciation and accent of the inputted answer data accordingly; When the comparison result of the repeated speech recognition value reaches the predetermined speech recognition value, the process proceeds to the next conversation. In contrast, when the predetermined speech recognition value is not reached, the voice is inputted to the user to repeatedly input the selection answer data. And a calibration step of repeating the answer repeating instruction step sent to the output means and the step of determining the repeated speech recognition value.

바람직한 일 실시예에 있어서, 상기 표준 답변 제시 단계 이전에, 상기 음성 인식수단을 통해 사용자에 의한 임의의 답변 데이터가 들어오면 이 임의의 답변 데이터를 인식하여 의미를 추출하여 추출된 의미를 상기 데이터 저장수단으로부터 읽어낸 답변 데이터와 비교하는 임의 답변 의미 추출 및 비교 단계를 더 포함하고; 이 비교 결과가 미리 결정된 의미 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 인식값에 도달하지 못한 경우, 상기 표준 답변 제시 단계가 진행된다.In a preferred embodiment, prior to the step of presenting the standard answer, if any answer data by the user enters through the voice recognition means, the random answer data is recognized and the meaning is extracted to store the extracted meaning. Further extracting and comparing arbitrary answer meanings to compare with answer data read from the means; When the comparison result reaches a predetermined meaning recognition value, the process proceeds to the next dialogue. In contrast, when the comparison result does not reach the predetermined recognition value, the standard answer presentation step is performed.

아래에서 본 발명에 따른 상호 대화식 학습 시스템 및 방법의 양호한 실시예를 첨부한 도면을 참조로 하여 상세히 설명한다. 도 1은 본 발명의 일 실시예에 따른 상호 대화식 학습 방법의 흐름을 설명하기 위한 개략적인 흐름도이고, 도 3은 본 발명의 일 실시예에 따른 상호 대화식 학습 시스템의 구성을 설명하기 위한 개략적인 블록도이다.DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, preferred embodiments of the interactive learning system and method according to the present invention will be described in detail with reference to the accompanying drawings. 1 is a schematic flowchart illustrating a flow of an interactive learning method according to an embodiment of the present invention, Figure 3 is a schematic block diagram for explaining the configuration of an interactive learning system according to an embodiment of the present invention It is also.

본 발명의 일 양상에 따른 상호 대화식 학습 시스템은, 도 3에 도시된 바와 같이, 음성 인식수단(20), 음성 출력수단(30), 영상 출력수단(40), 키입력수단(10), 제어수단(50, 60), 및 데이터 저장부(70) 등을 포함할 수 있다. 음성 인식수단(20)에는 사람의 음성을 입력받아 전기적인 신호로 변환하는 음성입력수단, 예컨대 마이크(21)와 같은 장치와 연결되거나 또는 포함하여, 상기 음성입력수단에서 변환된 음성신호를 분석하여 발음과 액센트에 근거하여 그 음성신호가 의미하는 내용과 일치하는 단어로 변환시켜 제어수단(50,60)으로 출력하는 장치이다. 음성 출력수단(30)은 스피커(31)와 같은 출력장치와 연결되거나 또는 포함하여, 상기 제어수단으로부터 전달되는 음성신호를 입력받아 신호 처리하여 상기 스피커를 통해 출 력하는 장치이다. 영상 출력수단은 예컨대 모니터(41)와 같은 표시장치에 연결되거나 포함하여, 상기 제어수단으로부터 전달되는 영상신호를 입력받아 신호 처리하여 상기 모니터를 통해 출력하는 장치이다. 키입력수단(10)은 예컨대 키보드와 같이 다수의 버튼을 이용하여 사용자가 데이터를 입력하는 등의 용도로 사용될 수 있다. 데이터 저장수단(70)은 도면에서 데이터 저장부(70)로서 도시되어 있는 것에 대응하며, 실생활 대화에서 전개될 수 있는 다양한 상황을 바탕으로 한 질문 데이터와, 그 질문에 따른 다수의 답변 데이터, 및 상기 질문과 다수의 답변 데이터 각각의 내용에 알맞은 영상데이터와 음향데이터를 학습자료 데이터베이스로 저장할 수 있다.Interactive interactive learning system according to an aspect of the present invention, as shown in Figure 3, the voice recognition means 20, the voice output means 30, the image output means 40, the key input means 10, control Means 50, 60, and data storage 70. The voice recognition means 20 is connected with or includes a voice input means for receiving a human voice and converting it into an electrical signal, for example, a microphone 21, and analyzes the voice signal converted by the voice input means. It is a device that converts a word corresponding to the content of the voice signal based on pronunciation and accent to output to the control means (50, 60). The voice output means 30 is connected to or including an output device such as a speaker 31, and is a device that receives a voice signal transmitted from the control means, processes the signal, and outputs the same through the speaker. The image output means is, for example, connected to or included in a display device such as a monitor 41, and receives an image signal transmitted from the control means to process the signal and output through the monitor. The key input means 10 may be used for a user inputting data using a plurality of buttons such as a keyboard, for example. The data storage means 70 corresponds to that shown in the drawing as the data storage unit 70, question data based on various situations that can be developed in a real life conversation, a plurality of answer data according to the question, and Image data and sound data suitable for the contents of the question and the plurality of answer data may be stored as a learning material database.

이러한 시스템은 로컬 개인용 컴퓨터 상에서 구현될 수 있다. 보다 바람직하게는 원격 통신 시스템, 예컨대 인터넷 상의 웹사이트를 통하여 구현될 수 있다. 학습을 위한 웹사이트는 잘 알려져 있으며, 이 경우 사용자의 로컬 컴퓨터는 모니터, 마이크, 스피커 등을 갖춘 출력 부분을 담당하게 되고, 학습 웹사이트 운영자는 데이터베이스의 관리 및 입력 및 출력 데이터의 제어 부분을 담당하게 될 것이다. 이러한 학습용 웹사이트 시스템을 해당 기술분야에서 잘 알려져 있으며, 도 3의 시스템을 웹사이트 시스템으로서 변경하여 구성하는 것은 쉽게 이루어질 수 있으므로, 이러한 학습용 웹사이트 시스템도 본 발명의 범위에 속한다는 점을 지적해둔다.Such a system can be implemented on a local personal computer. More preferably, it can be implemented via a telecommunications system, such as a website on the Internet. The learning website is well known, in which case the user's local computer is responsible for the output section with monitors, microphones, speakers, etc., and the learning website operator is responsible for managing the database and controlling the input and output data. Will be done. Such a learning website system is well known in the art, and it can be easily configured by changing the system of FIG. 3 as a website system, and points out that such a learning website system is also within the scope of the present invention. .

이와 같은 구성의 상호 대화식 학습 시스템에 있어서, 상기 제어수단(50, 60)은, 도시된 예와 같이 주처리부(50)와 부처리부(60)로 분리될 수 있다. 부처리 부(60)는 배경영상 처리부(61), 배경음향 처리부(62), 음성출력 처리부(63), 및 음성인식 처리부(64) 등을 포함할 수 있다. 데이터 저장부(70)는 세부적으로 배경영상 데이터베이스(DB)(71), 배경음향 DB(72), 음성출력 DB(73), 음성인식 DB(74) 등으로 분리되는 데이터베이스로 구성될 수 있다. 도 3에 도시된 예는 하나의 가능한 시스템 구성을 예시한 것이므로, 본 발명이 도시된 구성에만 국한되는 것은 아니며, 본 발명의 특징을 구현하기 위한 다양한 구체적인 구성이 가능하다는 것을 해당 기술의 지식을 가진 자라면 쉽게 이해할 것이다.In the interactive learning system having such a configuration, the control means 50 and 60 may be separated into the main processing unit 50 and the sub processing unit 60 as shown in the illustrated example. The sub processor 60 may include a background image processor 61, a background sound processor 62, a voice output processor 63, a voice recognition processor 64, and the like. The data storage unit 70 may be configured as a database that is divided into a background image database (DB) 71, a background sound DB 72, a voice output DB 73, a voice recognition DB 74, and the like in detail. Since the example illustrated in FIG. 3 illustrates one possible system configuration, the present invention is not limited to the illustrated configuration, and various specific configurations for implementing the features of the present invention are possible. As you grow up, you will easily understand.

본 발명의 상호 대화식 학습 시스템 및 방법에 있어서 특징적인 점은, 상기 제어수단(50,60)이 동작하는 방식에 있으며, 일 실시예에 따라, 학습자가 표준 음성에 의한 다수의 답변 데이터를 따라 발음함으로써 발음을 교정하는 과정에 관한 것이다. 구체적으로 설명하면 다음과 같다. 학습이 개시되면, 상기 제어수단(50, 60)은 상기 데이터 저장부(70)에서 배경영상, 배경음향, 및 학습자료의 질문 데이터를 읽어들인 후, 이를 결합하여 상기 영상 또는 음성 출력 수단으로 내보낸다. 사용자 즉 학습자는 모니터(41) 상에 표시된 배경영상, 배경음향 하에서 질문데이터를 눈으로 볼 수 있고, 음성출력부(30)와 연결된 스피커(31)를 통해 질문데이터를 들을 수 있다. 질문데이터의 출력 후, 제어수단은 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공한다. 학습자는 스피커(31)를 통해 출력된 다수의 답변 데이터 중에서 선택하여 이를 말함으로써 마이크(21)를 통해 입력하게 된다.A characteristic feature of the interactive learning system and method of the present invention lies in the manner in which the control means 50, 60 operate, and according to one embodiment, the learner is pronounced along a plurality of answer data by standard voice. By correcting pronunciation. Specifically, it is as follows. When the learning is started, the control means (50, 60) reads the question data of the background image, the background sound, and the learning data from the data storage unit 70, and combines them into the video or audio output means send. The user, that is, the learner, can visually view the question data under the background image and the background sound displayed on the monitor 41, and hear the question data through the speaker 31 connected to the voice output unit 30. After outputting the question data, the control means sends out a plurality of stored answer data according to the question data to the voice output means as a pre-stored standard voice so that the user can select one. The learner selects from among a plurality of answer data output through the speaker 31 and inputs it through the microphone 21.

제어 수단은, 사용자가 상기 표준 음성의 답변 데이터 중 선택한 답변을 따라 말한 것을 상기 음성 인식수단을 통하여 입력받은 후, 이 입력된 데이터를 미리 저장된 상기 표준 음성의 답변 데이터와 발음 및 액센트의 면에서 비교한다. 이 비교 결과가 미리 결정된 음성 인식값에 도달하면 다음 대화, 즉 다음 질문 및 답변 데이터로 구성된 대화로 진행한다. 여기서 음성 인식값이란 발음 및 액센트를 서로 비교하여 서로 일치 여부를 미리 결정된 오차범위 내에서 판정한 값으로서 정의할 수 있다. 이 밖에도 음성 인식 분야에서 사용하는 다양한 방식으로 정의할 수도 있다. 이때, 사용자의 발화에 의해 입력된 음성 데이터와 미리 저장된 표준 음성에 의한 음성 데이터의 미리 결정된 범위 내에서의 일치 여부를 나타내는 것이라면 어떠한 방식의 정의도 가능하다는 것을 해당 기술 분야의 통상의 지식을 가진 자라면 쉽게 이해할 것이다. 한편 반대로 비교 결과가 미리 결정된 음성 인식값에 도달하지 못한 경우, 제어수단은 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보낸다.The control means receives the user's speech according to the selected answer of the standard voice through the voice recognition means, and then compares the input data with the previously stored answer data of the standard voice in terms of pronunciation and accent. do. When the comparison result reaches a predetermined speech recognition value, the conversation proceeds to the next conversation, that is, the conversation consisting of the next question and answer data. Here, the speech recognition value may be defined as a value determined within a predetermined error range by comparing pronunciation and accents with each other. In addition, it can be defined in various ways used in the speech recognition field. At this time, any one of ordinary skill in the art knows that any method can be defined as long as it indicates whether the voice data input by the user's speech and the voice data by the pre-stored standard voice match within a predetermined range. Ramen would be easy to understand. On the other hand, when the comparison result does not reach the predetermined voice recognition value, the control means sends out data to the voice output means instructing the user to repeatedly input the selection answer data.

사용자 즉 학습자는 반복 지시에 따라 자신이 선택한 답변을 음성으로 마이크(21)를 이용하여 입력한다. 이렇게 학습자에 의해 입력된 답변 데이터의 발음 및 액센트를 표준 음성의 발음 및 액센트와 비교하여 미리 결정된 음성 인식값에 도달했는지를 판정하게 된다. 이러한 과정은 학습자가 만족스러운 발음과 액센트에 대해 익숙해질 때까지 복수회 반복될 수 있다. 이러한 반복 과정은 학습자의 발음을 교정하는 기능을 하게 될 것이며, 이를 통하여 학습자는 익숙치 않은 제2외국어를 쉽고 효율적으로 학습할 수 있게 된다는 효과를 얻을 수 있다.The user, that is, the learner, inputs the answer selected by the user using the microphone 21 as a voice according to the repetition instruction. The pronunciation and accent of the answer data input by the learner are compared with the pronunciation and accent of the standard voice to determine whether a predetermined speech recognition value has been reached. This process may be repeated multiple times until the learner is accustomed to satisfactory pronunciation and accents. This iterative process will be to correct the pronunciation of the learner, through which the learner can obtain an effect that can easily and efficiently learn a second language that is unfamiliar.

본 발명의 바람직한 일 실시예에 있어서, 학습자는 질문 데이터가 제공되면 이 질문에 대하여, 학습 시스템이 표준음성으로 제공한 답변 데이터가 없는 상태에서, 자신이 스스로 만들어낸 답변을 음성으로 입력할 수 있는 과정이 더 포함될 수 있다. 구체적으로 보면, 상기 제어수단은, 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하기 전에, 상기 음성 인식수단을 통해 사용자에 의한 임의의 답변 데이터를 입력받을 수 있다. 제어수단은, 이 임의의 답변 데이터를 인식하여 의미를 추출하고, 이 추출된 의미를 상기 데이터 저장수단에 미리 저장되어 있는 답변 데이터와 비교한다. 제어 수단은 이 비교 결과가 미리 결정된 의미 인식값에 도달하면 다음 대화로 진행한다. 여기서 의미 인식값이라 함은, 학습자가 입력한 음성 데이터를 인식하여 추출된 의미와 답변 데이터의 의미를 비교하여 미리 결정한 오차범위 내에서 서로 일치 여부를 판정한 값을 의미한다. 이러한 의미 인식값의 정의는 해당 기술 분야에서 통상의 지식을 가진 자라면 쉽게 이해할 것이다. 한편 반대로 비교 결과가 미리 결정된 인식값에 도달하지 못한 경우, 제어 수단은 위에서 설명한 과정 즉, 상기 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 하는 동작을 진행한다. 이에 따라 학습자는 질문에 대하여 자유로이 답변하는 융통성을 가질 수 있어, 더욱 실생활에 근접한 대화식 언어 학습이 가능하게 되는 효과를 제공한다.In a preferred embodiment of the present invention, when the question data is provided, the learner can input the answer he / she made by voice in response to the question without the answer data provided by the learning system as the standard voice. The process may further include. Specifically, the control means, the voice recognition means before exporting the plurality of stored answer data according to the question data to the voice output means as a pre-stored standard voice to the user to select one, The user can receive arbitrary answer data by the user. The control means recognizes this arbitrary answer data, extracts the meaning, and compares the extracted meaning with the answer data previously stored in the data storage means. The control means proceeds to the next conversation when the comparison result reaches a predetermined meaning recognition value. Here, the meaning recognition value means a value obtained by recognizing speech data input by the learner and comparing the extracted meaning with the meaning of the answer data to determine whether they match each other within a predetermined error range. This definition of meaning recognition value will be readily understood by those of ordinary skill in the art. On the other hand, when the comparison result does not reach a predetermined recognition value, the control means sends out the plurality of answer data stored according to the above-described process, that is, the question data, to the voice output means as a pre-stored standard voice to the user. It proceeds an operation that allows one to select one. Accordingly, the learner can have the flexibility of answering the questions freely, thereby providing an effect of enabling interactive language learning to be closer to real life.

더 나아가 본 발명의 바람직한 일 실시예에 있어서, 상기 제어수단은, 상기 데이터 저장수단에서 읽은 학습자료의 질문 데이터를 내보낼 때와 사용자에게 다른 지시 데이터를 내보내는 경우, 미리 입력받은 사용자의 이름을 포함시켜 내보내는 동작을 더 제어한다는 특징을 가진다. 이는 학습 시작 전에 학습자의 이름을 입력하는 과정을 거치거나, 또는 본 학습 시스템이 원격통신 방식, 예컨대 인터넷의 웹사이트를 통해 제공되는 것과 같은 방식으로 구성되는 경우에는, 해당 학습 웹사이트에 학습자가 회원가입할 때 등록한 이름, 아이디, 별명 등을 회원 등록 정보 데이터베이스로서 저장하였다가 이를 읽어내어 활용할 수도 있다. 이렇게 학습시스템과 학습자의 상호 질문-답변의 대화 과정에 있어서, 학습시스템이 학습의 각 세부 과정에서 학습자의 이름을 호칭하는 것은, 학습 과정이 보다 개인적으로 친밀감있게 될 수 있다는 효과를 제공할 수 있다.Furthermore, in a preferred embodiment of the present invention, the control means, when exporting the question data of the learning material read in the data storage means and when sending other indication data to the user, including the name of the user input in advance It has the feature of further controlling the exporting operation. This can be done by entering a learner's name before the start of the lesson, or if the learning system is configured in a telecommunication manner, such as through a website on the Internet, When registering, the registered name, ID, and nickname can be stored as a member registration information database, which can be read and used. Thus, in the process of dialogue between the learning system and the learner's mutual question-answer, the designation of the learner's name in each sub-course of learning can provide the effect that the learning process can be more personally intimate. .

도 1에는 도 3에 예시되어 있는 시스템을 사용하는 본 발명에 따른 상호 대화식 학습 방법의 바람직한 구체적인 일 실시예를 보여주는 흐름도가 도시되어 있다.FIG. 1 is a flow chart showing one preferred specific embodiment of an interactive learning method according to the present invention using the system illustrated in FIG.

학습이 시작(101)되면, 제어수단 즉 학습시스템은, 데이터 저장수단에서 읽은 학습자료의 질문 데이터를 내보낼 때와 사용자에게 다른 지시 데이터를 내보내는 경우, 미리 입력받은 사용자의 이름을 포함시켜 내보내는 동작을 더 제어하기 위하여 학습자를 인식하는 단계를 진행한다(102). 이 학습자 인식 단계(102)는 직접 학습자의 이름을 예컨대 키입력수단(10)에 의해 입력하도록 하는 과정일 수도 있고, 학습용 웹사이트 시스템의 경우에는, 회원가입된 회원의 등록 정보를 읽어들이는 과정을 포함할 수 있다.When the learning starts (101), the control means, that is, the learning system, when exporting the question data of the learning material read from the data storage means and when exporting different instruction data to the user, including the name of the user input in advance to export operation Recognizing the learner to further control (102). The learner recognition step 102 may be a process of directly inputting the learner's name by, for example, the key input means 10. In the case of the learning website system, the process of reading the registered information of the registered member It may include.

이후 학습시스템은 데이터 저장수단에서 읽은 학습자료의 질문 데이터를 상기 영상 또는 음성 출력 수단으로 내보내는 대화 개시 단계(103)를 진행시킨다. 이후 사용자가 음성으로 입력한 답변을 인식하고(104), 저장되어 있는 답변 데이터와 비교한다(105). 이 비교 단계(105)는, 음성 인식수단을 통해 사용자에 의한 임의의 답변 데이터가 들어오면, 이 임의의 답변 데이터를 인식하여 의미를 추출하여 추출된 의미를 상기 데이터 저장수단으로부터 읽어낸 답변 데이터와 비교하는 임의 답변 의미 추출 및 비교 단계이다. 비교 결과가 미리 결정된 의미 인식값에 도달하면 즉 일치하면 다음 대화로 진행하며(112), 반대로 미리 결정된 인식값에 도달하지 못한 경우 즉 불일치인 경우, 한번 더 답변을 반복할 것을 지시(106)하고, 이를 다시 비교하는 단계(107)를 진행할 수 있다. 이 반복 및 비교(106, 107)은 생략 가능하다. 비교 결과 일치하면 다음 대화로 진행하지만, 비교 결과 불일치 하면 표준 답변 제시 단계(109)가 진행된다.Thereafter, the learning system proceeds with the conversation start step 103 of exporting the question data of the learning data read from the data storage means to the video or audio output means. Thereafter, the user recognizes an answer input by voice (104) and compares it with the stored answer data (105). The comparison step 105, when the answer data by the user through the voice recognition means, the answer data read from the data storage means by recognizing the random answer data to extract the meaning and the extracted meaning Random answer meaning comparison and comparison step. If the comparison result reaches a predetermined meaning recognition value, i.e., matches, proceeds to the next conversation (112), on the contrary, if it does not reach the predetermined recognition value, i.e. inconsistency, instructs to repeat the answer once more (106). In operation 107, the process may be compared again. This iteration and comparison 106 and 107 can be omitted. If the result of the comparison matches, the process proceeds to the next dialog. If the result of the comparison does not match, the standard answer presentation step 109 proceeds.

표준 답변 제시 단계(109)는, 처음 질문 데이터에 따른 저장되어 있는 다수의 답변 데이터를 미리 저장된 표준 음성으로 상기 음성 출력수단에 내보내어 사용자로 하여금 하나를 선택할 수 있게 제공하는 단계이다. 사용자는 상기 표준 음성의 답변 데이터 중 선택한 것을 상기 음성 인식수단을 통하여 입력하고, 학습시스템은 이 입력된 것을 인식(110)하여, 미리 저장된 상기 표준 음성의 답변 데이터와 발음 및 액센트를 비교(111)하는 음성 인식값 결정 단계가 진행된다. 상기 음성 인식값 비교 결과가 미리 결정된 음성 인식값에 도달하면 즉 표준 음성 데이터와 사용자 음성 데이터가 일치하면 다음 대화로 진행(112)하며, 반대로 미리 결정된 음 성 인식값에 도달하지 못한 경우, 즉 불일치하면 상기 단계(106 ~ 110)를 반복하게 된다. 즉 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보내는 답변 반복 지시 단계와, 이에 따라 입력된 답변 데이터의 발음 및 액센트를 비교하여 미리 결정된 음성 인식값에 도달했는지를 판정하는 반복 음성 인식값 결정 단계와; 반복 음성 인식값 비교 결과가 미리 결정된 음성 인식값에 도달하면 다음 대화로 진행하며, 반대로 미리 결정된 음성 인식값에 도달하지 못한 경우, 사용자에게 상기 선택 답변 데이터를 반복하여 입력하도록 지시하는 데이터를 상기 음성 출력수단으로 내보내는 답변 반복 지시 단계 및 상기 반복 음성 인식값 결정 단계를 반복하게 한다. 이는 사용자의 발음을 교정하기 위한 교정 단계라고 할 수 있다.The standard answer presentation step 109 is a step of exporting a plurality of stored answer data according to the first question data to the voice output means as a pre-stored standard voice so that the user can select one. The user inputs the selected one of the answer data of the standard voice through the voice recognition means, and the learning system recognizes that the input is 110, and compares the previously stored answer data of the standard voice with pronunciation and accents (111). A voice recognition value determination step is performed. When the comparison result of the speech recognition value reaches a predetermined speech recognition value, that is, when the standard speech data and the user speech data coincide, the process proceeds to the next conversation (112). If the above step (106 to 110) is repeated. That is, the response repeating instruction step of outputting data instructing the user to repeatedly input the selection answer data to the voice output means, and accordingly compares pronunciation and accent of the inputted answer data to determine whether a predetermined speech recognition value has been reached. Determining a repeated speech recognition value; When the comparison result of the repeated speech recognition value reaches the predetermined speech recognition value, the process proceeds to the next conversation. In contrast, when the predetermined speech recognition value is not reached, the voice is inputted to the user to repeatedly input the selection answer data. Iteratively repeat the step of sending the reply to the output means and the step of determining the repeated speech recognition value. This may be referred to as a calibration step for correcting a user's pronunciation.

도 5는 (a) 종래 대화식 학습 방법의 발산형 트리 구조의 대화 흐름 구조와 이에 대비되는 (b) 본 발명의 대화식 학습 방법의 수렴형 트리 구조의 대화 흐름 구조를 예시하는 개략도이다. 도 3에 예시되고 또한 도 5의 (a)에 예시되어 있는 바와 같은 발산형 트리 구조, 즉 한가지 질문에 상응하여 복수의 답변이 나열되는 방식의 대화 구조는 기하 급수적으로 대화의 경우가 늘어나기 때문에, 특정 목적과 상황에 따른 자연스러운 대화를 형성하는 것이 어렵다. 반면에, 도 5의 (b)에 도시된 바와 같이, 본 발명에 따른 대화 구조는 대화의 시작부터 종결에 이르기 까지의 대화 경로가 대화의 도입과 분화 및 발전 단계를 지난 후에는 적절한 결론 부분을 향해 수렴되어지는 방식으로 된다는 특징을 가질 수 있다. 이러한 수렴 방식의 질문-답변 구조에 의하면, 실제의 특정 상황에서 가능한 여러 가지 상황을 반영하면 서도, 원하는 목적의 대화 종결을 유도할 수 있다는 장점을 제공한다.5 is a schematic diagram illustrating a conversation flow structure of a divergence tree structure of the divergent tree structure of the conventional interactive learning method and (b) a converging tree structure of the interactive learning method of the present invention, in contrast thereto. Since the divergent tree structure illustrated in FIG. 3 and illustrated in FIG. 5 (a), that is, a conversation structure in which a plurality of answers are listed corresponding to one question, increases the case of the conversation exponentially. For example, it is difficult to form natural dialogues for specific purposes and situations. On the other hand, as shown in (b) of FIG. 5, the dialogue structure according to the present invention has an appropriate conclusion after the dialogue path from the beginning to the end of the dialogue passes through the introduction, differentiation and development stages of the dialogue. It may be characterized in that it is in a way that converges toward. This convergence question-answer structure provides the advantage of inducing the termination of a desired purpose conversation while reflecting the various situations possible in the actual specific situation.

상술한 바와 같은 본 발명의 학습 방법에 따르면, 학습자가 다음 대화의 장에서 사용되어질 문장들을 전 단계에서 미리 학습할 수 있다는 장점이 있다. 예컨대, 학습자는 다음 날(또는 다음 단계에서 진행될 대화)에 사용될 문장들을 미리 훈련을 통해 학습할 수 있다. 다음 대화가 실제로 진행될 때 미리 학습한 문장 내용을 중심으로 선택적 답변이 가능하고, 또한 학습자의 답변이 미리 설정된 인식값에 미달시에는 발음 훈련을 통해 문장이 다시 한번 학습되어질 수 있으므로, 학습 효과가 매우 높다는 장점을 제공할 수 있다.According to the learning method of the present invention as described above, there is an advantage that the learner can previously learn the sentences to be used in the next chapter of the conversation in advance. For example, the learner may learn through sentences in advance sentences to be used for the next day (or the conversation to be performed in the next step). When the next conversation actually proceeds, selective answers are possible based on the sentence contents learned in advance, and when the learner's answer falls short of the preset recognition value, the sentence can be learned once again through pronunciation training, so that the learning effect is very effective. High can provide an advantage.

본 발명에 의한 학습 방법 및 시스템은, 특정 상황에서 가이더의 역할을 할 수 있다는 특징을 가진다. 예컨대, 한국인인 학습자가 주한 미대사관에서 비자 발급을 위한 인터뷰를 하는 상황을 생각해보자. 먼저, 미대사관 영사가 질문을 한다. 이에 대해 학습자가 상황에 맞지 않는 답변을 하거나, 부정확한 발음으로 답변을 한다. 영사는 학습자의 질문을 알아듣지 못하겠다는 답변을 한다. 이 상황에서, 학습자는 본 학습 시스템을 이용하는 가이더를 이용하여, 비자 발급을 위한 상황에 어울리는 대화과정을 시작할 수 있다. 가이더 학습 시스템은 해당 상황(여기서는 영사의 인터뷰 질문)에 어울리는 답변들을 제시하여 주고, 학습자가 선택한 문장에 대한 발음 지도를 할 수 있다. 이러한 학습 시스템을 이용하여 학습한 결과 학습자의 발음이 정해진 인식률을 만족하게 되면, 영사는 알아 들었다는 표현을 하고, 다음 인터뷰 질문을 진행할 수 있다. 이와 같이, 학습자가 도움이 필요한 상황이 되면, 가이더 학습 시스템을 가동시킴으로써, 해당 상황에 가장 어울리는 해결책을 제시하고 훈련시켜 주게 함으로써, 학습자가 실제로 주체가 되어 대화가 진행될 수 있게 할 수 있다.The learning method and system according to the present invention is characterized in that it can act as a guider in a specific situation. For example, consider a situation where a Korean learner is interviewing for a visa at the US Embassy in Seoul. First, the US Embassy Consulate asks a question. The learner responds to the situation or responds with incorrect pronunciation. The consular officer responds that he or she will not understand the learner's question. In this situation, the learner can use the guider using the present learning system to start a conversational process suitable for the situation for visa issuance. The guider learning system can present answers that match the situation (in this case, the interviewer's interview question) and provide pronunciation guidance for the sentence chosen by the learner. As a result of learning using such a learning system, when a learner's pronunciation satisfies a predetermined recognition rate, the consul can express that he has learned and proceed to the next interview question. As such, when the learner needs a help, the guider learning system can be operated to present and train a solution that best suits the situation so that the learner can actually become a subject and proceed with the conversation.

본 발명의 학습 방법 및 시스템에서, 학습자가 답변할 수 있는 다수의 답변들이 화면에 한글(즉, 학습자의 모국어) 텍스트로 출력되는 학습 보조 수단을 포함할 수 있다. 원칙적으로 학습자가 답변할 수 있는 문장들은 외국어로 표현 또는 기술되어 DB에 저장되고, 시스템이 학습자가 답변한 내용을 인식하여 이에 상응하는 DB의 저장 자료를 외국어 텍스트 또는 외국어 음성으로 출력하여 다음 단계로 진행하게 된다. 그런데 대화가 외국 언어로 진행된다 하더라도 학습자의 두뇌에 인식되어지는 대화 내지 상황은 학습자의 모국어에 의한 연상 작용에 의해 진행되는 것이다. 즉 주어진 질문에 대해 어떠한 대답을 하여야 할지를 모국어 연상에 의해 제 1 차적으로 판단하게 되고, 제 1 차적인 판단이 완료되면, 제 2 단계로 이 판단 내용을 외국어로 표현하게 되는 것이다. 다시 말해서 어떠한 대답을 해야 할 지 결정되어야 이를 외국어로 말할 수 잇게 되는 것이다. 따라서 대화가 진행되는 각 단계에 학습자가 답변 또는 제시할 수 있는 문장 내용을 모국어로 번역하여 컴퓨터 화면에 출력하여 줌으로써, 어떠한 답변을 해야 할지 말성이게 되는 부차적인 낭비 요소를 생략해 줄 수 있다. 이에 따라 외국어로 말하고 표현하고자 하는 본연의 학습 목적을 이루는데 매우 큰 효과를 거둘 수 있게 된다.In the learning method and system of the present invention, a plurality of answers that can be answered by the learner may include a learning aid that is output on the screen in Korean (ie, the learner's native language) text. In principle, sentences that can be answered by the learner are expressed or described in a foreign language and stored in the DB. The system recognizes the contents of the learner's answers and outputs the corresponding DB's stored data in foreign language text or foreign language voice. You will proceed. However, even if the conversation is conducted in a foreign language, the conversation or situation that is recognized by the learner's brain is performed by the associative action of the learner's native language. That is, the first judgment is made based on the association of the mother tongue as to which answer should be given to the given question. When the first judgment is completed, the judgment is expressed in a second language. In other words, you have to decide which answer to answer so that you can speak it in a foreign language. Therefore, by translating the contents of sentences that can be answered or presented by the learner into the native language at each stage of the conversation, the secondary wasteful elements that become speechless can be omitted. As a result, it is possible to achieve a great effect in achieving the original learning purpose of speaking and expressing a foreign language.

본 발명의 일 실시예에 있어서, 대화의 내용에 따라 이를 효과적으로 인식시키기 위한 영상물의 출력에 있어, “나(학습자)”의 모습은 나타나지 않는 것을 특징으로 할 수 있다. 외국어 학습 위한 보조 수단으로서 동영상이나 애니메이션과 같은 영상물을 사용하는 종래 학습 방법에서는, 영상물 내에 설정 인물을 학습자의 역할을 배정하고, 학습자가 화면 속의 배정되어진 역할을 수행하면서 외국어로 발음하면서 대화가 진행되는 방식을 취하는 것이 일반적이다. 그러나, 이는 “화면속의 나”가 영상물로 표현되고 있고, 이를 보고 있는 “실제의 나”가 외부에 있게 되므로 학습을 진행하고 있는 “나”는 또 하나의 제 3자적인 시각을 형성하게 되기 마련이다. 본 학습 시스템에서는 “화면 속의 나”가 영상물로 표현되어 진행되는 것이 아니라, 아예 없다. 식당에서의 상황을 예로 보면, 기존의 방법은 “손님”과 “웨이터”가 애니메이션으로 등장하여, 내가 손님의 역할을 수행하여 대화를 진행하는 방식을 취할 수 있는데, 이 상황에서 내가 나를 보면서 대화는 한다는 것은 부자연스러울 수 있다. 따라서 본 학습 시스템에서는 “웨이터”만 영상물 내에 표현되고 나는 등장하지 않는다. 즉 나의 시각에서 웨이터와 직접 대화를 나누는 실제 효과를 구현하므로써 본 학습 시스템의 특징인 “주체적”학습 방법을 확보할 수 있게 된다.In one embodiment of the present invention, in the output of the image to effectively recognize this in accordance with the content of the conversation, the appearance of "me (learner)" may be characterized in that it does not appear. In a conventional learning method using a video object such as a video or an animation as an auxiliary means for learning a foreign language, a conversation is performed while assigning a role of a learner to a set person in the video, and a learner pronunciation a foreign language while performing the assigned role on the screen. It is common to take the approach. However, this is because the “I in the screen” is expressed as a video and the “real I” who sees it is outside, so the “I” who is learning is likely to form another third party perspective. to be. In this learning system, “I in the screen” is not represented as a video, but not at all. Taking the situation in a restaurant as an example, the conventional method is that the animation of "guests" and "waiters" can take the form of acting as a guest, and in this situation, the conversation is It can be unnatural. Therefore, in this learning system, only the “waiter” is represented in the video and I do not appear. In other words, by realizing the actual effect of talking directly with the waiter from my perspective, it is possible to secure a “subjective” learning method that is a characteristic of this learning system.

이상에서 본 발명의 컴퓨터 및 음성인식 프로그램을 사용하여 기계와 학습자인 사용자가 서로 대화식으로 실시하는 언어 학습을 위한 상호 대화식 학습 시스템 및 방법에 대한 기술사상을 첨부도면과 함께 서술하였지만, 이는 본 발명의 가장 양호한 실시예를 예시적으로 설명한 것이지 본 발명을 한정하는 것은 아니다.In the above description, the technical concept of a system and a method for interactive learning for a language learning performed by a user who is a machine and a learner interactively using a computer and a voice recognition program of the present invention has been described together with the accompanying drawings. The best embodiment is described by way of example and not by way of limitation.

상술한 바와 같이, 본 발명은 컴퓨터 및 음성인식 프로그램을 사용하여 기계 와 학습자인 사용자가 서로 대화식으로 실시하는 언어 학습 시스템 및 방법으로서, 특히 단순히 질문을 제시하고 답변을 입력받아 정답/오답 처리를 하는 문제은행식 학습이 가지는 단문 형태의 대화형식이 아니라, 학습자의 부정확한 발음에 의한 인식불가의 경우에도 학습자의 발음을 교정할 수 있으면서도 상황에 따른 비교적 긴 내용의 대화식 학습이 끊어지지 않고 계속될 수 있으며 또한 자연스럽게 흘러가는 듯한 긴 대화가 가능하고, 더 나아가 학습자의 이름을 대화에 자연스럽게 포함시킴으로써 실제 상황에 더욱 근접한 느낌의 상호 대화식 학습이 가능하게 한 새로운 상호 대화식 학습 시스템 및 방법을 제공하는 효과를 제공한다.As described above, the present invention is a language learning system and a method in which a computer and a user who is a learner perform interactively with each other using a computer and a voice recognition program, and in particular, simply present a question and receive an answer to receive a correct answer / incorrect answer. It is not a short form of conversational form of problem banking learning, but even if the learner's pronunciation is not recognized due to inaccurate pronunciation, the learner's pronunciation can be corrected and interactive learning of relatively long contents can be continued without interruption. It also provides a new interactive learning system and method that enables long conversations that seem to flow naturally, and furthermore, by naturally including the learner's name in the conversation, enabling interactive learning with a closer feel to real life. do.

Claims

In the interactive learning system comprising a voice recognition means including a voice input means, a voice output means including a speaker, an image output means including a monitor, a key input means, a control means and a data storage means,

The control means,

The question data of the learning data read by the data storage means is exported through an image output means and an audio output means, or through an image output means or an audio output means, and the plurality of stored answer data according to the question data are stored in advance as the standard voice. Export to the voice output means to provide the user to select one, the response data of the user selected from the response data of the standard voice received through the voice recognition means received the response data and pronunciation of the pre-stored standard voice And comparing the accents to proceed to the next conversation when the predetermined speech recognition value is reached. On the contrary, when the predetermined speech recognition value is not reached, data for instructing the user to repeatedly input the selection answer data is outputted to the speech output means. And then entered accordingly Controlling the operation of determining whether the predetermined value reaching the speech recognition by comparing the pronunciation and accent of the service data,

And when the question data of the learning material read by the data storage means is exported and when the instruction data is different to the user, the user's name input in advance is included.

The voice of claim 1, wherein the control means sends out a plurality of stored answer data according to the question data to the voice output means as a pre-stored standard voice to provide a user to select one. When any answer data by the user is input through the recognition means, the random answer data is recognized, the meaning is extracted, the extracted meaning is compared with the answer data read from the data storage means, and the comparison result is a predetermined meaning. When the recognition value is reached, the process proceeds to the next conversation. On the contrary, when the predetermined recognition value is not reached, a plurality of stored answer data according to the question data is exported to the voice output means as a pre-stored standard voice to allow the user. To give you more control over the behavior , Interactive learning system according to Jing.

The apparatus of claim 2, wherein the control unit further controls the image output unit 40 to output a plurality of answer data according to the question data of the learning material read from the data storage unit through the monitor as the mother tongue text of the learner. An interactive learning system.

delete

Interactive learning method using a system including voice recognition means including voice input means, voice output means including a speaker, video output means including a monitor, key input means, control means and data storage means To

A conversation initiation step of the control means outputting the question data of the learning material read from the data storage means through an image output means and an audio output means or an image output means or an audio output means;

A learner recognition step of recognizing a learner so that when the control means exports the question data of the learning material read from the data storage means and when other instruction data is sent to the user, the learner may include the user's name input in advance;

A standard answer presentation step of exporting a plurality of answer data stored according to the question data to a voice output means as a pre-stored standard voice so that a user can select one;

A voice recognition value determining step of receiving answer data selected and spoken by the user from the standard voice response data through a voice recognition unit, and comparing the prestored answer data with the pronunciation and accent;

When the voice recognition value comparison result reaches a predetermined voice recognition value, the user proceeds to the next conversation. On the contrary, when the voice recognition value comparison result does not reach the predetermined voice recognition value, the voice outputting data instructing the user to repeatedly input the selection answer data. A step of repeating the reply of sending out by means;

A repetitive speech recognition value determination step of determining whether a predetermined speech recognition value has been reached by comparing the pronunciation and accent of the inputted answer data accordingly;

When the comparison result of the repeated speech recognition value reaches the predetermined speech recognition value, the process proceeds to the next conversation. In contrast, when the predetermined speech recognition value is not reached, the voice is inputted to the user to repeatedly input the selection answer data. And a calibration step of repeating the reply repeating instruction step outputted to the output means and the repeating speech recognition value determination step.

The data storage means of claim 5, wherein, when the user inputs any answer data by the user through the voice recognition means, the random answer data is recognized and the meaning is extracted by extracting the meaning. Further extracting and comparing random answer meanings that compare with answer data read from the; And if the comparison result reaches a predetermined meaning recognition value, proceeds to the next conversation. If the comparison result does not reach the predetermined recognition value, the standard answer presentation step is performed.

The apparatus of claim 6, wherein the control unit further controls the image output unit 40 to output a plurality of answer data according to the question data of the learning material read from the data storage unit through the monitor as the mother tongue text of the learner. Interactive learning method, characterized in that.

delete