KR20150038892A

KR20150038892A - Method for english study based on voice recognition

Info

Publication number: KR20150038892A
Application number: KR20130117236A
Authority: KR
Inventors: 성기중
Original assignee: 성기중
Priority date: 2013-10-01
Filing date: 2013-10-01
Publication date: 2015-04-09

Abstract

The present invention relates to an English learning method using speech recognition comprising a content selecting step of requesting, to a learner, for selecting the learning object English contents among English content stored in the database when an implementation is requested from the learner; an original text displaying step of displaying, on the screen, the original texts according to the selected English content when the learning object English content is selected by the learner; a speech inputting step of requesting, to the learner, for inputting speech with respect to the displayed original texts; an input texts generating step of converting the speech input by the learner into the texts through a speech recognition engine to generate the input texts when the speech is input by the learner; a text comparing step of comparing the generated input texts with the original texts to determine the incorrect text parts from the input texts; an input text displaying step of discriminating the incorrect text parts for the learner to easily identify the incorrect text parts and displaying the input texts on the screen. According to the present invention, the learner can independently and clearly determine how English pronunciation of the learner is incorrect, thereby improving English skills through efficient correction of pronunciation.

Description

[0001] The present invention relates to a method for learning English using speech recognition,

The present invention relates to an English learning method using speech recognition. More specifically, the present invention provides an English learning method using speech recognition, which can improve the English ability through efficient pronunciation correction by providing a way to check how a learner's English pronunciation is wrong .

The importance of English is becoming more and more important with the trend of internationalization and globalization of industry. Various English teaching materials and on-line English classes are spread and opened according to this importance. We are investing a lot of time and money to study English.

In the past, this kind of English learning has been mainly focused on words and grammar, so that it can be infused unilaterally to learners. Even if they have been educated for many years, they will have abundant reading comprehension and vocabulary, There is a problem that can not be made.

Recently, there has been a growing interest in the two-way English learning method due to the problems of the one-way learning method which unilaterally transmits to the learners.

A typical example of such an interactive English learning is a method of learning by face to face with a native speaker in an English institute or by learning a telephone conversation with a native speaker. In this case, a large amount of money is spent on English learning, In addition, there is a problem that the participation in learning is extremely limited for those who are engaged in busy everyday life such as office worker because there are a lot of time and space restrictions because English learning is performed in a certain place at a fixed time .

Therefore, it has become necessary to provide an English learning program that allows the user to listen and speak English in both directions independently of time and space, and to evaluate the pronunciation. Various English learning methods using speech recognition have been developed in response to this demand .

Voice recognition or speech recognition is a processing of interpreting human speech through a computer-recognized solution and converting the contents into character data. As a conventional English learning method using such speech recognition, 10-2008-0051306 (Published Date: June 11, 2008) and 10-2011-0094769 (Published Date: Aug. 24, 2011) are disclosed.

However, most of the conventional English learning methods using the speech recognition disclosed so far merely compare the pronunciation of the inputted learner with the pronunciation of the native speaker through the speech recognition technology and provide the result to the learner only by the score, The learner can only know that his / her pronunciation is somewhat wrong, and he / she can only know that his / her pronunciation is wrong, It is impossible to accurately recognize how this is wrong.

Therefore, since the learner can not accurately recognize how his / her pronunciation is wrong, it is difficult for the learner to correct the pronunciation by himself / herself, and it takes a lot of time and the pronunciation correction is inefficient. There is a problem in that even the will and the will can be lowered together.

SUMMARY OF THE INVENTION The present invention has been proposed in order to solve the above-mentioned problems of the prior art. It is an object of the present invention to provide a speech recognition apparatus and a speech recognition method, The present invention provides an English learning method using speech recognition, which enables a learner to identify himself / herself wrongly, thereby improving the English ability of the learners themselves by effectively correcting the English pronunciation and inducing learning interest.

In order to achieve the above object, according to the present invention,

A method of learning English using speech recognition, the method comprising: a content selection step of, when a learner requests execution, requesting a learner to select an English language content to be selected from English language content stored in a DB; A step of displaying a source text corresponding to the content on a screen; a voice input step of requesting a learner to voice input the displayed original text; and a voice input step of inputting a voice of the input learner A text comparing step of comparing the generated input text and the original text to determine an incorrect text part in the input text; The learner can distinguish and display the input An English learning method using speech recognition including an input text display step of displaying text on a screen is disclosed.

Here, the incorrect text portion displayed in the input text display step may be displayed separately from at least one of displayed in a different color from the other portion of the input text, displayed in a blinking manner, and displayed in an arrow direction have.

In addition, in the input text display step, a portion of the original text matching the wrong text portion of the displayed input text is activated, and when the activated original text portion is selected from the learner, a native speech of the corresponding text portion can be output.

The method may further include a mode selection step of requesting a learner to select a desired one of a content learning mode and a content production mode as a previous step of the content selection step. When the content learning mode is selected, A step of requesting a learner to input an English language content directly produced when the content creation mode is selected, and a step of storing the inputted learner-produced English language content in a database in a manner different from the already stored standard English language content have.

The step of selecting a content may include the steps of: requesting a learner to select an English language content to be learned from among the standard English language content and a learner-made English language content; selecting, when the standard English language content is selected, . &Lt; / RTI >

Further, the method may further include determining whether the displayed English content is a dialogue sentence between the source text display step and the speech input step, and then requesting the learner to select a talker to input a speech in the case of a dialogue sentence.

According to the English learning method using speech recognition according to the present invention,

The learner can know what part of his / her English pronunciation is wrong and how he / she has wrongly pronounced the wrong part so that the learner can speak English It is possible to improve the English ability significantly by efficiently correcting the pronunciation, and it is also possible to produce the English content which is desired by the user in an efficient pronunciation correction, or to select the talker so that the learners can lead the learning by themselves. It is possible to improve the efficiency of English learning by increasing the fun.

It is to be noted that, in addition to the effects specifically described above, specific effects that can be easily derived and expected from the characteristic configuration of the present invention can also be included in the effects of the present invention.

1 is a diagram showing a schematic basic configuration of a learning system in which an English learning method of the present invention is executed,
2 is a flowchart illustrating an English learning method according to an embodiment of the present invention,
FIG. 3 is an exemplary diagram illustrating an example of the original text of a conversation sentence displayed on the screen,
FIG. 4 is an exemplary diagram illustrating an input text including a wrong text portion being displayed,
5 is an exemplary diagram illustrating an example in which the calculated learning statistics are displayed.

Hereinafter, a preferred embodiment of an English learning method using speech recognition according to the present invention will be described in detail with reference to the accompanying drawings.

In the following description of the embodiments, when it is judged that the technical characteristics of the present invention may be unnecessarily blurred as a matter of fact obviously known to those skilled in the art, such as known functions and configurations, Description thereof will be omitted.

The English learning method (hereinafter, referred to as "English learning method") using speech recognition according to the present invention is implemented and implemented as a software such as a program, an application, and the like in a learning system such as a computer, a smart phone, Before explaining a specific embodiment of an English learning method according to the present invention, a basic configuration of a learning system in which an English learning method of the present invention is executed will be briefly described first.

FIG. 1 shows a schematic basic configuration of a learning system in which an English learning method according to the present invention can be implemented. As shown in FIG. 1, a learning system includes various English language contents to be learned, native speaker voice data based on English language content, A storage means 10 such as a DB for storing information and the like, and a display means 20 for displaying the learning target English content or the like on the screen.

In addition, it includes voice output means 30 such as a speaker for outputting voice data of English content, and voice input means 40 such as a microphone for the learner to input voice.

A voice recognition processing means 50 for recognizing the voice inputted through the voice input means 40 through a voice recognition engine and a control means 60 for controlling the overall operation of the means 10 to 50 ).

FIG. 2 is a flow chart of an English learning method according to an embodiment of the present invention, which is executed in a learning system including the basic configuration described above, and the English learning method according to an embodiment of the present invention will be described in detail with reference to FIG. .

An English learning method according to the present invention is an English learning method in which when a learner requests execution of an English learning method, a learner is requested to input login information, and then, when login information is inputted from a learner, The step S100 may proceed.

When the authentication is completed through the learner authentication step S100, a mode selection step S200 for requesting the learner to select a mode is performed.

In this mode selection step S200, a learner can select two modes, that is, a content learning mode for learning English learning and a content production mode in which a learner can directly produce English content to be studied.

If the learner selects the content learning mode, the learning steps S300 to S900 described below are sequentially performed. On the other hand, if the learner selects the content production mode, the steps of inputting and storing the English language contents It proceeds.

In this content production mode, a step of requesting input of English content produced directly to the learner is performed (S210), and the learner-produced English content is input to the DB Step S220 is performed for distinguishing from the standard English content.

Here, the English content created and input by the learner through the content production mode is defined as learner-produced English content, and the English content already stored in the DB and provided by the learning system is defined as standard English content can do.

In addition, in the case of the above-mentioned standard English language contents provided, it is classified according to the sentence form such as conversation sentence, description sentence and the like, and also classified into the difficulty such as beginner, intermediate, and advanced and the first, second, And may be provided in layers according to the same level.

Through the content production mode, the learner can learn and input English content that the user is interested in, such as a movie script or a pop song, in addition to the English content provided in the learning system, thereby enabling interest and learning efficiency .

When the content production mode is completed or the content learning mode is selected in the mode selection step S200, the English language content to be selected among the various English language content stored in the DB is selected (S300).

The selection step S300 of the learning target English content is performed so that the learner can select whether to learn the standard English content from the English content stored in the DB or learn the learner-produced English content produced by the learner himself / herself, A step S310 of requesting selection of English content to be learned from the standard English content and the learner-made English content is performed.

As described above, since the standard English content can be hierarchically structured according to the degree of difficulty and the level, if the learner selects the standard English content in step S310, the step of requesting to select the difficulty level and the level (S320) do.

After completion of the selection of the learner-made English content or the difficulty level and the standard English content according to the level through the selection process, a source text display step (S400) of displaying the source text according to the selected English content is performed.

In this step S400, not only the English content selected by the learner is displayed on the screen but also the native speaker's voice of the English content is outputted according to the learner's selection.

In the case of the standard English content, the output of the native speaker voice is such that the native speaker voice data is stored in the DB in advance for each English content, and in the case of the learner-made English language content, the corresponding voice data is input together in the input step (S210) DB or a text to speech (TTS) engine may be provided in the learning system so that a voice for English language content is output.

If it is determined in step S400 that the displayed text is a dialogue sentence, the step S500 of requesting the learner to select a talker to input a speech is determined as a result of the determination Lt; / RTI >

In other words, if it is determined that the displayed target English language content is a conversation sentence, if the sentence is not a conversation sentence, the process directly proceeds to the speech input step S600 described later. In the case of the dialogue sentence, And the learner can select the communicators to learn in the dialogue sentence.

For example, FIG. 3 shows an example of displaying English content, which is a conversation sentence, in which a learner can select a role A or a role B and selectively input a sentence A and B . At this time, it can be fully understood that the sentence of the talker not selected by the learner can output the native speaker's voice.

In the case of the dialogue sentence, the learner can learn as if the actual conversation is proceeding, and the learner can select and learn the conversant.

If the English content is displayed through the above steps or if the dialogue is selected, the speech input step S600 is performed to request the learner to input the speech of the original text of the displayed English content.

If the learner inputs a voice while reading the original text displayed through the step S600, the speech recognition processing step S700 for recognizing the learner's speech inputted through the speech recognition engine proceeds.

In this speech recognition processing step S700, first, an input text generation step S710 is performed in which the speech of the learner input through the speech recognition engine is converted into text and an input text is generated.

For example, if the learner selects A in the conversation sentence exemplified in Fig. 3, the learner reads the original text " Hello, Mike. "Oh, I'm going to the library to check out some book.", "No, I do not. It's just for homework. The learner's voice is input. When the learner's voice is input, the input voice is converted into text through voice recognition engine to generate input text.

That is, unlike the original text, the input text is a text generated by converting the input speech, and when the learner's English pronunciation is wrong, the input text is converted into the text as the wrong pronunciation, and for example, In the sentence, when the learner pronounces "It's" as "each" and "for" as "pour", the input text for the original text is "No, I do not. each just pour homework. I have to write an essay. "

When the input text is generated as described above, a text comparison step S720 is performed to compare the generated input text with the original text to determine an incorrect text portion in the input text.

That is, "No, I do not. each just pour homework. When the input text is generated, it compares the input text with the original text to determine "each" and "pour" which are the text parts of the input text and the wrong text.

When the step S720 is completed, an input text display step S800 for displaying the generated input text on the screen is performed.

In this step S800, the input text may be displayed at the bottom of the original text so that the learner can directly confirm the input text generated according to the learner's input voice. Particularly, in the step S720, In such a case, such a wrong text portion can be distinguished and displayed so that the learner can easily identify it.

The distinguishing and displaying may be performed by at least one of displaying the wrong text portion in a different color from that of the other portion of the input text, displaying it to be flickered, or displaying it to be indicated by an arrow.

For example, " No, I do not. each just pour homework. Quot ;, and " each " and " pour " are judged as a wrong text portion, the input text is displayed at the bottom of the original text as shown in an example in Fig. 4 In particular, in the case of the input text for the last sentence of A, "err" and "pour" which are the wrong text parts are displayed in red different from the other parts and indicated by an arrow on the lower side of the text.

Since the wrong text part is displayed identifiably, it is easy to check which part of the original text is wrongly spoken by the learner. In particular, "each" and "pour" This makes it possible for the learner to clearly identify how the pronunciation is wrong.

In this step S800, a learner who confirms what part of the original text is pronounced and mistaken is matched with the wrong text part of the input text so that the learner who compares the pronunciation of the original text with the pronunciation of the native part of the text part, The original text portion is activated and a native speaker voice of the text portion is output when the activated original text portion is selected from the learner.

For example, as described above, when the wrong text portions in the input text are "each" and "pour", the "It's" and "for" portions of the original text matching therewith are selectively enabled, (Or, of course, the native speaker's voice can be repeatedly output each time it is selected), so that the learner follows the correct native speaker's voice and makes a false pronunciation It becomes easy to calibrate.

The learning of the English contents to be learned can be completed through the steps described above. When the learning is completed, step S900 of calculating and displaying the learning statistics through statistics on the wrong text part can be further performed. As shown in FIG. 5, for example, the learning statistics are displayed according to learning dates, so that a learner can easily check whether the learning progress according to his learning progress can be easily recognized at a glance.

As described above, according to the English learning method according to an embodiment of the present invention, according to the English learning method of the present invention, not only is the learner pronouncing 'what part' wrongly while pronouncing English content, By providing the pronunciation of 'how' wrongly, it can be understood that the learners can improve their English ability efficiently by adjusting the pronunciation of their mouth and tongue position by adjusting their pronunciation by comparing the pronunciation of the native speaker with the pronunciation of the native speaker.

In addition, as described above, the above-described English learning method of the present invention can be specifically implemented as a program that can be executed in a computer or a language learning machine, or an application that can be downloaded from a server via data communication and can be executed in a user terminal such as a smart phone Can be fully understood.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It is to be understood that the equivalent construction is not limited to the scope of the technical idea of the present invention.

Claims

As an English learning method using speech recognition,
A content selection step of requesting a learner to select an English language learning content from among English language content stored in a DB when execution is requested from a learner;
A source text display step of displaying a source text corresponding to the selected English content on a screen when the learner selects the target English language content;
A voice input step of requesting a learner to voice input the displayed original text;
An input text generation step of, when a learner inputs a voice, converting the voice of the input learner into text through a speech recognition engine to generate an input text;
Comparing the generated input text with the original text to determine an incorrect text portion in the input text; And
And displaying an input text on a screen by separately displaying the wrong text portion so as to be identifiable by a learner.

The method according to claim 1,
Wherein the incorrect text portion displayed in the input text display step is distinguished from at least one of a color displayed in a different color from that of the other portion of the input text, English learning method using speech recognition.

The method according to claim 1,
Wherein the input text display step activates a portion of the original text that matches the incorrect text portion of the displayed input text and outputs a native speech of the text portion when the activated original text portion is selected from the learner, English learning method using.

4. The method according to any one of claims 1 to 3,
And a mode selecting step of requesting a learner to select a desired mode from among a content learning mode and a content production mode as a previous step of the content selection step,
If the content learning mode is selected, the process proceeds to the content selection step,
A step of requesting a learner to input an English language content directly created when the content creation mode is selected; and a step of storing the inputted learner-produced English language content in a database in a manner different from already stored standard English language content English learning method using speech recognition.

5. The method of claim 4,
The content selection step may include:
Requesting a learner to select English language content to be learned from the standard English language content and learner-made English language content;
And requesting the learner to select difficulty level and level of English content when the standard English content is selected.

6. The method of claim 5,
Determining whether the displayed English content is a conversation sentence between the source text display step and the voice input step;
And requesting the learner to select a speaker to input a voice in the case of a dialogue sentence.