CN109035896B

CN109035896B - Oral training method and learning equipment

Info

Publication number: CN109035896B
Application number: CN201810918494.XA
Authority: CN
Inventors: 徐杨
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2018-08-13
Filing date: 2018-08-13
Publication date: 2021-11-05
Anticipated expiration: 2038-08-13
Also published as: CN109035896A

Abstract

A spoken language training method and learning equipment comprise: outputting a voice question, and acquiring a target voice answer matched with the voice question and input by a user according to the voice question; extracting target text information and target pronunciation information included in the target voice answer; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer; and outputting the next voice question according to the target text information and the target pronunciation information. By implementing the embodiment of the invention, the spoken language training effect can be improved.

Description

Oral training method and learning equipment

Technical Field

The invention relates to the technical field of spoken language training, in particular to a spoken language training method and learning equipment.

Background

At present, the learning of language subjects becomes a big difficulty for learners, for example, when students learn english subjects, they often solve problems only by learning words and reciting grammar knowledge points, but do not pay attention to hearing and spoken language training, thereby causing the phenomenon of "dumb english".

In order to reduce the occurrence of the phenomenon of "dumb english", electronic devices (such as a family education machine) for training spoken language appear on the market, and the way of performing spoken language training by the electronic devices for training spoken language is as follows: receiving spoken information input by a user-scoring the spoken information.

In practice, the spoken language training mode can only realize the training of asking a question and answering a question, and is not in line with the conversation mode of multiple interactions in real life, so that the spoken language training effect is poor.

Disclosure of Invention

The embodiment of the invention discloses a spoken language training method and learning equipment, which can improve the spoken language training effect.

The first aspect of the embodiments of the present invention discloses a spoken language training method, including:

outputting a voice question, and acquiring a target voice answer matched with the voice question and input by a user according to the voice question;

extracting target text information and target pronunciation information included in the target voice answer; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer;

and outputting the next voice question according to the target text information and the target pronunciation information.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the outputting a next speech question according to the target text information and the target pronunciation information includes:

determining standard answer information matched with the voice question;

judging whether the target text information is matched with the text information included in the standard answer information or not, and judging whether the target pronunciation information meets a preset pronunciation standard or not;

when the target text information is judged to be not matched with the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard, outputting a next voice question according to the text information included in the target text information and the standard answer information;

when the target text information is judged to be matched with the text information included in the standard answer information and the target pronunciation information does not accord with the preset pronunciation standard, outputting a next voice question according to the target pronunciation information and the preset pronunciation standard;

and when the target text information is judged to be not matched with the text information included in the standard answer information and the target pronunciation information is judged not to accord with the preset pronunciation standard, outputting the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information and the preset pronunciation standard.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the outputting a next voice question according to text information included in the target text information and the standard answer information includes:

determining first target semantic knowledge point information according to the target text information and text information included in the standard answer information;

and outputting the next voice question based on the first target semantic knowledge point information.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the outputting a next speech question according to the target pronunciation information and the preset pronunciation criteria includes:

determining first target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard;

and outputting the next voice question based on the first target pronunciation knowledge point information.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the outputting a next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation standard includes:

determining second target semantic knowledge point information according to the text information included in the target text information and the standard answer information, and determining second target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard;

and outputting the next voice question based on the second target semantic knowledge point information and the second target pronunciation knowledge point information.

A second aspect of the embodiments of the present invention discloses a learning apparatus, including:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for outputting a voice question and acquiring a target voice answer matched with the voice question and input by a user according to the voice question;

the extraction unit is used for extracting target text information and target pronunciation information included in the target voice answer; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer;

and the output unit is used for outputting the next voice question according to the target text information and the target pronunciation information.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, the output unit includes:

the determining subunit is used for determining standard answer information matched with the voice question;

the judging subunit is used for judging whether the target text information is matched with the text information included in the standard answer information or not and judging whether the target pronunciation information meets a preset pronunciation standard or not;

a first output subunit, configured to output a next voice question according to the text information included in the target text information and the standard answer information when the determining subunit determines that the target text information is not matched with the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard;

the second output subunit is configured to output a next voice question according to the target pronunciation information and the preset pronunciation standard when the judging subunit judges that the target text information matches the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard;

and the third output subunit is configured to output a next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation standard when the judgment subunit judges that the target text information is not matched with the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, a manner that the first output subunit is configured to output the next voice question according to the text information included in the target text information and the standard answer information is specifically:

the first output subunit is configured to determine first target semantic knowledge point information according to the target text information and text information included in the standard answer information; and outputting the next voice question based on the first target semantic knowledge point information.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, a manner that the second output subunit is configured to output the next speech question according to the target pronunciation information and the preset pronunciation criterion is specifically:

the second output subunit is used for determining first target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard; and outputting the next voice question based on the first target pronunciation knowledge point information.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, a manner that the third output subunit is configured to output the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation standard is specifically:

the third output subunit is configured to determine second target semantic knowledge point information according to the target text information and the text information included in the standard answer information, and determine second target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard; and outputting the next voice question based on the second target semantic knowledge point information and the second target pronunciation knowledge point information.

A third aspect of an embodiment of the present invention discloses a learning apparatus, including:

a memory storing executable program code;

a processor coupled with the memory;

the processor calls the executable program code stored in the memory to execute the speech training method disclosed by the first aspect of the embodiment of the invention.

A fourth aspect of the embodiments of the present invention discloses a computer-readable storage medium, which stores a computer program, wherein the computer program enables a computer to execute the speech training method disclosed in the first aspect of the embodiments of the present invention.

A fifth aspect of embodiments of the present invention discloses a computer program product, which, when run on a computer, causes the computer to perform some or all of the steps of any one of the methods of the first aspect.

A sixth aspect of the present embodiment discloses an application publishing platform, where the application publishing platform is configured to publish a computer program product, where the computer program product is configured to, when running on a computer, cause the computer to perform part or all of the steps of any one of the methods in the first aspect.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, after the voice question is output and the target voice answer matched with the voice question and input by the user according to the voice of the voice question is received, on the basis of realizing single question and single answer, the next voice question is further output according to the target text information and the target pronunciation information included in the target voice answer input by the user, so that a multi-time interactive conversation mode is formed, the conversation mode in real life is more met, and the spoken language training effect is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic flow chart illustrating a method for spoken language training according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of another spoken language training method disclosed in the embodiments of the present invention;

FIG. 3 is a schematic flow chart of another spoken language training method disclosed in the embodiments of the present invention;

FIG. 4 is a schematic structural diagram of a learning device according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of another learning device disclosed in the embodiment of the present invention;

FIG. 6 is a schematic structural diagram of another learning device disclosed in the embodiment of the present invention;

fig. 7 is a schematic structural diagram of another learning device disclosed in the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

The embodiment of the invention discloses a spoken language training method and learning equipment, which can improve the spoken language training effect, wherein the learning equipment can comprise various learning equipment such as a family education machine, a learning mobile phone, a learning tablet and the like, and the embodiment of the invention is not limited. The operating systems of various learning devices may include, but are not limited to, an Android operating system, an IOS operating system, a Symbian operating system, a blackberry operating system, a Windows Phone8 operating system, and the like, which is not limited in the embodiments of the present invention. The following are detailed below.

Example one

Referring to fig. 1, fig. 1 is a schematic flow chart of a spoken language training method according to an embodiment of the present invention. As shown in fig. 1, the spoken language training method may include the steps of:

101. the learning equipment outputs the voice question and acquires a target voice answer matched with the voice question and input by the user according to the voice of the voice question.

In the embodiment of the present invention, the speech problem output by the learning device may be a preset spoken english training problem, a preset spoken english training problem of another language, or a preset non-language problem for learning, which is not limited in the embodiment of the present invention. After the learning device outputs the voice question, a target voice answer matched with the voice question input by the user according to the voice question can be acquired.

As an alternative embodiment, the learning device obtaining the target voice answer matched with the voice question and input by the user according to the voice question by the user may include:

after the learning equipment outputs the voice question, detecting whether voice information is received or not;

when voice information is received, the learning equipment determines the voiceprint characteristics of the voice information and judges whether the voiceprint characteristics are matched with the preset voiceprint characteristics or not;

when the voiceprint feature is judged to be matched with the preset voiceprint feature, the learning equipment judges whether the voice information contains a voice keyword matched with the voice question;

and when the voice information is judged to contain the voice keyword matched with the voice question, the learning equipment determines the voice information as a target voice answer matched with the voice question.

By implementing the optional implementation mode, only the voice information matched with the preset voiceprint feature can be recognized, and the voice information input by other people is prevented from interfering when various voice information such as the voice information input by the user and the voice information input by other people is detected, so that the reliability of recognizing the voice information input by the user is improved. Furthermore, the voice information input by the user may be determined as the target voice answer matching the voice question only when the voice information includes the voice keyword matching the voice question, for example, when the voice question is "How you know what means" when the user asks a person next to the voice question after hearing the voice question, "and at this time, the learning device detects the voice information input by the user" does you know what means "but does not determine the voice information as the target voice answer matching the voice question because the voice information does not include the keyword (such as" I ") matching the voice question. The process can further enhance the reliability of obtaining the target voice answer matched with the voice question and reduce the probability of obtaining the wrong target voice answer.

102. The learning equipment extracts target text information and target pronunciation information included in the target voice answer; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer.

In the embodiment of the present invention, when the answer to the target voice is "I am fine" in a voice form, the target text information is "I am fine" in a text form, and the target pronunciation information may include pronunciation information of each word in "I am fine", such as first pronunciation information of "I", "second pronunciation information of" am ", and third pronunciation information of" fine ".

103. The learning device outputs the next voice question according to the target text information and the target pronunciation information.

In the embodiment of the invention, the learning device can judge whether the target voice answer input by the user aiming at the voice question is correct or not according to the target text information, can judge whether the pronunciation of the target voice answer input by the user is standard or not according to the target pronunciation information, and outputs the next voice question by combining the answering condition of the user and the pronunciation condition of the user, thereby achieving the aim of pertinence consolidation.

For example, when the voice question output by the learning device is "How you", a target voice answer input by the user is obtained, when the target voice answer input by the user is "I am fine", it is determined that the target voice answer input by the user matches the voice question according to the target text information of the target voice answer, and if a certain word of the target voice answer input by the user is not pronunciation-standard (for example, pronunciation-not-standard of "fine"), a next question including the certain word is output (for example, "What a fine day | What not go for a picnic"), so that the user learns the correct pronunciation of "fine" in the sneak-mered; when the target voice answer input by the user is "I am 7 years old", if the pronunciation of each word of the target voice answer input by the user is very standard, but it is determined from the target text information of the target voice answer that the target voice answer input by the user does not match the voice question, a next question (for example, "How are you am fine, and you.

It can be seen that, by implementing the spoken language training method described in fig. 1, after outputting a voice question and receiving a target voice answer matched with the voice question and input by a user according to the voice of the voice question, on the basis of realizing single question and single answer, a next voice question is further output according to target text information and target pronunciation information included in the target voice answer input by the user, so that a multi-interactive conversation mode is formed, which is more suitable for a conversation mode in real life, thereby improving the spoken language training effect.

Example two

Referring to fig. 2, fig. 2 is a schematic flow chart of another spoken language training method according to an embodiment of the present invention. As shown in fig. 2, the spoken language training method may include the steps of:

201. the learning equipment outputs the voice question and acquires a target voice answer matched with the voice question and input by the user according to the voice of the voice question.

202. The learning equipment extracts target text information and target pronunciation information included in the target voice answer; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer.

203. The learning device determines standard answer information matched with the voice question.

204. The learning device determines whether the target text information matches the text information included in the standard answer information and whether the target pronunciation information meets a preset pronunciation standard, if it is determined that the target text information does not match the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard, step 205 is performed, if it is determined that the target text information matches the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard, step 206 is performed, and if it is determined that the target text information does not match the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard, step 207 is performed.

In the embodiment of the invention, when the spoken language training method is applied to oral english language training, the preset pronunciation standard may include a pronunciation corresponding to an english word in a large number of english words, and optionally, the pronunciation corresponding to each english word may include an english pronunciation corresponding to the english word and an american pronunciation corresponding to the english word, so as to meet various learning requirements of a user.

205. The learning device outputs the next voice question based on the text information included in the target text information and the standard answer information.

206. And the learning equipment outputs the next voice question according to the target pronunciation information and the preset pronunciation standard.

207. And the learning equipment outputs the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information and the preset pronunciation standard.

It can be seen that, by implementing the spoken language training method described in fig. 2, after outputting a voice question and receiving a target voice answer matched with the voice question and input by a user according to the voice of the voice question, on the basis of realizing single question and single answer, a next voice question is further output according to target text information and target pronunciation information included in the target voice answer input by the user, so that a multi-interactive conversation mode is formed, which is more suitable for a conversation mode in real life, thereby improving the spoken language training effect.

In addition, by implementing the spoken language training method described in fig. 2, when the target text information included in the target speech answer is not matched with the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard, the next speech question is output according to the text information included in the target text information and the standard answer information, so that when there is no question in the pronunciation of the user but there is a deviation in the understanding of the speech question by the user, the next speech question can be output in a targeted manner in combination with the deviation in the understanding of the speech question by the user; when the target text information included in the target voice answer is matched with the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard, outputting the next voice question according to the target pronunciation information and the preset pronunciation standard, so that when the user pronounces a problem but the user understands the voice question without a problem, the next voice question can be output in a targeted manner by combining the problems existing in the pronunciation of the user; when the target text information included in the target voice answer is not matched with the text information included in the standard answer information and the target pronunciation information does not accord with the preset pronunciation standard, outputting the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information and the preset pronunciation standard, so that when the user pronounces a question and the user has a deviation in understanding the voice question, the next voice question can be output in a targeted manner by combining the question existing in user pronunciation and the deviation in understanding the voice question by the user. The process can output the next voice question aiming at the question of the target voice answer input by the user, thereby ensuring that the output auxiliary learning effect of the next voice question is better.

EXAMPLE III

Referring to fig. 3, fig. 3 is a schematic flow chart of another spoken language training method according to an embodiment of the present invention. As shown in fig. 3, the spoken language training method may include the steps of:

301. the learning equipment outputs the voice question and acquires a target voice answer matched with the voice question and input by the user according to the voice of the voice question.

302. The learning equipment extracts target text information and target pronunciation information included in the target voice answer; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer.

303. The learning device determines standard answer information matched with the voice question.

304. The learning apparatus determines whether the target text information matches the text information included in the standard answer information and whether the target pronunciation information meets a preset pronunciation standard, performs steps 305 to 306 when it is determined that the target text information does not match the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard, performs steps 307 to 308 when it is determined that the target text information matches the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard, and performs steps 309 to 310 when it is determined that the target text information does not match the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard.

305. The learning device determines first target semantic knowledge point information according to text information included in the target text information and the standard answer information.

As an alternative embodiment, the determining, by the learning device, the first target semantic knowledge point information according to the text information included in the target text information and the standard answer information may include:

when the learning equipment detects that the target text information has grammar errors, the learning equipment determines grammar knowledge points corresponding to the grammar errors of the target text information;

when the learning equipment detects that the target text information has vocabulary errors, the learning equipment determines vocabulary knowledge points corresponding to the vocabulary errors of the target text information;

when the learning equipment detects that the matching degree of the target text information and the text information included in the standard answer information is lower than the preset matching degree, the learning equipment determines the problem corresponding to the target text information;

the learning device determines the first target semantic knowledge point information according to the semantic knowledge points, the vocabulary knowledge points, the questions corresponding to the target text information, the voice questions and the standard answer information corresponding to the voice questions.

For example, when the target text information is "I is 7 year old", the voice question is "How you", and the standard answer information corresponding to the voice question is "I am fine", it is detected that there is a grammar error in the target text information, it is determined that a grammar knowledge point corresponding to the grammar error is a major collocation, it is detected that there is a vocabulary error in the target text information, it is determined that a vocabulary knowledge point corresponding to the vocabulary error is "yes old", it is detected that a matching degree of text information included in the target text information and the standard answer information is lower than a preset matching degree, it is determined that the question corresponding to the target text information is "How you", the learning apparatus combines the grammar knowledge point, the vocabulary knowledge point, the question corresponding to the target text information, the voice question, and the standard answer information corresponding to the voice question, and determines that the first target semantic knowledge point information includes a major collocation predicate, "year old", "How old are you" and "How are you, I am fine".

By implementing the optional implementation manner, the first target semantic knowledge point information determined according to the text information included in the target text information and the standard answer information includes not only the grammar knowledge point corresponding to the grammar error existing in the target text information, the vocabulary knowledge point corresponding to the vocabulary error existing in the target text information, and the question corresponding to the target text information, but also the voice question and the standard answer information corresponding to the voice question, so that the next voice question is output according to the content included in the first target semantic knowledge point information, and the next voice question plays a greater role in helping the learning of the user.

306. And the learning equipment outputs the next voice question based on the first target semantic knowledge point information.

In the embodiment of the present invention, the learning device may specifically output the next speech question based on the first target semantic knowledge point information by: the learning device outputs a next speech question containing first target semantic knowledge point information. For example, when the first target semantic knowledge point information includes a dominating collocation, "yes ols", "How old are you", and "How are you, I am fine", the output next speech question may be "How is he is fine, How is you are he is 7 yes ols, How old are you? ".

307. The learning device determines first target pronunciation knowledge point information according to the target pronunciation information and a preset pronunciation standard.

As an alternative embodiment, the learning device determining the first target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation criteria may include:

the learning equipment acquires pronunciation information corresponding to each word in the target pronunciation information;

the learning equipment compares the pronunciation information corresponding to each word with standard pronunciation information corresponding to the word in a preset pronunciation standard to obtain a comparison score corresponding to each word, wherein the higher the comparison score is, the more standard the pronunciation of the word is;

the learning equipment determines a plurality of words with the comparison scores lower than the preset comparison score from each word, and determines first target pronunciation knowledge point information containing the words.

For example, when the target pronunciation information is "I am fine", the learning device acquires pronunciation information of each word in "I am fine", such as first pronunciation information of "I", second pronunciation information of "am", and third pronunciation information of "fine", compares the first pronunciation information with standard pronunciation information of "I" in a preset pronunciation standard, obtains a first comparison score (e.g. 70 points, full score of 100 points) corresponding to "I", compares the second pronunciation information with standard pronunciation information of "am" in the preset pronunciation standard, obtains a second comparison score (e.g. 60 points, full score of 100 points) corresponding to "am", compares the third pronunciation information with standard pronunciation information of "fine" in the preset pronunciation standard, obtains a third comparison score (e.g. 30 points, full score of 100 points) corresponding to "fine", selects a word "fine" having a comparison score lower than 60 points out of the three words, the pronunciation information is used as the first target pronunciation knowledge point information so as to output the next voice question containing the first target pronunciation knowledge point information in the following, thus leading the user to know the pronunciation of 'fine' more deeply and improving the learning efficiency.

308. The learning equipment outputs the next voice question based on the first target pronunciation knowledge point information.

In the embodiment of the present invention, the learning device may specifically output the next speech question based on the first target pronunciation knowledge point information: the learning device outputs a next phonetic question containing information of the first target pronunciation knowledge point. When the first target pronunciation knowledge point information is "fine", the next phonetic problem may be "I am fine, thank you, and you? ".

309. The learning device determines second target semantic knowledge point information according to text information included in the target text information and the standard answer information, and determines second target pronunciation knowledge point information according to the target pronunciation information and a preset pronunciation standard.

As an alternative embodiment, the learning device determining the second target semantic knowledge point information according to the text information included in the target text information and the standard answer information, and determining the second target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard may include:

the learning equipment determines second target semantic knowledge point information by taking the semantic knowledge points, the vocabulary knowledge points, the questions corresponding to the target text information, the voice questions and the standard answer information corresponding to the voice questions as the basis;

the learning equipment determines a plurality of words with the comparison scores lower than the preset comparison score from each word, and determines second target pronunciation knowledge point information containing the plurality of words.

For example, when the target text information is "I is 7 year old", the voice question is "How you", and the standard answer information corresponding to the voice question is "I am fine", it is detected that there is a grammar error in the target text information, it is determined that a grammar knowledge point corresponding to the grammar error is a major collocation, it is detected that there is a vocabulary error in the target text information, it is determined that a vocabulary knowledge point corresponding to the vocabulary error is "yes old", it is detected that a matching degree of text information included in the target text information and the standard answer information is lower than a preset matching degree, it is determined that the question corresponding to the target text information is "How you", the learning apparatus combines the grammar knowledge point, the vocabulary knowledge point, the question corresponding to the target text information, the voice question, and the standard answer information corresponding to the voice question, and determines that the second target semantic knowledge point information includes a major collocation predicate, "year old", "How old are you" and "How are you, I am fine". When the target pronunciation information is "I is 7 year old" in a speech form, the learning device obtains pronunciation information of each word in the "I is 7 year old", such as first pronunciation information of the "I", second pronunciation information of the "is", third pronunciation information of the "7", fourth pronunciation information of the "year", and fifth pronunciation information of the "old", compares the first pronunciation information with standard pronunciation information of the "I" in a preset pronunciation standard to obtain a first comparison score (e.g. 70 points, full points 100 points) corresponding to the "I", compares the second pronunciation information with standard pronunciation information of the "is" in the preset pronunciation standard to obtain a second comparison score (e.g. 60 points, full points 100 points) corresponding to the "is", compares the third pronunciation information with standard pronunciation information of the "7" in the preset pronunciation standard to obtain a third comparison score (e.g. 80 points, full score of 100), comparing the fourth pronunciation information with standard pronunciation information of "year" in the preset pronunciation standard to obtain a fourth comparison score (for example, 40 score, full score of 100 score) corresponding to "year", comparing the fifth pronunciation information with standard pronunciation information of "old" in the preset pronunciation standard to obtain a fifth comparison score (for example, 30 score, full score of 100 score) corresponding to "old", selecting words "years" and "old" with comparison scores lower than 60 score from the five words by the learning device, and taking the words as second target pronunciation knowledge point information.

310. And the learning equipment outputs the next voice question by taking the second target semantic knowledge point information and the second target pronunciation knowledge point information as the basis.

In the embodiment of the present invention, the way for the learning device to output the next speech question based on the second target semantic knowledge point information and the second target pronunciation knowledge point information is specifically: the learning device outputs a next speech question including second target semantic knowledge point information and second target pronunciation knowledge point information, and when the second target semantic knowledge point information includes a collocation of the primary and the secondary, the second target semantic knowledge point information includes "years old", "How old are you", and "How are you, I am fine", and the second target pronunciation knowledge point information is "years" and "old", the next speech question may be "How is you, How you? how much old is he is 7 years old, how much old are you? "where" year "and" old "in the next phonetic question are output at a higher volume to emphasize the second target pronunciation knowledge point information.

It can be seen that, by implementing the spoken language training method described in fig. 3, after outputting a voice question and receiving a target voice answer matched with the voice question and input by a user according to the voice of the voice question, on the basis of realizing single question and single answer, a next voice question is further output according to target text information and target pronunciation information included in the target voice answer input by the user, so that a multi-interactive conversation mode is formed, which is more suitable for a conversation mode in real life, thereby improving the spoken language training effect.

In addition, by implementing the spoken language training method described in fig. 3, the next speech question can be output for the question of the target speech answer input by the user, so that the output-aided learning effect of the next speech question is better.

Example four

Referring to fig. 4, fig. 4 is a schematic structural diagram of a learning device according to an embodiment of the present invention. As shown in fig. 4, the learning apparatus 400 may include an acquisition unit 401, an extraction unit 402, and an output unit 403, in which:

the obtaining unit 401 is configured to output a voice question, and obtain a target voice answer that is input by a user according to the voice question and matches the voice question.

As an alternative implementation, the acquiring unit 401 may acquire the target voice answer matched with the voice question and input by the user according to the voice question by the user by:

after outputting the voice question, the obtaining unit 401 detects whether the voice message is received;

when receiving voice information, the obtaining unit 401 determines a voiceprint feature of the voice information, and determines whether the voiceprint feature matches a preset voiceprint feature;

when the voiceprint feature is determined to match the preset voiceprint feature, the obtaining unit 401 determines whether the voice message includes a voice keyword matching the voice question;

when it is determined that the voice information includes a voice keyword matching the voice question, the obtaining unit 401 determines the voice information as a target voice answer matching the voice question.

An extracting unit 402, configured to extract target text information and target pronunciation information included in the target voice answer acquired by the acquiring unit 401; the target text information is text information corresponding to the target voice answer, and the target pronunciation information is pronunciation information corresponding to the target voice answer.

An output unit 403, configured to output a next speech question according to the target text information and the target pronunciation information extracted by the extraction unit 402.

It can be seen that, by implementing the learning device described in fig. 4, after outputting a voice question and receiving a target voice answer matched with the voice question and input by a user according to the voice of the voice question, on the basis of realizing single question and single answer, a next voice question is further output according to target text information and target pronunciation information included in the target voice answer input by the user, so that a multi-interactive conversation mode is formed, which is more suitable for a conversation mode in real life, and thus, the spoken language training effect is improved.

EXAMPLE five

Referring to fig. 5, fig. 5 is a schematic structural diagram of another learning apparatus according to an embodiment of the present invention. In comparison with the learning apparatus 400 shown in fig. 4, in the learning apparatus 400 shown in fig. 5, in which the learning apparatus 400 shown in fig. 5 is optimized by the learning apparatus 400 shown in fig. 4, the output unit 403 includes:

the determination subunit 4031 is used to determine standard answer information matched with the voice question.

A determining subunit 4032, configured to determine whether the target text information matches the text information included in the standard answer information determined by the determining subunit 4031, and determine whether the target pronunciation information meets a preset pronunciation standard.

A first output subunit 4033, configured to output the next speech question according to the text information included in the target text information and the standard answer information when the determining subunit 4032 determines that the target text information does not match the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard.

A second output subunit 4034, configured to output the next voice question according to the target pronunciation information and the preset pronunciation standard when the determining subunit 4032 determines that the target text information matches the text information included in the standard answer information and the target pronunciation information does not meet the preset pronunciation standard.

A third output subunit 4035, configured to output the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation standard when the determining subunit 4032 determines that the target text information and the text information included in the standard answer information do not match and the target pronunciation information does not meet the preset pronunciation standard.

It can be seen that, by implementing the learning device described in fig. 5, after outputting a voice question and receiving a target voice answer matched with the voice question and input by a user according to the voice of the voice question, on the basis of realizing a single question and single answer, a next voice question is further output according to target text information and target pronunciation information included in the target voice answer input by the user, so that a multi-interactive conversation mode is formed, which is more suitable for a conversation mode in real life, and thus, the spoken language training effect is improved.

In addition, by implementing the learning apparatus described in fig. 5, a next voice question can be output for a question that a target voice answer input by a user exists, so that the output assistant learning effect of the next voice question is better.

EXAMPLE six

Referring to fig. 6, fig. 6 is a schematic structural diagram of another learning apparatus according to an embodiment of the present invention. In comparison with the learning apparatus 400 shown in fig. 5, in the learning apparatus 400 shown in fig. 6, a manner that the first output subunit 4033 is used to output the next speech question according to the text information included in the target text information and the standard answer information in the learning apparatus 400 shown in fig. 6 is specifically that:

a first output subunit 4033, configured to determine first target semantic knowledge point information according to text information included in the target text information and the standard answer information; and outputting the next voice question based on the first target semantic knowledge point information.

As an alternative implementation, the determining, by the first output subunit 4033, the first target semantic knowledge point information according to the text information included in the target text information and the standard answer information may include:

when the first output subunit 4033 detects that the target text information has a grammar error, the first output subunit 4033 determines a grammar knowledge point corresponding to the grammar error of the target text information;

when the first output subunit 4033 detects that the vocabulary error exists in the target text information, the first output subunit 4033 determines the vocabulary knowledge point corresponding to the vocabulary error existing in the target text information;

when the first output subunit 4033 detects that the matching degree of the target text information and the text information included in the standard answer information is lower than the preset matching degree, the first output subunit 4033 determines a question corresponding to the target text information;

the first output subunit 4033 determines the first target semantic knowledge point information based on the above-mentioned grammar knowledge points, vocabulary knowledge points, questions corresponding to the target text information, voice questions, and standard answer information corresponding to the voice questions.

In this embodiment of the present invention, the first output subunit 4033 uses the first target semantic knowledge point information as a basis, and a way of outputting the next speech question may specifically be: the first output subunit 4033 outputs the next speech question containing the first target semantic knowledge point information. For example, when the first target semantic knowledge point information includes a dominating collocation, "yes ols", "How old are you", and "How are you, I am fine", the output next speech question may be "How is he is fine, How is you are he is 7 yes ols, How old are you? ".

Optionally, the manner that the second output subunit 4034 is used to output the next speech question according to the target pronunciation information and the preset pronunciation standard specifically is:

a second output subunit 4034, configured to determine the first target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation criteria; and outputting the next voice question based on the first target pronunciation knowledge point information.

As an alternative implementation, the determining, by the second output subunit 4034, the first target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation criteria may include:

the second output subunit 4034 acquires pronunciation information corresponding to each word in the target pronunciation information;

the second output subunit 4034 compares the pronunciation information corresponding to each word with the standard pronunciation information corresponding to the word in the preset pronunciation standard to obtain a comparison score corresponding to each word, wherein a higher comparison score indicates a more standard pronunciation of the word;

the second output subunit 4034 determines a plurality of words with a comparison score lower than the preset comparison score from each of the words, and determines the first target pronunciation knowledge point information including the plurality of words.

In this embodiment of the present invention, the second output subunit 4034 uses the first target pronunciation knowledge point information as a basis, and a way of outputting the next speech question may specifically be: the second output subunit 4034 outputs the next speech question containing the first target pronunciation knowledge point information. When the first target pronunciation knowledge point information is "fine", the next phonetic problem may be "I am fine, thank you, and you? ".

Further optionally, the manner that the third output subunit 4035 is configured to output the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation standard specifically is:

a third output subunit 4035, configured to determine second target semantic knowledge point information according to the text information included in the target text information and the standard answer information, and determine second target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard; and outputting the next voice question based on the second target semantic knowledge point information and the second target pronunciation knowledge point information.

As an alternative implementation, the third output subunit 4035 may determine the second target semantic knowledge point information according to the text information included in the target text information and the standard answer information, and determine the second target pronunciation knowledge point information according to the target pronunciation information and the preset pronunciation standard may include:

when the third output subunit 4035 detects that the target text information has a grammar error, the third output subunit 4035 determines a grammar knowledge point corresponding to the grammar error of the target text information;

when the third output subunit 4035 detects that there is a vocabulary error in the target text information, the third output subunit 4035 determines a vocabulary knowledge point corresponding to the vocabulary error in the target text information;

when the third output subunit 4035 detects that the matching degree between the target text information and the text information included in the standard answer information is lower than the preset matching degree, the third output subunit 4035 determines a question corresponding to the target text information;

the third output subunit 4035 determines the second target semantic knowledge point information based on the above-mentioned semantic knowledge points, lexical knowledge points, questions corresponding to the target text information, voice questions, and standard answer information corresponding to the voice questions;

the third output subunit 4035 obtains pronunciation information corresponding to each word in the target pronunciation information;

the third output subunit 4035 compares the pronunciation information corresponding to each word with the standard pronunciation information corresponding to the word in the preset pronunciation standard to obtain a comparison score corresponding to each word, wherein a higher comparison score indicates a more standard pronunciation of the word;

the third output subunit 4035 determines a plurality of words with a comparison score lower than the preset comparison score from each of the words, and determines second target pronunciation knowledge point information including the plurality of words.

It can be seen that, by implementing the learning device described in fig. 6, after outputting a voice question and receiving a target voice answer matched with the voice question and input by a user according to the voice of the voice question, on the basis of realizing a single question and single answer, a next voice question is further output according to target text information and target pronunciation information included in the target voice answer input by the user, so that a multi-interactive conversation mode is formed, which is more suitable for a conversation mode in real life, and thus, the spoken language training effect is improved.

In addition, by implementing the learning apparatus described in fig. 6, a next voice question can be output for a question that a target voice answer input by a user exists, so that the output-aided learning effect of the next voice question is better.

EXAMPLE seven

Referring to fig. 7, fig. 7 is a schematic structural diagram of another learning apparatus according to an embodiment of the present invention. As shown in fig. 7, the learning apparatus may include:

a memory 701 in which executable program code is stored;

a processor 702 coupled to the memory 701;

the processor 702 calls the executable program code stored in the memory 701 to execute any one of the spoken language training methods shown in fig. 1 to 3.

An embodiment of the present invention discloses a computer-readable storage medium storing a computer program, wherein the computer program enables a computer to execute any one of the spoken language training methods shown in fig. 1 to 3.

Embodiments of the present invention also disclose a computer program product, wherein, when the computer program product is run on a computer, the computer is caused to execute part or all of the steps of the method as in the above method embodiments.

The embodiment of the present invention also discloses an application publishing platform, wherein the application publishing platform is used for publishing a computer program product, and when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of the method in the above method embodiments.

In various embodiments of the present invention, it should be understood that the sequence numbers of the above-mentioned processes do not imply an inevitable order of execution, and the execution order of the processes should be determined by their functions and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

In the embodiments provided herein, it should be understood that "B corresponding to A" means that B is associated with A from which B can be determined. It should also be understood, however, that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, can be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the above-described method of each embodiment of the present invention.

It will be understood by those skilled in the art that all or part of the steps in the methods of the embodiments described above may be implemented by hardware instructions of a program, and the program may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM), or other Memory, such as a magnetic disk, or a combination thereof, A tape memory, or any other medium readable by a computer that can be used to carry or store data.

The above detailed description is given to a speech training method and learning device disclosed in the embodiments of the present invention, and specific examples are applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for spoken language training, comprising:

outputting a voice question, judging whether voiceprint features of voice information are matched with preset voiceprint features or not when the voice information is received, judging whether voice keywords matched with the voice question are contained in the voice information or not if the voiceprint features of the voice information are matched with the preset voiceprint features, and determining the voice information as a target voice answer matched with the voice question if the voice keywords are contained in the voice information;

determining standard answer information matched with the voice question; judging whether the target text information is matched with the text information included in the standard answer information or not, and judging whether the target pronunciation information meets a preset pronunciation standard or not;

when the target text information is judged to be not matched with the text information included in the standard answer information and the target pronunciation information accords with the preset pronunciation standard, when a grammar error of the target text information is detected, a grammar knowledge point corresponding to the grammar error of the target text information is determined, when a vocabulary error of the target text information is detected, a vocabulary knowledge point corresponding to the vocabulary error of the target text information is determined, and when the matching degree of the target text information and the text information included in the standard answer information is detected to be lower than the preset matching degree, a question corresponding to the target text information is determined; determining first target semantic knowledge point information by taking the grammar knowledge points, the vocabulary knowledge points, the questions corresponding to the target text information, the voice questions and the standard answer information as basis; outputting a next voice question containing the first target semantic knowledge point information;

2. The method according to claim 1, wherein outputting the next phonetic question according to the target pronunciation information and the preset pronunciation criteria comprises:

3. The method according to any one of claims 1 to 2, wherein outputting the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation standard comprises:

4. A learning device, comprising:

the output unit is used for outputting a next voice question according to the target text information and the target pronunciation information;

the acquiring of the target voice answer matched with the voice question and input by the user according to the voice question comprises:

when voice information is received, judging whether voiceprint features of the voice information are matched with preset voiceprint features, if so, judging whether voice keywords matched with the voice question are contained in the voice information, and if so, determining the voice information as a target voice answer matched with the voice question;

the output unit includes:

a first output subunit, configured to, when the determining subunit determines that the target text information is not matched with the text information included in the standard answer information and the target pronunciation information meets the preset pronunciation standard, determine, when it is detected that there is a grammar error in the target text information, a grammar knowledge point corresponding to the grammar error in the target text information, when it is detected that there is a vocabulary error in the target text information, a vocabulary knowledge point corresponding to the vocabulary error in the target text information, and when it is detected that a matching degree of the target text information with the text information included in the standard answer information is lower than a preset matching degree, determine a question corresponding to the target text information; determining first target semantic knowledge point information by taking the grammar knowledge points, the vocabulary knowledge points, the questions corresponding to the target text information, the voice questions and the standard answer information as basis; outputting a next voice question containing the first target semantic knowledge point information;

5. The learning device according to claim 4, wherein the second output subunit is configured to output the next phonetic question according to the target pronunciation information and the preset pronunciation criteria in a manner that:

6. The learning apparatus according to any one of claims 4 to 5, wherein the third output subunit is configured to output the next voice question according to the target text information, the text information included in the standard answer information, the target pronunciation information, and the preset pronunciation criterion by: