CN111951827B

CN111951827B - Continuous reading identification correction method, device, equipment and readable storage medium

Info

Publication number: CN111951827B
Application number: CN201910406387.3A
Authority: CN
Inventors: 刘晨晨; 沈欣尧; 余津锐; 杨晓飞; 蒋成林; 顾怡炜; 张飞华; 徐旭栋; 李艳
Original assignee: Shanghai Liulishuo Information Technology Co ltd
Current assignee: Shanghai Liulishuo Information Technology Co ltd
Priority date: 2019-05-16
Filing date: 2019-05-16
Publication date: 2022-12-06
Anticipated expiration: 2039-05-16
Also published as: CN111951827A

Abstract

The invention discloses a continuous reading identification correction method, which comprises the steps of obtaining audio data recorded aiming at a preset statement; analyzing the audio data, and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous; and generating feedback information for judging whether the actual pronunciation is correct or not. The method provided by the application can automatically analyze the input audio data, detect whether the continuous reading is correct or not, and the obtained feedback information can assist the user in understanding the concept of the continuous reading, so that the user is helped to effectively master the effect of continuous reading skills in the English spoken language. In addition, the application does not need teachers to demonstrate and correct the learning information in the same plane, so that the learning time and the learning space are not limited, the learning cost is saved, and the effective learning time is also ensured. In addition, the application also provides a continuous reading identification correction device, equipment and a computer readable storage medium with the technical effects.

Description

Continuous reading identification correction method, device, equipment and readable storage medium

Technical Field

The present invention relates to the field of speech technology, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for continuous reading recognition correction.

Background

With the development of scientific technology, the application of language learning based on the internet is rapidly developed. In some language learning applications, an application provider sends learning materials to a client through the internet, and a user acquires the learning materials through the client to perform corresponding learning. For language learning, in addition to learning grammar and vocabulary, pronunciation capability is one of the most important capabilities. In general, the user can improve the pronunciation capability of the user by reading aloud, reading with the back and the like. However, in most cases, the user cannot know whether the pronunciation is accurate.

Continuous reading often occurs when speaking or reading consistently in english. In a case of continuous reading, if two adjacent words in the same sense group (i.e., phrase or clause) end with a consonant phoneme and the latter begins with a vowel phoneme, the consonant and the vowel are naturally pieced together to form a syllable reading. The syllables during continuous reading are not read repeatedly, and only need to be followed naturally, and cannot be added with sound or be read too heavily. Such as: not at all this phrase. When read in succession, sounds like a word. Because the Chinese language has almost no continuous reading voice phenomenon, it is difficult for some English learners to understand and master the continuous reading rule.

The traditional scheme is that field teaching and correction are carried out manually, and learners can master the teaching as much as possible through continuous simulation under the feedback guidance of teachers. However, the learning of spoken language requires continuous practice, and not only is artificial teaching and error correction costly, but also the effective practice of learners is limited by time and space.

Disclosure of Invention

The invention aims to provide a continuous reading identification correction method, a continuous reading identification correction device, continuous reading identification correction equipment and a computer readable storage medium, which solve the problems that the traditional method is high in learning cost and limited in time and space for effective exercise.

In order to solve the above technical problem, the present invention provides a continuous reading identification correction method, including:

acquiring audio data input aiming at a preset statement;

analyzing the audio data, and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous;

and generating feedback information for judging whether the actual pronunciation is correct and continuous reading.

Optionally, the analyzing the audio data and determining whether the actual pronunciation of the word pairs that can be continuous in the predetermined sentence is correct or not comprises:

screening out word pairs which can be read continuously according to the word pronunciation of the preset sentence and a preset continuous reading rule;

inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to pronunciation modes during continuous reading, and adding pronunciations of the newly constructed words into a pronunciation dictionary;

analyzing the audio data, and intercepting an acoustic model output segment corresponding to a word pair which can be read continuously according to the time boundary of the word;

and inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not.

Optionally, after obtaining a decoding result for determining whether the actual pronunciation of the word pair that can be read-through in the predetermined sentence is correct, the method further includes:

if the actual pronunciation of the word pair which can be read continuously in the preset sentence is judged to be read continuously, further judging whether the read-continuously words meet the preset requirement;

the generating feedback information for judging whether the actual reading is correct continuous reading comprises:

if the continuous reading words meet the preset requirements, generating feedback information for judging the correct continuous reading of the actual pronunciation; and if the continuous reading words do not meet the preset requirements, generating feedback information for judging that the actual pronunciation is not correct for continuous reading.

Optionally, the preset requirements include: the pronunciation score of the read-through word is larger than a preset first threshold value, and/or the phoneme duration of the read-through word is smaller than a preset second threshold value.

Optionally, after the screening out the word pairs that can be read through, the method further includes:

the phonetic symbols and letters of the part to be read continuously in the word pair capable of being read continuously are marked by the first visual element through the display interface so as to prompt the reading-continuously range and the pronunciation characteristics of the reading-continuously phoneme.

Optionally, after the generating feedback information for determining whether the actual reading is correct and continuous reading, the method further includes:

and marking whether the actual pronunciation is correctly read through by a second visual element of the display interface.

Optionally, after the generating feedback information for judging whether the actual reading is correct continuous reading, the method further includes:

and prompting the correct pronunciation mode of continuous reading in a text and/or voice mode.

The application also provides a continuous reading identification correcting device, including:

the acquisition module is used for acquiring audio data recorded aiming at a preset statement;

the judging module is used for analyzing the audio data and judging whether the actual pronunciation of the word pairs which can be continuously read in the preset sentence is correct or not;

and the generating module is used for generating feedback information for judging whether the actual pronunciation is correct and read continuously.

The application also provides a continuous reading identification correction device, which is applied to a server, and the device comprises:

a memory for storing a computer program;

a processor for implementing the following steps when executing the computer program: acquiring audio data input aiming at a preset statement; analyzing the audio data and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous; and generating feedback information for judging whether the actual pronunciation is correct or not.

The application also provides a continuous reading identification correction device, which is applied to a client, and the device comprises:

the audio acquisition device is used for inputting audio data aiming at a preset sentence;

the communication device is used for sending the audio data to a server so that the server can analyze the audio data and judge whether the actual pronunciation of the word pairs which can be continuously read in the preset sentence is correct or not; generating feedback information for judging whether the actual pronunciation is correct or not; receiving feedback information sent by the server;

and the display device is used for displaying the feedback information on a display interface.

The present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the continuous reading identification correction methods.

The continuous reading identification correction method provided by the invention comprises the steps of acquiring audio data recorded aiming at a preset statement; analyzing the audio data, and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous; and generating feedback information for judging whether the actual pronunciation is correct and continuous reading. The method provided by the application can automatically analyze the input audio data and detect whether the continuous reading is correct or not, and the obtained feedback information can assist the user in understanding the concept of continuous reading, so that the user is helped to effectively master the effect of continuous reading skills in the English spoken language. In addition, the application does not need teachers to demonstrate and correct the learning information in the same plane, so that the learning time and the learning space are not limited, the learning cost is saved, and the effective learning time is also ensured. In addition, the application also provides a continuous reading identification correction device, equipment and a computer readable storage medium with the technical effects.

Drawings

In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

FIG. 1 is a flow diagram of one embodiment of read-through identification correction provided herein;

FIG. 2 is a schematic diagram illustrating a process of determining whether actual pronunciations of word pairs that can be continuous in a predetermined sentence are correct for continuous reading in an embodiment of the present application;

FIG. 3 is a flow diagram of another embodiment of read-through identification correction provided herein;

FIG. 4 is a schematic view of a read-through message display;

FIG. 5 is a flow diagram of yet another embodiment of read-through identification correction as provided herein;

FIG. 6 is a schematic view of a visual presentation for feedback of user continuous reading correctness;

FIG. 7 is a block diagram of a read-through identification calibration apparatus according to an embodiment of the present invention;

fig. 8 is a block diagram of a read-through identification correction device applied to a server according to an embodiment of the present invention;

fig. 9 is a block diagram of a read-through identification correction device applied to a client according to an embodiment of the present invention;

fig. 10 is a block diagram of a read-through identification correction system according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the disclosure, reference will now be made in detail to the embodiments of the disclosure as illustrated in the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be implemented in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

The embodiment of the invention can be used in pronunciation learning scenes, especially pronunciation learning scenes or pronunciation correction scenes in language learning, wherein languages include but are not limited to foreign languages such as English, french, german and Japanese, and Chinese branches such as Mandarin, cantonese and Sichuan. The language learning scenario according to the embodiment of the present invention may be, for example, a pronunciation evaluation scenario, a pronunciation correction scenario, or the like in the language learning software or the language learning terminal, or may be another language learning scenario, and the embodiment of the present invention is not limited in this respect.

As will be described in detail below, the user can learn pronunciation through the client, the client can display the content to be learned by the user on the display interface, and can output the audio content in a voice form to the user through an audio playing device such as a speaker. When the user performs pronunciation learning of voice, the client can acquire audio data of the user during pronunciation through the audio acquisition device so as to perform rhythm recognition and correction operation in the subsequent process. It can be understood that the subject performing the rhythm identification correction operation may be a client or a server, which does not affect the implementation of the present application.

The client in the embodiment of the present invention may include, but is not limited to: smart phones, tablet computers, MP4, MP3, PCs, PDAs, wearable devices, head-mounted display devices, and the like; the server may include, but is not limited to: a single web server, a server group of multiple web servers, or a cloud based on cloud computing consisting of a large number of computers or web servers.

With reference to the application scenario, a flowchart of a specific implementation of read-through recognition correction provided by the present application is shown in fig. 1, where the method specifically includes:

step S101: acquiring audio data input aiming at a preset statement;

the predetermined sentence comprises one or more sentences, and each sentence comprises two or more words. The predetermined sentence in this embodiment may be a sentence that requires the application of a continuous reading skill. The user can read the preset sentence aloud, the voice aiming at the sentence to be trained is input through the client, and the audio data corresponding to the voice is acquired after the voice is acquired by the audio acquisition device.

Step S102: analyzing the audio data and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous;

and analyzing the audio data to obtain the actual pronunciation of the word pair which can be read continuously in the preset sentence, and further judging whether the actual pronunciation is correct to read continuously. It should be noted that the process may be executed by the client or by the background server, which does not affect the implementation of the present application.

Step S103: and generating feedback information for judging whether the actual pronunciation is correct and continuous reading.

Specifically, the feedback information may be visually displayed to the user in a visual manner, and may also be assisted by adding a corresponding sound effect for feedback, which is not limited herein.

The continuous reading identification correction method provided by the invention comprises the steps of acquiring audio data recorded aiming at a preset statement; analyzing the audio data, and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous; and generating feedback information for judging whether the actual pronunciation is correct and continuous reading. The method provided by the application can automatically analyze the input audio data, detect whether the continuous reading is correct or not, and the obtained feedback information can assist the user in understanding the concept of the continuous reading, so that the user is helped to effectively master the effect of continuous reading skills in the English spoken language. In addition, the application does not need teachers to demonstrate and correct the learning information in the same plane, so that the learning time and the learning space are not limited, the learning cost is saved, and the effective learning time is also ensured.

As a specific implementation manner, referring to fig. 2, the step S102 of determining whether the actual pronunciation of the word pair that can be continuous in the predetermined sentence is correct for continuous reading may specifically include:

step S1021: screening out word pairs which can be read continuously according to the word pronunciation of the preset sentence and a preset continuous reading rule;

specifically, it is possible to sequentially check whether adjacent two words satisfy the read-through rule by traversing adjacent word pairs in a predetermined sentence. The continuous reading rule is preset condition information needing continuous reading, for example, when the condition that the consonant t, d, s and z ends and the word you are consistent with the continuous reading condition is detected, the corresponding word pair is a word pair capable of being read continuously.

Step S1022: inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to pronunciation modes during continuous reading, and adding pronunciations of the newly constructed words into a pronunciation dictionary;

inserting a predetermined symbol in the middle of the screened word pair to construct a new word, and simultaneously constructing a corresponding phoneme sequence according to the pronunciation mode of the word pair during continuous reading, wherein the predetermined symbol can be @, and when detecting that the continuous reading word pair is 'did' and 'you', constructing a new word 'did @ @ you', and correspondingly pronouncing @ @ @

And adding the newly constructed pronunciation corresponding to the word into a pronunciation dictionary of the voice recognition system. Wherein, the pronunciation dictionary is an essential component for constructing the traditional voice recognition system, and the recognition system only can inputWords present in the pronunciation dictionary are presented. The pronunciation dictionary can be obtained by pre-construction, and newly constructed word pronunciations can be added to the pronunciation dictionary according to requirements in a continuous reading recognition scene.

Step S1023: analyzing the audio data, and intercepting an acoustic model output segment corresponding to a word pair which can be read continuously according to the time boundary of the word;

the location of each phoneme is determined by forced segmentation alignment of speech recognition, and then from the syllables of each word, the time boundaries of each syllable and word are found. And intercepting an acoustic model output segment corresponding to the current detection read-through word pair according to the obtained time boundary of the word.

Step S1024: and inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not.

The decoding network is a pre-constructed grammar network with only two branches, wherein one is formed by two words which are not read continuously, such as 'did' and 'you', and the other is formed by a corresponding newly added 'word' which represents read continuously, such as 'did @ @ you'. And then a decoding network is constructed together with the state transition, the context and the pronunciation rule. The decoding network has and only has two outputs representing read-through or not read-through.

And inputting the acoustic model output segment into the constructed decoding network for decoding. If the decoding result is two independent words which are not read continuously, such as 'did you', it indicates that the audio data is not read continuously on the corresponding word pair, and if the decoding result is a word which is newly added into the dictionary in the text and indicates continuous reading, such as 'did @ @ you', it indicates that the audio data is read continuously on the corresponding word pair.

Further, in another specific embodiment of the continuous reading identification correction method provided by the present application, after the decoding network determines that the audio data is continuously read on the corresponding word pair, further determination may be performed, so as to further improve the accuracy of the determination. A flowchart corresponding to this embodiment is shown in fig. 3, and the method specifically includes:

step S201: acquiring audio data input aiming at a preset statement;

step S202: screening out word pairs which can be read continuously according to the word pronunciation of the preset sentence and a preset continuous reading rule;

optionally, after the screening out the word pairs that can be read through, the method further includes: the phonetic symbols and letters of the part to be read continuously in the word pair capable of being read continuously are marked by the first visual element through the display interface so as to prompt the reading-continuously range and the pronunciation characteristics of the reading-continuously phoneme. The first visual element may be an arc. And displaying sentences needing to apply continuous reading skills on a display interface, wherein the phonetic symbols of the parts to be continuously read and the corresponding letters are connected by arcs, and visually prompting the continuous reading range. Meanwhile, under the phonetic symbols of the continuous reading part, the pronunciation condition of the continuous reading corresponding part can be directly marked, and the user is assisted to intuitively understand the pronunciation characteristics of the phonemes during continuous reading.

As shown in the schematic diagram of fig. 4, for the predetermined sentence "turn over", it is first detected that the two are word pairs that can be read through. Below the letters of the readthrough part, the need for readthrough is indicated by an arc. And the phonetic symbol part directly marks the pronunciation of the continuous reading part, and the phonetic symbol of the continuous reading part also uses an arc line to indicate the continuous reading.

Step S203: inserting preset symbols into the word pairs obtained by screening to construct new words, constructing a corresponding phoneme sequence according to a pronunciation mode during continuous reading, and adding pronunciations of the newly constructed words into a pronunciation dictionary;

step S204: analyzing the audio data, and intercepting an acoustic model output segment corresponding to a word pair which can be read continuously according to the time boundary of the word;

step S205: inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not;

step S206: if the actual pronunciation of the word pair which can be read continuously in the preset sentence is judged to be read continuously, further judging whether the read-continuously words meet the preset requirement;

wherein, the preset requirements may include: the pronunciation score of the read-through word is larger than a preset first threshold value, and/or the phoneme duration of the read-through word is smaller than a preset second threshold value.

And calculating the pronunciation score of the continuous reading word by a preset pronunciation score calculation method. And judging the voice data with the score lower than a preset first threshold value as unread. The preset first threshold is obtained by performing statistical analysis on a large amount of labeled data, and different continuous reading types correspond to different preset first thresholds.

One method for calculating the preset pronunciation score may be: calculating posterior probability evaluation indexes of actual pronunciation data corresponding to the continuous reading words; calculating a duration evaluation index of actual pronunciation data corresponding to the continuous reading word; and inputting the posterior probability evaluation index and the duration evaluation index into a pre-established evaluation model to obtain the pronunciation score of the actual pronunciation data.

The process of calculating the posterior probability evaluation index of the actual pronunciation data corresponding to the continuous reading word may specifically include: calculating a likelihood score for each phoneme; and dividing the likelihood scores of the actual pronunciation data corresponding to the continuous reading words by the likelihood scores of all the phonemes to obtain the posterior probability evaluation index of the actual pronunciation data corresponding to the continuous reading words.

The process of calculating the duration evaluation index of the actual pronunciation data corresponding to the reading-through word may specifically include: counting the phoneme duration of each phoneme on standard pronunciation data in advance, and establishing a corresponding relation between the represented phoneme duration and a duration evaluation index through a Gaussian model; determining the phoneme duration of actual pronunciation data corresponding to the continuous reading word; and determining a duration evaluation index corresponding to the duration of the current phoneme according to the Gaussian model.

The pre-established evaluation model may be a linear regression model.

The phoneme durations of the readthrough words may specifically be: the time length of the last phoneme of the previous word and the first syllable of the next word in the pair of read-through words, which is calculated from the phoneme boundaries above. And judging the speech data with the phoneme duration lower than a preset second threshold as continuous reading, otherwise, judging the speech data as non-continuous reading. The preset second threshold is also obtained by analyzing and counting a large amount of labeled data, and different continuous reading types correspond to different preset second thresholds.

Step S207: if the continuous reading words meet the preset requirements, generating feedback information for judging the correct continuous reading of the actual pronunciation; and if the continuous reading words do not meet the preset requirements, generating feedback information for judging that the actual pronunciation is not correct for continuous reading.

As shown in fig. 5, on the basis of any of the above embodiments, after the generating feedback information for determining whether the actual reading is correct or not, the method for continuous reading identification and correction provided by the present application may further include: step S104: feedback information is displayed to the user through visual elements to assist the user in a series of processes of learning. The method specifically comprises the following steps: and marking whether the actual pronunciation is correctly read through by a second visual element of the display interface. For example, a read-through relation is represented by a connecting line between the word pairs which can be read through in the display interface, and the connecting line presents a preset first color when the correct read-through is carried out. And the connecting line presents a preset second color under the condition that the correct continuous reading is not carried out. The preset first color and the preset second color are different colors. Under the condition of unsuccessful continuous reading, the method can also visually represent the condition by a mode of disconnecting the connecting line.

Referring to fig. 6, a schematic diagram of a visual presentation form for feeding back that the user continuous reading is correct and incorrect, in this embodiment, the predetermined sentence is "turn over", whether the user has performed continuous reading is indicated by a large circle on the left above the interface, where a color of the circle turns green to indicate that correct continuous reading is performed, and a color of the circle turns red to indicate that correct continuous reading is not performed.

In addition, the correct pronunciation mode of continuous reading can be prompted in a text and/or voice mode. For example, phonetic symbols after continuous reading are displayed on the display interface, or pronunciation skills during continuous reading are displayed, or possible sound variation phenomena are displayed.

In this embodiment, feedback information is fed back to the user through the display interface, and the feedback information may include but is not limited to: the continuous reading is correct and incorrect, the continuous reading range of the part to be continuously read and the pronunciation characteristics of continuous reading phonemes. The embodiment adopts visual elements to assist the learning user in practicing the continuous reading and non-continuous reading distinction, enhances the understanding of concepts in the practicing process and can quickly locate the problem of the user in the practicing process. The visual comprehension of the word reading-through situation is assisted by the visual element shape.

In the following, the continuous reading identification correction device provided by the embodiment of the present invention is introduced, and the continuous reading identification correction device described below and the continuous reading identification correction method described above may be referred to correspondingly.

Fig. 7 is a block diagram of a read-through identification correction apparatus according to an embodiment of the present invention, where, referring to fig. 7, the read-through identification correction apparatus may include:

an obtaining module 100, configured to obtain audio data entered for a predetermined sentence;

a determining module 200, configured to analyze the audio data, and determine whether actual pronunciation of a word pair that can be continuous in the predetermined sentence is correct for continuous reading;

a generating module 300, configured to generate feedback information for determining whether the actual pronunciation is correct for continuous reading

As a specific implementation manner, in the embodiment of the present application, the determining module 200 specifically includes:

the screening unit is used for screening out word pairs which can be read continuously according to the word pronunciation of the preset sentence and a preset continuous reading rule;

the adding unit is used for inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to the pronunciation mode during continuous reading and adding the newly constructed word pronunciations into the pronunciation dictionary;

the analysis unit is used for analyzing the audio data and intercepting an acoustic model output segment corresponding to a word pair which can be read continuously according to the time boundary of the word;

and the judging unit is used for inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual pronunciation of the word pair which can be read continuously in the preset sentence is correct or not.

As a specific implementation manner, in the embodiment of the present application, the determining module 200 is further configured to:

after a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not is obtained, if the actual reading of the word pair capable of being read continuously in the preset sentence is judged to be read continuously, further judging whether the word capable of being read continuously meets the preset requirement or not; if the continuous reading words meet the preset requirements, generating feedback information for judging the correct continuous reading of the actual pronunciation; and if the continuous reading words do not meet the preset requirements, generating feedback information for judging that the actual pronunciation is not correct for continuous reading.

As a specific implementation manner, the preset requirements in the embodiment of the present application include: the pronunciation score of the read-through word is larger than a preset first threshold value, and/or the phoneme duration of the read-through word is smaller than a preset second threshold value.

As a specific implementation manner, the continuous reading identification correction apparatus provided by the present application may further include:

and the first display module is used for marking the phonetic symbols and the letters of the part to be read continuously in the word pair which can be read continuously by using the first visual element through the display interface after the word pair which can be read continuously is screened out so as to prompt the reading-continuously range and the pronunciation characteristics of the reading-continuously phoneme.

On the basis of any of the above embodiments, the continuous reading identification correction apparatus provided by the present application may further include:

and the second display module is used for marking whether the actual reading is correct continuous reading or not through a second visual element of the display interface after generating the feedback information for judging whether the actual reading is correct continuous reading or not.

and the prompting module is used for prompting the correct pronunciation mode of continuous reading in a text and/or voice mode after generating feedback information for judging whether the actual pronunciation is correct continuous reading.

The continuous reading identification correction apparatus of this embodiment is configured to implement the continuous reading identification correction method, and thus specific implementation of the continuous reading identification correction apparatus may be found in the foregoing embodiment parts of the continuous reading identification correction method, for example, the obtaining module 100, the determining module 200, and the generating module 300, which are respectively configured to implement steps S101, S102, and S103 in the continuous reading identification correction method.

The device that this application provided can carry out the analysis to the audio data who types automatically, detects the continuous reading wherein whether correct, and the feedback information who obtains can assist the user to understand the notion of continuous reading to help the user effectively to master the effect of continuous reading skill in the oral english language. In addition, the application does not need teachers to demonstrate and correct the learning information in the same plane, so that the learning time and the learning space are not limited, the learning cost is saved, and the effective learning time is also ensured.

In addition, the present application further provides a continuous reading identification correction device, which is applied to the server 1, and as shown in fig. 8, the device includes:

a memory 11 for storing a computer program;

a processor 12 for implementing the following steps when executing the computer program: acquiring audio data input aiming at a preset sentence; analyzing the audio data and judging whether the actual pronunciation of the word pairs which can be continuous in the preset sentence is correct and continuous; and generating feedback information for judging whether the actual pronunciation is correct and continuous reading.

The memory 11 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may in some embodiments be an internal storage unit of the readthrough recognition correction device, for example a hard disk. The memory 11 may also be an external storage device of the read-through identification and correction device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 11 may also include both an internal storage unit of the readthrough recognition correction device and an external storage device. The memory 11 can be used not only for storing application software installed in the readthrough recognition correction device and various types of data, such as the code of the readthrough recognition correction program 01, etc., but also for temporarily storing data that has been output or is to be output.

The processor 12, which in some embodiments may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip, is used for executing program codes or Processing data stored in the memory 11, such as executing the read-through identification correction program 01.

Optionally, the processor 12 is configured to implement the following steps when executing the computer program: screening out word pairs which can be read continuously according to the word pronunciation of the preset sentence and a preset continuous reading rule; inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to pronunciation modes during continuous reading, and adding pronunciations of the newly constructed words into a pronunciation dictionary; analyzing the audio data, and intercepting an acoustic model output segment corresponding to a word pair which can be read continuously according to the time boundary of the word; and inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not.

Optionally, the processor 12 is configured to implement the following steps when executing the computer program:

after a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not is obtained, if the actual reading of the word pair capable of being read continuously in the preset sentence is judged to be read continuously, further judging whether the word capable of being read continuously meets the preset requirement or not; if the continuous reading words meet the preset requirements, generating feedback information for judging the correct continuous reading of the actual pronunciation; and if the continuous reading word does not meet the preset requirement, generating feedback information for judging that the actual pronunciation is not correct and continuous reading.

It can be understood that, in the embodiment of the present application, the service end may include but is not limited to: a single web server, a server group of multiple web servers, or a cloud based on cloud computing consisting of a large number of computers or web servers.

In addition, the present application also provides a continuous reading identification correction device, which is applied to the client 2, as shown in fig. 9, the device includes:

the audio acquisition device 21 is used for inputting audio data aiming at a preset sentence;

the communication device 22 is configured to send the audio data to a server, so that the server analyzes the audio data and determines whether actual pronunciation of a word pair that can be continuous in the predetermined sentence is correct for continuous reading; generating feedback information for judging whether the actual pronunciation is correct or not; receiving feedback information sent by the server;

and the display device 23 is used for displaying the feedback information on a display interface.

Optionally, the display device 23 in the continuous reading identification and correction device provided in the embodiment of the present application may be specifically configured to: after the word pairs which can be read continuously are screened out, the phonetic symbols and letters of the parts to be read continuously in the word pairs which can be read continuously are marked by the first visual elements through the display interface so as to prompt the reading range and the pronunciation characteristics of the reading phoneme.

Optionally, the display device 23 in the continuous reading identification and correction device provided in the embodiment of the present application may be specifically configured to: and after generating feedback information for judging whether the actual reading is correct continuous reading, marking whether the actual reading is correct continuous reading through a second visual element of the display interface.

Optionally, the continuous reading identification correction device provided in the embodiment of the present application may further include: and the prompting device is used for prompting the correct pronunciation mode of continuous reading in a text and/or voice mode after generating feedback information for judging whether the actual pronunciation is correct continuous reading.

It can be understood that the client in the embodiment of the present application may include, but is not limited to: smart phones, tablet computers, MP4, MP3, PCs, PDAs, wearable devices, head-mounted display devices, and the like.

Further, the present application also provides a continuous reading identification correction system, as shown in fig. 10, the system includes any one of the servers 1 and any one of the clients 2. The user can carry out pronunciation study through the client, and the client can show the content that the user waited to study on the display interface to can also export the audio frequency content of speech form to the user through audio playback devices such as speaker, when the user carries out pronunciation study of pronunciation, the client can gather the audio data when the user pronounces through audio acquisition device, and with audio data transmission to server, carry out the process of continuous reading discernment correction by the server. And after the audio data are analyzed at the server side and feedback information is obtained, the feedback information is sent to the client side. And displaying the feedback information through a display device of the client, and providing visual auxiliary information for the user.

Furthermore, the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above-mentioned continuous reading identification correction methods.

The continuous reading identification correction device, the continuous reading identification correction system and the computer readable storage medium provided by the application correspond to the method. It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

To sum up, this application can be automatically to the audio data analysis of typing in, detect whether the continuous reading wherein is correct, the feedback information that obtains can assist the user to understand the concept of continuous reading to help the user effectively to master the effect of the continuous reading skill in the oral english language. In addition, the application does not need teachers to demonstrate and correct the learning information in the same plane, so that the learning time and the learning space are not limited, the learning cost is saved, and the effective learning time is ensured.

In the present specification, the embodiments are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same or similar parts between the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The method, apparatus, device and computer readable storage medium for continuous reading identification calibration provided by the present invention are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, it is possible to make various improvements and modifications to the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims

1. A method for read-through recognition correction, comprising:

acquiring audio data input aiming at a preset sentence;

through traversing word pairs in a preset sentence, when two adjacent words in the word pairs are detected to meet the continuous reading rule of 'consonant t, d, s and z ending + word you', determining the word pairs as word pairs which can be read continuously;

inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to the pronunciation mode during continuous reading, and adding the phoneme sequences of the newly constructed word pronunciations into a pronunciation dictionary;

determining the position of each phoneme in the audio data through forced segmentation alignment of speech recognition, and then finding out the time boundary of each syllable and each word according to the syllable of each word; intercepting an acoustic model output segment corresponding to the current detection read-through word pair according to the obtained time boundary of the word;

inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not; inputting the acoustic model output segment into a constructed decoding network for decoding, if the decoding result is an independent word which is not read-through, indicating that the audio data is not read-through on the corresponding word pair, and if the decoding result is a word which is newly added into the dictionary in the text and indicates that the audio data is read-through on the corresponding word pair;

2. The method for continuous reading identification correction according to claim 1, wherein after said obtaining a decoding result for judging whether the actual reading of the word pair that can be read-through in the predetermined sentence is correct, further comprising:

if the continuous reading words meet the preset requirements, generating feedback information for judging the correct continuous reading of the actual pronunciation; if the continuous reading words do not meet the preset requirements, generating feedback information for judging that the actual pronunciation is not correct for continuous reading; wherein the preset requirements include: the pronunciation score of the read-through word is larger than a preset first threshold value, and/or the phoneme duration of the read-through word is smaller than a preset second threshold value.

3. The method of read-through recognition correction according to claim 1, further comprising, after said determining that said word pair is a word pair that can be read-through:

the phonetic symbols and letters of the part to be read continuously in the word pair which can be read continuously are marked by the first visual element through the display interface so as to prompt the reading range and the pronunciation characteristics of the reading phoneme.

4. The readthrough recognition correction method of any one of claims 1 to 3, further comprising, after the generating feedback information that judges whether the actual reading is correct for readthrough:

and marking whether the actual reading is correct and continuous reading or not through a second visual element of the display interface.

5. The readthrough recognition correction method of claim 4, further comprising, after the generating feedback information that determines whether the actual reading is correct for readthrough:

6. A read-through recognition correction device, comprising:

the acquisition module is used for acquiring audio data input aiming at a preset sentence;

the judging module is used for determining that the word pair can be read continuously when detecting that two adjacent words in the word pair meet the continuous reading rule of 'consonant t, d, s and z ending + word you' by traversing the word pair in a preset sentence;

determining the position of each phoneme in the audio data through forced segmentation alignment of voice recognition, and then finding out the time boundary of each syllable and each word according to the syllable of each word; intercepting an acoustic model output segment corresponding to the current detection read-through word pair according to the obtained time boundary of the word;

7. A continuous reading identification correction device, applied to a server, the device comprising:

a memory for storing a computer program;

a processor for implementing the following steps when executing the computer program: acquiring audio data input aiming at a preset statement; through traversing word pairs in a preset sentence, when two adjacent words in the word pairs are detected to meet the continuous reading rule of 'consonant t, d, s and z ending + word you', determining the word pairs as word pairs which can be read continuously; inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to the pronunciation mode during continuous reading, and adding the phoneme sequences of the newly constructed word pronunciations into a pronunciation dictionary; determining the position of each phoneme in the audio data through forced segmentation alignment of speech recognition, and then finding out the time boundary of each syllable and each word according to the syllable of each word; intercepting an acoustic model output segment corresponding to the current detection read-through word pair according to the obtained time boundary of the word; inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not; inputting the acoustic model output segment into a constructed decoding network for decoding, if the decoding result is an independent word which is not read-through, indicating that the audio data is not read-through on the corresponding word pair, and if the decoding result is a word which is newly added into the dictionary in the text and indicates that the audio data is read-through on the corresponding word pair; and generating feedback information for judging whether the actual pronunciation is correct or not.

8. A read-through recognition correction device, applied to a client, the device comprising:

communication means for determining a word pair in a predetermined sentence as a word pair that can be read consecutively when it is detected that two adjacent words in the word pair satisfy a consecutive reading rule of "consonant t, d, s, z end + word you"; inserting preset symbols into the word pairs obtained by screening to construct new words, constructing corresponding phoneme sequences according to the pronunciation mode during continuous reading, and adding the phoneme sequences of the newly constructed word pronunciations into a pronunciation dictionary; determining the position of each phoneme in the audio data through forced segmentation alignment of speech recognition, and then finding out the time boundary of each syllable and each word according to the syllable of each word; intercepting an acoustic model output segment corresponding to the current detection read-through word pair according to the obtained time boundary of the word; inputting the acoustic model output segment into a pre-constructed decoding network to obtain a decoding result for judging whether the actual reading of the word pair capable of being read continuously in the preset sentence is correct or not; inputting the acoustic model output segment into a constructed decoding network for decoding, if the decoding result is an independent word which is not read-through, indicating that the audio data is not read-through on the corresponding word pair, and if the decoding result is a word which is newly added into the dictionary in the text and indicates that the audio data is read-through on the corresponding word pair;

and the display device is used for displaying the information whether the continuous reading is correct or not on the display interface.

9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the read-through recognition correction method according to any one of claims 1 to 5.