CN112309371A - Intonation detection method, apparatus, device and computer readable storage medium - Google Patents
Intonation detection method, apparatus, device and computer readable storage medium Download PDFInfo
- Publication number
- CN112309371A CN112309371A CN201910696870.XA CN201910696870A CN112309371A CN 112309371 A CN112309371 A CN 112309371A CN 201910696870 A CN201910696870 A CN 201910696870A CN 112309371 A CN112309371 A CN 112309371A
- Authority
- CN
- China
- Prior art keywords
- intonation
- preset
- audio data
- change
- actual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B19/00—Teaching not covered by other main groups of this subclass
- G09B19/06—Foreign languages
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/04—Electrically-operated educational appliances with audible presentation of the material to be studied
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Abstract
The application discloses a intonation detection method, a intonation detection device, intonation detection equipment and a computer-readable storage medium, wherein the method comprises the following steps: acquiring audio data input aiming at a preset statement; analyzing the audio data to determine the actual intonation change in the preset sentence; and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not. The method that this application provided can be automatically to the audio data of typeeing and carry out the analysis, whether the actual intonation change wherein accords with preset intonation, and to the information whether correct in intonation feedback to the user, can assist the user to understand the notion that the intonation changes, thereby help the user effectively to master the intonation change in the oral english, and simultaneously, this application no longer needs mr to carry out real person demonstration teaching or correction on the spot, time and space's restriction has been overcome, can carry out corresponding exercise anytime and anywhere, the study cost has been practiced thrift.
Description
Technical Field
The present application relates to the field of speech technology, and more particularly, to a tone detection method, apparatus, device, and computer-readable storage medium.
Background
With the development of scientific technology, the application of language learning based on the internet is rapidly developed. In some language learning applications, an application provider sends learning materials to a client through the internet, and a user acquires the learning materials through the client to perform corresponding learning. For language learning, in addition to learning grammar and vocabulary, pronunciation capability is one of the most important capabilities. In general, the user can improve the pronunciation capability of the user by reading aloud, reading with the back and the like. However, in most cases, the user cannot know whether the pronunciation is accurate.
In the process of learning English by learners, learners with Chinese as a mother language are used to directly raise tone at the last word/syllable, but different from the cognition of most learners, the tone of English needs to be raised or lowered from the stressed syllable of a key word in a sentence, and many learners still cannot correctly master tone after understanding the English pronunciation principle due to the unaccustomed continuous raising and tone reading method; moreover, the learning process requires real-man teacher feedback to correct the problems with intonation, so that the user's effective practice is limited in time and space.
Disclosure of Invention
The present application is directed to a intonation detection method, apparatus, device and computer-readable storage medium, so as to solve the problems of the conventional method that the learning efficiency is low and the time and space for effective practice are limited.
In order to achieve the above object, the present application provides a intonation detection method, including:
acquiring audio data input aiming at a preset statement;
analyzing the audio data to determine the actual intonation change in the preset sentence;
and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not.
Optionally, the analyzing the audio data to determine the actual intonation change in the predetermined sentence includes:
analyzing the audio data, and detecting to obtain a vowel part in the audio data;
determining the vibration frequency of the vowel part and calculating the change rate of the vibration frequency;
determining an actual intonation change in the predetermined sentence based on the rate of change.
Optionally, the analyzing the audio data to detect a vowel portion in the audio data includes:
and carrying out forced segmentation and alignment on the audio data through voice recognition to obtain a vowel part in the audio data.
Optionally, before the acquiring the audio data entered for the predetermined sentence, the method further includes:
and marking the preset intonation change through a first visual element of a display interface.
Optionally, after generating feedback information for characterizing whether the current intonation of the predetermined sentence is correct, the method further includes:
when the actual intonation change is consistent with the preset intonation change in comparison, indicating that the intonation change of the preset sentence is correct through a second visual element of the display interface;
and when the actual tone variation is inconsistent with the preset tone variation in comparison, indicating that the tone variation of the preset sentence is incorrect through a third visual element of the display interface.
Optionally, after generating feedback information for characterizing whether the current intonation of the predetermined sentence is correct, the method further includes:
and prompting the feedback information through a specific sound effect.
In order to achieve the above object, the present application provides a intonation detecting apparatus, including:
the acquisition module is used for acquiring audio data recorded aiming at a preset statement;
the determining module is used for analyzing the audio data and determining the actual intonation change in the preset sentence;
and the generating module is used for comparing the actual intonation change with a preset intonation change corresponding to the preset statement and generating feedback information for representing whether the current intonation of the preset statement is correct or not.
In order to achieve the above object, the present application provides a intonation detection apparatus, which is applied to a server, the apparatus includes:
a memory for storing a computer program;
a processor for implementing the steps of any of the aforementioned disclosed intonation detection methods when executing said computer program.
In order to achieve the above object, the present application provides a intonation detection device, which is applied to a client, the device including:
the audio acquisition device is used for inputting audio data aiming at a preset sentence;
the communication device is used for sending the audio data to a server so that the server can analyze the audio data and determine the actual intonation change in the preset statement; comparing the actual intonation change with a preset intonation change corresponding to the preset sentence to generate feedback information for representing whether the current intonation of the preset sentence is correct or not;
and the display device is used for displaying the feedback information on a display interface.
To achieve the above object, the present application provides a computer-readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of any one of the intonation detection methods disclosed in the foregoing.
According to the scheme, the intonation detection method provided by the application comprises the following steps: acquiring audio data input aiming at a preset statement; analyzing the audio data to determine the actual intonation change in the preset sentence; and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not. The method that this application provided can be automatically to the audio data of typeeing and carry out the analysis, whether the actual intonation change wherein accords with preset intonation, and to the information whether correct in intonation feedback to the user, can assist the user to understand the notion that the intonation changes, thereby help the user effectively to master the intonation change in the oral english, and simultaneously, this application no longer needs mr to carry out real person demonstration teaching or correction on the spot, time and space's restriction has been overcome, can carry out corresponding exercise anytime and anywhere, the cost of learning has been practiced thrift.
The application also discloses a tone detection device, equipment and a computer readable storage medium, which can also realize the technical effects.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a intonation detection method disclosed in an embodiment of the present application;
FIG. 2 is a schematic diagram of a visualization presentation prompting a user to practice intonation at a display interface;
FIG. 3 is a flow chart of a process for determining actual intonation changes in a predetermined sentence;
FIG. 4 is a flow chart of another intonation detection method disclosed in the embodiments of the present application;
FIG. 5 is a schematic diagram of a visualization of feedback to a user's intonation exercise at a display interface;
fig. 6 is a block diagram of a intonation detection apparatus according to an embodiment of the present invention;
fig. 7 is a block diagram of a structure in which intonation detection equipment provided in the embodiment of the present application is applied to a server;
fig. 8 is a block diagram illustrating a structure of a intonation detection apparatus applied to a client according to an embodiment of the present application;
fig. 9 is a block diagram of a intonation detection system according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the disclosure, a more particular description of the disclosure will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The embodiment of the invention can be used in pronunciation learning scenes, especially pronunciation learning scenes or pronunciation correction scenes in language learning, wherein languages include but are not limited to foreign languages such as English, French, German and Japanese, and Chinese branches such as Guangdong and Sichuan. The language learning scenario according to the embodiment of the present invention may be, for example, a pronunciation evaluation scenario, a pronunciation correction scenario, or the like in the language learning software or the language learning terminal, or may be another language learning scenario, and the embodiment of the present invention is not limited.
As will be explained in detail below in the application scenario of the embodiment of the present application, a user may perform pronunciation learning through a client, and the client may display a content to be learned by the user on a display interface and may output an audio content in a voice form to the user through an audio playing device such as a speaker. When the user learns pronunciation of voice, the client can acquire audio data of the user during pronunciation through the audio acquisition device so as to perform tone detection operation in the following process. It can be understood that the main body for performing the intonation detection operation may be a client or a server, which does not affect the implementation of the present application.
The client in the embodiment of the present invention may include, but is not limited to: smart phones, tablet computers, MP4, MP3, PCs, PDAs, wearable devices, head-mounted display devices, and the like; the server may include, but is not limited to: a single web server, a server group of multiple web servers, or a cloud based on cloud computing consisting of a large number of computers or web servers.
With reference to the above application scenarios, a flowchart of a specific implementation of the intonation detection method provided by the present application is shown in fig. 1, and the method specifically includes:
s101: acquiring audio data input aiming at a preset statement;
in this embodiment, the predetermined sentence is a sentence to be used for practicing intonation, and may include one or more sentences, each sentence includes one or more sense groups, and each sense group may be at least one word. The user can read the preset sentence through the client side to input voice aiming at the sentence to be trained, and the voice data corresponding to the voice is acquired through the audio acquisition device.
Specifically, as a preferred implementation manner, the embodiment of the present application may use the visual element to mark the predetermined sentence correspondingly to prompt the user of the correct intonation change. As shown in fig. 2, the accent words in the predetermined sentence may be shown in bold and the accent syllables may be further enlarged, and the rising and falling of the intonation may be indicated by inclined arrows on the words, wherein the arrows start from the accent syllables of the accent words and extend to the end of the rising and falling of the intonation.
S102: analyzing the audio data to determine the actual intonation change in the preset sentence;
in this embodiment, the actual intonation change of the user in the reading process of the predetermined sentence is obtained by analyzing the audio data. The process can be executed by the client and can also be executed by the background server, which does not influence the implementation of the application.
As a specific implementation manner, referring to fig. 3, in the present application, the analyzing the audio data to determine the actual intonation change in the predetermined sentence may include:
s1021: analyzing the audio data, and detecting to obtain a vowel part in the audio data;
in the English pronunciation, the consonant part has no obvious periodicity, and the vowel part can detect the frequency of sound vibration. Specifically, the audio data may be subjected to forced segmentation alignment by using speech recognition, so as to obtain a vowel portion in the audio data.
S1022: determining the vibration frequency of the vowel part and calculating the change rate of the vibration frequency;
in the present embodiment, the vibration frequency of the vowel portion at each time is calculated, for example, the vibration frequency of the vowel portion every 0.01 second is calculated, and the change rate of the vibration frequency is determined. Specifically, the change rate of the vibration frequency of the audio data can be estimated through the minimum mean square error, and the corresponding slope is obtained.
S1023: determining an actual intonation change in the predetermined sentence based on the rate of change.
It will be appreciated that after determining the rate of change of the audio data vibration frequency, the actual intonation change may be determined based on the rate of change. For example, if the change rate of the vibration frequency is less than zero, it is determined that the audio data entered by the user is down-pitched, and otherwise, it is determined that the audio data entered by the user is up-pitched.
S103: and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not.
And after determining the actual tone variation corresponding to the audio data input by the user, comparing the actual tone variation with the preset tone variation. If the situation of inconsistency exists, the current intonation change can be considered to be incorrect.
In addition, the feedback information may be visually displayed to the user, or may assist in adding a corresponding sound effect to perform feedback, which is not limited in this embodiment.
According to the scheme, the intonation detection method provided by the application comprises the following steps: acquiring audio data input aiming at a preset statement; analyzing the audio data to determine the actual intonation change in the preset sentence; and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not. The method that this application provided can be automatically to the audio data of typeeing and carry out the analysis, whether the actual intonation change wherein accords with preset intonation, and to the information whether correct in intonation feedback to the user, can assist the user to understand the notion that the intonation changes, thereby help the user effectively to master the intonation change in the oral english, and simultaneously, this application no longer needs mr to carry out real person demonstration teaching or correction on the spot, time and space's restriction has been overcome, can carry out corresponding exercise anytime and anywhere, the cost of learning has been practiced thrift.
The embodiment of the application discloses another intonation detection method, and compared with the previous embodiment, the embodiment further explains and optimizes the technical scheme. Referring to fig. 4, specifically:
s201: acquiring audio data input aiming at a preset statement;
s202: analyzing the audio data to determine the actual intonation change in the preset sentence;
s203: comparing the actual intonation change with a preset intonation change corresponding to the preset sentence;
s204: when the actual intonation change is consistent with the preset intonation change in comparison, indicating that the intonation change of the preset sentence is correct through a second visual element of a display interface;
s205: and when the actual tone variation is inconsistent with the preset tone variation in comparison, indicating that the tone variation of the preset sentence is incorrect through a third visual element of the display interface.
The first visual element and the second visual element can be different geometric patterns or the same geometric pattern, and are distinguished by different indication colors or other characteristics of the geometric patterns. For example, the geometric pattern may be selected as a circle, and when the overall pronunciation tone is correct, the color of the circle is changed to green, and the preset first sound effect is played simultaneously to indicate that the overall tone is correct. When the integral pronunciation tone is incorrect, the color of the circle is changed into red, and meanwhile, the circle shakes or plays a preset second sound effect to indicate the integral tone error.
Referring to fig. 5, a schematic diagram of a visual presentation of feedback on user intonation practice in a display interface is shown, in this embodiment, a predetermined sentence is "Can we record it? ", indicating whether the actual integral pronunciation tone of the user is correct or not through the large circle on the left side above the interface, the color of the circle turning green indicates that the integral pronunciation tone is correct, and the color of the circle turning red indicates that the integral pronunciation tone is incorrect. The key words in the preset sentences are shown in a bold mode, the stressed syllables are further enlarged, and the rising and falling of the tone are indicated by inclined arrows on the words, wherein the arrows start from the stressed syllables of the key words and extend to the tail end of the rising and falling of the tone. For example, the accent word "record" is displayed in bold, and the accent syllable "cord" is further enlarged and highlighted.
In this embodiment, feedback information is fed back to the user through the display interface, and the feedback information may include but is not limited to: the correct and incorrect tone of the whole language, the actual condition of reading with strong and weak syllables in each word and the standard condition of reading with strong and weak syllables. The repeated reading concept of the word is difficult to understand for the learning user and needs to be continuously strengthened, particularly, in the initial learning stage, the learning user is difficult to accurately judge the strength through standard audio and needs direct and clear explanation, so that the embodiment adopts visual elements to assist the learning user in clearly obtaining the strength and weakness of the practice content, the understanding of the concept is strengthened in the practice process, and the problem of the learning user can be quickly positioned. Visually, the user is aided in intuitively understanding the intonation changes of the words by large and small contrasts, with the enlargement and reduction of words, and with more abstract, differently sized geometric shapes.
Furthermore, the method and the device can help the user to clearly compare the problems of the user during pronunciation by playing the audio data and the video audio of the user during exercise, and have an opportunity to further improve the problems by simulating standard demonstration audio.
In the following, a tone detection apparatus according to an embodiment of the present invention is introduced, and the tone detection apparatus described below and the tone detection method described above may be referred to correspondingly.
Fig. 6 is a block diagram of a tone detection apparatus according to an embodiment of the present invention, and referring to fig. 6, the tone detection apparatus may include:
an obtaining module 100, configured to obtain audio data entered for a predetermined sentence;
a determining module 200, configured to analyze the audio data and determine an actual intonation change in the predetermined sentence;
a generating module 300, configured to compare the actual intonation change with a preset intonation change corresponding to the predetermined sentence, and generate feedback information for representing whether the current intonation of the predetermined sentence is correct.
As a specific implementation manner, in the embodiment of the present application, the determining module 200 may specifically include:
the vowel detection unit is used for analyzing the audio data and detecting to obtain a vowel part in the audio data;
a frequency determining unit for determining a vibration frequency of the vowel portion and calculating a rate of change of the vibration frequency;
and the intonation determining unit is used for determining the actual intonation change in the preset sentence based on the change rate.
As a specific implementation manner, the vowel detection unit in the embodiment of the present application is specifically configured to: and carrying out forced segmentation and alignment on the audio data through voice recognition to obtain a vowel part in the audio data.
As a specific implementation manner, the embodiment of the present application may further include:
and the identification module is used for marking the preset intonation change through a first visual element of a display interface before acquiring the audio data input aiming at the preset statement.
As a specific implementation manner, the embodiment of the present application may further include:
the first indicating module is used for indicating that the intonation change of the preset sentence is correct through a second visual element of the display interface when the actual intonation change is consistent with the preset intonation change in a comparison mode;
and the second indicating module is used for indicating that the tone change of the preset statement is incorrect through a third visual element of the display interface when the actual tone change is inconsistent with the preset tone change.
As a specific implementation manner, the embodiment of the present application may further include:
and the prompting module is used for prompting the feedback information through a specific sound effect after generating the feedback information for representing whether the current intonation of the preset sentence is correct.
The intonation detection apparatus of the present embodiment is configured to implement the above-mentioned intonation detection method, and therefore the specific implementation manner of the intonation detection apparatus can be seen in the foregoing examples of the intonation detection method, for example, the obtaining module 100, the determining module 200, and the generating module 300 are respectively configured to implement steps S101, S102, and S103 in the above-mentioned intonation detection method, so that the specific implementation manner thereof may refer to the description of the corresponding examples of each part, and will not be described again here.
This application can carry out the analysis to the audio data of typeeing automatically, and confirm whether the actual intonation change wherein accords with to predetermine the intonation and feed back the information whether correct to the intonation to the user, can assist the user to understand the notion that the intonation changes, thereby help the user effectively to master the intonation change in the oral english, and simultaneously, this application no longer needs mr to carry out real person demonstration teaching or correction on the spot, the restriction in time and space has been overcome, can carry out corresponding exercise anytime and anywhere, the learning cost has been practiced thrift.
In addition, the present application further provides a intonation detection apparatus, which is applied to the server 1, as shown in fig. 7, the apparatus includes:
a memory 11 for storing a computer program;
a processor 12 for implementing the following steps when executing the computer program: acquiring audio data input aiming at a preset statement; analyzing the audio data to determine the actual intonation change in the preset sentence; and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not.
The memory 11 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may in some embodiments be an internal storage unit of the intonation detection device, for example a hard disk. The memory 11 may also be an external storage device of the intonation detection device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and so on. Further, the memory 11 may also include both an internal storage unit of the intonation detection apparatus and an external storage apparatus. The memory 11 may be used not only to store application software installed in the intonation detection apparatus and various types of data, such as the code of the intonation detection program 01, etc., but also to temporarily store data that has been output or is to be output.
The processor 12 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the memory 11 or Processing data, such as the program 01 for performing tone detection.
Optionally, the processor 12 is configured to implement the following steps when executing the computer program: analyzing the audio data, and detecting to obtain a vowel part in the audio data; determining the vibration frequency of the vowel part and calculating the change rate of the vibration frequency; determining an actual intonation change in the predetermined sentence based on the rate of change.
Optionally, the processor 12 is configured to implement the following steps when executing the computer program: and carrying out forced segmentation and alignment on the audio data through voice recognition to obtain a vowel part in the audio data.
Optionally, the processor 12 is configured to implement the following steps when executing the computer program: and marking the preset intonation change through a first visual element of a display interface before acquiring the audio data input aiming at the preset statement.
Optionally, the processor 12 is configured to implement the following steps when executing the computer program: when the actual intonation change is consistent with the preset intonation change in comparison, indicating that the intonation change of the preset sentence is correct through a second visual element of the display interface; and when the actual tone variation is inconsistent with the preset tone variation in comparison, indicating that the tone variation of the preset sentence is incorrect through a third visual element of the display interface.
Optionally, the processor 12 is configured to implement the following steps when executing the computer program: and after generating feedback information for representing whether the current intonation of the preset sentence is correct, prompting the feedback information through a specific sound effect.
It can be understood that the server in the embodiment of the present application may include, but is not limited to: a single web server, a server group of multiple web servers, or a cloud based on cloud computing consisting of a large number of computers or web servers.
In addition, the present application further provides a intonation detection device, which is applied to the client 2, as shown in fig. 8, the device includes:
the audio acquisition device 21 is used for inputting audio data aiming at a preset sentence;
the communication device 22 is configured to send the audio data to a server, so that the server analyzes the audio data to determine an actual intonation change in the predetermined sentence; comparing the actual intonation change with a preset intonation change corresponding to the preset sentence to generate feedback information for representing whether the current intonation of the preset sentence is correct or not;
and the display device 23 is used for displaying the feedback information on a display interface.
Optionally, in the intonation detection apparatus provided in the embodiment of the present application, the display device 23 may be further configured to: and marking the preset intonation change through a first visual element of a display interface before acquiring the audio data input aiming at the preset statement.
It can be understood that the client in the embodiment of the present application may include, but is not limited to: smart phones, tablets, MP4, MP3, PCs, PDAs, wearable devices, head mounted display devices, and the like.
Further, the present application also provides a tone detection system, as shown in fig. 9, the system includes any one of the above-mentioned service terminals 1 and any one of the above-mentioned client terminals 2. The user can carry out pronunciation study through the client, and the client can show the content that the user waited to study on display interface to can also export the audio frequency content of speech form to the user through audio playback devices such as speaker, when the user carries out pronunciation study of pronunciation, the client can gather the audio data when the user pronounces through audio acquisition device, and with audio data transmission to server, carry out the process that the intonation detected by the server. And after the audio data are analyzed at the server side and feedback information is obtained, the feedback information is sent to the client side. And displaying the feedback information through a display device of the client, and providing visual auxiliary information for the user.
Furthermore, the present application also provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the intonation detection methods disclosed in the foregoing embodiments.
The intonation detection device, the intonation detection system and the computer-readable storage medium provided by the application correspond to the intonation detection method. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
To sum up, this application can be automatically to the audio data analysis of typeeing, whether the actual intonation change wherein accords with preset intonation to confirm, and to the information whether correct in intonation feedback to the user, can assist the user to understand the notion that the intonation changes, thereby help the user effectively to master the intonation change in the oral english, and simultaneously, this application no longer needs the mr to carry out real person's teaching or correction on the spot, time and space's restriction has been overcome, can carry out corresponding exercise anytime and anywhere, the cost of learning has been practiced thrift.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Claims (10)
1. A intonation detection method, comprising:
acquiring audio data input aiming at a preset statement;
analyzing the audio data to determine the actual intonation change in the preset sentence;
and comparing the actual intonation change with a preset intonation change corresponding to the preset statement to generate feedback information for representing whether the current intonation of the preset statement is correct or not.
2. The intonation detection method according to claim 1, wherein said analyzing said audio data to determine actual intonation changes in said predetermined sentence comprises:
analyzing the audio data, and detecting to obtain a vowel part in the audio data;
determining the vibration frequency of the vowel part and calculating the change rate of the vibration frequency;
determining an actual intonation change in the predetermined sentence based on the rate of change.
3. The intonation detection method according to claim 2, wherein said analyzing said audio data to detect a vowel portion of said audio data comprises:
and carrying out forced segmentation and alignment on the audio data through voice recognition to obtain a vowel part in the audio data.
4. The intonation detection method according to claims 1 to 3, wherein said obtaining audio data entered for a predetermined sentence further comprises:
and marking the preset intonation change through a first visual element of a display interface.
5. The intonation detection method according to claim 4, wherein after generating feedback information for characterizing whether the current intonation of the predetermined sentence is correct, further comprising:
when the actual intonation change is consistent with the preset intonation change in comparison, indicating that the intonation change of the preset sentence is correct through a second visual element of the display interface;
and when the actual tone variation is inconsistent with the preset tone variation in comparison, indicating that the tone variation of the preset sentence is incorrect through a third visual element of the display interface.
6. The intonation detection method according to claim 5, wherein after generating feedback information for characterizing whether the current intonation of the predetermined sentence is correct, further comprising:
and prompting the feedback information through a specific sound effect.
7. A intonation detection apparatus, comprising:
the acquisition module is used for acquiring audio data recorded aiming at a preset statement;
the determining module is used for analyzing the audio data and determining the actual intonation change in the preset sentence;
and the generating module is used for comparing the actual intonation change with a preset intonation change corresponding to the preset statement and generating feedback information for representing whether the current intonation of the preset statement is correct or not.
8. A tone detection apparatus, applied to a server, the apparatus comprising:
a memory for storing a computer program;
a processor for implementing the steps of the intonation detection method according to any one of claims 1 to 6 when executing said computer program.
9. A intonation detection device, for application to a client, the device comprising:
the audio acquisition device is used for inputting audio data aiming at a preset sentence;
the communication device is used for sending the audio data to a server so that the server can analyze the audio data and determine the actual intonation change in the preset statement; comparing the actual intonation change with a preset intonation change corresponding to the preset sentence to generate feedback information for representing whether the current intonation of the preset sentence is correct or not;
and the display device is used for displaying the feedback information on a display interface.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the intonation detection method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910696870.XA CN112309371A (en) | 2019-07-30 | 2019-07-30 | Intonation detection method, apparatus, device and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910696870.XA CN112309371A (en) | 2019-07-30 | 2019-07-30 | Intonation detection method, apparatus, device and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112309371A true CN112309371A (en) | 2021-02-02 |
Family
ID=74485234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910696870.XA Pending CN112309371A (en) | 2019-07-30 | 2019-07-30 | Intonation detection method, apparatus, device and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112309371A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02203396A (en) * | 1989-02-01 | 1990-08-13 | Sharp Corp | Feature extraction system for voice |
US20040006461A1 (en) * | 2002-07-03 | 2004-01-08 | Gupta Sunil K. | Method and apparatus for providing an interactive language tutor |
US20040006468A1 (en) * | 2002-07-03 | 2004-01-08 | Lucent Technologies Inc. | Automatic pronunciation scoring for language learning |
CN101354889A (en) * | 2008-09-18 | 2009-01-28 | 北京中星微电子有限公司 | Method and apparatus for tonal modification of voice |
CN101739870A (en) * | 2009-12-03 | 2010-06-16 | 深圳先进技术研究院 | Interactive language learning system and method |
CN103310273A (en) * | 2013-06-26 | 2013-09-18 | 南京邮电大学 | Method for articulating Chinese vowels with tones and based on DIVA model |
CN104485116A (en) * | 2014-12-04 | 2015-04-01 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, voice quality evaluation method and voice quality evaluation system |
CN107507610A (en) * | 2017-09-28 | 2017-12-22 | 河南理工大学 | A kind of Chinese tone recognition method based on vowel fundamental frequency information |
CN109272992A (en) * | 2018-11-27 | 2019-01-25 | 北京粉笔未来科技有限公司 | A kind of spoken language assessment method, device and a kind of device for generating spoken appraisal model |
-
2019
- 2019-07-30 CN CN201910696870.XA patent/CN112309371A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02203396A (en) * | 1989-02-01 | 1990-08-13 | Sharp Corp | Feature extraction system for voice |
US20040006461A1 (en) * | 2002-07-03 | 2004-01-08 | Gupta Sunil K. | Method and apparatus for providing an interactive language tutor |
US20040006468A1 (en) * | 2002-07-03 | 2004-01-08 | Lucent Technologies Inc. | Automatic pronunciation scoring for language learning |
CN101354889A (en) * | 2008-09-18 | 2009-01-28 | 北京中星微电子有限公司 | Method and apparatus for tonal modification of voice |
CN101739870A (en) * | 2009-12-03 | 2010-06-16 | 深圳先进技术研究院 | Interactive language learning system and method |
CN103310273A (en) * | 2013-06-26 | 2013-09-18 | 南京邮电大学 | Method for articulating Chinese vowels with tones and based on DIVA model |
CN104485116A (en) * | 2014-12-04 | 2015-04-01 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, voice quality evaluation method and voice quality evaluation system |
CN107507610A (en) * | 2017-09-28 | 2017-12-22 | 河南理工大学 | A kind of Chinese tone recognition method based on vowel fundamental frequency information |
CN109272992A (en) * | 2018-11-27 | 2019-01-25 | 北京粉笔未来科技有限公司 | A kind of spoken language assessment method, device and a kind of device for generating spoken appraisal model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110085261B (en) | Pronunciation correction method, device, equipment and computer readable storage medium | |
US8065142B2 (en) | Synchronization of an input text of a speech with a recording of the speech | |
US10546508B2 (en) | System and method for automated literacy assessment | |
CN110136747A (en) | A kind of method, apparatus, equipment and storage medium for evaluating phoneme of speech sound correctness | |
CN110136748A (en) | A kind of rhythm identification bearing calibration, device, equipment and storage medium | |
CN104464757A (en) | Voice evaluation method and device | |
CN109448704A (en) | Construction method, device, server and the storage medium of tone decoding figure | |
CN109166569B (en) | Detection method and device for phoneme mislabeling | |
WO2019205383A1 (en) | Electronic device, deep learning-based music performance style identification method, and storage medium | |
JP2002132287A (en) | Speech recording method and speech recorder as well as memory medium | |
CN111325031B (en) | Resume analysis method and device | |
CN109408175B (en) | Real-time interaction method and system in general high-performance deep learning calculation engine | |
KR102414626B1 (en) | Foreign language pronunciation training and evaluation system | |
CN106356053A (en) | Method and device for testing recognition accuracy of voice input method and electronic equipment | |
CN112309429A (en) | Method, device and equipment for explosion loss detection and computer readable storage medium | |
CN112116181B (en) | Classroom quality model training method, classroom quality evaluation method and classroom quality evaluation device | |
CN110097874A (en) | A kind of pronunciation correction method, apparatus, equipment and storage medium | |
CN110085260A (en) | A kind of single syllable stress identification bearing calibration, device, equipment and medium | |
CN110890095A (en) | Voice detection method, recommendation method, device, storage medium and electronic equipment | |
CN111951827B (en) | Continuous reading identification correction method, device, equipment and readable storage medium | |
CN112309371A (en) | Intonation detection method, apparatus, device and computer readable storage medium | |
CN111128237B (en) | Voice evaluation method and device, storage medium and electronic equipment | |
CN109949813A (en) | A kind of method, apparatus and system converting speech into text | |
CN111128181B (en) | Recitation question evaluating method, recitation question evaluating device and recitation question evaluating equipment | |
CN108573713A (en) | Speech recognition equipment, audio recognition method and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210202 |