CN112863263B

CN112863263B - Korean pronunciation correction system based on big data mining technology

Info

Publication number: CN112863263B
Application number: CN202110060609.8A
Authority: CN
Inventors: 金清子
Original assignee: Jilin Agricultural Science and Technology College
Current assignee: Jilin Agricultural Science and Technology College
Priority date: 2021-01-18
Filing date: 2021-01-18
Publication date: 2021-12-07
Anticipated expiration: 2041-01-18
Also published as: CN112863263A

Abstract

The invention relates to a Korean pronunciation correction system based on big data mining technology, which utilizes a sensor to detect the formant frequency and the position changes of the tongue and the chin in the pronunciation process to determine the chin pronunciation parameters related to the pitch, and also utilizes magnetic resonance imaging and palate electrogram data to capture the three-dimensional sound channel geometrical characteristics of near consonants in the pronunciation process, and guides the dynamic adjustment of the movements of the lower jaw, the tongue and the throat of a learner according to the actual phoneme string and standard pronunciation.

Description

Korean pronunciation correction system based on big data mining technology

Technical Field

The invention relates to the field of language learning, in particular to a Korean pronunciation correction system based on a big data mining technology.

Technical Field

Due to historical reasons, Korean is greatly influenced by Chinese, so that the Korean has many similarities with Chinese, and the similarity brings great convenience for Korean to learn Chinese and brings great negative migration. Although many korean pronunciations and chinese pronunciations are similar in pronunciation and are particularly apparent in korean kanji words, in fact, there are great differences in pronunciation methods and pronunciation positions. This difference makes korean students have many difficult difficulties to overcome when learning chinese, and brings much trouble to korean chinese speech teaching. It is necessary to study the consonant difference problem of Han-Han pronunciation, to discuss the difference of Han-Han consonants, and to discuss the corresponding teaching strategy.

Consonants are the sounds formed by significant obstruction of airflow at the site of pronunciation, called consonants. Consonants in chinese and korean are different in pronunciation method, pronunciation position, and pronunciation strength. The consonant system of Mandarin Chinese and the consonant system of Korean have no correspondence, and some sounds exist in Mandarin Chinese and do not exist in Korean, such as f [ ] f](ii) a There are also some sounds that appear to be pronounced in the same location and manner, but in fact are not as pronounced, such as

And g, k; still other sounds exist in korean, but not in chinese, such as the tonic of korean, and the chinese consonant system does not. There are also tight tones in korean, which are distinguished from loose tones by a stronger airflow. Meanwhile, there is a throat sound in the korean consonant system

Nasal sound

Flashing sound

These three sounds do not exist in Chinese, but are also more specific in Korean, the nasal sounds

At the beginning of the syllable, the syllable is not pronounced, the larynx is voiced

Like the h-tone, the flashing tone

The pronunciation method is similar to r sound when receiving sound.

In the learning process, learners often have strong dependence on the native language. Generally speaking, it is common that the learner prefers to learn the second language from the original language, and replaces the target language with the sound similar to that of the original language, or learning the target language by thinking of the original language may cause errors. (1) Similar speech causes bias errors, both Mandarin and Korean are inherently similar, and substitution is more common, as some of the above-described approximations, e.g. by using similar sounds

G, k are replaced, thereby causing bias; (2) substitution of pronunciation not present in the native language by native speech, e.g. by laryngeal tone

Instead of h, or

Pronunciations are substituted for either l or r. (3) The pronunciation change of korean language causes bias. Therefore, learning the mandarin chinese through the ambisonics thinking of the native language also causes bias.

In summary, understanding the relationship between voicing characteristics and acoustic signals is crucial to solving the voicing inversion problem.

Disclosure of Invention

The invention provides a Korean pronunciation correction system based on a big data mining technology, which realizes the detection and automatic correction of Korean spoken pronunciation errors and provides technical support for students to learn Korean.

A Korean pronunciation correction system based on big data mining technology comprises an audio signal acquisition module, a data analysis module, a correction module, a control module, a terminal module and a cloud module, wherein the signal transmission device comprises a vocal cord vibration sensor and an electromagnetic sensor, the electromagnetic sensor is used for capturing the movement of the tongue and the chin in voice recognition, the electromagnetic sensor is a wearable permanent magnetic tracer agent, the movement of the tongue is tracked wirelessly by utilizing a magnetic sensor array, the ultrasonic imaging measurement of the coordinates and the curvature position of the tongue is carried out to represent the tongue in the speaking process, meanwhile, the formant frequency of vowels in a pronunciation model is estimated based on the combination of the lower jaw, the tongue and the throat, the data analysis module optimizes two formants before Korean vowels and consonants, and the specific steps comprise:

s1. for vowels, the first formant is expressed as

Its value is inversely proportional to the tongue height h:

the second resonance peak, is shown as

For vowel production, its value is inversely proportional to the horizontal axis advance of the tongue,/:

the mouth is considered as a tubular model and as a resonator, and the model is modified to obtain:

β₁and beta₂Is the closest constant value, beta, of the formant response of the provided tongue vowel pronunciation system₁、β₂E is R, c is the speed of sound, c is 340 m/s;

s2, determining beta₁And beta₂Value of (a), beta₁And beta₂The value of (a) is calculated based on the acquired value of the formants of the existing oral system of the experimental value of the permanent magnetic tracer, in order to improve the accuracy, a loss function between the formants of the estimation system and the tongue pronunciation system is calculated, and the loss is calculated by using a mean square error function:

calculating partial derivatives of the loss function and updating beta by₁And beta₂Current value of (a):

s3, the first formants of the relaxing tone, the tightening tone and the air supply tone are respectively expressed as follows:

the second formants of the relaxing tone, the relaxing tone and the air supply tone are respectively expressed as follows:

in the formula, gamma₁、γ₂Is the closest constant value of the provided tongue consonant pronunciation system formant response, c is the speed of sound, B is the burst release time, Duration is the Duration of pronunciation;

s4, cascading the simplified oral cavity system based on the tongue with the throat system to provide a calculation formula of the vocal tract system, wherein a transfer function of a resonant peak frequency of the vocal tract system is expressed as V (z)_kTransfer function of formant frequencies of the laryngeal system and tongue is expressed as L (z)_kAnd

A₁，A₂representing the formant frequencies of the laryngeal and lingual articulatory systems, respectively, T representing the duration of each formant, z representing the bandwidth of the formant, F_ikThe expression represents that the values of i and k are different

S5, the correction module acquires the formant frequency and the position changes of the tongue and the chin through a sensor so as to determine a chin pronunciation parameter related to the pitch; and in the process of pronunciation, performing acoustic and electromyogram analysis, capturing the three-dimensional vocal tract geometric characteristics of the near consonant by using magnetic resonance imaging and palatal electrogram data, and guiding the dynamic adjustment of the movements of the lower jaw, tongue and throat of a learner according to the actual phoneme string and standard pronunciation.

Furthermore, the introduction of error-eliminating calculation can effectively perform high-precision spoken language pronunciation correction calculation, firstly, data processing and error calculation are performed, and the process is as follows:

in the formula, the error E is an error threshold, H is an extreme value of a vibration trough, C is an effective period law of audio, D is a constant frequency parameter, and PAH is a standard amplitude of Korean voice;

the collected Korean spoken utterances are normalized:

in the formula eta_EIs a function discrete value in a Korean pronunciation process, n is a weight of the function discrete value, T represents a hop count between two audio nodes, d_ijRepresenting the shortest distance between audio node i and node jA path;

the pronunciation is corrected as follows:

Vi＝RU_i(A^TS^-1)^-1

in the formula, A^TFor the natural skewness of the audio, it is a parameter for measuring the note, S^-1Is the combination of audio attributes, is a function parameter of audio proofreading, R is the lifting weight of the high-grade audio, Ui is the measurement of the audio, V_iIs the audio error protection limit.

Further, the vocal cord vibration sensor comprises a voice signal acquisition sensor array, and the frequency domain of the korean voice signal feature detection is v (t, θ), that is:

in the formula, ω_i(theta) represents an instantaneous time-domain signal weighting vector of the ith pronunciation output of Korean,

representing the transient time domain signal component of the Korean pronunciation output, theta is a speech signal parameter, theta represents a conjugation operator, M represents a sensor, and the maximum value of the quantity is M;

and performing time domain matching and filtering on the voice signals by adopting an adaptive beam forming method. The frequency domain characteristics of the output signal are as follows:

V(t,θ)＝x^H(t)ω(θ)

in the formula, H represents complex conjugate transpose;

the weight vector and components of the instantaneous time-domain signal of the korean speech output can be expressed as:

x(t)＝[x₁(t),x₂(t),…,x_M(t)]^T

ω(θ)＝[ω₁(θ),ω₂(θ),…,ω_M(θ)]^T；

combining the self-adaptive filtering and blind source separation, decomposing the voice signal to obtain an FM component of Korean voice detection, and outputting the FM component as follows:

T_m(θ)＝(m-1)T₀(θ)；

in the formula, T₀(θ) represents the initial FM component. Combining with the signal processing method of the sensor array, a signal model for detecting the pronunciation error of the Korean language is obtained as follows:

in the formula, g_mTo calculate the coefficients, n_m(t) is an auxiliary parameter.

Furthermore, the audio signal acquisition module comprises a signal transmission device, an audio signal modulator, a demodulator and a voice acquisition device.

Furthermore, the audio signal modulator modulates the low-frequency digital signal into a high-frequency digital signal through a digital signal processing technology and transmits the high-frequency digital signal, the audio signal modulator and the demodulator are used in pair and used for adjusting the digital signal into the high-frequency signal to transmit, and the demodulator restores the digital signal into an original signal.

Further, the demodulator restores the low frequency digital signal modulated in the high frequency digital signal.

Furthermore, the control module consists of a program counter, an instruction register, an instruction decoder, a time sequence generator and an operation controller and is used for issuing commands and coordinating and commanding the operation of the whole system.

Further, the terminal module comprises a client UI module and a visualization module, and the client UI module is suitable for collecting terminal user information.

Further, the cloud module comprises a signal receiving module, and the cloud module comprises a korean standard pronunciation and a database of an oral system and a throat system.

In the pronunciation process, the sensor is used for detecting the formant frequency and the position change of the tongue and the chin so as to determine the chin pronunciation parameters related to the pitch. In the process of pronunciation, acoustic and electromyogram analysis is carried out, magnetic resonance imaging and palatal electrogram data are used for capturing the three-dimensional vocal tract geometric characteristics of the near consonants, and dynamic adjustment guidance is carried out on the movements of the lower jaw, the tongue and the throat of a learner according to the actual phoneme string and standard pronunciation.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The Korean pronunciation error correction system is mainly used for recognizing Korean spoken pronunciation, detecting and automatically correcting the Korean spoken pronunciation error. Spoken pronunciation is the first step in learning korean, and is the basis for the entire learning of korean. The primary problem in learning korean is to remember words. The primary task of remembering words is to remember the pronunciation of a word. The hearing can be greatly improved by the correct pronunciation habit of the spoken language. Even if some familiar words are in a sentence, they cannot understand the correct spoken pronunciation of others because of their own unique spoken pronunciation, thus causing difficulties in korean spoken interaction. Accurate korean pronunciation is very important to the hearing of students.

The system hardware architecture is constructed according to the requirements of the Korean spoken language pronunciation error automatic error correction system, and comprises an audio signal acquisition module, a data analysis module, a correction module, a control module, a terminal module and a cloud module.

The audio signal modulator is a device that modulates a low-frequency digital signal into a high-frequency digital signal by a digital signal processing technique and transmits the signal. An audio signal modulator is usually used in combination with a demodulator to convert a digital signal into a high frequency signal for transmission, and a demodulator to restore the digital signal to an original signal. A demodulator is a device that restores a low frequency digital signal modulated in a high frequency digital signal using a digital signal processing technique. The main function of the voice collector is to collect the pronunciation of Korean spoken language. The controller is a main circuit for changing a preset sequence, explaining the wiring and the circuit of the control circuit, controlling the resistance of the die punching, controlling the rotating speed of a motor of the die punching in the die punching and a main device for braking and reversing, and mainly comprises a program counter, an instruction register, an instruction decoder, a time sequence generator and an operation controller; issuing commands, i.e., coordinating and directing the operation of the entire system, is the "decision-making principal".

The traditional spoken language voice correction system adopts a signal processing method to extract the characteristics of a spoken language voice signal and recognize information, compares an extracted voiceprint image with a standard voiceprint, but does not correct the voiceprint image on the basis of a pronunciation mechanism. The invention researches a voice system, and enables a user to sense and detect the muscle movement mode of own vocal organs (including lips, chin, tongue and teeth) in the vocal process through a signal transmission device arranged on a napestrap so as to correct and adjust the vocal. The speech system is used to record the activity of the pronunciation system (including facial muscles), detect the synthesis of speech signals using electromagnetic signals, and determine the acoustic properties of the pronunciation map by describing the pronunciation trajectory of the mandible, lips, tongue body and tongue tip.

The vocal cord vibrating device is located in the larynx and captures sensor signals, which are sent to a control system to detect periodic vibrations associated with the utterance. Meanwhile, the electromagnetic sensor is connected to the face and records the pulse, while the tongue and ear interface is a wearable system that can capture the movements of the tongue and chin for speech recognition.

The tongue's characteristics in terms of vowel production are considered in the present invention to be the primary role in the production of speech through the mouth. Wearable permanent magnetic tracers are fixed on the tongue, the magnetic sensor array is used for tracking the movement of the tongue in a wireless mode, and the wearable system is free of physical invasion. Ultrasonic imaging measurements of the coordinates of the tongue and its curvature location to represent the tongue during speech while the formant frequencies of the vowels in the articulatory model are estimated based on the combination of the mandible, tongue and larynx. The vowel formant frequency values were experimentally counted using recorded voices of ten thousand koreans, which were associated with their tongue curvatures obtained by ultrasonically analyzing the resonance mechanism of the oral vocal tract system. From the relationship between the coordinates of the tongue and the formant frequencies, it is concluded that: the first formant frequency is dependent on the height of the tongue and the second formant is dependent on the length of advance of the horizontal axis of the tongue.

During the pronunciation process, the sensor is used for detecting the formant frequency and the position change of the tongue and the chin so as to determine the chin pronunciation parameters related to the pitch. In the process of pronunciation, acoustic and electromyogram analysis is carried out, and the three-dimensional vocal tract geometric characteristics of the near consonants are captured by utilizing magnetic resonance imaging and palatal electrogram data.

The first formant is inversely proportional to tongue height and the second formant frequency is related to the size of the forehead cavity or the degree of tongue advancement based on the displayed tongue and lip position. And formant frequencies are speaker dependent and vary with gender and age. In the invention, an optimized statistical formula of the vowel formant frequency is provided from the accumulation result of the vowel and is expanded to the consonant, and all researches are based on tongue motion mapping in the pronunciation process of the vowel and the consonant. The tongue basal oral cavity statistical model provided by the invention is associated with the throat model and is compared with the voice generated by the vocal tract model in detail. The algorithm is based on a formant expression and is suitable for vowel and consonant generation of different age groups and sexes.

The invention provides an optimized statistical relationship of two formants in front of a Korean vowel and a consonant, defines an age-and gender-independent speech generation system by using human tongue motion, and associates a tongue pronunciation system with a known throat model.

When the vocal cords suddenly close, a pulse-like excitation in the vibration source causes the glottis to close, and it is at this stage that the subglottal region and the supraglottic region are separated, and therefore the effective length of the vocal tract is reduced, thus producing resonance only for the supraglottic portion. This variation in vocal tract length causes a variation in the dominant resonances of the spectrum, and it is difficult to extract the resonance frequencies and their associated bandwidths accurately, since these frequencies and their associated bandwidths vary continuously due to the variation in vocal tract shape, not only within the pitch period, but also within the pitch period (i.e. from the closed phase to the open phase of the glottis), and therefore the estimation of the resonance bandwidth must be carefully done for short speech segments. When the speech spectrum is decomposed into amplitude and phase components, the prominent resonance locations and the associated bandwidths are called formants. During the vowel sounds, the first two formants of the oral system formants are inversely proportional to tongue height and tongue propulsion, respectively. And performing statistical estimation by mapping tongue direction characteristics by adopting a sound channel synthesizer and a vowel space theory. The vocal tract shape and the quadrangle are displayed in pairs representing each vowel. In vowel space theory, the same pattern is quadrilateral, where the horizontal axis l represents tongue advancement, e.g., anterior, medial, posterior, which describes the tongue being elevated during vowel articulation, and the slope line h represents tongue height, e.g., closed, medial, and open.

A first resonance peak, denoted as

For vowel production, its value is inversely proportional to tongue height h:

the second resonance peak, is shown as

the mouth is considered to be a tubular model and assumed to be a resonator. And correcting the model to obtain:

β₁and beta₂Is the closest constant value, beta, of the formant response of the provided tongue vowel pronunciation system₁、β₂e.R, c is the speed of sound, c 340 m/s.

The next step is to determine beta₁And beta₂Value of (a), beta₁And beta₂The value of (a) is calculated based on the acquired value of the formants of the existing oral system of the experimental value of the permanent magnetic tracer, in order to improve the accuracy, a loss function between the formants of the estimation system and the tongue pronunciation system is calculated, and the loss is calculated by using a mean square error function:

calculating partial derivatives of the loss function and updating beta by₁And beta₂The current value of (a).

The pronunciation produced for a consonant represents the position and movement of the tongue by the relationship between the tongue height h and the horizontal axis advance l of the consonant. In a manner similar to a vowel, a relationship between the tongue height h of the consonant quadrilateral and the horizontal axis advance l of the tongue is established. A statistical formula of the consonant oral cavity formants is obtained by a gradient descent method and optimized. Consonants are described and distinguished by a phonological and modal system, on the basis of which consonants are divided into three distinct groups: loose sound, tight sound, air-feeding sound. From the acoustic properties of consonants, the first and second formants are affected by the size of the constriction, the manner of articulation (tongue height) and bursting (sudden release of air), the position of the tongue, and the voiced or unvoiced sounds and the articulation (tongue forward).

The first formants of the relaxing tone, the relaxing tone and the air supply tone are respectively expressed as follows:

in the formula, gamma₁、γ₂Is the closest constant value of the provided tongue consonant pronunciation system formant response, c is the speed of sound, B is the burst release time, and Duration is the Duration of pronunciation.

After the formants of the complete set of vowels and consonants are established, the present invention proposes a new method of quantifying speech intelligibility using the above results and indicates that the formants of the first two formants of the tongue pronunciation system are different.

The vocal tract model includes the lung (glottic source) and the larynx, andoral cavity of single conduit. The lungs act as a motive force to provide airflow to the larynx. The larynx regulates the airflow from the lungs and provides a periodic or noisy source of airflow. Thus, the output provides a modulated airflow by spectrally shaping the light source, a calculation formula for the vocal tract system is developed by cascading a simplified tongue-based oral system (tongue articulatory system) with the laryngeal system, the transfer function of the vocal tract system formant frequencies being represented by V (z)_kTransfer function of formant frequencies of the laryngeal system and tongue is expressed as L (z)_kAnd

A₁,A₂representing the formant frequencies of the laryngeal and lingual articulatory systems, respectively, T representing the duration of each formant, z representing the bandwidth of the formant, F_ikThe expression represents that the values of i and k are different

In addition, the bandwidth of the formants obtained by short-time processing can be approximate to the instantaneous bandwidth of each formant, and the formants can be extracted by the instantaneous bandwidth in addition to the amplitude component. The formant bandwidth is determined by decomposing the speech signal through a bank of bandpass filters and then demodulating each band to obtain an amplitude envelope and an instantaneous frequency signal. The bandwidth of formants is then extracted from these instantaneous frequency signals using an energy separation algorithm, the bandwidth values are normalized with respect to the maximum and plotted as histogram curves, and the bandwidth at the dominant resonance frequency of the spectral response is extracted from short segments of speech to highlight the variation in bandwidth in vowel and consonant segments.

The vocal cord vibration sensor comprises a voice signal acquisition sensor array, and the frequency domain of Korean voice signal feature detection is v (t, theta), namely:

represents the instantaneous time domain signal component of the Korean pronunciation output, theta is the speech signal parameter, phi represents the conjugation operator, M represents the sensor, and the maximum value of the number is M.

V(t,θ)＝x^H(t)ω(θ)

in the formula, H represents a complex conjugate transpose.

x(t)＝[x₁(t),x₂(t),…,x_M(t)]^T

ω(θ)＝[ω₁(θ),ω₂(θ),…,ω_M(θ)]^T

T_m(θ)＝(m-1)T₀(θ)

Speech error detection

After the learner pronounces according to the prompt of the system, the system combines the standard pronunciation dictionary and the pronunciation rule to form a phoneme detection network. Meanwhile, the formant frequency and the position change of the tongue and the jaw are obtained through a sensor so as to determine a jaw pronunciation parameter related to the pitch; and in the process of pronunciation, performing acoustic and electromyogram analysis, capturing the three-dimensional vocal tract geometric characteristics of the near consonant by using magnetic resonance imaging and palatal electrogram data, and guiding the dynamic adjustment of the movements of the lower jaw, tongue and throat of a learner according to the actual phoneme string and standard pronunciation.

The introduction of error-eliminating calculation can effectively carry out high-precision spoken language pronunciation correction calculation, firstly, data processing and error calculation are carried out, and the process is as follows:

in the formula, the error E, the error threshold H, the extreme value of the vibration trough B, the effective period law of the audio frequency C, the constant frequency parameter D and the PAH are the standard amplitude of the Korean speech.

By the above method, the collected korean spoken utterances are "normalized":

in the formula eta_EIs a function discrete value in the Korean pronunciation process, n is the weight of the function discrete value, and T represents twoNumber of hops between individual audio nodes, d_ijRepresenting the shortest path between audio node i and node j.

The pronunciation is corrected as follows:

Vi＝RU_i(A^TS^-1)^-1

Through the research on the sound channel and the oral cavity model, the pronunciation errors of the Korean spoken language are automatically corrected based on pronunciation phonemes, and technical support is provided for students to learn the Korean language.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent changes and modifications made to the above embodiment according to the technical spirit of the present invention are within the scope of the technical solution of the present invention.

Claims

1. The Korean pronunciation correction system based on the big data mining technology is characterized by comprising an audio signal acquisition module, a data analysis module, a correction module, a control module, a terminal module and a cloud module, wherein a signal transmission device comprises a vocal cord vibration sensor and an electromagnetic sensor, the electromagnetic sensor is used for capturing the movement of the tongue and the chin in voice recognition, the electromagnetic sensor is a wearable permanent magnetic tracer agent, the movement of the tongue is wirelessly tracked by utilizing a magnetic sensor array, the ultrasonic imaging measurement of the coordinates and the curvature position of the tongue is carried out to represent the tongue in the speaking process, meanwhile, the formant frequency of a vowel in a pronunciation model is estimated based on the combination of the lower jaw, the tongue and the throat, and the data analysis module optimizes two formants before a Korean vowel and a consonant, and the method comprises the following specific steps:

s1. for vowels, the first formant is expressed as

Its value is inversely proportional to the tongue height h:

the second resonance peak, is shown as

s4, cascading the simplified oral cavity system based on the tongue with the throat system to provide a calculation formula of the vocal tract system, wherein a transfer function of a resonant peak frequency of the vocal tract system is expressed as V (z)_kTransfer function of formant frequencies of the laryngeal system and tongue is expressed as L (z)_kAnd O (z)_k：

S5, the correction module acquires the formant frequency and the position changes of the tongue and the chin through a sensor so as to determine a chin pronunciation parameter related to the pitch; in the process of pronunciation, performing acoustic and electromyogram analysis, capturing the three-dimensional vocal tract geometric characteristics of the near consonant by using magnetic resonance imaging and palatal electrogram data, and guiding the dynamic adjustment of the movements of the lower jaw, tongue and throat of a learner according to the actual phoneme string and standard pronunciation;

the audio signal acquisition module comprises a signal transmission device, an audio signal modulator, a demodulator and a voice acquisition device.

2. The korean pronunciation correction system based on big data mining technology as claimed in claim 1, wherein the introduction of error-eliminating calculation can effectively perform the high-precision spoken pronunciation correction calculation, and the data processing and error calculation are performed first, as follows:

T_m(θ)＝(m-1)T₀(θ)；

in the formula, T₀(θ) represents the initial FM component, T_m(θ) is the FM component;

and (3) carrying out normalized calculation on the collected Korean spoken pronunciation:

in the formula eta_EIs a function discrete value in a Korean pronunciation process, n is a weight of the function discrete value, T represents a hop count between two audio nodes, d_ijRepresenting the shortest path between audio node i and node j;

the pronunciation is corrected as follows:

Vi＝RU_i(A^TS^-1)^-1

in the formula, A^TFor the natural skewness of the audio, it is a parameter for measuring the note, S^-1Is a combination of audio attributes, is a function parameter of audio proofreading, and R is highLifting weight, U, of the level audio_iIs a measure of audio, V_iIs the audio error protection limit.

3. The system of claim 1, wherein the vocal cord vibration sensor comprises a voice signal acquisition sensor array, and the frequency domain of the feature detection of the Korean voice signal is v (t, θ), that is:

the time domain matching and filtering are carried out on the voice signals by adopting a self-adaptive beam forming method, and the frequency domain characteristics of the output signals are as follows:

V(t,θ)＝x^H(t)ω(θ)

in the formula, H represents complex conjugate transpose;

x(t)＝[x₁(t),x₂(t),…,x_M(t)]^T

ω(θ)＝[ω₁(θ),ω₂(θ),…,ω_M(θ)]^T；

combining with the signal processing method of the sensor array, a signal model for detecting the pronunciation error of the Korean language is obtained as follows:

4. The system of claim 3, wherein the audio modulator modulates a low frequency digital signal into a high frequency digital signal by a digital signal processing technique and transmits the high frequency digital signal, and the audio modulator is used in pair with the demodulator to modulate the digital signal into a high frequency signal and transmit the high frequency signal, and the demodulator restores the digital signal to an original signal.

5. The system of claim 4, wherein the demodulator recovers a low frequency digital signal modulated in a high frequency digital signal.

6. The korean pronunciation correction system based on big data mining technology as claimed in claim 1, wherein the control module is composed of a program counter, an instruction register, an instruction decoder, a timing generator and an operation controller for issuing commands, coordinating and directing the operation of the whole system.

7. The system of claim 1, wherein the terminal module comprises a client UI module, a visualization module, and the client UI module is adapted to collect information of a terminal user.

8. The system of claim 1, wherein the cloud module comprises a signal receiving module, and the cloud module comprises a standard pronunciation for korean and a database of an oral system and a throat system.