CN114515138A - Language disorder assessment and correction system - Google Patents
Language disorder assessment and correction system Download PDFInfo
- Publication number
- CN114515138A CN114515138A CN202210011831.3A CN202210011831A CN114515138A CN 114515138 A CN114515138 A CN 114515138A CN 202210011831 A CN202210011831 A CN 202210011831A CN 114515138 A CN114515138 A CN 114515138A
- Authority
- CN
- China
- Prior art keywords
- module
- audio
- user
- pronunciation
- audio signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012937 correction Methods 0.000 title claims abstract description 40
- 208000011977 language disease Diseases 0.000 title claims description 9
- 230000005236 sound signal Effects 0.000 claims abstract description 43
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 13
- 238000001228 spectrum Methods 0.000 claims description 8
- 239000000463 material Substances 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 6
- 241001672694 Citrus reticulata Species 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims 1
- 238000011156 evaluation Methods 0.000 abstract description 28
- 230000004888 barrier function Effects 0.000 abstract description 10
- 238000010586 diagram Methods 0.000 description 4
- 241000282414 Homo sapiens Species 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 206010013887 Dysarthria Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/40—Detecting, measuring or recording for evaluating the nervous system
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4803—Speech analysis specially adapted for diagnostic purposes
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Biophysics (AREA)
- Pathology (AREA)
- Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Veterinary Medicine (AREA)
- Physics & Mathematics (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Neurology (AREA)
- Neurosurgery (AREA)
- Physiology (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention relates to the technical field of language barriers and discloses a language barrier assessment and correction system which comprises an input module, a recording module, a storage module, a processing module, a comparison module, a display module and a playing module, wherein the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting data to be compared into a type capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the comparison similarity to a user, and the playing module is used for clicking and playing audio by the user; the problems that in the prior art, the correction process of the language barrier is not targeted, the process is not strict, and the evaluation system is high in cost, difficult to operate, limited in region and the like are solved.
Description
Technical Field
The invention relates to the technical field of language disorder, in particular to a language disorder evaluation and correction system.
Background
The language is the most important human communication tool, is a means for thinking and expressing thought of human beings, is the most basic information carrier of human society, is an important medium for thought collision between people, and the existence of the language cannot be separated between people.
Products in the current market have no products for evaluating the conditions of language disorder patients, the evaluation of the language disorder patients is based on artificial and face-to-face evaluation, artificial professional disparities exist in the technology, a unified standard does not exist, and certain influence can be generated on the accuracy of an evaluation result; on the other hand, the existing voice guidance system on the market is complex in device, and needs to prepare various hardware facilities such as a correction device, a projection device, an earphone and a headset, and the correction device is formed by combining unit devices such as a recording unit, a playing unit, a display unit, and a storage unit, so that the device is complicated and one person must set the device. The setting of one auxiliary device is not only high in equipment cost and has no great advantages compared with the one-to-one professional guidance of personnel configuration relative to the next, but also has no pertinence to the correction process of a language barrier, and the process is not strict, which is the defects of high cost, difficult operation, region limitation and the like of the traditional evaluation system, so that the language barrier evaluation and correction system is needed to solve the problems.
Disclosure of Invention
The invention provides a language barrier evaluation and correction system, which saves complex operations of equipment and manual work, fills the blank of digitalization of the evaluation system, has the advantage of facilitating language evaluation and correction of an evaluated person at any time and any place, and solves the problems of no pertinence in the correction process of a language barrier, non-rigorous process, high cost, difficult operation, region limitation and the like of the evaluation system in the prior art.
The invention provides the following technical scheme: a language barrier assessment and correction system comprises an input module, a recording module, a storage module, a processing module, a comparison module, a display module and a playing module, wherein the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting data to be compared into types capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the data to be seen by a user, the playing module is used for clicking the audio to be played by the user, the storage module stores standard pronunciation audio in a file folder in advance, a database only stores respective paths of the standard pronunciation audio and presents the paths to the user for clicking one by one through a circular traversal method, click the corresponding audio frequency of broadcast when the user, can hear the standard pronunciation audio frequency that has deposited, hear corresponding audio frequency back user will follow up reading, the sound that the user said will be recorded to the recording module, and pass through storage module stores to this is as outside audio signal, simultaneously, the record is clicked the audio frequency of broadcast and is regarded as inside audio signal, the recording module can record including the recording module can record the correct pronunciation including unit sound final, compound vowel final and consonant initial, and pass through storage module saves as picture annotation adds the video file of pronunciation.
Preferably, the external audio signal and the internal audio signal are processed by fourier transform and a window function to obtain an energy spectrum of an audio segment, the energy spectrum of the audio segment is extracted into chroma feature vectors of each frame, and a set of the chroma feature vectors is formed, so as to obtain respective feature matrices.
Preferably, the feature matrices of the external audio signal and the internal audio signal are compared with each other by a DTW algorithm to determine the similarity between the two signals.
Preferably, the video file of the picture annotation and pronunciation contains a text description of a voice correction method, and is matched with a voice explanation correction operation and an explanation real operation audio image.
Preferably, the storage module further stores a speech stream training material, and the speech stream training material is used for mandarin chinese speech stream training.
Preferably, the phonemes comprise vowels and consonants, and the syllables comprise initials and finals.
Preferably, the vowels and the consonants include six unit vowels, eighteen complex vowels and consonants.
The invention has the following beneficial effects:
the method and the device aim at the manual evaluation system of the language barrier in the current market, and fill the blank of the digitalization of the evaluation system.
The invention can be automatically operated by a demander on a mobile phone for online evaluation and correction, thereby saving the defects of equipment, complex manual operation, fixed place evaluation and the like and facilitating the language evaluation and correction of the evaluated person at any time and any place.
After the self-service evaluation is carried out on the evaluated person, an evaluation report is provided according to the evaluation condition of the evaluated person, and a correction scheme can be provided in a subsequent language correction system in a targeted manner, so that the language correction can be better carried out.
The system can evaluate at any time in the correction process, the correction progress condition of a user is determined, the correction process is recorded in a data mode, and the system can adjust the correction plan according to the evaluation condition of each time and the requirement; the closed loop setting of evaluation, correction, feedback and evaluation can detect the sound production condition of the user in real time, thereby improving the accuracy of the pronunciation of the language handicapped.
The on-line evaluation and correction mode of the invention is convenient for the evaluation process of the evaluated person and the family to a great extent, and the evaluation can be carried out at home without specially dragging the family to a designated place, reserving, queuing and consulting and the like; when the epidemic situation is abused, the evaluation mode also improves the safety and meets the requirement of national epidemic situation management and control.
Drawings
FIG. 1 is a diagram of data collection in accordance with the present invention;
FIG. 2 is a data processing diagram of the present invention;
FIG. 3 is a diagram of syllables of the present invention;
FIG. 4 is an exemplary diagram of a phrase of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Referring to fig. 1 to 4, a language disorder evaluating and correcting system includes an input module, a recording module, a storage module, a processing module, a comparison module, a display module, and a playing module.
Wherein, the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting the data to be compared into the type capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the data to a user, the playing module is used for clicking and playing the audio by the user, the external audio signals are input into the input module, the external audio signals are recorded by the recording module, the recorded audio is stored, the standard pronunciation audio signals stored by the storage module are taken into the processing module, the external information signals and the internal audio signals are processed and converted into the data structures of the same type, and the comparison module is used for comparing to form a comparison similarity, recording the information, storing the information, and displaying a test report on a display module; the storage module stores standard pronunciation audio in a folder in advance, the database only stores respective paths of the standard pronunciation audio and the standard pronunciation audio, the standard pronunciation audio is presented to a user for clicking one by one through a circular traversal method, when the user clicks and plays the corresponding audio, the stored standard pronunciation audio is heard, the user can follow and read after hearing the corresponding audio, the recording module records the voice spoken by the user and stores the voice as an external audio signal through the storage module, meanwhile, the clicked and played audio is recorded as an internal audio signal, the recording module records the correct pronunciation of phonemes and syllables, a video file for annotating and adding pronunciation for a picture is stored through the storage module, and a video file for annotating and adding pronunciation for the picture is stored through the storage module.
The external audio signal and the internal audio signal are processed through Fourier transform and a window function to obtain an energy spectrum of a section of audio clip, the energy spectrum of the audio clip is extracted into chroma characteristic vectors of each frame, and a set of the chroma characteristic vectors is formed, and finally respective characteristic matrixes are obtained.
The feature matrices of the external audio signal and the internal audio signal are compared with each other through a DTW algorithm to judge the similarity of the two signals.
The video file of the picture annotation and pronunciation contains the graphic and text description of the voice correction method, and is matched with the voice explanation correction operation and the explanation real operation audio image.
The storage module also stores speech stream training materials, and the speech stream training materials are used for mandarin speech stream training.
The phonemes comprise vowels and consonants, and the syllables comprise initials and finals.
The vowels and consonants include six unit-tone vowels, eighteen complex vowel vowels and twenty-one consonant initials.
The working principle is as follows: the stored standard pronunciation audio is stored in a folder in advance, the database only stores respective paths of the standard pronunciation audio, the stored standard pronunciation audio is presented to a user for clicking one by one through a circular traversal method, when the user clicks and plays the corresponding audio, the stored standard pronunciation audio is heard, the user can read after hearing the corresponding audio, the system can record the voice spoken by the user and store the voice as an external audio signal, and simultaneously the clicked and played audio is recorded as an internal audio signal. The external audio signal and the internal audio signal are processed through Fourier transform and a window function to obtain an energy spectrum of an audio clip, the energy spectrum of the audio clip is extracted into chroma eigenvectors of each frame, and the chroma eigenvectors form a chroma eigenvector group, and finally respective feature matrixes are obtained. Then, the feature matrices of the external audio signal and the internal audio signal are compared for their similarity by the DTW algorithm, thereby judging the similarity of the two signals. After the comparison of the DTW algorithm, the external audio signal is used as a reference, the similarity is correspondingly recorded, a series of information is stored and generated, the accuracy of the pronunciation of the user is judged according to the fact that the value of the similarity reaches a given value, and finally an evaluation report information is formed and displayed to the user.
Wherein the language barrier assessment content comprises:
1. unit tone and vowel: a. o, e, i, u
2. Complex vowel and vowel: ai ei ui ao ou iu ie ue er
First nasal sound and final: an en in un u n
Last nasal sound: ang eng ing ong
3. Consonant consonants: lips sound b, p, m; lip and tongue tip pronunciations f, z, c, s; tongue tip medians d, t, n, l; tongue tip back sounds zh, ch, sh, r; lingual surface sounds j, q, x; tongue root sounds g, k, h.
Note: the correct pronunciations of the 6 unit tones and the finals, the 18 complex vowels and the finals and the 21 consonants are recorded and stored as the video file of the picture annotation and the pronunciations, and the user can follow the reading by referring to the correct pronunciations.
The language disorder correction contents include
Step one, dysarthria correction: phoneme (vowel and consonant) -syllable (consisting of initial and final) -the evaluation system re-evaluates syllable pronunciation until pronunciation criteria are correct;
1. correction of vowels and consonants: for 6 unit sounds and finals, 18 compound vowels and finals and the like and 21 consonants and initials, the specific voice correction method graphic description of each final and initial is recorded, and a user learns and corrects a wrong pronunciation mode by matching with voice explanation and correction operation and explanation of real operation voice images.
2. Correcting syllables: the Chinese character pronunciation is formed by 21 consonant initials and all vowels, and one pronunciation is selected as a correction object for each same tone. The user can learn and correct the wrong pronunciation mode by recording the graphic and text description of the syllable correction method and cooperating with the speech explanation and correction operation and the explanation of the real practice audio image.
3. And continuously comparing and evaluating the corrected standard pronunciation of the evaluation system until the pronunciation is completely correct.
Step two, mandarin chinese language flow training: phrases-short sentences-short texts and conversations are simple to difficult to implement.
Mandarin Chinese streaming training: the user reading the language flow training materials by the correction system follows the reading, phrase training is carried out on the corrected syllables, the voice fluency of the user is improved, and short sentences and short texts are trained after proficiency until the normal conversation is smooth and standard.
Claims (7)
1. A language disorder assessment and correction system, comprising: the voice recognition system comprises an input module, a recording module, a storage module, a processing module, a comparison module, a display module and a playing module, wherein the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting data to be compared into types capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the data to a user, the playing module is used for clicking the audio to be played by the user, the storage module stores standard pronunciation audio in a file folder in advance, a database only stores respective paths of the standard pronunciation audio, the paths are presented to the user for clicking one by one through a circular traversal method, and when the user clicks to play the corresponding audio, the stored standard pronunciation audio can be heard, after hearing the corresponding audio, the user can read the audio, the recording module can record the voice spoken by the user and store the voice through the storage module to be used as an external audio signal, meanwhile, the clicked and played audio is recorded to be used as an internal audio signal, the recording module can record the correct pronunciation of the phonemes and syllables, a video file for annotating the picture and adding the pronunciation is stored through the storage module, and a video file for annotating the picture and adding the pronunciation is stored through the storage module.
2. The system of claim 1, wherein: the external audio signal and the internal audio signal are processed through Fourier transform and a window function to obtain an energy spectrum of an audio clip, the energy spectrum of the audio clip is extracted into chroma characteristic vectors of each frame, and a set of the chroma characteristic vectors is formed, and finally respective characteristic matrixes are obtained.
3. The system of claim 2, wherein: and comparing the similarity of the external audio signal and the internal audio signal by the feature matrix of the external audio signal and the internal audio signal through a DTW algorithm to judge the similarity of the two signals.
4. The system of claim 1, wherein: the video file of the picture annotation and pronunciation contains the graphic and text description of the voice correction method, and is matched with the voice explanation correction operation and the explanation real operation audio image.
5. The system of claim 1, wherein the system further comprises: the storage module also stores speech stream training materials, and the speech stream training materials are used for mandarin speech stream training.
6. The system of claim 1, wherein: the phonemes comprise vowels and consonants, and the syllables comprise initials and finals.
7. The system of claim 6, wherein: the vowels and the consonants comprise six unit-tone finals, eighteen complex vowel finals and twenty-one consonant initials.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210011831.3A CN114515138A (en) | 2022-01-06 | 2022-01-06 | Language disorder assessment and correction system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210011831.3A CN114515138A (en) | 2022-01-06 | 2022-01-06 | Language disorder assessment and correction system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114515138A true CN114515138A (en) | 2022-05-20 |
Family
ID=81595938
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210011831.3A Pending CN114515138A (en) | 2022-01-06 | 2022-01-06 | Language disorder assessment and correction system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114515138A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070112568A1 (en) * | 2003-07-28 | 2007-05-17 | Tim Fingscheidt | Method for speech recognition and communication device |
JP2008040344A (en) * | 2006-08-09 | 2008-02-21 | Yamaha Corp | Speech evaluating device |
CN106531182A (en) * | 2016-12-16 | 2017-03-22 | 上海斐讯数据通信技术有限公司 | Language learning system |
CN108520650A (en) * | 2018-03-27 | 2018-09-11 | 深圳市神经科学研究院 | A kind of intelligent language training system and method |
CN108922563A (en) * | 2018-06-17 | 2018-11-30 | 海南大学 | Based on the visual verbal learning antidote of deviation organ morphology behavior |
CN110223688A (en) * | 2019-06-08 | 2019-09-10 | 安徽中医药大学 | A kind of self-evaluating system of compressed sensing based hepatolenticular degeneration disfluency |
-
2022
- 2022-01-06 CN CN202210011831.3A patent/CN114515138A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070112568A1 (en) * | 2003-07-28 | 2007-05-17 | Tim Fingscheidt | Method for speech recognition and communication device |
JP2008040344A (en) * | 2006-08-09 | 2008-02-21 | Yamaha Corp | Speech evaluating device |
CN106531182A (en) * | 2016-12-16 | 2017-03-22 | 上海斐讯数据通信技术有限公司 | Language learning system |
CN108520650A (en) * | 2018-03-27 | 2018-09-11 | 深圳市神经科学研究院 | A kind of intelligent language training system and method |
CN108922563A (en) * | 2018-06-17 | 2018-11-30 | 海南大学 | Based on the visual verbal learning antidote of deviation organ morphology behavior |
CN110223688A (en) * | 2019-06-08 | 2019-09-10 | 安徽中医药大学 | A kind of self-evaluating system of compressed sensing based hepatolenticular degeneration disfluency |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tran et al. | Improvement to a NAM-captured whisper-to-speech system | |
US6853971B2 (en) | Two-way speech recognition and dialect system | |
US7280964B2 (en) | Method of recognizing spoken language with recognition of language color | |
US20090037171A1 (en) | Real-time voice transcription system | |
JP6705956B1 (en) | Education support system, method and program | |
WO2004063902A2 (en) | Speech training method with color instruction | |
JPH075807A (en) | Device for training conversation based on synthesis | |
TW201214413A (en) | Modification of speech quality in conversations over voice channels | |
CN106328146A (en) | Video subtitle generation method and apparatus | |
Rose | Crosslinguistic corpus of hesitation phenomena: a corpus for investigating first and second language speech performance. | |
KR20150076128A (en) | System and method on education supporting of pronunciation ussing 3 dimensional multimedia | |
JP2003186379A (en) | Program for voice visualization processing, program for voice visualization figure display and for voice and motion image reproduction processing, program for training result display, voice-speech training apparatus and computer system | |
Stemberger et al. | Phonetic transcription for speech-language pathology in the 21st century | |
JP2013088552A (en) | Pronunciation training device | |
Nakai et al. | Viewing speech in action: speech articulation videos in the public domain that demonstrate the sounds of the International Phonetic Alphabet (IPA) | |
JP2003228279A (en) | Language learning apparatus using voice recognition, language learning method and storage medium for the same | |
KR20140087956A (en) | Apparatus and method for learning phonics by using native speaker's pronunciation data and word and sentence and image data | |
JP2844817B2 (en) | Speech synthesis method for utterance practice | |
CN114515138A (en) | Language disorder assessment and correction system | |
Johnson | An integrated approach for teaching speech spectrogram analysis to engineering students | |
JPH0756494A (en) | Pronunciation training device | |
JP2873830B2 (en) | Automatic conversation practice device | |
Deshpande et al. | Speech Coach: A framework to evaluate and improve speech delivery | |
Lavagetto | Multimedia Telephone for Hearing-Impaired People | |
CN112951208B (en) | Method and device for speech recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |