CN114515138A - Language disorder assessment and correction system - Google Patents

Language disorder assessment and correction system Download PDF

Info

Publication number
CN114515138A
CN114515138A CN202210011831.3A CN202210011831A CN114515138A CN 114515138 A CN114515138 A CN 114515138A CN 202210011831 A CN202210011831 A CN 202210011831A CN 114515138 A CN114515138 A CN 114515138A
Authority
CN
China
Prior art keywords
module
audio
user
pronunciation
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210011831.3A
Other languages
Chinese (zh)
Inventor
林兆勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou Xingkanglang Language Education Technology Co ltd
Original Assignee
Fuzhou Xingkanglang Language Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou Xingkanglang Language Education Technology Co ltd filed Critical Fuzhou Xingkanglang Language Education Technology Co ltd
Priority to CN202210011831.3A priority Critical patent/CN114515138A/en
Publication of CN114515138A publication Critical patent/CN114515138A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/40Detecting, measuring or recording for evaluating the nervous system
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4803Speech analysis specially adapted for diagnostic purposes

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Veterinary Medicine (AREA)
  • Physics & Mathematics (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Neurology (AREA)
  • Neurosurgery (AREA)
  • Physiology (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention relates to the technical field of language barriers and discloses a language barrier assessment and correction system which comprises an input module, a recording module, a storage module, a processing module, a comparison module, a display module and a playing module, wherein the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting data to be compared into a type capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the comparison similarity to a user, and the playing module is used for clicking and playing audio by the user; the problems that in the prior art, the correction process of the language barrier is not targeted, the process is not strict, and the evaluation system is high in cost, difficult to operate, limited in region and the like are solved.

Description

Language disorder assessment and correction system
Technical Field
The invention relates to the technical field of language disorder, in particular to a language disorder evaluation and correction system.
Background
The language is the most important human communication tool, is a means for thinking and expressing thought of human beings, is the most basic information carrier of human society, is an important medium for thought collision between people, and the existence of the language cannot be separated between people.
Products in the current market have no products for evaluating the conditions of language disorder patients, the evaluation of the language disorder patients is based on artificial and face-to-face evaluation, artificial professional disparities exist in the technology, a unified standard does not exist, and certain influence can be generated on the accuracy of an evaluation result; on the other hand, the existing voice guidance system on the market is complex in device, and needs to prepare various hardware facilities such as a correction device, a projection device, an earphone and a headset, and the correction device is formed by combining unit devices such as a recording unit, a playing unit, a display unit, and a storage unit, so that the device is complicated and one person must set the device. The setting of one auxiliary device is not only high in equipment cost and has no great advantages compared with the one-to-one professional guidance of personnel configuration relative to the next, but also has no pertinence to the correction process of a language barrier, and the process is not strict, which is the defects of high cost, difficult operation, region limitation and the like of the traditional evaluation system, so that the language barrier evaluation and correction system is needed to solve the problems.
Disclosure of Invention
The invention provides a language barrier evaluation and correction system, which saves complex operations of equipment and manual work, fills the blank of digitalization of the evaluation system, has the advantage of facilitating language evaluation and correction of an evaluated person at any time and any place, and solves the problems of no pertinence in the correction process of a language barrier, non-rigorous process, high cost, difficult operation, region limitation and the like of the evaluation system in the prior art.
The invention provides the following technical scheme: a language barrier assessment and correction system comprises an input module, a recording module, a storage module, a processing module, a comparison module, a display module and a playing module, wherein the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting data to be compared into types capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the data to be seen by a user, the playing module is used for clicking the audio to be played by the user, the storage module stores standard pronunciation audio in a file folder in advance, a database only stores respective paths of the standard pronunciation audio and presents the paths to the user for clicking one by one through a circular traversal method, click the corresponding audio frequency of broadcast when the user, can hear the standard pronunciation audio frequency that has deposited, hear corresponding audio frequency back user will follow up reading, the sound that the user said will be recorded to the recording module, and pass through storage module stores to this is as outside audio signal, simultaneously, the record is clicked the audio frequency of broadcast and is regarded as inside audio signal, the recording module can record including the recording module can record the correct pronunciation including unit sound final, compound vowel final and consonant initial, and pass through storage module saves as picture annotation adds the video file of pronunciation.
Preferably, the external audio signal and the internal audio signal are processed by fourier transform and a window function to obtain an energy spectrum of an audio segment, the energy spectrum of the audio segment is extracted into chroma feature vectors of each frame, and a set of the chroma feature vectors is formed, so as to obtain respective feature matrices.
Preferably, the feature matrices of the external audio signal and the internal audio signal are compared with each other by a DTW algorithm to determine the similarity between the two signals.
Preferably, the video file of the picture annotation and pronunciation contains a text description of a voice correction method, and is matched with a voice explanation correction operation and an explanation real operation audio image.
Preferably, the storage module further stores a speech stream training material, and the speech stream training material is used for mandarin chinese speech stream training.
Preferably, the phonemes comprise vowels and consonants, and the syllables comprise initials and finals.
Preferably, the vowels and the consonants include six unit vowels, eighteen complex vowels and consonants.
The invention has the following beneficial effects:
the method and the device aim at the manual evaluation system of the language barrier in the current market, and fill the blank of the digitalization of the evaluation system.
The invention can be automatically operated by a demander on a mobile phone for online evaluation and correction, thereby saving the defects of equipment, complex manual operation, fixed place evaluation and the like and facilitating the language evaluation and correction of the evaluated person at any time and any place.
After the self-service evaluation is carried out on the evaluated person, an evaluation report is provided according to the evaluation condition of the evaluated person, and a correction scheme can be provided in a subsequent language correction system in a targeted manner, so that the language correction can be better carried out.
The system can evaluate at any time in the correction process, the correction progress condition of a user is determined, the correction process is recorded in a data mode, and the system can adjust the correction plan according to the evaluation condition of each time and the requirement; the closed loop setting of evaluation, correction, feedback and evaluation can detect the sound production condition of the user in real time, thereby improving the accuracy of the pronunciation of the language handicapped.
The on-line evaluation and correction mode of the invention is convenient for the evaluation process of the evaluated person and the family to a great extent, and the evaluation can be carried out at home without specially dragging the family to a designated place, reserving, queuing and consulting and the like; when the epidemic situation is abused, the evaluation mode also improves the safety and meets the requirement of national epidemic situation management and control.
Drawings
FIG. 1 is a diagram of data collection in accordance with the present invention;
FIG. 2 is a data processing diagram of the present invention;
FIG. 3 is a diagram of syllables of the present invention;
FIG. 4 is an exemplary diagram of a phrase of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Referring to fig. 1 to 4, a language disorder evaluating and correcting system includes an input module, a recording module, a storage module, a processing module, a comparison module, a display module, and a playing module.
Wherein, the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting the data to be compared into the type capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the data to a user, the playing module is used for clicking and playing the audio by the user, the external audio signals are input into the input module, the external audio signals are recorded by the recording module, the recorded audio is stored, the standard pronunciation audio signals stored by the storage module are taken into the processing module, the external information signals and the internal audio signals are processed and converted into the data structures of the same type, and the comparison module is used for comparing to form a comparison similarity, recording the information, storing the information, and displaying a test report on a display module; the storage module stores standard pronunciation audio in a folder in advance, the database only stores respective paths of the standard pronunciation audio and the standard pronunciation audio, the standard pronunciation audio is presented to a user for clicking one by one through a circular traversal method, when the user clicks and plays the corresponding audio, the stored standard pronunciation audio is heard, the user can follow and read after hearing the corresponding audio, the recording module records the voice spoken by the user and stores the voice as an external audio signal through the storage module, meanwhile, the clicked and played audio is recorded as an internal audio signal, the recording module records the correct pronunciation of phonemes and syllables, a video file for annotating and adding pronunciation for a picture is stored through the storage module, and a video file for annotating and adding pronunciation for the picture is stored through the storage module.
The external audio signal and the internal audio signal are processed through Fourier transform and a window function to obtain an energy spectrum of a section of audio clip, the energy spectrum of the audio clip is extracted into chroma characteristic vectors of each frame, and a set of the chroma characteristic vectors is formed, and finally respective characteristic matrixes are obtained.
The feature matrices of the external audio signal and the internal audio signal are compared with each other through a DTW algorithm to judge the similarity of the two signals.
The video file of the picture annotation and pronunciation contains the graphic and text description of the voice correction method, and is matched with the voice explanation correction operation and the explanation real operation audio image.
The storage module also stores speech stream training materials, and the speech stream training materials are used for mandarin speech stream training.
The phonemes comprise vowels and consonants, and the syllables comprise initials and finals.
The vowels and consonants include six unit-tone vowels, eighteen complex vowel vowels and twenty-one consonant initials.
The working principle is as follows: the stored standard pronunciation audio is stored in a folder in advance, the database only stores respective paths of the standard pronunciation audio, the stored standard pronunciation audio is presented to a user for clicking one by one through a circular traversal method, when the user clicks and plays the corresponding audio, the stored standard pronunciation audio is heard, the user can read after hearing the corresponding audio, the system can record the voice spoken by the user and store the voice as an external audio signal, and simultaneously the clicked and played audio is recorded as an internal audio signal. The external audio signal and the internal audio signal are processed through Fourier transform and a window function to obtain an energy spectrum of an audio clip, the energy spectrum of the audio clip is extracted into chroma eigenvectors of each frame, and the chroma eigenvectors form a chroma eigenvector group, and finally respective feature matrixes are obtained. Then, the feature matrices of the external audio signal and the internal audio signal are compared for their similarity by the DTW algorithm, thereby judging the similarity of the two signals. After the comparison of the DTW algorithm, the external audio signal is used as a reference, the similarity is correspondingly recorded, a series of information is stored and generated, the accuracy of the pronunciation of the user is judged according to the fact that the value of the similarity reaches a given value, and finally an evaluation report information is formed and displayed to the user.
Wherein the language barrier assessment content comprises:
1. unit tone and vowel: a. o, e, i, u
2. Complex vowel and vowel: ai ei ui ao ou iu ie ue er
First nasal sound and final: an en in un u n
Last nasal sound: ang eng ing ong
3. Consonant consonants: lips sound b, p, m; lip and tongue tip pronunciations f, z, c, s; tongue tip medians d, t, n, l; tongue tip back sounds zh, ch, sh, r; lingual surface sounds j, q, x; tongue root sounds g, k, h.
Note: the correct pronunciations of the 6 unit tones and the finals, the 18 complex vowels and the finals and the 21 consonants are recorded and stored as the video file of the picture annotation and the pronunciations, and the user can follow the reading by referring to the correct pronunciations.
The language disorder correction contents include
Step one, dysarthria correction: phoneme (vowel and consonant) -syllable (consisting of initial and final) -the evaluation system re-evaluates syllable pronunciation until pronunciation criteria are correct;
1. correction of vowels and consonants: for 6 unit sounds and finals, 18 compound vowels and finals and the like and 21 consonants and initials, the specific voice correction method graphic description of each final and initial is recorded, and a user learns and corrects a wrong pronunciation mode by matching with voice explanation and correction operation and explanation of real operation voice images.
2. Correcting syllables: the Chinese character pronunciation is formed by 21 consonant initials and all vowels, and one pronunciation is selected as a correction object for each same tone. The user can learn and correct the wrong pronunciation mode by recording the graphic and text description of the syllable correction method and cooperating with the speech explanation and correction operation and the explanation of the real practice audio image.
3. And continuously comparing and evaluating the corrected standard pronunciation of the evaluation system until the pronunciation is completely correct.
Step two, mandarin chinese language flow training: phrases-short sentences-short texts and conversations are simple to difficult to implement.
Mandarin Chinese streaming training: the user reading the language flow training materials by the correction system follows the reading, phrase training is carried out on the corrected syllables, the voice fluency of the user is improved, and short sentences and short texts are trained after proficiency until the normal conversation is smooth and standard.

Claims (7)

1. A language disorder assessment and correction system, comprising: the voice recognition system comprises an input module, a recording module, a storage module, a processing module, a comparison module, a display module and a playing module, wherein the input module is used for inputting external audio signals, the recording module is used for recording the external audio signals, the storage module is used for storing the audio signals, the processing module is used for converting data to be compared into types capable of being compared, the comparison module is used for comparing the data converted into the same type to generate a comparison similarity, the display module is used for displaying the data to a user, the playing module is used for clicking the audio to be played by the user, the storage module stores standard pronunciation audio in a file folder in advance, a database only stores respective paths of the standard pronunciation audio, the paths are presented to the user for clicking one by one through a circular traversal method, and when the user clicks to play the corresponding audio, the stored standard pronunciation audio can be heard, after hearing the corresponding audio, the user can read the audio, the recording module can record the voice spoken by the user and store the voice through the storage module to be used as an external audio signal, meanwhile, the clicked and played audio is recorded to be used as an internal audio signal, the recording module can record the correct pronunciation of the phonemes and syllables, a video file for annotating the picture and adding the pronunciation is stored through the storage module, and a video file for annotating the picture and adding the pronunciation is stored through the storage module.
2. The system of claim 1, wherein: the external audio signal and the internal audio signal are processed through Fourier transform and a window function to obtain an energy spectrum of an audio clip, the energy spectrum of the audio clip is extracted into chroma characteristic vectors of each frame, and a set of the chroma characteristic vectors is formed, and finally respective characteristic matrixes are obtained.
3. The system of claim 2, wherein: and comparing the similarity of the external audio signal and the internal audio signal by the feature matrix of the external audio signal and the internal audio signal through a DTW algorithm to judge the similarity of the two signals.
4. The system of claim 1, wherein: the video file of the picture annotation and pronunciation contains the graphic and text description of the voice correction method, and is matched with the voice explanation correction operation and the explanation real operation audio image.
5. The system of claim 1, wherein the system further comprises: the storage module also stores speech stream training materials, and the speech stream training materials are used for mandarin speech stream training.
6. The system of claim 1, wherein: the phonemes comprise vowels and consonants, and the syllables comprise initials and finals.
7. The system of claim 6, wherein: the vowels and the consonants comprise six unit-tone finals, eighteen complex vowel finals and twenty-one consonant initials.
CN202210011831.3A 2022-01-06 2022-01-06 Language disorder assessment and correction system Pending CN114515138A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210011831.3A CN114515138A (en) 2022-01-06 2022-01-06 Language disorder assessment and correction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210011831.3A CN114515138A (en) 2022-01-06 2022-01-06 Language disorder assessment and correction system

Publications (1)

Publication Number Publication Date
CN114515138A true CN114515138A (en) 2022-05-20

Family

ID=81595938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210011831.3A Pending CN114515138A (en) 2022-01-06 2022-01-06 Language disorder assessment and correction system

Country Status (1)

Country Link
CN (1) CN114515138A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070112568A1 (en) * 2003-07-28 2007-05-17 Tim Fingscheidt Method for speech recognition and communication device
JP2008040344A (en) * 2006-08-09 2008-02-21 Yamaha Corp Speech evaluating device
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN108520650A (en) * 2018-03-27 2018-09-11 深圳市神经科学研究院 A kind of intelligent language training system and method
CN108922563A (en) * 2018-06-17 2018-11-30 海南大学 Based on the visual verbal learning antidote of deviation organ morphology behavior
CN110223688A (en) * 2019-06-08 2019-09-10 安徽中医药大学 A kind of self-evaluating system of compressed sensing based hepatolenticular degeneration disfluency

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070112568A1 (en) * 2003-07-28 2007-05-17 Tim Fingscheidt Method for speech recognition and communication device
JP2008040344A (en) * 2006-08-09 2008-02-21 Yamaha Corp Speech evaluating device
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN108520650A (en) * 2018-03-27 2018-09-11 深圳市神经科学研究院 A kind of intelligent language training system and method
CN108922563A (en) * 2018-06-17 2018-11-30 海南大学 Based on the visual verbal learning antidote of deviation organ morphology behavior
CN110223688A (en) * 2019-06-08 2019-09-10 安徽中医药大学 A kind of self-evaluating system of compressed sensing based hepatolenticular degeneration disfluency

Similar Documents

Publication Publication Date Title
Tran et al. Improvement to a NAM-captured whisper-to-speech system
US6853971B2 (en) Two-way speech recognition and dialect system
US7280964B2 (en) Method of recognizing spoken language with recognition of language color
US20090037171A1 (en) Real-time voice transcription system
JP6705956B1 (en) Education support system, method and program
WO2004063902A2 (en) Speech training method with color instruction
JPH075807A (en) Device for training conversation based on synthesis
TW201214413A (en) Modification of speech quality in conversations over voice channels
CN106328146A (en) Video subtitle generation method and apparatus
Rose Crosslinguistic corpus of hesitation phenomena: a corpus for investigating first and second language speech performance.
KR20150076128A (en) System and method on education supporting of pronunciation ussing 3 dimensional multimedia
JP2003186379A (en) Program for voice visualization processing, program for voice visualization figure display and for voice and motion image reproduction processing, program for training result display, voice-speech training apparatus and computer system
Stemberger et al. Phonetic transcription for speech-language pathology in the 21st century
JP2013088552A (en) Pronunciation training device
Nakai et al. Viewing speech in action: speech articulation videos in the public domain that demonstrate the sounds of the International Phonetic Alphabet (IPA)
JP2003228279A (en) Language learning apparatus using voice recognition, language learning method and storage medium for the same
KR20140087956A (en) Apparatus and method for learning phonics by using native speaker's pronunciation data and word and sentence and image data
JP2844817B2 (en) Speech synthesis method for utterance practice
CN114515138A (en) Language disorder assessment and correction system
Johnson An integrated approach for teaching speech spectrogram analysis to engineering students
JPH0756494A (en) Pronunciation training device
JP2873830B2 (en) Automatic conversation practice device
Deshpande et al. Speech Coach: A framework to evaluate and improve speech delivery
Lavagetto Multimedia Telephone for Hearing-Impaired People
CN112951208B (en) Method and device for speech recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination