CN112885168A - Immersive speech feedback training system based on AI - Google Patents

Immersive speech feedback training system based on AI Download PDF

Info

Publication number
CN112885168A
CN112885168A CN202110081356.2A CN202110081356A CN112885168A CN 112885168 A CN112885168 A CN 112885168A CN 202110081356 A CN202110081356 A CN 202110081356A CN 112885168 A CN112885168 A CN 112885168A
Authority
CN
China
Prior art keywords
information
module
voice
shape coefficient
scoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110081356.2A
Other languages
Chinese (zh)
Other versions
CN112885168B (en
Inventor
范虹
刘蓝冰
尉泽民
严晓波
茹文亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaoxing Peoples Hospital
Original Assignee
Shaoxing Peoples Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaoxing Peoples Hospital filed Critical Shaoxing Peoples Hospital
Priority to CN202110081356.2A priority Critical patent/CN112885168B/en
Publication of CN112885168A publication Critical patent/CN112885168A/en
Application granted granted Critical
Publication of CN112885168B publication Critical patent/CN112885168B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • G09B21/06Devices for teaching lip-reading

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses an AI-based immersive speech feedback training system, which comprises a capability rating module, a grading learning module, a standard library, a movie playing module, a voice recognition module, an image acquisition module, a data receiving module, a data processing module, a learning scoring module and a scoring display module, wherein the learning module is used for learning and scoring; the ability rating module is used for rating the voice ability of the patient with language disorder and generating voice rating information, the voice rating information is sent to the grading learning module, the grading learning module receives the voice rating information and calls learning movies at corresponding levels from a standard library, and the standard library stores training movie information at different levels; the movie playing module is used for receiving movie information of a corresponding level sent by the hierarchical learning module control standard library, and the movie playing module starts playing after receiving the movie information of the corresponding level. The invention can better help and promote the rehabilitation training of the person with language disorder.

Description

Immersive speech feedback training system based on AI
Technical Field
The invention relates to the field of language training, in particular to an immersive speech feedback training system based on AI.
Background
Speech and language dysgenesis refers to a disorder of normal language acquisition patterns at an early stage of development, manifested as delays and abnormalities in pronunciation, language understanding, or development of language expression abilities that affect learning, occupation, and social functions. The situations are not caused by the abnormality of nerve or speech mechanisms, sensory impairment, mental development retardation or surrounding environmental factors, and a speech feedback training system is used for assisting rehabilitation training in the rehabilitation process of language disorder.
The existing speech feedback training system has a single function in the using process, so that the training effect is poor, the using requirement of a user cannot be met, certain influence is brought to the use of the speech feedback training system, and therefore, the immersive speech feedback training system based on the AI is provided.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: how to solve current speech feedback training system, in the use, the function singleness leads to its training effect relatively poor, can not satisfy user's user demand, has brought the problem of certain influence for speech feedback training system's use, provides an immersive speech feedback training system based on AI.
The invention solves the technical problems through the following technical scheme that the system comprises a capability rating module, a grading learning module, a standard library, a film and television playing module, a sound identification module, an image acquisition module, a data receiving module, a data processing module, a learning scoring module and a scoring display module;
the ability rating module is used for rating the voice ability of the patient with language disorder and generating voice rating information, the voice rating information is sent to the grading learning module, the grading learning module receives the voice rating information and calls learning movies at corresponding levels from a standard library, and the standard library stores training movie information at different levels;
the video playing module is used for receiving video information of a corresponding level sent by the hierarchical learning module control standard library, the video playing module starts playing after receiving the video information of the corresponding level, the sound identification module collects voice information sent by a language barrier patient at the moment, and meanwhile, the image collection module operates to collect mouth action information when the language barrier patient sends the voice information;
the voice information sent by the language disorder patient and the mouth action information when the voice information is sent by the language disorder patient are both sent to the data receiving module, and the data receiving module processes the received voice information sent by the language disorder patient and the mouth action information when the voice information is sent by the language disorder patient to generate voice comparison information and action comparison information;
the voice comparison information and the action comparison information are both sent to a learning scoring module, the learning scoring module processes the voice comparison information and the action comparison information to generate training scoring information, the training scoring information is sent to a scoring display module, and the scoring display module is used for displaying training scoring.
Preferably, the specific process of the ability rating module for rating the ability of the language-handicapped patient is as follows:
the method comprises the following steps: the capability rating module is preset with different levels of text content information, including primary text, middle level text information, high level text information and normal text information, and the difficulty level is as follows: the primary characters are less than the middle-level character information and more than the high-level character information and less than the normal character information;
step two: sequentially selecting at least x groups of character information from the primary character information, the middle-level character information, the high-level character information and the normal character information from normal to high, wherein x is more than or equal to 5;
step three: displaying x groups of character information selected from the primary characters, the middle-level character information, the high-level character information and the normal character information, sequentially reading the x groups of character information in the primary characters, the x groups of character information in the middle-level character information, the x groups of character information in the high-level character information and the x groups of character information in the normal character information by a patient with language disorder, and respectively marking the reading of the character information from low to high as K1, K2, K3 and K4 according to the rank sequence;
step four: extracting preset pronunciation information of x groups of character information selected from primary characters, middle-level character information, high-level character information and normal character information, and respectively marking the preset pronunciation information as M1, M2, M3 and M4 according to the rank order;
step five: carrying out similarity matching on K1 and M1 to obtain similarity Km1, carrying out similarity matching on K2 and M2 to obtain similarity Km2, carrying out similarity matching on K3 and M3 to obtain similarity Km3, and carrying out similarity matching on K4 and M4 to obtain similarity Km 4;
step five: when any one of the similarity of Km1, Km2, Km3 and Km4 is larger than a preset value, the level is judged to belong to, and when two or more than two similarities are larger than the preset value, the similarity with high level is taken as a final judgment result.
Preferably, the video playing module plays the audio information synchronously while playing the video.
Preferably, training movie and television information of different grades is stored in the standard library, wherein the training movie and television information comprises mouth shape coefficient information corresponding to character information, the data processing module processes the acquired mouth motion information into real-time mouth shape coefficient information through the image acquisition module, and the real-time mouth shape coefficient information is compared with the pre-stored mouth shape coefficient information to obtain motion comparison information.
Preferably, the shape factor comprises a first shape factor and a second shape factor, and the specific process of the shape factor is as follows:
the method comprises the following steps: marking the key point of the upper lip as a point A1, marking two corner points of the upper lip as a point A2 and a point A3 respectively, and acquiring an arc segment L1 through the set point A1, the point A2 and the point A3;
step two: marking the key point of the lower lip as a point B1, marking two corner points of the lower lip as a point B2 and a point B3 respectively, and acquiring an arc line segment L2 through a set point B1, a set point B2 and a set point B3;
step three: connecting the point A1 with the point A2 to obtain a line segment L3, measuring the radians of the arc line segment L1 and the arc line segment L2, and measuring the length of the line segment L3;
step four: by the formula (L1+ L2)/(L1-L2) ═ LRatio of,LRatio ofThe length of L3 is the second mouth shape factor;
the specific process of comparing the real-time mouth shape coefficient information with the pre-stored mouth shape coefficient information by the data processing module is as follows:
s1: extracting a real-time first mouth shape coefficient, a real-time second mouth shape coefficient, a preset first mouth shape coefficient and a preset second mouth shape coefficient, marking the real-time first mouth shape coefficient as P1, marking the real-time second mouth shape coefficient as P2, marking the preset first mouth shape coefficient as Q1 and marking the preset second mouth shape coefficient as Q2;
s2: the difference Pq1 between the real-time first shape factor P1 and the preset first shape factor labeled Q1 is calculatedDifference (D)Then, the difference Pq2 between the real-time second shape coefficient P2 and the preset second shape coefficient Q2 is calculatedDifference (D)
S3: calculate Pq1Difference (D)Absolute value of (1) and Pq2Difference (D)Of absolute value of (Pq)Andobtaining the action comparison information PqAnd
preferably, the specific processing procedure of the data processing module for processing the voice comparison information is as follows:
SS 1: extracting standard voice information of the film and television information in the pre-storage library, performing voiceprint processing on the standard voice information to obtain standard voiceprint, and marking the standard voiceprint as FSign board
SS 2: the voice information of the preset character content read by the language barrier patient and acquired by the voice identification module is subjected to culture-filling processing to obtain real-time voiceprints, and the real-time voiceprints are marked as FFruit of Chinese wolfberryI.e. voice comparison information FFruit of Chinese wolfberry
SS 3: obtaining the real-time voiceprint FFruit of Chinese wolfberryAnd standard voiceprint FSign boardComparing the similarity to obtain a similarity FRatio of
Preferably, the specific process of the learning scoring module for processing the voice comparison information and the motion comparison information to generate the training scoring information is as follows:
s01: extracting the obtained voice comparison information and action comparison information, and respectively marking the obtained voice comparison information and action comparison information as M and N;
s02: in order to highlight the importance of voice comparison, a correction value U1 is given to the voice comparison information, a correction value U2 is given to the action comparison information, U1 is greater than U2, and U1+ U2 is 1;
s03: by the formula M U1+ N U2 ═ MnAndobtaining training score information MnAnd
preferably, the scoring display module ranks all the received training scoring information from high to low, and displays the personnel information corresponding to the first three maximum training scoring information after being amplified by a preset font.
Compared with the prior art, the invention has the following advantages: the immersive speech feedback training system based on AI can better evaluate the level of language disorder of a patient with language disorder before performing speech training, through the arrangement, the system can better provide speech training contents for the patient with language disorder, the arrangement is easy to achieve, the use experience of the system can be effectively improved, the frustration caused by the difficulty of the training contents to the patient with language disorder can be effectively avoided, the mouth shape condition of pronunciation can be simultaneously checked for the patient with language disorder through synchronous playing of movie contents and sound, pronunciation is performed through observing mouth shape simulation, the rehabilitation training progress of the patient with language disorder is accelerated, meanwhile, through double analysis of mouth shape and pronunciation, the rehabilitation training progress of the patient with language disorder can be more accurately evaluated, different arrangements meet different use requirements of the patient with language disorder, the system is more worthy of popularization and application.
Drawings
FIG. 1 is a system block diagram of the present invention.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
As shown in fig. 1, the present embodiment provides a technical solution: an immersion type speech feedback training system based on AI comprises a capability rating module, a grading learning module, a standard library, a film and television playing module, a voice identification module, an image acquisition module, a data receiving module, a data processing module, a learning scoring module and a scoring display module;
the ability rating module is used for rating the voice ability of the patient with language disorder and generating voice rating information, the voice rating information is sent to the grading learning module, the grading learning module receives the voice rating information and calls learning movies at corresponding levels from a standard library, and the standard library stores training movie information at different levels;
the movie playing module is used for receiving movie information of a corresponding level sent by the hierarchical learning module control standard library, the movie playing module starts playing after receiving the movie information of the corresponding level, the movie playing module performs amplification close-up processing on mouths of characters in images when playing the movie information, so that mouth shape imitation of patients with language disorder is facilitated, the sound identification module collects voice information sent by the patients with language disorder at the moment, and meanwhile, the image collection module operates to collect mouth action information when the patients with language disorder send the voice information;
the voice information sent by the language disorder patient and the mouth action information when the voice information is sent by the language disorder patient are both sent to the data receiving module, and the data receiving module processes the received voice information sent by the language disorder patient and the mouth action information when the voice information is sent by the language disorder patient to generate voice comparison information and action comparison information;
the voice comparison information and the action comparison information are both sent to a learning scoring module, the learning scoring module processes the voice comparison information and the action comparison information to generate training scoring information, the training scoring information is sent to a scoring display module, and the scoring display module is used for displaying training scoring.
The specific process of the capacity rating module for rating the capacity of the language barrier patient is as follows:
the method comprises the following steps: the capability rating module is preset with different levels of text content information, including primary text, middle level text information, high level text information and normal text information, and the difficulty level is as follows: the primary characters are less than the middle-level character information and more than the high-level character information and less than the normal character information;
step two: sequentially selecting at least x groups of character information from the primary character information, the middle-level character information, the high-level character information and the normal character information from normal to high, wherein x is more than or equal to 5;
step three: displaying x groups of character information selected from the primary characters, the middle-level character information, the high-level character information and the normal character information, sequentially reading the x groups of character information in the primary characters, the x groups of character information in the middle-level character information, the x groups of character information in the high-level character information and the x groups of character information in the normal character information by a patient with language disorder, and respectively marking the reading of the character information from low to high as K1, K2, K3 and K4 according to the rank sequence;
step four: extracting preset pronunciation information of x groups of character information selected from primary characters, middle-level character information, high-level character information and normal character information, and respectively marking the preset pronunciation information as M1, M2, M3 and M4 according to the rank order;
step five: carrying out similarity matching on K1 and M1 to obtain similarity Km1, carrying out similarity matching on K2 and M2 to obtain similarity Km2, carrying out similarity matching on K3 and M3 to obtain similarity Km3, and carrying out similarity matching on K4 and M4 to obtain similarity Km 4;
step five: when any one of the similarity of Km1, Km2, Km3 and Km4 is greater than a preset value, the level is judged to belong to, and when two or more than two similarities are greater than the preset value, the similarity with high level is taken as a final judgment result;
before carrying out the pronunciation training, better carry out the rank assessment of language obstacle to language obstacle patient, through this kind of setting, let that this system is better provide the pronunciation training content for language obstacle patient, from easy to difficult setting, can effectually promote the use of this system and experience, effectively avoided the training content too difficult to lead to the fact the frustration for language obstacle patient in addition.
The movie & TV broadcast module is when carrying out the image broadcast, and audio information is broadcast in step, and the effectual language disorder patient pronunciation that has avoided the sound painting desynchrony to make mistakes to through movie & TV content and sound synchronous broadcast, let can look over the mouth shape situation of pronunciation simultaneously for language disorder patient, carry out pronunciation through observing the simulation of mouth shape, accelerated language disorder patient's rehabilitation training progress.
The standard library stores training movie and television information of different grades, wherein the training movie and television information comprises mouth shape coefficient information corresponding to character information, the data processing module processes collected mouth motion information into real-time mouth shape coefficient information through the image collecting module, the real-time mouth shape coefficient information is compared with prestored mouth shape coefficient information to obtain motion comparison information, and the mouth shape coefficient is set to better evaluate the rehabilitation training state of a patient with language disorder.
The shape coefficient comprises a first shape coefficient and a second shape coefficient, and the specific process of the shape coefficient is as follows:
the method comprises the following steps: marking the key point of the upper lip as a point A1, marking two corner points of the upper lip as a point A2 and a point A3 respectively, and acquiring an arc segment L1 through the set point A1, the point A2 and the point A3;
step two: marking the key point of the lower lip as a point B1, marking two corner points of the lower lip as a point B2 and a point B3 respectively, and acquiring an arc line segment L2 through a set point B1, a set point B2 and a set point B3;
step three: connecting the point A1 with the point A2 to obtain a line segment L3, measuring the radians of the arc line segment L1 and the arc line segment L2, and measuring the length of the line segment L3;
step four: by the formula (L1+ L2)/(L1-L2) ═ LRatio of,LRatio ofThe length of L3 is the second mouth shape factor;
the judgment accuracy is further improved by setting the two mouth shape coefficients;
the specific process of comparing the real-time mouth shape coefficient information with the pre-stored mouth shape coefficient information by the data processing module is as follows:
s1: extracting a real-time first mouth shape coefficient, a real-time second mouth shape coefficient, a preset first mouth shape coefficient and a preset second mouth shape coefficient, marking the real-time first mouth shape coefficient as P1, marking the real-time second mouth shape coefficient as P2, marking the preset first mouth shape coefficient as Q1 and marking the preset second mouth shape coefficient as Q2;
s2: the difference Pq1 between the real-time first shape factor P1 and the preset first shape factor labeled Q1 is calculatedDifference (D)Then, the difference Pq2 between the real-time second shape coefficient P2 and the preset second shape coefficient Q2 is calculatedDifference (D)
S3: calculate Pq1Difference (D)Absolute value of (1) and Pq2Difference (D)Of absolute value of (Pq)Andobtaining the action comparison information PqAnd
the action comparison information can be better acquired through the setting.
The specific processing process of the data processing module for processing the voice comparison information is as follows:
SS 1: extracting standard voice information of the film and television information in the pre-storage library, performing voiceprint processing on the standard voice information to obtain standard voiceprint, and marking the standard voiceprint as FSign board
SS 2: the voice information of the preset character content read by the language barrier patient and acquired by the voice identification module is subjected to culture-filling processing to obtain real-time voiceprints, and the real-time voiceprints are marked as FFruit of Chinese wolfberryI.e. voice comparison information FFruit of Chinese wolfberry
SS 3: obtaining the real-time voiceprint FFruit of Chinese wolfberryAnd standard voiceprint FSign boardComparing the similarity to obtain a similarity FRatio of
The specific process of the learning scoring module for processing the voice comparison information and the action comparison information to generate training scoring information is as follows:
s01: extracting the obtained voice comparison information and action comparison information, and respectively marking the obtained voice comparison information and action comparison information as M and N;
s02: in order to highlight the importance of voice comparison, a correction value U1 is given to the voice comparison information, a correction value U2 is given to the action comparison information, U1 is greater than U2, and U1+ U2 is 1;
s03: by the formula M U1+ N U2 ═ MnAndobtaining training score information MnAnd
through the double analysis to the shape of mouth and pronunciation, language disorder patient's that can be more accurate rehabilitation training progress aassessment, and the different user demands of language disorder patient have been satisfied in the setting of multiple difference, let this system be worth using widely more.
The scoring display module is used for ranking all the received training scoring information from high to low, amplifying the personnel information corresponding to the first three persons with the maximum training scoring information by using a preset font and then displaying the personnel information;
the method can lead the language disorder patient to know the recovery state of other sick and friends through setting the ranking information, and can stimulate the rehabilitation training confidence of the language disorder patient, thereby accelerating the recovery speed of the language disorder patient.
In conclusion, when the system is used, the ability rating module is used for rating the voice ability of the patient with language disorder and generating voice rating information, the voice rating information is sent to the grading learning module, the grading learning module receives the voice rating information and calls learning movies at corresponding levels from the standard library, the standard library stores training movie information at different levels, the system can better perform level evaluation on the language disorder of the patient with language disorder before voice training, through the setting, the system can better provide voice training contents for the patient with language disorder, the setting is easy to go wrong, the use experience of the system can be effectively improved, the frustration caused by the fact that the training contents are too difficult to the patient with language disorder can be effectively avoided, the movie playing module is used for receiving the movie information of corresponding levels sent by the grading learning module control standard library, the video playing module starts playing after receiving the video information of the corresponding level, and synchronously plays through video content and sound to enable a patient with language disorder to simultaneously check the mouth shape condition of pronunciation, pronounces through observing mouth shape simulation, accelerates the rehabilitation training progress of the patient with language disorder, the sound identification module collects the voice information sent by the patient with language disorder at the moment, the image collection module operates to collect the mouth action information when the patient with language disorder sends the voice information, the voice information sent by the patient with language disorder and the mouth action information when the patient with language disorder sends the voice information are both sent to the data receiving module, the data receiving module processes the received voice information sent by the patient with language disorder and the mouth action information when the patient with language disorder sends the voice information, and generates voice comparison information and action comparison information, the voice comparison information and the action comparison information are all sent to the learning scoring module, the learning scoring module processes the voice comparison information and the action comparison information to generate training scoring information, and meanwhile, through double analysis of mouth shape and pronunciation, the rehabilitation training progress of the language disorder patient can be more accurate to evaluate, different use requirements of the language disorder patient are met through setting of multiple differences, the system is enabled to be more worthy of popularization and use, the training scoring information is sent to the scoring display module, and the scoring display module is used for conducting training scoring display.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (8)

1. An immersion type speech feedback training system based on AI is characterized by comprising a capability rating module, a grading learning module, a standard library, a movie playing module, a voice recognition module, an image acquisition module, a data receiving module, a data processing module, a learning scoring module and a scoring display module;
the ability rating module is used for rating the voice ability of the patient with language disorder and generating voice rating information, the voice rating information is sent to the grading learning module, the grading learning module receives the voice rating information and calls learning movies at corresponding levels from a standard library, and the standard library stores training movie information at different levels;
the video playing module is used for receiving video information of a corresponding level sent by the hierarchical learning module control standard library, the video playing module starts playing after receiving the video information of the corresponding level, the sound identification module collects voice information sent by a language barrier patient at the moment, and meanwhile, the image collection module operates to collect mouth action information when the language barrier patient sends the voice information;
the voice information sent by the language disorder patient and the mouth action information when the voice information is sent by the language disorder patient are both sent to the data receiving module, and the data receiving module processes the received voice information sent by the language disorder patient and the mouth action information when the voice information is sent by the language disorder patient to generate voice comparison information and action comparison information;
the voice comparison information and the action comparison information are both sent to a learning scoring module, the learning scoring module processes the voice comparison information and the action comparison information to generate training scoring information, the training scoring information is sent to a scoring display module, and the scoring display module is used for displaying training scoring.
2. The AI-based immersive verbal feedback training system of claim 1, wherein: the specific process of the capacity rating module for rating the capacity of the language barrier patient is as follows:
the method comprises the following steps: the capability rating module is preset with different levels of text content information, including primary text, middle level text information, high level text information and normal text information, and the difficulty level is as follows: the primary characters are less than the middle-level character information and more than the high-level character information and less than the normal character information;
step two: sequentially selecting at least x groups of character information from the primary character information, the middle-level character information, the high-level character information and the normal character information from normal to high, wherein x is more than or equal to 5;
step three: displaying x groups of character information selected from the primary characters, the middle-level character information, the high-level character information and the normal character information, sequentially reading the x groups of character information in the primary characters, the x groups of character information in the middle-level character information, the x groups of character information in the high-level character information and the x groups of character information in the normal character information by a patient with language disorder, and respectively marking the reading of the character information from low to high as K1, K2, K3 and K4 according to the rank sequence;
step four: extracting preset pronunciation information of x groups of character information selected from primary characters, middle-level character information, high-level character information and normal character information, and respectively marking the preset pronunciation information as M1, M2, M3 and M4 according to the rank order;
step five: carrying out similarity matching on K1 and M1 to obtain similarity Km1, carrying out similarity matching on K2 and M2 to obtain similarity Km2, carrying out similarity matching on K3 and M3 to obtain similarity Km3, and carrying out similarity matching on K4 and M4 to obtain similarity Km 4;
step five: when any one of the similarity of Km1, Km2, Km3 and Km4 is larger than a preset value, the level is judged to belong to, and when two or more than two similarities are larger than the preset value, the similarity with high level is taken as a final judgment result.
3. The AI-based immersive verbal feedback training system of claim 1, wherein: the video playing module plays the video and simultaneously plays the audio information synchronously.
4. The AI-based immersive verbal feedback training system of claim 1, wherein: the standard library stores training movie and television information of different grades, wherein the training movie and television information comprises mouth shape coefficient information corresponding to character information, the data processing module processes collected mouth motion information into real-time mouth shape coefficient information through the image collection module, and the real-time mouth shape coefficient information is compared with prestored mouth shape coefficient information to obtain motion comparison information.
5. The AI-based immersive verbal feedback training system of claim 4, wherein: the shape coefficient comprises a first shape coefficient and a second shape coefficient, and the specific process of the shape coefficient is as follows:
the method comprises the following steps: marking the key point of the upper lip as a point A1, marking two corner points of the upper lip as a point A2 and a point A3 respectively, and acquiring an arc segment L1 through the set point A1, the point A2 and the point A3;
step two: marking the key point of the lower lip as a point B1, marking two corner points of the lower lip as a point B2 and a point B3 respectively, and acquiring an arc line segment L2 through a set point B1, a set point B2 and a set point B3;
step three: connecting the point A1 with the point A2 to obtain a line segment L3, measuring the radians of the arc line segment L1 and the arc line segment L2, and measuring the length of the line segment L3;
step four: by the formula (L1+ L2)/(L1-L2) ═ LRatio of,LRatio ofThe length of L3 is the second mouth shape factor;
the specific process of comparing the real-time mouth shape coefficient information with the pre-stored mouth shape coefficient information by the data processing module is as follows:
s1: extracting a real-time first mouth shape coefficient, a real-time second mouth shape coefficient, a preset first mouth shape coefficient and a preset second mouth shape coefficient, marking the real-time first mouth shape coefficient as P1, marking the real-time second mouth shape coefficient as P2, marking the preset first mouth shape coefficient as Q1 and marking the preset second mouth shape coefficient as Q2;
s2: the difference Pq1 between the real-time first shape factor P1 and the preset first shape factor labeled Q1 is calculatedDifference (D)Then, the difference Pq2 between the real-time second shape coefficient P2 and the preset second shape coefficient Q2 is calculatedDifference (D)
S3: calculate Pq1Difference (D)Absolute value of (1) and Pq2Difference (D)Of absolute value of (Pq)Andobtaining the action comparison information PqAnd
6. the AI-based immersive verbal feedback training system of claim 1, wherein: the specific processing process of the data processing module for processing the voice comparison information is as follows:
SS 1: extracting standard voice information of the film and television information in the pre-storage library, performing voiceprint processing on the standard voice information to obtain standard voiceprint, and marking the standard voiceprint as FSign board
SS 2: the voice information of the preset character content read by the language barrier patient and acquired by the voice identification module is subjected to culture-filling processing to obtain real-time voiceprints, and the real-time voiceprints are marked as FFruit of Chinese wolfberryI.e. voice comparison information FFruit of Chinese wolfberry
SS 3: obtaining the real-time voiceprint FFruit of Chinese wolfberryAnd standard voiceprint FSign boardComparing the similarity to obtain a similarity FRatio of
7. The AI-based immersive verbal feedback training system of claim 1, wherein: the specific process of the learning scoring module for processing the voice comparison information and the action comparison information to generate training scoring information is as follows:
s01: extracting the obtained voice comparison information and action comparison information, and respectively marking the obtained voice comparison information and action comparison information as M and N;
s02: in order to highlight the importance of voice comparison, a correction value U1 is given to the voice comparison information, a correction value U2 is given to the action comparison information, U1 is greater than U2, and U1+ U2 is 1;
s03: by the formula M U1+ N U2 ═ MnAndobtaining training score information MnAnd
8. the AI-based immersive verbal feedback training system of claim 1, wherein: the scoring display module is used for ranking all the received training scoring information from high to low, amplifying the personnel information corresponding to the first three maximum training scoring information by using a preset font and then displaying the personnel information.
CN202110081356.2A 2021-01-21 2021-01-21 Immersive speech feedback training system based on AI Active CN112885168B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110081356.2A CN112885168B (en) 2021-01-21 2021-01-21 Immersive speech feedback training system based on AI

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110081356.2A CN112885168B (en) 2021-01-21 2021-01-21 Immersive speech feedback training system based on AI

Publications (2)

Publication Number Publication Date
CN112885168A true CN112885168A (en) 2021-06-01
CN112885168B CN112885168B (en) 2022-09-09

Family

ID=76051484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110081356.2A Active CN112885168B (en) 2021-01-21 2021-01-21 Immersive speech feedback training system based on AI

Country Status (1)

Country Link
CN (1) CN112885168B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744880A (en) * 2021-09-08 2021-12-03 邵阳学院 Child language barrier degree management and analysis system
CN114306871A (en) * 2021-12-30 2022-04-12 首都医科大学附属北京天坛医院 Artificial intelligence-based aphasia patient rehabilitation training method and system

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0713046U (en) * 1993-07-19 1995-03-03 武盛 豊永 Dictation word processor
US6356868B1 (en) * 1999-10-25 2002-03-12 Comverse Network Systems, Inc. Voiceprint identification system
US20080004879A1 (en) * 2006-06-29 2008-01-03 Wen-Chen Huang Method for assessing learner's pronunciation through voice and image
CN101751809A (en) * 2010-02-10 2010-06-23 长春大学 Deaf children speech rehabilitation method and system based on three-dimensional head portrait
CN102063903A (en) * 2010-09-25 2011-05-18 中国科学院深圳先进技术研究院 Speech interactive training system and speech interactive training method
KR20140075994A (en) * 2012-12-12 2014-06-20 주홍찬 Apparatus and method for language education by using native speaker's pronunciation data and thought unit
CN105982641A (en) * 2015-01-30 2016-10-05 上海泰亿格康复医疗科技股份有限公司 Speech and language hypoacousie multi-parameter diagnosis and rehabilitation apparatus and cloud rehabilitation system
US9548048B1 (en) * 2015-06-19 2017-01-17 Amazon Technologies, Inc. On-the-fly speech learning and computer model generation using audio-visual synchronization
CN109872714A (en) * 2019-01-25 2019-06-11 广州富港万嘉智能科技有限公司 A kind of method, electronic equipment and storage medium improving accuracy of speech recognition
CN110349565A (en) * 2019-07-02 2019-10-18 长春大学 A kind of auxiliary word pronunciation learning method and its system towards hearing-impaired people
CN110379221A (en) * 2019-08-09 2019-10-25 陕西学前师范学院 A kind of pronunciation of English test and evaluation system
CN110853624A (en) * 2019-11-29 2020-02-28 杭州南粟科技有限公司 Speech rehabilitation training system
CN111081080A (en) * 2019-05-29 2020-04-28 广东小天才科技有限公司 Voice detection method and learning device
CN112233679A (en) * 2020-10-10 2021-01-15 安徽讯呼信息科技有限公司 Artificial intelligence speech recognition system

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0713046U (en) * 1993-07-19 1995-03-03 武盛 豊永 Dictation word processor
US6356868B1 (en) * 1999-10-25 2002-03-12 Comverse Network Systems, Inc. Voiceprint identification system
US20080004879A1 (en) * 2006-06-29 2008-01-03 Wen-Chen Huang Method for assessing learner's pronunciation through voice and image
CN101751809A (en) * 2010-02-10 2010-06-23 长春大学 Deaf children speech rehabilitation method and system based on three-dimensional head portrait
CN102063903A (en) * 2010-09-25 2011-05-18 中国科学院深圳先进技术研究院 Speech interactive training system and speech interactive training method
KR20140075994A (en) * 2012-12-12 2014-06-20 주홍찬 Apparatus and method for language education by using native speaker's pronunciation data and thought unit
CN105982641A (en) * 2015-01-30 2016-10-05 上海泰亿格康复医疗科技股份有限公司 Speech and language hypoacousie multi-parameter diagnosis and rehabilitation apparatus and cloud rehabilitation system
US9548048B1 (en) * 2015-06-19 2017-01-17 Amazon Technologies, Inc. On-the-fly speech learning and computer model generation using audio-visual synchronization
CN109872714A (en) * 2019-01-25 2019-06-11 广州富港万嘉智能科技有限公司 A kind of method, electronic equipment and storage medium improving accuracy of speech recognition
CN111081080A (en) * 2019-05-29 2020-04-28 广东小天才科技有限公司 Voice detection method and learning device
CN110349565A (en) * 2019-07-02 2019-10-18 长春大学 A kind of auxiliary word pronunciation learning method and its system towards hearing-impaired people
CN110379221A (en) * 2019-08-09 2019-10-25 陕西学前师范学院 A kind of pronunciation of English test and evaluation system
CN110853624A (en) * 2019-11-29 2020-02-28 杭州南粟科技有限公司 Speech rehabilitation training system
CN112233679A (en) * 2020-10-10 2021-01-15 安徽讯呼信息科技有限公司 Artificial intelligence speech recognition system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴大江: "基于深度学习的唇读识别研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *
吴大江: "基于深度学习的唇读识别研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 31 December 2018 (2018-12-31) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744880A (en) * 2021-09-08 2021-12-03 邵阳学院 Child language barrier degree management and analysis system
CN113744880B (en) * 2021-09-08 2023-11-17 邵阳学院 Child language barrier degree management analysis system
CN114306871A (en) * 2021-12-30 2022-04-12 首都医科大学附属北京天坛医院 Artificial intelligence-based aphasia patient rehabilitation training method and system

Also Published As

Publication number Publication date
CN112885168B (en) 2022-09-09

Similar Documents

Publication Publication Date Title
CN112885168B (en) Immersive speech feedback training system based on AI
WO2020215966A1 (en) Remote teaching interaction method, server, terminal and system
CN111563487B (en) Dance scoring method based on gesture recognition model and related equipment
Davies et al. Facial composite production: A comparison of mechanical and computer-driven systems.
Wood Willingness to communicate and second language speech fluency: An idiodynamic investigation
WO2019095447A1 (en) Guided teaching method having remote assessment function
WO2016192395A1 (en) Singing score display method, apparatus and system
CN111709358A (en) Teacher-student behavior analysis system based on classroom video
CN111833672B (en) Teaching video display method, device and system
CN106021496A (en) Video search method and video search device
US20140302469A1 (en) Systems and Methods for Providing a Multi-Modal Evaluation of a Presentation
Chen et al. Utilizing multimodal cues to automatically evaluate public speaking performance
CN114898861A (en) Multi-modal depression detection method and system based on full attention mechanism
TW202042172A (en) Intelligent teaching consultant generation method, system and device and storage medium
CN110490173B (en) Intelligent action scoring system based on 3D somatosensory model
CN106228996A (en) Vocality study electron assistant articulatory system
CN108074440A (en) The error correction method and system of a kind of piano performance
CN109739354A (en) A kind of multimedia interaction method and device based on sound
Mozaffari et al. Guided learning of pronunciation by visualizing tongue articulation in ultrasound image sequences
CN116312552A (en) Video speaker journaling method and system
CN111554303A (en) User identity recognition method and storage medium in song singing process
CN111611854A (en) Classroom condition evaluation method based on pattern recognition
CN113888757A (en) Examination paper intelligent analysis method, examination paper intelligent analysis system and storage medium based on benchmarking evaluation
TWI771632B (en) Learning support device, learning support method, and recording medium
CN112949554B (en) Intelligent children accompanying education robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant