CN112949554B - Intelligent children accompanying education robot - Google Patents

Intelligent children accompanying education robot Download PDF

Info

Publication number
CN112949554B
CN112949554B CN202110304626.1A CN202110304626A CN112949554B CN 112949554 B CN112949554 B CN 112949554B CN 202110304626 A CN202110304626 A CN 202110304626A CN 112949554 B CN112949554 B CN 112949554B
Authority
CN
China
Prior art keywords
mouth
opening angle
chinese characters
sentences
mouth opening
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110304626.1A
Other languages
Chinese (zh)
Other versions
CN112949554A (en
Inventor
阳传红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Zhongkai Zhichuang Technology Co ltd
Original Assignee
Hunan Zhongkai Zhichuang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Zhongkai Zhichuang Technology Co ltd filed Critical Hunan Zhongkai Zhichuang Technology Co ltd
Priority to CN202110304626.1A priority Critical patent/CN112949554B/en
Priority to PCT/CN2021/098302 priority patent/WO2022198798A1/en
Publication of CN112949554A publication Critical patent/CN112949554A/en
Application granted granted Critical
Publication of CN112949554B publication Critical patent/CN112949554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention relates to the technical field of robots, and discloses an intelligent child accompanying education robot, which is used for training and interacting the mouth shape of a child in the pronunciation process. The processor of the robot of the invention realizes the following steps when executing the corresponding computer program: in the training process, audio and video data are synchronously acquired, and the slices of the face image data stream are synchronized based on the audio slices, so that the accuracy of the slices of the face image data stream is ensured; meanwhile, considering that the Chinese characters are respectively transmitted and received in the beginning and the end in the pronunciation process, selecting middle section image data stream with most language expression to carry out serial calculation of mouth opening angles, and carrying out comparative analysis according to the mouth opening angle change trend between adjacent Chinese characters and the mouth opening angle change trend between adjacent sentences based on the front and back continuous mouth opening angle data sequence and the standard opening angle data sequence corresponding to the training text; the validity and reliability of the final judgment result are ensured.

Description

Intelligent children accompanying education robot
Technical Field
The invention relates to the technical field of robots, in particular to an intelligent child accompanying education robot.
Background
Currently, with the continuous maturity of face recognition technology, voice and image recognition technology, video interaction and big data analysis technology, these technologies can achieve high coupling with the main application scene of the family robot, and provide good use experience for the user. Meanwhile, the production cost of the robot is continuously reduced due to the technical progress, and the possibility is provided for large-scale production.
In 2019, considered a primitive year for children, then children were more familiar with it and began blowout-type development. The unit price of the child accompanying robot is also different from hundreds to tens of thousands of RMB. Childhood education focuses on content and interaction patterns. The voice conversation of the traditional children toy mainly has the functions of telling stories and playing accompanying slogans, and has no actual function. And intelligent robot has added more humanized functions, and is abundant interactive with child, accords with child's behavioral habit, possesses functions such as voice conversation, story telling, carry the ancient poetry on the back, sing the children song, interdynamic, overturns traditional early education, promotes many-sided ability such as child's expression, logic, music, art, is child's attentive buddies and family teacher.
At present, the voice recognition and interaction technology is mature; however, in the wide interest and hobbies of children such as speech, host and the like, the mouth shape in the pronunciation process is very important, and different Chinese characters are corresponding to different mouth shapes; even if the same Chinese character is used, the mouth shapes of the robot are different due to the variation of polyphone characters, emotional colors, tones and the like in different use scenes, and the current robot is lack of the function of training and interacting the mouth shapes of the pronunciations of the children.
Disclosure of Invention
The invention aims to disclose an intelligent child accompanying education robot, which is used for training and interacting the mouth shape of a child in the pronunciation process.
In order to achieve the above object, the present invention discloses an intelligent robot for child accompanying education, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor executes the computer program to realize the following steps:
calling a training text and displaying the training text to a user through a display screen, wherein the training text comprises at least two sentences with different mouth shape overall change amplitudes, and each sentence comprises at least two Chinese characters with different mouth shapes;
collecting an audio data stream synchronized by a user and a face image data stream comprising a mouth shape;
slicing the audio data stream according to the distribution condition of the Chinese characters and the punctuation marks of the training text and the standard audio characteristics corresponding to the Chinese characters to obtain the time stamp information of each audio data slice corresponding to a single Chinese character one by one;
slicing the face image data stream according to the timestamp information of each audio data slice, and establishing a mapping relation between each image data slice and a corresponding Chinese character;
screening image data frames in the middle 1/3 time period for each image data slice, identifying and extracting open mouth contour information from each screened image data frame, and determining the coordinate position of each characteristic point according to the mouth contour information, wherein the characteristic points at least comprise A, B points at the mouth corners at the inner sides of the mouth and C, D points at the middle positions of the inner sides of the upper lip and the lower lip; calculating the mouth opening angle of each image frame according to the coordinate positions of the A, B point and the C, D point; taking the average value of the calculated mouth opening angles in the same slice as the final value of the mouth opening angles of the Chinese characters in the same mapping;
according to the time sequence, forming a mouth opening angle data sequence corresponding to the training text by the calculated final value of each mouth opening angle;
comparing and analyzing the actual mouth flare angle data sequence and the standard flare angle data sequence corresponding to the training text according to the mouth flare angle variation trend between adjacent Chinese characters and the mouth flare angle variation trend between adjacent sentences, judging the single character and sentence to be corrected of the mouth, and outputting and displaying the judgment result to the user through the display screen; the overall mouth flare angle of a single sentence is the average value or the root mean square of the absolute value of the change amplitude of the adjacent flare angles of each Chinese character governed, and the mouth flare angle is any one of & lt CAD, & lt CBD, & lt ACB or & lt ADB in a quadrilateral formed by taking ACBD as a vertex.
Preferably, in the process of carrying out comparative analysis based on the mouth opening angle variation trend, firstly carrying out comparative analysis on the mouth opening angle variation trend between adjacent sentences to obtain the sentence with the mouth shape to be corrected, and then obtaining the Chinese character with the mouth shape to be corrected according to the mouth opening angle variation trend between the adjacent Chinese characters in the sentence with the mouth shape to be corrected; and finally, determining whether the Chinese characters are in the rest sentences or not according to whether the corresponding request of the user is received or not, and obtaining the Chinese characters to be corrected in the mouth shape according to the mouth shape opening angle variation trend between the adjacent Chinese characters.
Preferably, the robot processor of the present invention, when executing the computer program, further performs the steps of:
and calculating the correlation between the actual mouth flare angle data sequence and the standard flare angle data sequence, and giving an evaluation result corresponding to the whole training text according to the correlation calculation result. For example: and the evaluation result is specifically grade and score calculation according to the statistical relevance value range and gradient.
Preferably, the training text is downloaded remotely through a network, and when the training text is downloaded, standard flare angle data sequence information corresponding to the training text, standard audio characteristic information corresponding to each Chinese character and a standard mouth type comment video of a single character and a sentence are synchronously downloaded; the intelligent child accompanying education robot carries out local stand-alone comparison analysis processing; and the processor, when executing the computer program, further implements the following steps:
after the single characters and sentences to be corrected of the mouth shape are judged, the standard mouth shape comment videos corresponding to the single characters and the sentences are preloaded into the memory so as to play the corresponding correction contents in real time according to the corresponding selection instructions of the user.
The invention has the following beneficial effects:
in the training process, audio and video data are synchronously acquired, and the slices of the face image data stream are synchronized based on the audio slices, so that the accuracy of the slices of the face image data stream is ensured; meanwhile, considering that the Chinese characters are respectively transmitted and received in the beginning and the end in the pronunciation process, selecting middle section image data stream with most language expression to carry out serial calculation of mouth opening angles, and carrying out comparative analysis according to the mouth opening angle change trend between adjacent Chinese characters and the mouth opening angle change trend between adjacent sentences based on the front and back continuous mouth opening angle data sequence and the standard opening angle data sequence corresponding to the training text; the validity and reliability of the final judgment result are ensured.
The present invention will be described in further detail below with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart illustrating steps implemented when a processor of an intelligent child accompanying education robot executes a corresponding computer program according to a preferred embodiment of the present invention.
Detailed Description
The embodiments of the invention will be described in detail below with reference to the drawings, but the invention can be implemented in many different ways as defined and covered by the claims.
Example 1
The embodiment discloses an intelligent child accompanying education robot which comprises a memory, a processor and a computer program, wherein the computer program is stored on the memory and can run on the processor. As shown in fig. 1, the robot processor of the present embodiment implements the following steps when executing the computer program:
and step S1, calling a training text and displaying the training text to a user through a display screen, wherein the training text comprises at least two sentences with different mouth overall change amplitudes, and each sentence comprises at least two Chinese characters with different mouth shapes.
In this embodiment, the training text is elaborately arranged content (i.e., selected content corresponding to subsequent adjacent Chinese characters and adjacent sentences with an obvious mouth opening angle variation trend) based on which the training effect can be obviously evaluated and followed by experts in the relevant fields such as acoustics and lip language expressiveness. And can be downloaded from a cloud server based on a C/S architecture.
Preferably, the training text of this embodiment is downloaded remotely via a network, and when the training text is downloaded, the standard flare angle data sequence information corresponding to the training text, the standard audio feature information corresponding to each Chinese character, and the standard mouth type comment video of the single character and sentence are downloaded synchronously; the intelligent child accompanying education robot can perform a series of data processing such as comparison and analysis in subsequent steps based on a local single machine. Preferably, the standard field angle data sequence information corresponding to the training text can also be obtained by recording experts in related fields such as acoustics and lip language expressiveness and performing background data calibration processing. As a variant, the standard opening angle data sequence in this step can also be calculated based on the way the audio information is converted into mouth-shaped marker points in the multimodal interaction.
And step S2, acquiring an audio data stream synchronized by a user and a face image data stream comprising a mouth shape.
In the step, the audio data stream can be collected through a microphone, and the human face image data stream can be collected through the video recording function of the camera module.
And step S3, slicing the audio data stream according to the distribution condition of the Chinese characters and the punctuations of the training text and the standard audio characteristics corresponding to the Chinese characters, and acquiring the time stamp information of each audio data slice corresponding to a single Chinese character one by one.
In the step, the content of Chinese characters and punctuation marks based on the training text is known, and the standard audio features corresponding to the Chinese characters are also known; the time stamp information of the audio data slice corresponding to each Chinese character can be quickly obtained by combining the frequency spectrum analysis and the slicing technology in the process of converting the voice into the Chinese character in the existing voice recognition.
And step S4, slicing the face image data stream according to the time stamp information of each audio data slice, and establishing the mapping relation between each image data slice and the corresponding Chinese character.
In the step, the acquired audio data stream and the face image data stream are synchronous; therefore, the mapping relation which is formed after the face image data stream is sliced according to the time stamp information corresponding to each audio slice and corresponds to the corresponding Chinese characters one by one is also accurate.
S5, screening image data frames in the middle 1/3 time period for each image data slice, identifying and extracting open mouth contour information from each screened image data frame, and determining the coordinate position of each characteristic point according to the mouth contour information, wherein the characteristic points at least comprise A, B points at the mouth corners at the inner sides of the mouth and C, D points at the middle positions of the inner sides of the upper lip and the lower lip; calculating the mouth opening angle of each image frame according to the coordinate positions of the A, B point and the C, D point; and taking the average value of the mouth opening angles calculated in the same slice as the final value of the mouth opening angles of the Chinese characters in the same mapping.
In this step, the image data frames at the middle 1/3 time in each image data slice are filtered, that is, the image data slices are divided into three equal parts, and the image data streams of the middle segment with the most language expression are taken by pinching the head and the tail to perform the series of calculations of the mouth opening angle. In general, there are 68 feature points for a human face, and the number of key feature points for a mouth is only 20. In the training process, points a and B are generally symmetrical with respect to the center point O of the mouth-shaped contour, and points C and D are also symmetrical with respect to the center point O of the mouth-shaped contour, that is, in a quadrilateral formed by using ACBD as a vertex, the lengths of the side AC and the side BC are approximately equal, and the lengths of the side AD and the side BD are approximately equal; correspondingly, the nozzle type opening angle can be defined as any one of < CAD, < CBD, < ACB or < ADB. Extracting the mouth contour information from the face image is a technique well known to those skilled in the art and will not be described in detail.
And step S6, forming the calculated final values of the mouth opening angles into a mouth opening angle data sequence corresponding to the training text according to the time sequence.
And step S7, comparing and analyzing the actual mouth opening angle data sequence and the standard opening angle data sequence corresponding to the training text according to the mouth opening angle variation trend between adjacent Chinese characters and the mouth opening angle variation trend between adjacent sentences, judging the single character and sentence of which the mouth shape is to be corrected, and outputting and displaying the judgment result to the user through the display screen.
In the step, the mouth opening angle of the whole single sentence is the average value or the root mean square of the absolute value of the variation amplitude of the adjacent opening angles of each Chinese character administered. In this embodiment, the comparative analysis of the mouth opening angle variation trend specifically includes: the sequencing of single Chinese characters or sentences used for representing time sequence is used as an abscissa, and a two-dimensional coordinate system is established by using the mouth opening angle value of the whole corresponding Chinese character or sentence as an ordinate; comparing the actually sampled mouth opening angle variation trend curve (usually formed by connecting a plurality of broken lines) in the two-dimensional coordinate system with the standard mouth opening angle variation trend curve, thereby obtaining the single character or sentence to be corrected.
Optionally, in this step, during the comparative analysis of the variation trend of the mouth opening angle between adjacent sentences, or during the comparative analysis of the variation trend of the mouth opening angle between adjacent Chinese characters in a sentence; different threshold values can be set respectively, and when the actual variation trend between adjacent Chinese characters or sentences obtained by sampling calculation exceeds the threshold value of the set limited deviation proportion range compared with the standard variation trend, the Chinese characters or sentences to be corrected can be judged.
Preferably, in the process of performing comparative analysis based on the mouth opening angle variation trend, the mouth opening angle variation trend between adjacent sentences is firstly performed comparative analysis to obtain the sentence to be corrected of the mouth shape, and then the Chinese character to be corrected of the mouth shape is obtained in the sentence to be corrected of the mouth shape according to the mouth opening angle variation trend between the adjacent Chinese characters; and finally, determining whether the Chinese characters are in the rest sentences or not according to whether the corresponding requests of the user are received or not (namely, the user generates corresponding instructions by clicking to respond, otherwise, the subsequent steps are not carried out), and obtaining the Chinese characters to be corrected in the mouth shape according to the mouth shape opening angle variation trend between the adjacent Chinese characters. Therefore, different requirements of different users can be distinguished; the method has the advantages that the sentence to be corrected and key Chinese characters to be corrected in the sentence to be corrected, which are generally concerned by a user, are quickly responded and positioned, and simultaneously, the load of a memory and the resource consumption of a CPU are effectively reduced.
Preferably, the robot processor of the present invention, when executing the computer program, further performs the steps of:
and step S8, calculating the correlation between the actual mouth opening angle data sequence and the standard opening angle data sequence, and giving the evaluation result corresponding to the whole training text according to the correlation calculation result. For example: and the evaluation result is specifically grade and score calculation according to the statistical relevance value range and gradient. Alternatively, the calculation method of the specific correlation may employ a pearson correlation coefficient method. And
step S9, after the single character and sentence to be corrected are judged, the standard mouth comment video corresponding to the single character and sentence is preloaded to the memory so as to play the corresponding correction content in real time according to the corresponding selection instruction of the user.
In summary, the technical solution disclosed in the embodiment of the present invention has at least the following beneficial effects:
in the training process, audio and video data are synchronously acquired, and the slices of the face image data stream are synchronized based on the audio slices, so that the accuracy of the slices of the face image data stream is ensured; meanwhile, considering that the Chinese characters are respectively transmitted and received in the beginning and the end in the pronunciation process, selecting middle section image data stream with most language expression to carry out serial calculation of mouth opening angles, and carrying out comparative analysis according to the mouth opening angle change trend between adjacent Chinese characters and the mouth opening angle change trend between adjacent sentences based on the front and back continuous mouth opening angle data sequence and the standard opening angle data sequence corresponding to the training text; the validity and reliability of the final judgment result are ensured.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. An intelligent child companion educational robot comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of:
calling a training text and displaying the training text to a user through a display screen, wherein the training text comprises at least two sentences with different mouth shape overall change amplitudes, and each sentence comprises at least two Chinese characters with different mouth shapes;
collecting an audio data stream synchronized by a user and a face image data stream comprising a mouth shape;
slicing the audio data stream according to the distribution condition of the Chinese characters and the punctuation marks of the training text and the standard audio characteristics corresponding to the Chinese characters to obtain the time stamp information of each audio data slice corresponding to a single Chinese character one by one;
slicing the face image data stream according to the timestamp information of each audio data slice, and establishing a mapping relation between each image data slice and a corresponding Chinese character;
screening image data frames in the middle 1/3 time period for each image data slice, identifying and extracting open mouth contour information from each screened image data frame, and determining the coordinate position of each characteristic point according to the mouth contour information, wherein the characteristic points at least comprise A, B points at the mouth corners at the inner sides of the mouth and C, D points at the middle positions of the inner sides of the upper lip and the lower lip; calculating the mouth opening angle of each image frame according to the coordinate positions of the A, B point and the C, D point; taking the average value of the calculated mouth opening angles in the same slice as the final value of the mouth opening angles of the Chinese characters in the same mapping;
according to the time sequence, forming a mouth opening angle data sequence corresponding to the training text by the calculated final value of each mouth opening angle;
comparing and analyzing the actual mouth flare angle data sequence and the standard flare angle data sequence corresponding to the training text according to the mouth flare angle variation trend between adjacent Chinese characters and the mouth flare angle variation trend between adjacent sentences, judging the single character and sentence to be corrected of the mouth, and outputting and displaying the judgment result to the user through the display screen; the method comprises the steps that the overall nozzle type flare angle of a single sentence is the average value or the root mean square of the absolute value of the change amplitude of the adjacent flare angles of each Chinese character administered, and the nozzle type flare angle is any one of & lt CAD, & lt CBD, & lt ACB or & lt ADB in a quadrangle formed by taking ACBD as a vertex;
wherein, the comparative analysis of the mouth opening angle variation trend specifically comprises the following steps: the sequencing of single Chinese characters or sentences used for representing time sequence is used as an abscissa, and a two-dimensional coordinate system is established by using the mouth opening angle value of the whole corresponding Chinese character or sentence as an ordinate; comparing the actually sampled mouth opening angle change trend curve with the standard mouth opening angle change trend curve in the two-dimensional coordinate system, thereby obtaining single characters or sentences to be corrected; and in the comparative analysis process of the mouth opening angle variation trend between adjacent sentences or between adjacent Chinese characters in the sentences; different threshold values are respectively set, and when the actual variation trend between adjacent Chinese characters or sentences obtained through sampling calculation exceeds the threshold value of the set limited deviation proportion range compared with the standard variation trend, the Chinese characters or sentences to be corrected are judged.
2. The intelligent robot for accompanying education of children as claimed in claim 1, wherein in the process of performing comparative analysis based on the variation trend of the mouth opening angle, firstly performing comparative analysis on the variation trend of the mouth opening angle between adjacent sentences to obtain sentences of which the mouth is to be corrected, and then obtaining Chinese characters of which the mouth is to be corrected according to the variation trend of the mouth opening angle between adjacent Chinese characters in the sentences of which the mouth is to be corrected; and finally, determining whether the Chinese characters are in the rest sentences or not according to whether the corresponding request of the user is received or not, and obtaining the Chinese characters to be corrected in the mouth shape according to the mouth shape opening angle variation trend between the adjacent Chinese characters.
3. The intelligent child companion educational robot of claim 2, wherein the processor, when executing the computer program, further performs the steps of:
and calculating the correlation between the actual mouth flare angle data sequence and the standard flare angle data sequence, and giving an evaluation result corresponding to the whole training text according to the correlation calculation result.
4. The intelligent child accompany education robot as claimed in claim 3, wherein the evaluation result is specifically a rating and score calculation based on statistical relevance value range and gradient.
5. The intelligent children accompany education robot as claimed in any one of claims 1 to 4, wherein the training text is downloaded remotely via a network, and when the training text is downloaded, standard flare data sequence information corresponding to the training text, standard audio feature information corresponding to each Chinese character, and a standard mouth type comment video of a single character and a sentence are downloaded synchronously; the intelligent child accompanying education robot carries out local stand-alone comparison analysis processing; and the processor, when executing the computer program, further implements the following steps:
after the single characters and sentences to be corrected of the mouth shape are judged, the standard mouth shape comment videos corresponding to the single characters and the sentences are preloaded into the memory so as to play the corresponding correction contents in real time according to the corresponding selection instructions of the user.
CN202110304626.1A 2021-03-22 2021-03-22 Intelligent children accompanying education robot Active CN112949554B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110304626.1A CN112949554B (en) 2021-03-22 2021-03-22 Intelligent children accompanying education robot
PCT/CN2021/098302 WO2022198798A1 (en) 2021-03-22 2021-06-04 Intelligent children accompanying education robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110304626.1A CN112949554B (en) 2021-03-22 2021-03-22 Intelligent children accompanying education robot

Publications (2)

Publication Number Publication Date
CN112949554A CN112949554A (en) 2021-06-11
CN112949554B true CN112949554B (en) 2022-02-08

Family

ID=76227595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110304626.1A Active CN112949554B (en) 2021-03-22 2021-03-22 Intelligent children accompanying education robot

Country Status (2)

Country Link
CN (1) CN112949554B (en)
WO (1) WO2022198798A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115359059B (en) * 2022-10-20 2023-01-31 一道新能源科技(衢州)有限公司 Solar cell performance test method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106940939A (en) * 2017-03-16 2017-07-11 牡丹江师范学院 Oral English Teaching servicing unit and its method
CN107424450A (en) * 2017-08-07 2017-12-01 英华达(南京)科技有限公司 Pronunciation correction system and method
CN108492641A (en) * 2018-03-26 2018-09-04 贵州西西沃教育科技股份有限公司 A kind of English phonetic learning system
CN109389098A (en) * 2018-11-01 2019-02-26 重庆中科云丛科技有限公司 A kind of verification method and system based on lip reading identification
CN111429885A (en) * 2020-03-02 2020-07-17 北京理工大学 Method for mapping audio clip to human face-mouth type key point
CN111951629A (en) * 2019-05-16 2020-11-17 上海流利说信息技术有限公司 Pronunciation correction system, method, medium and computing device
CN111950327A (en) * 2019-05-16 2020-11-17 上海流利说信息技术有限公司 Mouth shape correcting method, mouth shape correcting device, mouth shape correcting medium and computing equipment
CN112037788A (en) * 2020-09-10 2020-12-04 中航华东光电(上海)有限公司 Voice correction fusion technology

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9548048B1 (en) * 2015-06-19 2017-01-17 Amazon Technologies, Inc. On-the-fly speech learning and computer model generation using audio-visual synchronization
CN109034037A (en) * 2018-07-19 2018-12-18 江苏黄金屋教育发展股份有限公司 On-line study method based on artificial intelligence
CN112001323A (en) * 2020-08-25 2020-11-27 成都威爱新经济技术研究院有限公司 Digital virtual human mouth shape driving method based on pinyin or English phonetic symbol reading method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106940939A (en) * 2017-03-16 2017-07-11 牡丹江师范学院 Oral English Teaching servicing unit and its method
CN107424450A (en) * 2017-08-07 2017-12-01 英华达(南京)科技有限公司 Pronunciation correction system and method
CN108492641A (en) * 2018-03-26 2018-09-04 贵州西西沃教育科技股份有限公司 A kind of English phonetic learning system
CN109389098A (en) * 2018-11-01 2019-02-26 重庆中科云丛科技有限公司 A kind of verification method and system based on lip reading identification
CN111951629A (en) * 2019-05-16 2020-11-17 上海流利说信息技术有限公司 Pronunciation correction system, method, medium and computing device
CN111950327A (en) * 2019-05-16 2020-11-17 上海流利说信息技术有限公司 Mouth shape correcting method, mouth shape correcting device, mouth shape correcting medium and computing equipment
CN111429885A (en) * 2020-03-02 2020-07-17 北京理工大学 Method for mapping audio clip to human face-mouth type key point
CN112037788A (en) * 2020-09-10 2020-12-04 中航华东光电(上海)有限公司 Voice correction fusion technology

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Development of Novel Lip-Reading Recognition Algorithm;Bor-Shing Lin等;《IEEE Access》;20170109;第5卷;第794-801页 *
Recognizing collaborators using a flexible approach based on face and voice biometrics;Jesús Salvador Martínez-Delgado等;《2013 10th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE)》;20131202;第324-359页 *
唇读研究进展综述;张泽梁等;《计算机工程与设计》;20140630;第35卷(第6期);第2135-2141页 *
国YouTube汉语偏误视频教学研究;朴宝拉;《中国优秀硕士学位论文全文数据库 哲学与人文科学辑》;20201015;第2020年卷(第10期);F084-74 *

Also Published As

Publication number Publication date
WO2022198798A1 (en) 2022-09-29
CN112949554A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN108665492A (en) A kind of Dancing Teaching data processing method and system based on visual human
CN104183171A (en) Electronic music-based system and method for precisely judging instrument performance level
CN106898363A (en) A kind of vocality study electron assistant articulatory system
Sargin et al. Analysis of head gesture and prosody patterns for prosody-driven head-gesture animation
CN109583443B (en) Video content judgment method based on character recognition
CN106448701A (en) Vocal integrated training system
CN107436921A (en) Video data handling procedure, device, equipment and storage medium
CN109326162A (en) A kind of spoken language exercise method for automatically evaluating and device
CN112001323A (en) Digital virtual human mouth shape driving method based on pinyin or English phonetic symbol reading method
CN109582952A (en) Poem generation method, device, computer equipment and medium
CN110600033A (en) Learning condition evaluation method and device, storage medium and electronic equipment
CN107578004A (en) Learning method and system based on image recognition and interactive voice
CN112949554B (en) Intelligent children accompanying education robot
Kagirov et al. TheRuSLan: Database of Russian sign language
JP2023552854A (en) Human-computer interaction methods, devices, systems, electronic devices, computer-readable media and programs
CN113610680A (en) AI-based interactive reading material personalized recommendation method and system
TWI294107B (en) A pronunciation-scored method for the application of voice and image in the e-learning
CN110245253A (en) A kind of Semantic interaction method and system based on environmental information
CN114936787A (en) Online student teaching intelligent analysis management cloud platform based on artificial intelligence
Naert et al. Lsf-animal: A motion capture corpus in french sign language designed for the animation of signing avatars
CN113837907A (en) Man-machine interaction system and method for English teaching
KR20180012192A (en) Infant Learning Apparatus and Method Using The Same
KR102042503B1 (en) Method for providing advertisement using video-type avatar, and computer-readable recording medium with providing program of the same
Stefanov et al. A kinect corpus of swedish sign language signs
CN116168134B (en) Digital person control method, digital person control device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant