CN111638781B - AR-based pronunciation guide method and device, electronic equipment and storage medium - Google Patents

AR-based pronunciation guide method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111638781B
CN111638781B CN202010414224.2A CN202010414224A CN111638781B CN 111638781 B CN111638781 B CN 111638781B CN 202010414224 A CN202010414224 A CN 202010414224A CN 111638781 B CN111638781 B CN 111638781B
Authority
CN
China
Prior art keywords
user
pronunciation
sounded
foreign language
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010414224.2A
Other languages
Chinese (zh)
Other versions
CN111638781A (en
Inventor
崔颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN202010414224.2A priority Critical patent/CN111638781B/en
Publication of CN111638781A publication Critical patent/CN111638781A/en
Application granted granted Critical
Publication of CN111638781B publication Critical patent/CN111638781B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/632Query formulation
    • G06F16/634Query by example, e.g. query by humming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/2053D [Three Dimensional] animation driven by audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/04Speaking
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Abstract

The embodiment of the application discloses an AR-based pronunciation guide method, an AR-based pronunciation guide device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a foreign language sentence to be uttered, wherein the foreign language sentence to be uttered comprises a plurality of words to be uttered; loading and displaying the plurality of to-be-pronounced words in a displayed musical scale ladder diagram formed by sequentially splicing musical scales of the plurality of to-be-pronounced words according to the pronunciation sequence of the plurality of to-be-pronounced words; wherein, any word to be pronounced is displayed in the scale of the word to be pronounced close to the tone ladder diagram; tracking a mouth position of the user from the revealed real-time representation of the user; when a target word in the plurality of words to be uttered is prompted to be uttered, a standard uttered mouth shape for displaying the target word is loaded at the mouth position of the user in an AR form. The method can enable the child to pay attention to the mouth shape and the musical scale of each word in the foreign language sentence when the child pronounces, so that the intonation of the child when reading the foreign language sentence is more accurate and has emotion.

Description

AR-based pronunciation guide method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an AR-based pronunciation guide method, apparatus, electronic device, and storage medium.
Background
At present, in the process of a child reading a foreign language sentence (such as an English sentence), how to make the child focus on the mouth shape and the musical scale of each word in the foreign language sentence when pronouncing, so that the intonation of the child reading the foreign language sentence is more accurate and has emotion, and the method is a hot spot problem continuously discussed by a plurality of parents and teachers.
Disclosure of Invention
The embodiment of the application discloses a pronunciation guiding method, device, electronic equipment and storage medium based on AR, which can enable children to pay attention to mouth shapes and scales of foreign language sentences when each word pronounces, so that intonation of the foreign language sentences is more accurate and emotional when the children read the foreign language sentences.
An embodiment of the present application in a first aspect discloses an AR-based pronunciation guide method, which includes:
acquiring a foreign language sentence to be sounded, wherein the foreign language sentence to be sounded comprises a plurality of words to be sounded;
loading and displaying the plurality of to-be-sounded words in a displayed musical scale ladder diagram formed by sequentially splicing musical scales of the plurality of to-be-sounded words according to the pronunciation sequence of the plurality of to-be-sounded words; any one of the words to be sounded is displayed in a scale close to the word to be sounded in the scale map;
tracking a mouth position of a user from a displayed real-time representation of the user;
When a target word in the plurality of words to be uttered is prompted to be uttered, loading and displaying a standard pronunciation mouth shape of the target word in an AR form at a mouth position of the user.
In combination with the first aspect of the embodiments of the present application, in some optional embodiments, after the loading of the standard pronunciation mouth shape for displaying the target word at the mouth position of the user, the method further includes:
picking up the pronunciation of the target word by the user;
comparing the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user;
and controlling the scale of the target word and the target word in the scale map to respectively display colors corresponding to the pronunciation assessment results.
With reference to the first aspect of the embodiments of the present application, in some optional embodiments, the method further includes:
after the user finishes pronouncing the plurality of words to be pronounciated, detecting whether the foreign language sentence to be pronounciated is associated with an object to be unlocked or not;
if the foreign language sentence to be sounded is associated with the object to be unlocked, acquiring unlocking permission parameters configured for the object to be unlocked; wherein the unlocking permission parameters at least comprise a specified number of words with accurate pronunciation;
Counting the total number of words with accurate pronunciation in the plurality of words to be sounded according to the pronunciation evaluation result of the user on each word to be sounded;
and comparing whether the total number exceeds the specified number, and if so, unlocking the object to be unlocked.
With reference to the first aspect of the embodiments of the present application, in some optional embodiments, the unlocking permission parameter further includes a specified intonation matching degree, and after comparing that the total number exceeds the specified number, the method further includes:
matching the picked intonation of the foreign language sentence to be uttered by the user with the standard intonation of the foreign language sentence to be uttered to obtain the intonation matching degree of the foreign language sentence to be uttered;
and comparing whether the tone matching degree of the foreign language sentence to be sounded is larger than or equal to the appointed tone matching degree, and if so, executing the step of unlocking the object to be unlocked.
With reference to the first aspect of the embodiments of the present application, in some optional embodiments, the obtaining a foreign language sentence to be pronounced includes:
acquiring a voice signal sent by external monitoring equipment; the voice signal comprises the identification of the selected learning module and the identification of the appointed object;
Checking whether the identity of the appointed object is matched with the identity of the user in the displayed real-time portrait of the user, and if so, inquiring a learning module corresponding to the user according to the identity of the selected learning module contained in the voice signal;
and detecting the foreign language sentence selected by the user from the learning module as a foreign language sentence to be sounded.
A second aspect of an embodiment of the present application discloses an AR-based pronunciation guide apparatus, including:
the first acquisition unit is used for acquiring a foreign language sentence to be uttered, wherein the foreign language sentence to be uttered comprises a plurality of words to be uttered;
the first loading unit is used for loading and displaying the plurality of words to be sounded in a displayed musical scale map formed by sequentially splicing musical scales of the plurality of words to be sounded according to the pronunciation sequence of the plurality of words to be sounded; any one of the words to be sounded is displayed in a scale close to the word to be sounded in the scale map;
a position tracking unit for tracking the mouth position of the user from the displayed real-time representation of the user;
and the second loading unit is used for loading and displaying the standard pronunciation mouth shape of the target word in the mouth position of the user in an AR form when a certain target word in the plurality of words to be sounded is prompted to be sounded.
With reference to the second aspect of the embodiments of the present application, in some optional embodiments, the pronunciation guide device further includes:
a pronunciation pick-up unit for picking up the pronunciation of the target word by the user after the second loading unit loads the standard pronunciation mouth shape for displaying the target word at the mouth position of the user;
the pronunciation assessment unit is used for comparing the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user;
and the color control unit is used for controlling the musical scale of the target word and the target word in the musical scale map to respectively display colors corresponding to the pronunciation evaluation results.
With reference to the second aspect of the embodiments of the present application, in some optional embodiments, the apparatus further includes:
the sound guide device further includes:
the object detection unit is used for detecting whether the foreign language sentence to be sounded is associated with an object to be unlocked or not after the user finishes sounding the plurality of words to be sounded;
the second acquisition unit is used for acquiring unlocking permission parameters configured for the object to be unlocked when the object detection unit detects that the foreign language sentence to be sounded is associated with the object to be unlocked; wherein the unlocking permission parameters at least comprise a specified number of words with accurate pronunciation;
The quantity counting unit is used for counting the total quantity of words with accurate pronunciation in the plurality of words to be sounded according to the pronunciation evaluation result of the user on each word to be sounded;
a first comparing unit for comparing whether the total number exceeds the specified number;
and the object unlocking unit is used for unlocking the object to be unlocked when the comparison result of the first comparison unit is exceeded.
In combination with the second aspect of the embodiments of the present application, in some optional embodiments, the unlocking permission parameter further includes a specified intonation matching degree, and the pronunciation guide device further includes:
the intonation matching unit is used for matching the picked intonation of the pronunciation of the foreign language sentence to be uttered by the user with the standard intonation of the foreign language sentence to be uttered after the first comparison unit compares that the total number exceeds the designated number, so as to obtain the intonation matching degree of the foreign language sentence to be uttered;
and the second comparison unit is used for comparing whether the tone matching degree of the foreign language sentence to be sounded is larger than or equal to the appointed tone matching degree, and if so, triggering the object unlocking unit to execute the operation of unlocking the object to be unlocked when the comparison result of the first comparison unit is exceeded.
With reference to the second aspect of the embodiments of the present application, in some optional embodiments, the first obtaining unit includes:
the acquisition subunit is used for acquiring a voice signal sent by external supervision equipment; the voice signal comprises the identification of the selected learning module and the identification of the appointed object;
a query subunit, configured to check whether the identity of the specified object is matched with the identity of the user in the displayed real-time portrait of the user, and if so, query a learning module corresponding to the user according to the identifier of the selected learning module included in the voice signal;
and the detection subunit is used for detecting the foreign language sentence selected by the user from the learning module as a foreign language sentence to be sounded.
A third aspect of embodiments of the present application discloses an electronic device comprising an AR-based pronunciation guide apparatus as described in the second aspect of embodiments of the present application or any of the alternative embodiments of the second aspect.
A fourth aspect of the present application discloses an electronic device, including:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform all or part of the steps of the AR-based pronunciation guide method described in the first aspect of the embodiments of the present application or any alternative embodiment of the first aspect.
A fifth aspect of the embodiments of the present application is a computer-readable storage medium, where computer instructions are stored, where the computer instructions, when executed, cause a computer to perform all or part of the steps of the AR-based pronunciation guidance method described in the first aspect of the embodiments of the present application or any of the alternative embodiments of the first aspect.
Compared with the prior art, the embodiment of the application has the following beneficial effects:
in the embodiment of the application, the multiple to-be-sounded words can be loaded and displayed in a musical scale map formed by sequentially splicing the musical scales of the multiple to-be-sounded words contained in the displayed to-be-sounded foreign language sentence according to the pronunciation sequence of the multiple to-be-sounded words; wherein, any word to be pronounced is displayed in the scale of the word to be pronounced in the approach tone ladder diagram; and when a certain target word in the plurality of words to be uttered is prompted to be uttered, loading a standard uttered mouth shape for displaying the target word in the AR form at the mouth position of the user tracked from the real-time portrait of the displayed user, so that the child can pay attention to the mouth shape and the musical scale of each word in the foreign language sentence when uttering, and the intonation of the child when reading the foreign language sentence is more accurate and emotional.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a first embodiment of an AR-based pronunciation guidance method disclosed in embodiments of the present application;
FIG. 2 is an interface schematic of a screen disclosed in an embodiment of the present application;
FIG. 3 is a flow chart of a second embodiment of an AR-based pronunciation guidance method disclosed in embodiments of the present application;
FIG. 4 is a flow chart of a third embodiment of an AR-based pronunciation guidance method disclosed in embodiments of the present application;
FIG. 5 is a schematic structural view of a first embodiment of an AR-based pronunciation guide device disclosed in an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a second embodiment of an AR-based pronunciation guide device disclosed in an embodiment of the present application;
FIG. 7 is a schematic structural diagram of a third embodiment of an AR-based pronunciation guide device disclosed in an embodiment of the present application;
Fig. 8 is a schematic structural view of a first embodiment of an electronic device disclosed in an embodiment of the present application;
fig. 9 is a schematic structural view of a second embodiment of an electronic device disclosed in an embodiment of the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be noted that the terms "comprises" and "comprising," along with any variations thereof, in the embodiments of the present application are intended to cover non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed.
The embodiment of the application discloses a pronunciation guiding method and device based on AR, electronic equipment and storage medium, which can enable children to pay attention to mouth shapes and scales of foreign language sentences when each word pronounces, so that intonation of the foreign language sentences is more accurate and emotional when the children read the foreign language sentences. The following detailed description is made with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a flowchart illustrating a first embodiment of an AR-based pronunciation guide according to an embodiment of the present application. The AR-based pronunciation guide described in fig. 1 is applicable to various electronic devices such as educational devices (e.g., home education devices, classroom electronic devices), computers (e.g., student tablet, personal PC), mobile phones, intelligent home devices (e.g., intelligent televisions, intelligent speakers, intelligent robots), etc., which are not limited in this embodiment. In the AR-based pronunciation guide described in fig. 1, the AR-based pronunciation guide is described with an electronic device as an execution subject. As shown in fig. 1, the AR-based pronunciation guide may include the steps of:
101. the electronic equipment acquires a foreign language sentence to be sounded, wherein the foreign language sentence to be sounded comprises a plurality of words to be sounded.
For example, when the foreign language sentence to be uttered acquired by the electronic device is an english sentence, a plurality of words to be uttered included in the foreign language sentence to be uttered may be each english word included in the english sentence.
Also for example, when the foreign language sentence to be uttered obtained by the electronic device is a russian sentence, the plurality of words to be uttered included in the foreign language sentence to be uttered may be each russian word included in the russian sentence.
Further, for example, when the foreign language sentence to be uttered obtained by the electronic device is a legal sentence, the plurality of words to be uttered included in the foreign language sentence to be uttered may be each word to be uttered in french included in the french sentence.
In some embodiments, the electronic device may capture a foreign language sentence to be uttered in the learning module that is selected by the user (e.g., clicked by the user) via a camera device (e.g., a camera). The learning module may be a learning page (e.g. a paper learning page or an electronic learning page) corresponding to the user, or may be a learning chapter included in a learning page corresponding to the user.
For example, the electronic device may locate a learning module selected by a user's finger, a writing pen, or voice, and use the learning module selected by the user's finger, writing pen, or voice as the learning module corresponding to the user. For example, the electronic device may take a learning module selected by a finger or a pen of a user as a learning module corresponding to the user by using a camera (such as a camera); alternatively, the electronic device may employ a pickup device (such as a microphone) to pick up a learning module selected by the voice uttered by the user as the learning module corresponding to the user. In some embodiments, the image capturing device (such as a camera) may be disposed on a finger ring worn by a finger of a user, and when the finger ring detects that the finger of the user worn by the finger ring is straightened, the finger ring may start the image capturing device (such as the camera) to capture a learning module selected by the finger of the user, and the captured learning module selected by the finger of the user is transmitted to the electronic device by the finger ring, so that the electronic device may determine the learning module corresponding to the user. By implementing the embodiment, the power consumption brought by the electronic equipment shooting of the learning module selected by the finger of the user can be reduced, so that the battery endurance of the electronic equipment can be improved.
In other examples, the electronic device may obtain a learning module selected by other external supervisory devices for the user, and use the learning module selected by other external devices for the user as the learning module corresponding to the user. For example, the electronic device may previously establish a communication connection with a wrist type supervision device worn by a supervisor (such as a classroom teacher or a parent) of the user, the supervisor holds a finger of a palm where the wrist of the wrist type supervision device is worn against the root of the ear to make the ear line Cheng Mibi sound cavity, and the supervisor may send out a voice signal with a volume lower than a certain threshold for selecting a learning module for the user; the voice signal is transmitted into the wrist type supervision device as a vibration signal through bone media of the palm, and the wrist type supervision device transmits the voice signal to the electronic device. In this embodiment, a user's supervisor (such as a classroom teacher or a parent) can flexibly select a learning module for the user, and the user will not cause interference to surrounding people in the process of selecting the learning module for the user.
102. The electronic equipment loads and displays the plurality of words to be pronounced in a displayed musical scale map formed by sequentially splicing musical scales of the plurality of words to be pronounced according to the pronunciation sequence of the plurality of words to be pronounced; wherein any word to be pronounced is displayed in a scale close to the any word to be pronounced in the ladder diagram.
103. The electronic device tracks the user's mouth position from the presented real-time representation of the user.
The electronic device may capture a real-time representation of the user by an image capture device (e.g., a camera) and output the captured real-time representation of the user to a screen (e.g., a display screen provided by the electronic device or an external display screen to which the electronic device is communicatively coupled) for presentation. Further, the electronic device may incorporate facial recognition technology to capture the user's mouth from the live representation of the user presented on the screen.
104. When the electronic equipment prompts that a target word in the plurality of words to be uttered needs to be uttered, the electronic equipment loads and displays the standard pronunciation mouth shape of the target word at the mouth position of the user in an AR form.
Taking the interface schematic diagram of the screen shown in fig. 2 as an example, the foreign language sentence to be uttered that can be acquired by the electronic device is "I like to walk to the office", and includes 7 words to be uttered, I "," like "," to "," walk "," to "," the ". Further, the electronic device may load and display the 7 to-be-sounded words in a musical scale map displayed on the screen, where the musical scales of the 7 to-be-sounded words are sequentially spliced according to the pronunciation sequence of the 7 to-be-sounded words; wherein, the to-be-sounded word "I" is displayed in the scale close to the to-be-sounded word "I" in the tone ladder diagram, the to-be-sounded word "like" is displayed in the scale close to the to-be-sounded word "like" in the tone ladder diagram, the first to-be-sounded word "to" is displayed in the scale close to the first to-be-sounded word "to" in the tone ladder diagram, the to-be-sounded word "walk" is displayed in the scale close to the to-be-sounded word "walk" in the tone ladder diagram, the second to-be-sounded word "to" is displayed in the scale close to the to-be-sounded word "to" in the tone ladder diagram, and the to-be-sounded word "office" is displayed in the scale close to the tone ladder diagram. Further, the electronic equipment can shoot the real-time portrait of the user through the camera equipment (such as a camera) and output the shot real-time portrait of the user to a screen for display, and on the basis, the electronic equipment can be positioned to the mouth position in the real-time portrait of the user through the face recognition and motion capture technology and follow the mouth position of the user in real time; further, the electronic device may load a standard pronunciation mouth shape for displaying a target word in an AR form at a mouth position of the user when a certain target word of the 7 words to be sounded is prompted to be sounded. For example, the electronic device may load a standard pronunciation mouth shape in AR form, which displays a target word "walk" at the mouth position of the user when a certain target word "walk" of the above 7 words to be sounded is prompted to be sounded.
It will be appreciated that loading a standard pronunciation mouth shape in AR form to display a target word at the user's mouth location may be: the changing process of the standard pronunciation mouth shape of the display target word is loaded at the mouth position of the user in the AR form (belonging to an animation process).
Therefore, by implementing the AR-based pronunciation guide method described in fig. 1, the child can pay attention to the mouth shape and scale of each word pronunciation in the foreign language sentence, so that the intonation of the child when reading the foreign language sentence is more accurate and affective.
In addition, by implementing the AR-based pronunciation guide method described in fig. 1, power consumption caused by shooting a learning module selected by a finger of a user by the electronic device can be reduced, so that battery endurance of the electronic device can be improved.
In addition, implementing the AR-based pronunciation guidance method described in fig. 1, a user's supervisor (such as a classroom teacher or a parent) can flexibly select a learning module for the user, and the learning module is selected for the user without causing interference to surrounding people.
Referring to fig. 3, fig. 3 is a flowchart illustrating a second embodiment of an AR-based pronunciation guidance method disclosed in an embodiment of the present application. In the AR-based pronunciation guide method described in fig. 3, the AR-based pronunciation guide method is described with an electronic device as an execution subject. As shown in fig. 3, the AR-based pronunciation guide method may include the steps of:
301. The electronic equipment acquires a foreign language sentence to be sounded, wherein the foreign language sentence to be sounded comprises a plurality of words to be sounded.
For example, the implementation of step 301 may refer to step 101, which is not described herein.
302. The electronic equipment loads and displays the plurality of words to be pronounced in a displayed musical scale map formed by sequentially splicing musical scales of the plurality of words to be pronounced according to the pronunciation sequence of the plurality of words to be pronounced; wherein any word to be pronounced is displayed in a scale close to the any word to be pronounced in the ladder diagram.
303. The electronic device tracks the user's mouth position from the presented real-time representation of the user.
304. When the electronic equipment prompts that a target word in the plurality of words to be uttered needs to be uttered, the electronic equipment loads and displays the standard pronunciation mouth shape of the target word at the mouth position of the user in an AR form.
305. The electronic device picks up the user's pronunciation of the target word.
Wherein the electronic device can pick up the user's pronunciation of the target word through the microphone.
306. And the electronic equipment compares the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user.
Illustratively, the user's pronunciation assessment results for the target word can be categorized into two categories, accurate and inaccurate.
307. The electronic device controls the scale of the target word and the target word in the step chart to display colors corresponding to the pronunciation assessment results, respectively.
For example, if the result of the pronunciation assessment of the target word by the user is accurate, the electronic device may control the scale of the target word and the target word in the step chart to display black corresponding to the result of the pronunciation assessment (i.e. accurate), respectively; otherwise, if the result of the pronunciation assessment of the target word by the user is inaccurate, the electronic device may control the scale of the target word and the target word in the step chart to respectively display gray corresponding to the result of the pronunciation assessment (i.e. inaccurate); for example, if the user is inaccurate in the pronunciation assessment result of the target word "walk", the electronic device may control the scale of the target word walk and the target word walk in the step chart to display gray colors corresponding to the pronunciation assessment result (i.e., inaccurate), respectively.
The steps 305 to 307 are implemented, so that man-machine interaction in the pronunciation assessment process can be improved, and students can be better guided to conduct pronunciation assessment on foreign language words, and accuracy of pronunciation of the foreign language words by the students is improved.
Therefore, by implementing the AR-based pronunciation guidance method described in fig. 3, the child can pay attention to the mouth shape and scale of each word pronunciation in the foreign language sentence, so that the intonation of the child when reading the foreign language sentence is more accurate and affective.
In addition, by implementing the AR-based pronunciation guide method described in fig. 3, power consumption caused by shooting the learning module selected by the finger of the user by the electronic device can be reduced, so that battery endurance of the electronic device can be improved.
In addition, implementing the AR-based pronunciation guidance method described in fig. 3, a user's supervisor (such as a classroom teacher or a parent) can flexibly select a learning module for the user, and the learning module is selected for the user without causing interference to surrounding people.
Referring to fig. 4, fig. 4 is a flowchart illustrating a third embodiment of an AR-based pronunciation guide method according to an embodiment of the present application. In the AR-based pronunciation guide method described in fig. 4, the AR-based pronunciation guide method is described with an electronic device as an execution subject. As shown in fig. 4, the AR-based pronunciation guide method may include the steps of:
401. the electronic equipment acquires a foreign language sentence to be sounded, wherein the foreign language sentence to be sounded comprises a plurality of words to be sounded.
For example, the implementation of the above step 401 may refer to the previous step 101, which is not described herein in detail in the embodiments of the present application.
402. The electronic equipment loads and displays the plurality of words to be pronounced in a displayed musical scale map formed by sequentially splicing musical scales of the plurality of words to be pronounced according to the pronunciation sequence of the plurality of words to be pronounced; wherein any word to be pronounced is displayed in a scale close to the any word to be pronounced in the ladder diagram.
403. The electronic device tracks the user's mouth position from the presented real-time representation of the user.
404. When the electronic equipment prompts that a target word in the plurality of words to be uttered needs to be uttered, the electronic equipment loads and displays the standard pronunciation mouth shape of the target word at the mouth position of the user in an AR form.
405. The electronic device picks up the user's pronunciation of the target word.
406. And the electronic equipment compares the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user.
407. The electronic device controls the scale of the target word and the target word in the step chart to display colors corresponding to the pronunciation assessment results, respectively.
408. After the user finishes pronouncing the plurality of words to be pronounciated, the electronic equipment detects whether the foreign language sentence to be pronounciated is associated with an object to be unlocked or not; if so, executing steps 409-411; if not, the process is ended.
The object to be unlocked may be a foreign language sentence to be unlocked, an APP to be unlocked, an electronic screen to be unlocked, an intelligent door lock to be unlocked, and the like, which is not limited in the embodiment of the present application.
409. The electronic equipment acquires unlocking permission parameters of the object to be unlocked which is configured; wherein the unlock allowing parameter includes at least a specified number of words that are accurately pronounced.
The object to be unlocked may be configured with the unlocking permission parameter by the electronic device, or the object to be unlocked may be configured with the unlocking permission parameter by a wrist type supervision device worn by a supervisor (such as a classroom teacher or a parent) of a user of the electronic device.
410. And the electronic equipment counts the total number of words with accurate pronunciation in the plurality of words to be pronounced according to the pronunciation evaluation result of the user on each word to be pronounced.
411. Comparing whether the total number exceeds the specified number, if so, executing step 312; if not, the process is ended.
412. And the electronic equipment unlocks the object to be unlocked.
In combination with the steps 408 to 412, the student can be guided to perform pronunciation assessment on the foreign language word, so as to improve the accuracy of pronunciation of the foreign language word by the student and improve the security of unlocking the object to be unlocked associated with the foreign language sentence to be sounded.
Therefore, by implementing the AR-based pronunciation guidance method described in fig. 4, the child can pay attention to the mouth shape and scale of each word pronunciation in the foreign language sentence, so that the intonation of the child when reading the foreign language sentence is more accurate and affective.
In addition, by implementing the AR-based pronunciation guide method described in fig. 4, power consumption caused by shooting the learning module selected by the finger of the user by the electronic device can be reduced, so that battery endurance of the electronic device can be improved.
In addition, implementing the AR-based pronunciation guidance method described in fig. 4, a user's supervisor (such as a classroom teacher or a parent) can flexibly select a learning module for the user, and the learning module is selected for the user without causing interference to surrounding people.
Referring to fig. 5, fig. 5 is a flowchart illustrating a fourth embodiment of an AR-based pronunciation guidance method disclosed in an embodiment of the present application. In the AR-based pronunciation guide method described in fig. 5, the AR-based pronunciation guide method is described with an electronic device as an execution subject. The electronic device is located in a certain indoor environment, and the electronic device establishes communication connection with an external supervision device located in the indoor environment in advance, wherein the external supervision device is a wrist supervision device worn by a supervisor (such as a parent) of a user of the electronic device. As shown in fig. 5, the AR-based pronunciation guide method may include the steps of:
501. The electronic equipment acquires a voice signal sent by the external supervision equipment, wherein the voice signal contains the identification of the selected learning module and the identification of the appointed object.
502. The electronic equipment checks whether the identity of the appointed object is matched with the identity of the user in the displayed real-time portrait of the user, and if so, steps 503-511 are executed; if not, the process is ended.
503. The electronic equipment inquires the learning module corresponding to the user according to the identification of the selected learning module contained in the voice signal.
In some examples, a supervisor (e.g., a parent) of a user of an electronic device may issue a voice signal for selecting a learning module for the user that has a volume below a certain threshold, the voice signal may include an identification (e.g., a chapter number) of the selected learning module and an identity (e.g., facial feature information of a specified object) of the specified object; further, the wrist wearable device may check whether the identity of the specified object (e.g. the face feature information of the specified object) is matched with the identity of the user (e.g. the face feature information of the user) in the real-time portrait of the user displayed on the screen, and if so, execute steps 503 to 511; if not, the process is ended.
According to the embodiment, parents can respectively select different learning modules for the children in the house according to different learning progress of the children in the house, so that flexibility and convenience in respectively selecting the different learning modules for the children in the house can be improved.
504. The electronic equipment detects a foreign language sentence selected by the user from the learning module as a foreign language sentence to be uttered, wherein the foreign language sentence to be uttered comprises a plurality of words to be uttered.
505. The electronic equipment loads and displays the plurality of words to be pronounced in a displayed musical scale map formed by sequentially splicing musical scales of the plurality of words to be pronounced according to the pronunciation sequence of the plurality of words to be pronounced; wherein any word to be pronounced is displayed in a scale close to the any word to be pronounced in the ladder diagram.
506. The electronic device tracks the user's mouth position from the presented real-time representation of the user.
507. When the electronic equipment prompts that a target word in the plurality of words to be uttered needs to be uttered, the electronic equipment loads and displays the standard pronunciation mouth shape of the target word at the mouth position of the user in an AR form.
508. The electronic device picks up the user's pronunciation of the target word.
509. And the electronic equipment compares the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user.
510. The electronic device controls the scale of the target word and the target word in the step chart to display colors corresponding to the pronunciation assessment results, respectively.
511. After the user finishes pronouncing the plurality of words to be pronounciated, the electronic equipment detects whether the foreign language sentence to be pronounciated is associated with an object to be unlocked or not; if so, executing the steps 512-514; if not, the process is ended.
512. The electronic equipment acquires unlocking permission parameters of the object to be unlocked which is configured; wherein the unlocking permission parameters at least comprise the appointed number of words with accurate pronunciation and the appointed tone matching degree.
The object to be unlocked can be configured with the unlocking permission parameter by a wrist type supervision device worn by a supervisor (such as a parent) of a user of the electronic device.
513. And the electronic equipment counts the total number of words with accurate pronunciation in the plurality of words to be pronounced according to the pronunciation evaluation result of the user on each word to be pronounced.
514. Comparing whether the total number exceeds the specified number by the electronic device, and if so, executing steps 515-516; if not, the process is ended.
515. And the electronic equipment matches the picked intonation of the pronunciation of the foreign language sentence to be uttered by the user with the standard intonation of the foreign language sentence to be uttered to obtain the intonation matching degree of the foreign language sentence to be uttered.
516. Comparing, by the electronic device, whether the intonation matching degree of the foreign language sentence to be uttered is greater than or equal to the specified intonation matching degree, and if so, executing 517; if not, the process is ended.
517. And the electronic equipment unlocks the object to be unlocked.
In some application scenarios, the object to be unlocked may be an intelligent door lock to be unlocked set for the indoor environment. In this application scenario, the manner in which the electronic device unlocks the object to be unlocked in step 514 may be:
the electronic equipment determines current space position information of a user using the electronic equipment based on an indoor image shot by an internal camera of the intelligent door lock to be unlocked;
the electronic equipment can check whether the current spatial position information of the user using the electronic equipment is matched with the three-dimensional position information of the monitored object specially configured by the supervisor (such as parents) of the user (belonging to the monitored object) relative to the internal camera of the intelligent door lock to be unlocked, and if so, the intelligent door lock to be unlocked is controlled to be unlocked; when the user is located in the three-dimensional position information of the monitored object specially configured by the supervisor (such as parents) of the user (belonging to the monitored object) relative to the internal camera of the intelligent door lock to be unlocked, the supervisor of the user can directly observe the user in the indoor environment. Alternatively, the three-dimensional position information of different monitored objects with respect to the internal camera of the smart door lock to be unlocked may be different. Therefore, a user of the electronic equipment can be required to control the intelligent door lock to be unlocked by using the electronic equipment to unlock only on a certain spatial position which is specially configured by a supervisor and is visible to the supervisor, the supervisor can intuitively know which monitored object is unlocked by the intelligent door lock to be unlocked, the visibility of the user of the electronic equipment when the user of the electronic equipment unlocks the intelligent door lock to be unlocked can be improved, and accidents (such as children being turned around) caused by the fact that the user of the electronic equipment steals the intelligent door lock to be unlocked under the condition that the supervisor does not know.
In addition, by implementing the AR-based pronunciation guidance method described in fig. 5, the child can pay attention to the mouth shape and scale of each word pronunciation in the foreign language sentence, so that the intonation of the child when reading the foreign language sentence is more accurate and affective.
In addition, for the child in the indoor environment, if the intelligent door lock to be unlocked is to be unlocked and goes out, the AR-based pronunciation guiding method described in fig. 5 is required to accurately pronounce more than the specified number of words to be pronounced in the foreign language sentence to be pronounced, and the intonation of the pronunciation of the foreign language sentence to be pronounced is required to be matched with the standard intonation of the foreign language sentence to be pronounced, so that the aim of promoting the child in the room to practice the accurate pronunciation of the foreign language sentence to be pronounced can be achieved.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a first embodiment of an AR-based pronunciation guide device according to an embodiment of the present application. The AR-based pronunciation guide apparatus may include:
a first obtaining unit 601, configured to obtain a foreign language sentence to be uttered, where the foreign language sentence to be uttered includes a plurality of words to be uttered;
a first loading unit 602, configured to load and display the plurality of words to be pronounced in a displayed scale map formed by sequentially splicing the scales of the plurality of words to be pronounced according to the pronunciation sequence of the plurality of words to be pronounced; wherein, any word to be pronounced is displayed in the scale of the word to be pronounced close to the one in the ladder diagram;
A position tracking unit 603 for tracking the mouth position of the user from the displayed real-time representation of the user;
and a second loading unit 604, configured to load, in AR form, a standard pronunciation mouth shape for displaying a target word at a mouth position of the user when a certain target word of the plurality of words to be pronounced is prompted to be pronounced.
In some embodiments, the first obtaining unit 601 may capture, by using an image capturing device (such as a camera), a foreign language sentence to be uttered, which is selected by a user (such as clicked by the user) in the learning module. The learning module may be a learning page (e.g. a paper learning page or an electronic learning page) corresponding to the user, or may be a learning chapter included in a learning page corresponding to the user.
The AR-based pronunciation guide apparatus may be applied to an electronic device, for example. The electronic device may locate a learning module selected by the user's finger, pen, or voice, and use the learning module selected by the user's finger, pen, or voice as the learning module corresponding to the user. For example, the electronic device may take a learning module selected by a finger or a pen of a user as a learning module corresponding to the user by using a camera (such as a camera); alternatively, the electronic device may employ a pickup device (such as a microphone) to pick up a learning module selected by the voice uttered by the user as the learning module corresponding to the user. In some embodiments, the image capturing device (such as a camera) may be disposed on a finger ring worn by a finger of a user, and when the finger ring detects that the finger of the user worn by the finger ring is straightened, the finger ring may start the image capturing device (such as the camera) to capture a learning module selected by the finger of the user, and the captured learning module selected by the finger of the user is transmitted to the electronic device by the finger ring, so that the electronic device may determine the learning module corresponding to the user. By implementing the embodiment, the power consumption brought by the electronic equipment shooting of the learning module selected by the finger of the user can be reduced, so that the battery endurance of the electronic equipment can be improved.
In other examples, the electronic device may obtain a learning module selected by other external supervisory devices for the user, and use the learning module selected by other external devices for the user as the learning module corresponding to the user. For example, the electronic device may previously establish a communication connection with a wrist type supervision device worn by a supervisor (such as a classroom teacher or a parent) of the user, the supervisor holds a finger of a palm where the wrist of the wrist type supervision device is worn against the root of the ear to make the ear line Cheng Mibi sound cavity, and the supervisor may send out a voice signal with a volume lower than a certain threshold for selecting a learning module for the user; the voice signal is transmitted into the wrist type supervision device as a vibration signal through bone media of the palm, and the wrist type supervision device transmits the voice signal to the electronic device. In this embodiment, a user's supervisor (such as a classroom teacher or a parent) can flexibly select a learning module for the user, and the user will not cause interference to surrounding people in the process of selecting the learning module for the user.
The AR-based pronunciation guide apparatus described in fig. 6 is at least capable of enabling a child to pay attention to a mouth shape and a scale of each word pronunciation in a foreign language sentence, so that intonation of the child when reading the foreign language sentence is more accurate and affective.
Referring to fig. 7 together, fig. 7 is a schematic structural diagram of a second embodiment of an AR-based pronunciation guide device according to an embodiment of the present application. Wherein, the AR-based pronunciation guide device shown in fig. 7 is optimized by the AR-based pronunciation guide device shown in fig. 6. In the AR-based pronunciation guide apparatus shown in fig. 7, the AR-based pronunciation guide apparatus further includes:
a pronunciation pick-up unit 605 for picking up the pronunciation of the target word by the user after the second loading unit 604 loads the standard pronunciation mouth shape displaying the target word at the mouth position of the user;
a pronunciation assessment unit 606, configured to compare the pronunciation of the target word by the user with the standard pronunciation of the target word, and obtain a pronunciation assessment result of the target word by the user;
a color control unit 607 for controlling the scale of the target word and the target word in the tone step chart to display the colors corresponding to the pronunciation assessment results, respectively.
Optionally, the pronunciation guide apparatus shown in fig. 7 further includes:
the object detection unit 608 is configured to detect whether the foreign language sentence to be uttered is associated with an object to be unlocked after the user utters the plurality of words to be uttered;
A second obtaining unit 609, configured to obtain an unlock permission parameter configured for the object to be unlocked when the object detecting unit 608 detects that the object to be unlocked is associated with the foreign language sentence to be pronounced; wherein the unlocking permission parameters at least comprise the appointed number of words with accurate pronunciation;
a number statistics unit 610, configured to, according to a result of the pronunciation assessment of each word to be uttered by the user, count a total number of words with accurate pronunciation among the plurality of words to be uttered;
a first comparing unit 611 for comparing whether the total number exceeds a specified number;
the object unlocking unit 612 is configured to unlock the object to be unlocked when the comparison result of the first comparison unit 611 is exceeded.
Optionally, the unlocking permission parameter further includes a designated intonation matching degree, and the pronunciation guide device further includes:
the intonation matching unit 613 is configured to match the picked intonation of the foreign language sentence to be uttered by the user with the standard intonation of the foreign language sentence to be uttered after the total number is compared by the first comparing unit 611 to exceed the specified number, so as to obtain the intonation matching degree of the foreign language sentence to be uttered;
the second comparing unit 614 is configured to compare whether the intonation matching degree of the foreign language sentence to be uttered is greater than or equal to the specified intonation matching degree, and if so, trigger the object unlocking unit 612 to perform an operation of unlocking the object to be unlocked when the comparison result of the first comparing unit 611 is exceeded.
Optionally, the first obtaining unit 601 includes:
an acquisition subunit 6011, configured to acquire a voice signal sent by an external supervisory device; the voice signal contains the identification of the selected learning module and the identification of the appointed object;
a query subunit 6012, configured to check whether the identity of the specified object is matched with the identity of the user in the displayed real-time portrait of the user, and if so, query a learning module corresponding to the user according to the identifier of the selected learning module included in the voice signal;
a detection subunit 6013 configured to detect the foreign language sentence selected by the user from the learning module as a foreign language sentence to be uttered.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a first embodiment of an electronic device disclosed in an embodiment of the present application. As shown in fig. 8, the electronic device may include any of the AR-based pronunciation guide apparatuses of the above embodiments.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a second embodiment of an electronic device disclosed in an embodiment of the present application. As shown in fig. 9, may include:
memory 901 storing executable program code
A processor 902 coupled to the memory;
wherein the processor 902 invokes executable program code stored in the memory 901 to perform all or part of the steps of the AR-based pronunciation guidance method described above.
It should be noted that, in this embodiment of the present application, the electronic device shown in fig. 9 may further include components that are not displayed, such as a speaker module, a display screen, a light projection module, a battery module, a wireless communication module (such as a mobile communication module, a WIFI module, a bluetooth module, etc.), a sensor module (such as a proximity sensor, etc.), an input module (such as a microphone, a key), and a user interface module (such as a charging interface, an external power supply interface, a card slot, a wired earphone interface, etc.).
Embodiments of the present invention disclose a computer readable storage medium having stored thereon computer instructions that when executed cause a computer to perform all or part of the steps of the AR-based pronunciation guidance method described above.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the above embodiments may be implemented by a program that instructs associated hardware, the program may be stored in a computer readable storage medium including Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disk Memory, magnetic disk Memory, tape Memory, or any other medium that can be used for carrying or storing data that is readable by a computer.
The foregoing details of the AR-based pronunciation guide method and apparatus, and the electronic device, and the storage medium disclosed in the embodiments of the present invention are presented in the foregoing description, and specific examples are applied to illustrate the principles and embodiments of the present invention, where the foregoing description of the embodiments is only for helping to understand the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (11)

1. An AR-based pronunciation guide method, the method comprising:
acquiring a foreign language sentence to be sounded, wherein the foreign language sentence to be sounded comprises a plurality of words to be sounded;
loading and displaying the plurality of to-be-sounded words in a displayed musical scale ladder diagram formed by sequentially splicing musical scales of the plurality of to-be-sounded words according to the pronunciation sequence of the plurality of to-be-sounded words; any one of the words to be sounded is displayed in a scale close to the word to be sounded in the scale map;
tracking a mouth position of a user from a displayed real-time representation of the user;
When a certain target word in the plurality of words to be sounded is prompted to be sounded, loading and displaying a standard sounding mouth shape of the target word in an AR form at a mouth position of the user, wherein the standard sounding mouth shape is positioned in a real-time portrait of the user;
after the user finishes pronouncing the plurality of words to be pronounciated, detecting whether the foreign language sentence to be pronounciated is associated with an object to be unlocked or not, wherein the object to be unlocked comprises the foreign language sentence to be unlocked;
if the foreign language sentence to be sounded is associated with the object to be unlocked, acquiring unlocking permission parameters configured for the object to be unlocked; wherein the unlocking permission parameters at least comprise a specified number of words with accurate pronunciation;
counting the total number of words with accurate pronunciation in the plurality of words to be sounded according to the pronunciation evaluation result of the user on each word to be sounded;
and comparing whether the total number exceeds the specified number, and if so, unlocking the object to be unlocked.
2. The pronunciation guide method of claim 1, wherein after loading a standard pronunciation mouth shape for displaying the target word at the user's mouth position, the method further comprises:
Picking up the pronunciation of the target word by the user;
comparing the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user;
and controlling the scale of the target word and the target word in the scale map to respectively display colors corresponding to the pronunciation assessment results.
3. The pronunciation guide method of claim 2 wherein the unlock allowing parameters further comprise a specified intonation match, the method further comprising, after comparing the total number to exceed the specified number:
matching the picked intonation of the foreign language sentence to be uttered by the user with the standard intonation of the foreign language sentence to be uttered to obtain the intonation matching degree of the foreign language sentence to be uttered;
and comparing whether the tone matching degree of the foreign language sentence to be sounded is larger than or equal to the appointed tone matching degree, and if so, executing the step of unlocking the object to be unlocked.
4. A pronunciation guide method as claimed in any one of claims 1 to 3, wherein the obtaining a sentence in a foreign language to be pronounced comprises:
Acquiring a voice signal sent by external monitoring equipment; the voice signal comprises the identification of the selected learning module and the identification of the appointed object;
checking whether the identity of the appointed object is matched with the identity of the user in the displayed real-time portrait of the user, and if so, inquiring a learning module corresponding to the user according to the identity of the selected learning module contained in the voice signal;
and detecting the foreign language sentence selected by the user from the learning module as a foreign language sentence to be sounded.
5. An AR-based pronunciation guide apparatus, the apparatus comprising:
the first acquisition unit is used for acquiring a foreign language sentence to be uttered, wherein the foreign language sentence to be uttered comprises a plurality of words to be uttered;
the first loading unit is used for loading and displaying the plurality of words to be sounded in a displayed musical scale map formed by sequentially splicing musical scales of the plurality of words to be sounded according to the pronunciation sequence of the plurality of words to be sounded; any one of the words to be sounded is displayed in a scale close to the word to be sounded in the scale map;
a position tracking unit for tracking the mouth position of the user from the displayed real-time representation of the user;
The second loading unit is used for loading and displaying a standard pronunciation mouth shape of a target word in the mouth position of the user in an AR form when a certain target word in the plurality of words to be sounded is prompted to be sounded, wherein the standard pronunciation mouth shape is positioned in a real-time portrait of the user;
the object detection unit is used for detecting whether the foreign language sentence to be sounded is associated with an object to be unlocked or not after the user finishes sounding the plurality of words to be sounded, and the object to be unlocked comprises the foreign language sentence to be unlocked;
the second acquisition unit is used for acquiring unlocking permission parameters configured for the object to be unlocked when the object detection unit detects that the foreign language sentence to be sounded is associated with the object to be unlocked; wherein the unlocking permission parameters at least comprise a specified number of words with accurate pronunciation;
the quantity counting unit is used for counting the total quantity of words with accurate pronunciation in the plurality of words to be sounded according to the pronunciation evaluation result of the user on each word to be sounded;
a first comparing unit for comparing whether the total number exceeds the specified number;
and the object unlocking unit is used for unlocking the object to be unlocked when the comparison result of the first comparison unit is exceeded.
6. The sound guiding device of claim 5, further comprising:
a pronunciation pick-up unit for picking up the pronunciation of the target word by the user after the second loading unit loads the standard pronunciation mouth shape for displaying the target word at the mouth position of the user;
the pronunciation assessment unit is used for comparing the pronunciation of the target word by the user with the standard pronunciation of the target word to obtain a pronunciation assessment result of the target word by the user;
and the color control unit is used for controlling the musical scale of the target word and the target word in the musical scale map to respectively display colors corresponding to the pronunciation evaluation results.
7. The sound guide apparatus of claim 6, wherein the unlock-allowing parameter further comprises a specified intonation match, the sound guide apparatus further comprising:
the intonation matching unit is used for matching the picked intonation of the pronunciation of the foreign language sentence to be uttered by the user with the standard intonation of the foreign language sentence to be uttered after the first comparison unit compares that the total number exceeds the designated number, so as to obtain the intonation matching degree of the foreign language sentence to be uttered;
And the second comparison unit is used for comparing whether the tone matching degree of the foreign language sentence to be sounded is larger than or equal to the appointed tone matching degree, and if so, triggering the object unlocking unit to execute the operation of unlocking the object to be unlocked when the comparison result of the first comparison unit is exceeded.
8. The sound guiding device according to any one of claims 5 to 7, wherein the first acquisition unit includes:
the acquisition subunit is used for acquiring a voice signal sent by external supervision equipment; the voice signal comprises the identification of the selected learning module and the identification of the appointed object;
a query subunit, configured to check whether the identity of the specified object is matched with the identity of the user in the displayed real-time portrait of the user, and if so, query a learning module corresponding to the user according to the identifier of the selected learning module included in the voice signal;
and the detection subunit is used for detecting the foreign language sentence selected by the user from the learning module as a foreign language sentence to be sounded.
9. An electronic device comprising the sound guide device according to any one of claims 5 to 7.
10. An electronic device, comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform all or part of the steps of the pronunciation guide method of any one of claims 1 to 4.
11. A computer readable storage medium having stored thereon computer instructions which, when executed, cause a computer to perform all or part of the steps of the pronunciation guide method of any of claims 1 to 4.
CN202010414224.2A 2020-05-15 2020-05-15 AR-based pronunciation guide method and device, electronic equipment and storage medium Active CN111638781B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010414224.2A CN111638781B (en) 2020-05-15 2020-05-15 AR-based pronunciation guide method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010414224.2A CN111638781B (en) 2020-05-15 2020-05-15 AR-based pronunciation guide method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111638781A CN111638781A (en) 2020-09-08
CN111638781B true CN111638781B (en) 2024-03-19

Family

ID=72330225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010414224.2A Active CN111638781B (en) 2020-05-15 2020-05-15 AR-based pronunciation guide method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111638781B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11272154A (en) * 1998-03-18 1999-10-08 Nobuyoshi Nakamura Storage medium for conversation teaching material
KR20030081891A (en) * 2002-04-15 2003-10-22 윤춘호 Computer network-based interactive multimedia learning system and method thereof
CN101727765A (en) * 2009-11-03 2010-06-09 无敌科技(西安)有限公司 Face simulation pronunciation system and method thereof
CN103136972A (en) * 2011-11-21 2013-06-05 学习时代公司 Computer-based language immersion teaching for young learners
KR20140132255A (en) * 2013-05-07 2014-11-17 정상기 Unlocking method for smart device with learning foreign word
KR20140133056A (en) * 2013-05-09 2014-11-19 중앙대학교기술지주 주식회사 Apparatus and method for providing auto lip-synch in animation
CN105070118A (en) * 2015-07-30 2015-11-18 广东小天才科技有限公司 Method of correcting pronunciation aiming at language class learning and device of correcting pronunciation aiming at language class learning
CN106530858A (en) * 2016-12-30 2017-03-22 武汉市马里欧网络有限公司 AR-based Children's English learning system and method
CN107564341A (en) * 2017-08-08 2018-01-09 广东小天才科技有限公司 A kind of character teaching method and user terminal based on user terminal
KR101822026B1 (en) * 2016-08-31 2018-01-26 주식회사 뮤엠교육 Language Study System Based on Character Avatar
CN108769535A (en) * 2018-07-04 2018-11-06 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and computer equipment
CN109064799A (en) * 2018-08-31 2018-12-21 苏州竹原信息科技有限公司 A kind of Language Training system and method based on virtual reality
CN109637286A (en) * 2019-01-16 2019-04-16 广东小天才科技有限公司 A kind of Oral Training method and private tutor's equipment based on image recognition
CN110166787A (en) * 2018-07-05 2019-08-23 腾讯数码(天津)有限公司 Augmented reality data dissemination method, system and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10679396B2 (en) * 2017-07-13 2020-06-09 Visyn Inc. Holographic multi avatar training system interface and sonification associative training

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11272154A (en) * 1998-03-18 1999-10-08 Nobuyoshi Nakamura Storage medium for conversation teaching material
KR20030081891A (en) * 2002-04-15 2003-10-22 윤춘호 Computer network-based interactive multimedia learning system and method thereof
CN101727765A (en) * 2009-11-03 2010-06-09 无敌科技(西安)有限公司 Face simulation pronunciation system and method thereof
CN103136972A (en) * 2011-11-21 2013-06-05 学习时代公司 Computer-based language immersion teaching for young learners
KR20140132255A (en) * 2013-05-07 2014-11-17 정상기 Unlocking method for smart device with learning foreign word
KR20140133056A (en) * 2013-05-09 2014-11-19 중앙대학교기술지주 주식회사 Apparatus and method for providing auto lip-synch in animation
CN105070118A (en) * 2015-07-30 2015-11-18 广东小天才科技有限公司 Method of correcting pronunciation aiming at language class learning and device of correcting pronunciation aiming at language class learning
KR101822026B1 (en) * 2016-08-31 2018-01-26 주식회사 뮤엠교육 Language Study System Based on Character Avatar
CN106530858A (en) * 2016-12-30 2017-03-22 武汉市马里欧网络有限公司 AR-based Children's English learning system and method
CN107564341A (en) * 2017-08-08 2018-01-09 广东小天才科技有限公司 A kind of character teaching method and user terminal based on user terminal
CN108769535A (en) * 2018-07-04 2018-11-06 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and computer equipment
CN110166787A (en) * 2018-07-05 2019-08-23 腾讯数码(天津)有限公司 Augmented reality data dissemination method, system and storage medium
CN109064799A (en) * 2018-08-31 2018-12-21 苏州竹原信息科技有限公司 A kind of Language Training system and method based on virtual reality
CN109637286A (en) * 2019-01-16 2019-04-16 广东小天才科技有限公司 A kind of Oral Training method and private tutor's equipment based on image recognition

Also Published As

Publication number Publication date
CN111638781A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
US11241789B2 (en) Data processing method for care-giving robot and apparatus
EP3877975B1 (en) Electronic device and method for outputting a speech signal
CN102163377B (en) Facial tracking electronic reader
US20180204480A1 (en) Cognitive training system
US9129602B1 (en) Mimicking user speech patterns
CN108875785A (en) The attention rate detection method and device of Behavior-based control Characteristic Contrast
WO2018118420A1 (en) Method, system, and apparatus for voice and video digital travel companion
US11848029B2 (en) Method and device for detecting audio signal, and storage medium
CN104965589A (en) Human living body detection method and device based on human brain intelligence and man-machine interaction
CN108090424B (en) Online teaching investigation method and equipment
CN111638781B (en) AR-based pronunciation guide method and device, electronic equipment and storage medium
CN111639227B (en) Spoken language control method of virtual character, electronic equipment and storage medium
US20200202738A1 (en) Robot and method of controlling the same
CN111563514B (en) Three-dimensional character display method and device, electronic equipment and storage medium
US20220015687A1 (en) Method for Screening Psychiatric Disorder Based On Conversation and Apparatus Therefor
CN111583739A (en) VR-based multimedia teaching system
CN111639567B (en) Interactive display method of three-dimensional model, electronic equipment and storage medium
CN109271480A (en) A kind of voice searches topic method and electronic equipment
CN111639635B (en) Processing method and device for shooting pictures, electronic equipment and storage medium
CN111078890B (en) Raw word collection method and electronic equipment
US11315544B2 (en) Cognitive modification of verbal communications from an interactive computing device
CN111639220A (en) Spoken language evaluation method and device, electronic equipment and storage medium
CN111081090B (en) Information output method and learning device in point-to-read scene
US10224026B2 (en) Electronic device, system, method and computer program
CN111031232B (en) Dictation real-time detection method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant