CN112185187B - Learning method and intelligent device for social language - Google Patents

Learning method and intelligent device for social language Download PDF

Info

Publication number
CN112185187B
CN112185187B CN201910596768.2A CN201910596768A CN112185187B CN 112185187 B CN112185187 B CN 112185187B CN 201910596768 A CN201910596768 A CN 201910596768A CN 112185187 B CN112185187 B CN 112185187B
Authority
CN
China
Prior art keywords
target
learning
social
video
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910596768.2A
Other languages
Chinese (zh)
Other versions
CN112185187A (en
Inventor
杨昊民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201910596768.2A priority Critical patent/CN112185187B/en
Publication of CN112185187A publication Critical patent/CN112185187A/en
Application granted granted Critical
Publication of CN112185187B publication Critical patent/CN112185187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • G09B7/02Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Abstract

The invention belongs to the field of intelligent products, and discloses a learning method and an intelligent device of social languages, wherein the method comprises the following steps: acquiring a target social scene selected by a user and a target role in the target social scene; selecting a target learning video set corresponding to a target social scene from a pre-constructed video library; selecting a target learning video corresponding to a target role in a target learning video set, and acquiring standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role; playing a target learning video; when the target learning video is played, acquiring reply information input by a user; and judging whether the reply information is correct or not according to the standard audio, and if not, sending a prompt message. According to the method and the device, the corresponding learning videos are played, so that the user can practice and learn the social language in a specific scene and a specific role, and the social language ability of the user is improved.

Description

Learning method and intelligent device for social language
Technical Field
The invention belongs to the technical field of intelligent products, and particularly relates to a social language learning method and an intelligent device.
Background
In modern society, due to rapid development of economy, people have frequent interaction, social languages have increasingly strengthened importance, and good language expression capacity is considered as essential capacity of modern people. For children, their social language ability is also important. The social language ability of the children is cultivated from childhood, so that the children can better adapt to various environments and coordinate the relationship with other people and the collective; but also is more beneficial to the physical and mental health of children.
With the rapid development of intelligent terminals and network technologies, more and more language learning devices emerge, such as a repeater, a point reader, and the like. However, these language learning devices are all directed to learning foreign languages such as english, and at present, there is no language learning device to guide children to learn social languages, so that the social language ability of children cannot be well cultivated and improved.
Disclosure of Invention
The invention aims to provide a social language learning method and an intelligent device, which can be used for a user to practice and learn social language in a specific scene and a specific role by playing a learning video so as to improve the social language ability of the user.
The technical scheme provided by the invention is as follows:
in one aspect, a social language learning method is provided, including:
acquiring a target social scene selected by a user and a target role in the target social scene;
selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
selecting a target learning video corresponding to the target role in the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
playing the target learning video;
when the target learning video is played, acquiring reply information input by a user;
and judging whether the reply information is correct or not according to the standard audio, and if not, sending a prompt message.
Further preferably, the video library is constructed by the following method:
obtaining a conversation video corresponding to each social scene;
deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and constructing the video library according to the learning video set corresponding to each social scene.
Further preferably, the method further comprises the following steps:
when the reply information is incorrect, determining the playing time of a place with a reply error in the conversation video according to the standard audio;
and controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Further preferably, the determining, according to the standard audio, whether the reply message is correct specifically includes:
identifying semantics of the reply information;
identifying semantics of the standard audio;
comparing the semantics of the reply information with the semantics of the standard audio;
when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct; otherwise, the reply message is incorrect.
Further preferably, the method further comprises the following steps:
scoring the reply information according to the judgment result;
and pushing a learning video set corresponding to the social scene to the user according to the score.
In another aspect, a smart device is also provided, including:
the acquisition module is used for acquiring a target social scene selected by a user and a target role in the target social scene;
the searching module is used for selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
the searching module is further configured to select a target learning video corresponding to the target role in the target learning video set, and acquire a standard audio of the target role in the target learning video, where the target learning video does not include audio information of the target role;
the playing module is used for playing the target learning video;
the acquisition module is used for acquiring reply information input by a user when the target learning video is played;
the judging module is used for judging whether the reply information is correct or not according to the standard audio;
and the prompt module is used for sending out prompt information when the reply information is incorrect.
Further preferably, the system also comprises a construction module;
the building module comprises:
the acquisition unit is used for acquiring the conversation videos corresponding to the social scenes;
the processing unit is used for deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
the storage unit is used for generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and the storage unit is used for constructing the video library according to the learning video set corresponding to each social scene.
Further preferably, the method further comprises the following steps:
the determining module is used for determining the playing time of a place with wrong reply in the conversation video according to the standard audio when the reply information is incorrect;
and the control module is used for controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Further preferably, the judging module includes:
the identification unit is used for identifying the semantics of the reply information;
the recognition unit is used for recognizing the semantics of the standard audio;
the comparison unit is used for comparing the semantics of the reply information with the semantics of the standard audio;
the judging unit is used for judging that the reply information is correct when the semantics of the reply information is the same as the semantics of the standard audio; otherwise, judging that the reply information is incorrect.
Further preferably, the method further comprises the following steps:
the scoring module is used for scoring the reply information according to the judgment result;
and the pushing module is used for pushing a learning video set corresponding to the social scene to the user according to the scores.
Compared with the prior art, the social language learning method and the intelligent device provided by the invention have the following beneficial effects: according to the method and the device, the user can have a conversation with other roles in the learning video through the playing of the learning video, so that the user can practice and learn the social language of a specific scene and a specific role, and when the user replies an error, the user is reminded, so that the user can know the place where the error is replied and think how to reply, and the social language ability of the user is improved.
Drawings
The above features, technical features, advantages and implementation manners of the social language learning method and the intelligent device will be further described in the following detailed description of preferred embodiments in a clearly understandable manner and with reference to the accompanying drawings.
FIG. 1 is a flow diagram illustrating an embodiment of a social language learning method of the present invention;
FIG. 2 is a flow chart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 3 is a flow chart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 4 is a flow chart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 6 is a block diagram illustrating the structure of one embodiment of an intelligent device of the present invention.
Description of the reference numerals
10. An acquisition module; 20. a search module; 30. a playing module; 40. an acquisition module; 50. a judgment module; 60. and a prompt module.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
Fig. 1 is a flowchart of an embodiment of a social language learning method provided in the present invention, where the social language learning method includes:
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message.
Specifically, the social language learning method can be applied to various intelligent terminals, such as a family education machine, a tablet computer, an intelligent desk lamp and the like.
When children need to learn social language, a target social scene needing to be learned and a target role in the target social scene can be selected on the intelligent terminal.
A social scenario refers to a conversation with different people in different scenarios. For example, a dialog scenario when asking a teacher for a topic; asking friends for a dialogue scene during learning; a conversation scenario when sharing a toy with a child; a conversation scene when playing with a friend; meeting conversation scenes of a mature person on the road; a dialog scenario when asking the teacher, etc.
Each conversation scene comprises a plurality of roles, namely two roles, namely one-to-one conversation; more than three characters, i.e. multi-person conversations, are also possible. The user can select any one of the roles as a target role for social language learning. Of course, in order to better develop and improve the social language ability of the user, it is better to select a character consistent with the own identity for social language learning. For example, in a conversation scenario with a teacher, when the user is a child, it is preferable to select a child character for social language learning in order to better learn and grasp a social language consistent with the identity of the user.
After the intelligent terminal obtains the target social scene selected by the user and the target role in the target social scene, a target learning video set corresponding to the target social scene can be selected from a pre-constructed video library. Each social scene corresponds to one or more learning video sets, namely, under the same social scene, multiple expressions can be provided. When the target social scene corresponds to a plurality of learning video sets, if the users of the plurality of learning video sets have not learned, randomly selecting one learning video set from the plurality of learning video sets as a target learning video set; if the user has learned some learning video sets in the plurality of learning video sets, a learning video set which is not learned by the user can be randomly selected as a target learning video set.
The learning video set includes a plurality of learning videos, each of which may be used to learn a social language of a character. After the target learning video set is selected, according to the target role selected by the user, the corresponding target learning video is selected from the plurality of learning videos in the target learning video set, and then the target learning video is played.
The target learning video does not contain the audio information of the target character, and only contains the audio information of other characters. When the target learning video is played, the dialogue audio of other characters in the target learning video is normally played, and in the dialogue, when the target character needs to answer the dialogue, the user can input reply information so that the user can carry out the dialogue with other characters in the target learning video as the target character. The dialog information of other roles in the target learning video can be used as a definition for a social scene and a guide for the reply information of the user, so that the user can be guided to reply and communicate.
For example, it is assumed that the target learning video includes a character a and a character B, the target character selected by the user is the character a, the video pictures of the character a and the character B are normally played in the target learning video, and the audio information of the character B is played, and the user inputs the reply information for the audio of the character B under the guidance of the dialog of the character B.
When a user inputs reply information as a role A, the reply information of the user is collected, then whether the reply information input by the user is correct or not is judged according to the standard audio frequency of the role A, if the reply information input by the user is correct, the exaggeration information is output, and if the reply information is incorrect, prompt information is output to prompt the user which places are wrong in answer, so that the user thinks how to reply and the social language ability of the user is improved.
According to the method, the user can have a conversation with other roles in the learning video through the playing of the learning video, so that the user can practice and learn the social language of a specific scene and a specific role, and when the user replies an error, the user is reminded, so that the user can know the place where the error is replied and think about how to reply, and the social language ability of the user is improved; meanwhile, which social languages are mastered by the user and which social languages are not mastered by the user can be judged, so that the user can further strengthen learning.
Fig. 2 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s010 obtains conversation videos corresponding to the social scenes;
s020 deleting audio information of a current role from a conversation video corresponding to a current social scene to form a learning video corresponding to the current role;
s030, generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
s040, constructing a video library according to the learning video sets corresponding to the social scenes;
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message.
Specifically, the video library includes learning video sets corresponding to social scenes, and each social scene corresponds to one or more learning video sets. The method for constructing the learning video set corresponding to each social scene comprises the following steps: conversation videos corresponding to the social scenes are obtained first, and a plurality of conversation videos can be obtained from each social scene. The conversation video can be shot by a real person according to the standard social expression, or can be animated according to the standard social expression.
And extracting the audio information of each character from the conversation video, deleting the audio information of a certain character, and generating the learning video of the character. For example, the dialogue video includes a character a and a character B, and the audio information of the character a is deleted from the dialogue video, so that the learning video of the character a can be generated; and deleting the audio information of the character B from the conversation video, so that the learning video of the character B can be generated. The learning video of the role A and the learning video of the role B can form a learning video set. If the conversation video comprises three roles, learning videos corresponding to the three roles can be generated, and the three learning videos can also form a learning video set.
And generating a learning video set corresponding to each social scene according to the conversation video of each social scene, and then storing all the generated learning video sets in a newly-built video library to form a built video library. And establishing the incidence relation between each learning video set and the social scenes in the constructed video library. After the target social contact scene selected by the user is obtained, the target learning video set corresponding to the target social contact scene can be searched in the video library according to the association relation between the social contact scene and the learning video set.
Fig. 3 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s010 obtains conversation videos corresponding to the social scenes;
s020 deleting audio information of a current role from a conversation video corresponding to a current social scene to form a learning video corresponding to the current role;
s030, generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
s040, constructing a video library according to the learning video sets corresponding to the social scenes;
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message;
s710, when the reply message is incorrect, determining the playing time of a place with a reply error in the conversation video according to the standard audio;
s720, controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Specifically, the user may reply many times while talking to other characters in the target learning video, and thus, the reply message may include multiple words. And comparing each sentence in the reply message with the corresponding sentence in the standard audio, judging whether an incorrect place exists in the reply message, and if so, determining the playing time of the place with the wrong reply in the conversation video corresponding to the target social scene according to the standard audio, wherein the playing time comprises the playing start time and the playing end time.
And the learning video set corresponding to the target social scene can also store a conversation video corresponding to the target social scene. And controlling the conversation video corresponding to the target social scene to be played according to the determined playing time so as to play a correct reply corresponding to the incorrect reply for the user to perform imitation learning, wherein the imitation learning is also a very important learning process for children with small grades.
When the user replies with an error, the correct reply corresponding to the error reply is only played, so that the time of the user can be saved, and the user can perform imitation learning to improve the social language ability of children so as to better communicate with other people.
Fig. 4 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s610, recognizing the semantics of the reply information;
s620, identifying the semantics of the standard audio;
s630, comparing the semantics of the reply message with the semantics of the standard audio;
s640, when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct; otherwise, the reply message is incorrect;
and S650, if the code is incorrect, sending a prompt message.
Specifically, the reply information is compared with the standard audio to judge whether the reply information is correct or not, the semantics of the reply information is firstly identified, the semantics of the standard audio is identified, then the semantics of the reply information is compared with the semantics of the standard audio, when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct, and if the semantics of the reply information is different from the semantics of the standard audio, the reply information is incorrect.
When the reply message contains a plurality of sentences, the semantics of each sentence in the reply message is compared with the semantics of the corresponding sentence in the standard audio so as to judge which sentence in the reply message replies correctly and which sentence replies incorrectly. And for incorrect reply, prompting information can be sent to remind the user, and after the user is reminded, voice information input by the user can be collected again, wherein the voice information input again is the re-reply of the user to the place with the wrong reply, and whether the voice information input again by the user is correct is judged according to the standard audio frequency of the error reply. When the user replies for many times and still does not reply correctly, the correct reply corresponding to the error reply in the dialogue video can be played, when the dialogue video is played, only the correct reply part corresponding to the error reply is played, and other places which reply correctly can not be played; of course, if the user wishes to play the complete conversation video, it is played.
In this embodiment, whether the reply message is correct or not is determined according to the semantics of the reply message and the semantics of the standard audio, so that the situation that the semantics are the same but are determined incorrectly due to different characters is prevented, and the accuracy of determination is further improved.
Fig. 5 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message;
s810, scoring the reply information according to the judgment result;
s820, pushing a learning video set of the corresponding social scene to the user according to the scores.
Specifically, after judging whether the reply information is correct according to the standard audio, the reply information can be scored according to the judgment result. For example, if five sentences are included in the reply message, three sentences reply correctly, and two sentences reply incorrectly, the score of the reply message is 60.
If the score of the reply information is higher, the user is indicated to know the social language of the social scene; if the score of the reply message is lower, the user is not informed of the social language of the social scene. If the user already knows the social language of the social scene, a learning video set of the social scene with higher difficulty can be pushed to the user. If the user does not master the social language of the social scene, pushing other learning video sets corresponding to the social scene to the user, or pushing the learning video sets of the social scene associated with the social scene to the user.
In this embodiment, according to the score of the reply information, the corresponding learning video set is pushed to the user, so that the pushed learning video set better conforms to the current level of the user, and the user is better helped to learn the social language.
It should be understood that, in the foregoing embodiments, the sequence numbers of the steps do not mean the execution sequence, and the execution sequence of the steps should be determined by functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
Fig. 6 is a schematic block diagram of a structure of an embodiment of an intelligent device provided in the present invention, the intelligent device including:
the acquiring module 10 is configured to acquire a target social scene selected by a user and a target role in the target social scene;
the searching module 20 is configured to select a target learning video set corresponding to the target social scene from a pre-constructed video library, where the target learning video set includes a plurality of learning videos for learning social languages of different roles;
the searching module 20 is further configured to select a target learning video corresponding to the target role in the target learning video set, and acquire a standard audio of the target role in the target learning video, where the target learning video does not include audio information of the target role;
the playing module 30 is used for playing the target learning video;
the acquisition module 40 is used for acquiring reply information input by a user when the target learning video is played;
the judging module 50 is configured to judge whether the reply message is correct according to the standard audio;
and the prompt module 60 is configured to send a prompt message when the reply message is incorrect.
Specifically, the intelligent device of the invention can be various intelligent terminals, such as a family education machine, a tablet computer, an intelligent desk lamp and the like.
When children need to learn social language, a target social scene needing to be learned and a target role in the target social scene can be selected on the intelligent terminal.
A social scenario refers to a conversation with different people in different scenarios. For example, a dialog scenario when asking teachers for topics; asking friends for a dialogue scene during learning; a conversation scenario when sharing a toy with a child; conversation scenes when playing with friends; meeting conversation scenes of a mature person on the road; a dialog scenario when asking the teacher, etc.
Each conversation scene comprises a plurality of roles, namely two roles, namely one-to-one conversation; more than three characters, i.e. multi-person conversations, are also possible. The user can select any one of the roles as a target role for social language learning. Of course, in order to better develop and improve the social language ability of the user, it is better to select a character consistent with the own identity for social language learning. For example, in a conversation scenario with a teacher, when the user is a child, it is preferable to select a child character for social language learning in order to better learn and grasp a social language consistent with the identity of the user.
After the intelligent terminal obtains the target social scene selected by the user and the target role in the target social scene, a target learning video set corresponding to the target social scene can be selected from a pre-constructed video library. Each social scene corresponds to one or more learning video sets, namely, under the same social scene, multiple expressions can be provided. When the target social scene corresponds to a plurality of learning video sets, if the users of the plurality of learning video sets have not learned, randomly selecting one learning video set from the plurality of learning video sets as a target learning video set; if the user has learned some learning video sets in the plurality of learning video sets, a learning video set which is not learned by the user can be randomly selected as a target learning video set.
The learning video set includes a plurality of learning videos, each of which may be used to learn a social language of a character. After the target learning video set is selected, according to the target role selected by the user, the corresponding target learning video is selected from the plurality of learning videos in the target learning video set, and then the target learning video is played.
The target learning video does not contain the audio information of the target character, and only contains the audio information of other characters. When the target learning video is played, the dialogue audio of other characters in the target learning video is normally played, and in the dialogue, when the target character needs to answer the dialogue, the user can input reply information so that the user can carry out the dialogue with other characters in the target learning video as the target character. The dialog information of other roles in the target learning video can be used as a definition for a social scene and a guide for the reply information of the user, so that the user can be guided to reply and communicate.
For example, it is assumed that the target learning video includes a character a and a character B, the target character selected by the user is the character a, the video pictures of the character a and the character B are normally played in the target learning video, and the audio information of the character B is played, and the user inputs the reply information for the audio of the character B under the guidance of the dialog of the character B.
When a user inputs reply information as a role A, the reply information of the user is collected, then whether the reply information input by the user is correct or not is judged according to the standard audio frequency of the role A, if the reply information input by the user is correct, the exaggeration information is output, and if the reply information is incorrect, prompt information is output to prompt the user which places are wrong in answer, so that the user thinks how to reply and the social language ability of the user is improved.
According to the method, the user can have a conversation with other roles in the learning video through the playing of the learning video, so that the user can practice and learn the social language of a specific scene and a specific role, and when the user replies an error, the user is reminded, so that the user can know the place where the error is replied and think about how to reply, and the social language ability of the user is improved; meanwhile, which social languages are mastered by the user and which social languages are not mastered by the user can be judged, so that the user can further strengthen learning.
Preferably, the system further comprises a construction module; the construction module comprises:
the acquisition unit is used for acquiring the conversation videos corresponding to the social scenes;
the processing unit is used for deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
the storage unit is used for generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and the storage unit is used for constructing the video library according to the learning video set corresponding to each social scene.
Specifically, the video library includes learning video sets corresponding to social scenes, and each social scene corresponds to one or more learning video sets. The method for constructing the learning video set corresponding to each social scene comprises the following steps: conversation videos corresponding to the social scenes are obtained first, and a plurality of conversation videos can be obtained from each social scene. The conversation video can be shot by a real person according to the standard social expression, or can be animated according to the standard social expression.
And extracting the audio information of each character from the conversation video, deleting the audio information of a certain character, and generating the learning video of the character. For example, the dialogue video includes a character a and a character B, and the audio information of the character a is deleted from the dialogue video, so that the learning video of the character a can be generated; and deleting the audio information of the character B from the conversation video, so that the learning video of the character B can be generated. The learning video of the role A and the learning video of the role B can form a learning video set. If the conversation video comprises three roles, learning videos corresponding to the three roles can be generated, and the three learning videos can also form a learning video set.
And generating a learning video set corresponding to each social scene according to the conversation video of each social scene, and then storing all the generated learning video sets in a newly-built video library to form a built video library. And establishing the incidence relation between each learning video set and the social scenes in the constructed video library. After the target social scene selected by the user is obtained, the target learning video set corresponding to the target social scene can be found in the video library according to the association relationship between the social scene and the learning video set.
Preferably, the method further comprises the following steps:
the determining module is used for determining the playing time of a place with wrong reply in the conversation video according to the standard audio when the reply information is incorrect;
and the control module is used for controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Specifically, the user may reply many times while talking to other characters in the target learning video, and thus, the reply message may include multiple words. And comparing each sentence in the reply message with the corresponding sentence in the standard audio, judging whether an incorrect place exists in the reply message, and if so, determining the playing time of the place with the wrong reply in the conversation video corresponding to the target social scene according to the standard audio, wherein the playing time comprises the playing start time and the playing end time.
And a dialogue video corresponding to the target social scene can be stored in the learning video set corresponding to the target social scene. And controlling the conversation video corresponding to the target social scene to be played according to the determined playing time so as to play a correct reply corresponding to the incorrect reply, so that the user can perform simulation learning.
When the user replies with an error, the correct reply corresponding to the error reply is only played, so that the time of the user can be saved, and the user can perform imitation learning to improve the social language ability of children so as to better communicate with other people.
Preferably, the judging module 50 includes:
the identification unit is used for identifying the semantics of the reply information;
the recognition unit is used for recognizing the semantics of the standard audio;
the comparison unit is used for comparing the semantics of the reply information with the semantics of the standard audio;
the judging unit is used for judging that the reply information is correct when the semantics of the reply information is the same as the semantics of the standard audio; otherwise, judging that the reply information is incorrect.
Specifically, the reply information is compared with the standard audio to judge whether the reply information is correct or not, the semantics of the reply information is firstly identified, the semantics of the standard audio is identified, then the semantics of the reply information is compared with the semantics of the standard audio, when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct, and if the semantics of the reply information is different from the semantics of the standard audio, the reply information is incorrect.
When the reply message contains a plurality of sentences, the semantics of each sentence in the reply message is compared with the semantics of the corresponding sentence in the standard audio to judge which sentence in the reply message replies correctly and which sentence replies incorrectly. And for incorrect reply, prompting information can be sent to remind the user, and after the user is reminded, voice information input by the user can be collected again, wherein the voice information input again is the re-reply of the user to the place with the wrong reply, and whether the voice information input again by the user is correct is judged according to the standard audio frequency of the error reply. When the user replies for many times and still does not reply correctly, the correct reply corresponding to the error reply in the conversation video can be played, when the conversation video is played, only the correct reply part corresponding to the error reply is played, and other places which reply correctly can not be played any more; of course, if the user wishes to play the complete conversation video, it is played.
In this embodiment, whether the reply message is correct or not is determined according to the semantics of the reply message and the semantics of the standard audio, which can prevent the situation that the semantics are the same but are determined incorrectly due to different characters, thereby improving the accuracy of the determination.
Preferably, the method further comprises the following steps:
the scoring module is used for scoring the reply information according to the judgment result;
and the pushing module is used for pushing a learning video set corresponding to the social scene to the user according to the scores.
Specifically, after judging whether the reply information is correct according to the standard audio, the reply information can be scored according to the judgment result. For example, if five sentences are included in the reply message, three sentences reply correctly, and two sentences reply incorrectly, the score of the reply message is 60.
If the score of the reply information is higher, the user is indicated to know the social language of the social scene; if the score of the reply message is lower, it indicates that the user does not master the social language of the social scene. If the user already knows the social language of the social scene, a learning video set of the social scene with higher difficulty can be pushed to the user. If the user does not master the social language of the social scene, pushing other learning video sets corresponding to the social scene to the user, or pushing the learning video sets of the social scene associated with the social scene to the user.
In this embodiment, according to the score of the reply information, the corresponding learning video set is pushed to the user, so that the pushed learning video set better conforms to the current level of the user, and the user is better helped to learn the social language.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (6)

1. A method for learning social language, comprising:
acquiring a target social scene selected by a user and a target role in the target social scene;
selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
selecting a target learning video corresponding to the target role in the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
playing the target learning video;
when the target learning video is played, acquiring reply information input by a user, wherein the reply information is used for making the user as a target role to have a conversation with other roles in the target learning video;
judging whether the reply information is correct or not according to the standard audio, and if not, sending a prompt message;
the construction method of the video library comprises the following steps:
obtaining a conversation video corresponding to each social scene;
deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
constructing the video library according to the learning video set corresponding to each social scene;
the judging whether the reply message is correct according to the standard audio specifically comprises:
identifying semantics of the reply information;
identifying semantics of the standard audio;
comparing the semantics of the reply information with the semantics of the standard audio;
when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct; otherwise, the reply message is incorrect.
2. The method for learning social language according to claim 1, further comprising:
when the reply information is incorrect, determining the playing time of a place with a reply error in the conversation video according to the standard audio;
and controlling the conversation video corresponding to the target social scene to be played according to the playing time.
3. The method for learning social language according to claim 1, further comprising:
grading the reply information according to a judgment result;
and pushing a learning video set corresponding to the social scene to the user according to the score.
4. A smart device, comprising:
the acquisition module is used for acquiring a target social scene selected by a user and a target role in the target social scene;
the searching module is used for selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
the searching module is further configured to select a target learning video corresponding to the target role in the target learning video set, and acquire a standard audio of the target role in the target learning video, where the target learning video does not include audio information of the target role;
the playing module is used for playing the target learning video;
the acquisition module is used for acquiring reply information input by a user when the target learning video is played, and the reply information is used for making the user as a target role to have a conversation with other roles in the target learning video;
the judging module is used for judging whether the reply information is correct or not according to the standard audio;
the prompt module is used for sending out prompt information when the reply information is incorrect;
wherein the building block comprises:
the device comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring conversation videos corresponding to all social scenes;
the processing unit is used for deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
the storage unit is used for generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
the storage unit is used for constructing the video library according to the learning video set corresponding to each social scene;
the judging module comprises:
the identification unit is used for identifying the semantics of the reply information;
the recognition unit is used for recognizing the semantics of the standard audio;
the comparison unit is used for comparing the semantics of the reply information with the semantics of the standard audio;
the judging unit is used for judging that the reply information is correct when the semantics of the reply information is the same as the semantics of the standard audio; otherwise, judging that the reply information is incorrect.
5. The intelligent device according to claim 4, further comprising:
the determining module is used for determining the playing time of a place with wrong reply in the conversation video according to the standard audio when the reply information is incorrect;
and the control module is used for controlling the conversation video corresponding to the target social scene to be played according to the playing time.
6. The intelligent device according to claim 4, further comprising:
the scoring module is used for scoring the reply information according to the judgment result;
and the pushing module is used for pushing a learning video set corresponding to the social scene to the user according to the score.
CN201910596768.2A 2019-07-02 2019-07-02 Learning method and intelligent device for social language Active CN112185187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910596768.2A CN112185187B (en) 2019-07-02 2019-07-02 Learning method and intelligent device for social language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910596768.2A CN112185187B (en) 2019-07-02 2019-07-02 Learning method and intelligent device for social language

Publications (2)

Publication Number Publication Date
CN112185187A CN112185187A (en) 2021-01-05
CN112185187B true CN112185187B (en) 2022-05-20

Family

ID=73915990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910596768.2A Active CN112185187B (en) 2019-07-02 2019-07-02 Learning method and intelligent device for social language

Country Status (1)

Country Link
CN (1) CN112185187B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669658A (en) * 2021-01-25 2021-04-16 杨涵予 English learning real scene dialogue training device
CN112560512B (en) * 2021-02-24 2021-08-06 南京酷朗电子有限公司 Intelligent voice analysis and auxiliary communication method for foreign language learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5930757A (en) * 1996-11-21 1999-07-27 Freeman; Michael J. Interactive two-way conversational apparatus with voice recognition
CN208796480U (en) * 2018-05-16 2019-04-26 侯院凤 A kind of multi-functional early education robot

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8009966B2 (en) * 2002-11-01 2011-08-30 Synchro Arts Limited Methods and apparatus for use in sound replacement with automatic synchronization to images
US20070015121A1 (en) * 2005-06-02 2007-01-18 University Of Southern California Interactive Foreign Language Teaching
TW200811769A (en) * 2006-08-25 2008-03-01 Inventec Besta Co Ltd Method of scenario language learning
CA2682000A1 (en) * 2007-03-28 2008-10-02 Breakthrough Performancetech, Llc Systems and methods for computerized interactive training
CN101123043A (en) * 2007-09-13 2008-02-13 无敌科技(西安)有限公司 Scenario foreign language learning method
US20130073964A1 (en) * 2011-09-20 2013-03-21 Brian Meaney Outputting media presentations using roles assigned to content
CN103117057B (en) * 2012-12-27 2015-10-21 安徽科大讯飞信息科技股份有限公司 The application process of a kind of particular person speech synthesis technique in mobile phone cartoon is dubbed
US10283013B2 (en) * 2013-05-13 2019-05-07 Mango IP Holdings, LLC System and method for language learning through film
CN103426335A (en) * 2013-08-26 2013-12-04 苏州跨界软件科技有限公司 Social language learning method
US10272349B2 (en) * 2016-09-07 2019-04-30 Isaac Davenport Dialog simulation
CN106952515A (en) * 2017-05-16 2017-07-14 宋宇 The interactive learning methods and system of view-based access control model equipment
CN207425137U (en) * 2017-08-30 2018-05-29 湖南安全技术职业学院 English study training device based on the dialogue of VR real scenes
CN108040289A (en) * 2017-12-12 2018-05-15 天脉聚源(北京)传媒科技有限公司 A kind of method and device of video playing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5930757A (en) * 1996-11-21 1999-07-27 Freeman; Michael J. Interactive two-way conversational apparatus with voice recognition
CN208796480U (en) * 2018-05-16 2019-04-26 侯院凤 A kind of multi-functional early education robot

Also Published As

Publication number Publication date
CN112185187A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN105512228B (en) A kind of two-way question and answer data processing method and system based on intelligent robot
Fay et al. How to bootstrap a human communication system
CN109643325B (en) Recommending friends in automatic chat
US10853716B2 (en) Systems and methods for a mathematical chat bot
CN108509591B (en) Information question-answer interaction method and system, storage medium, terminal and intelligent knowledge base
WO2018227782A1 (en) Network-based online interactive language learning system
CN109801527B (en) Method and apparatus for outputting information
CN112185187B (en) Learning method and intelligent device for social language
CN108763548A (en) Collect method, apparatus, equipment and the computer readable storage medium of training data
CN112685550B (en) Intelligent question-answering method, intelligent question-answering device, intelligent question-answering server and computer readable storage medium
CN104462122A (en) Test question data processing method and device
CN109460503B (en) Answer input method, answer input device, storage medium and electronic equipment
CN108831229B (en) Chinese automatic grading method
CN110796911A (en) Language learning system capable of automatically generating test questions and language learning method thereof
CN108306813B (en) Session message processing method, server and client
CN109800301B (en) Weak knowledge point mining method and learning equipment
CN117271753B (en) Intelligent property question-answering method and related products
CN113361396A (en) Multi-modal knowledge distillation method and system
CN110852073A (en) Language learning system and learning method for customizing learning content for user
Zapata-Paulini et al. Development and evaluation of a didactic tool with augmented reality for Quechua language learning in preschoolers
CN111078982B (en) Electronic page retrieval method, electronic device and storage medium
CN114661196B (en) Problem display method and device, electronic equipment and storage medium
CN116403583A (en) Voice data processing method and device, nonvolatile storage medium and vehicle
CN110750633A (en) Method and device for determining answer of question
CN108664842B (en) Lip movement recognition model construction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant