CN112185187A - Learning method and intelligent device for social language - Google Patents
Learning method and intelligent device for social language Download PDFInfo
- Publication number
- CN112185187A CN112185187A CN201910596768.2A CN201910596768A CN112185187A CN 112185187 A CN112185187 A CN 112185187A CN 201910596768 A CN201910596768 A CN 201910596768A CN 112185187 A CN112185187 A CN 112185187A
- Authority
- CN
- China
- Prior art keywords
- target
- learning
- social
- video
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000014509 gene expression Effects 0.000 description 7
- 238000010276 construction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004630 mental health Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/06—Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
- G09B5/065—Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B7/00—Electrically-operated teaching apparatus or devices working with questions and answers
- G09B7/02—Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Educational Technology (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention belongs to the field of intelligent products, and discloses a learning method and an intelligent device of social languages, wherein the method comprises the following steps: acquiring a target social scene selected by a user and a target role in the target social scene; selecting a target learning video set corresponding to a target social scene from a pre-constructed video library; selecting a target learning video corresponding to a target role in a target learning video set, and acquiring standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role; playing a target learning video; when the target learning video is played, acquiring reply information input by a user; and judging whether the reply information is correct or not according to the standard audio, and if not, sending a prompt message. According to the method and the device, the corresponding learning videos are played, so that the user can practice and learn the social language in a specific scene and a specific role, and the social language ability of the user is improved.
Description
Technical Field
The invention belongs to the technical field of intelligent products, and particularly relates to a social language learning method and an intelligent device.
Background
In modern society, due to rapid development of economy, people have frequent interaction, social languages have increasingly strengthened importance, and good language expression capacity is considered as essential capacity of modern people. For children, their social language ability is also important. The social language ability of the children is cultivated from childhood, so that the children can better adapt to various environments and coordinate the relationship with other people and the collective; but also is more beneficial to the physical and mental health of children.
With the rapid development of intelligent terminals and network technologies, more and more language learning devices emerge, such as a repeater, a point reader, and the like. However, these language learning devices are all directed to learning foreign languages such as english, and at present, there is no language learning device to guide children to learn social languages, so that the social language ability of children cannot be well cultivated and improved.
Disclosure of Invention
The invention aims to provide a social language learning method and an intelligent device, which can be used for a user to practice and learn social language in a specific scene and a specific role by playing a learning video so as to improve the social language ability of the user.
The technical scheme provided by the invention is as follows:
in one aspect, a social language learning method is provided, including:
acquiring a target social scene selected by a user and a target role in the target social scene;
selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
selecting a target learning video corresponding to the target role in the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
playing the target learning video;
when the target learning video is played, acquiring reply information input by a user;
and judging whether the reply information is correct or not according to the standard audio, and if not, sending a prompt message.
Further preferably, the video library is constructed by the following method:
obtaining a conversation video corresponding to each social scene;
deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and constructing the video library according to the learning video set corresponding to each social scene.
Further preferably, the method further comprises the following steps:
when the reply information is incorrect, determining the playing time of a place with a reply error in the conversation video according to the standard audio;
and controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Further preferably, the determining, according to the standard audio, whether the reply message is correct specifically includes:
identifying semantics of the reply information;
identifying semantics of the standard audio;
comparing the semantics of the reply information with the semantics of the standard audio;
when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct; otherwise, the reply message is incorrect.
Further preferably, the method further comprises the following steps:
scoring the reply information according to the judgment result;
and pushing a learning video set corresponding to the social scene to the user according to the score.
In another aspect, a smart device is also provided, including:
the acquisition module is used for acquiring a target social scene selected by a user and a target role in the target social scene;
the searching module is used for selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
the searching module is further configured to select a target learning video corresponding to the target role in the target learning video set, and acquire a standard audio of the target role in the target learning video, where the target learning video does not include audio information of the target role;
the playing module is used for playing the target learning video;
the acquisition module is used for acquiring reply information input by a user when the target learning video is played;
the judging module is used for judging whether the reply information is correct or not according to the standard audio;
and the prompt module is used for sending out prompt information when the reply information is incorrect.
Further preferably, the system also comprises a construction module;
the building module comprises:
the acquisition unit is used for acquiring the conversation videos corresponding to the social scenes;
the processing unit is used for deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
the storage unit is used for generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and the storage unit is used for constructing the video library according to the learning video set corresponding to each social scene.
Further preferably, the method further comprises the following steps:
the determining module is used for determining the playing time of a place with wrong reply in the conversation video according to the standard audio when the reply information is incorrect;
and the control module is used for controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Further preferably, the judging module includes:
the identification unit is used for identifying the semantics of the reply information;
the recognition unit is used for recognizing the semantics of the standard audio;
the comparison unit is used for comparing the semantics of the reply information with the semantics of the standard audio;
the judging unit is used for judging that the reply information is correct when the semantics of the reply information is the same as the semantics of the standard audio; otherwise, judging that the reply information is incorrect.
Further preferably, the method further comprises the following steps:
the scoring module is used for scoring the reply information according to the judgment result;
and the pushing module is used for pushing a learning video set corresponding to the social scene to the user according to the scores.
Compared with the prior art, the social language learning method and the intelligent device provided by the invention have the following beneficial effects: according to the method and the device, the user can have a conversation with other roles in the learning video through the playing of the learning video, so that the user can practice and learn the social language of a specific scene and a specific role, and when the user replies an error, the user is reminded, so that the user can know the place where the error is replied and think how to reply, and the social language ability of the user is improved.
Drawings
The above features, technical features, advantages and implementation manners of the social language learning method and the intelligent device will be further described in the following detailed description of preferred embodiments in a clearly understandable manner and with reference to the accompanying drawings.
FIG. 1 is a flow diagram illustrating an embodiment of a social language learning method of the present invention;
FIG. 2 is a flow chart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 3 is a flow chart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 4 is a flow chart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for learning social language according to another embodiment of the present invention;
FIG. 6 is a block diagram illustrating the structure of an embodiment of an intelligent device of the present invention.
Description of the reference numerals
10. An acquisition module; 20. a search module; 30. a playing module; 40. an acquisition module; 50. a judgment module; 60. and a prompt module.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
Fig. 1 is a flowchart of an embodiment of a social language learning method provided in the present invention, where the social language learning method includes:
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message.
Specifically, the social language learning method can be applied to various intelligent terminals, such as a family education machine, a tablet computer, an intelligent desk lamp and the like.
When children need to learn social language, a target social scene needing to be learned and a target role in the target social scene can be selected on the intelligent terminal.
A social scenario refers to a conversation with different people in different scenarios. For example, a dialog scenario when asking a teacher for a topic; asking friends for a dialogue scene during learning; a conversation scenario when sharing a toy with a child; conversation scenes when playing with friends; meeting conversation scenes of a mature person on the road; a dialog scenario when asking the teacher, etc.
Each conversation scene comprises a plurality of roles, namely two roles, namely one-to-one conversation; more than three characters, i.e. multi-person conversations, are also possible. The user can select any one of the roles as a target role for social language learning. Of course, in order to better develop and improve the social language ability of the user, it is better to select a character consistent with the own identity for social language learning. For example, in a conversation scenario with a teacher, when the user is a child, it is preferable to select a child character for social language learning in order to better learn and grasp a social language consistent with the identity of the user.
After the intelligent terminal obtains the target social scene selected by the user and the target role in the target social scene, a target learning video set corresponding to the target social scene can be selected from a pre-constructed video library. Each social scene corresponds to one or more learning video sets, namely, under the same social scene, multiple expressions can be provided. When the target social scene corresponds to a plurality of learning video sets, if the users of the plurality of learning video sets have not learned, randomly selecting one learning video set from the plurality of learning video sets as a target learning video set; if the user has learned some learning video sets in the plurality of learning video sets, a learning video set which is not learned by the user can be randomly selected as a target learning video set.
The learning video set includes a plurality of learning videos, each of which may be used to learn a social language of a character. After the target learning video set is selected, according to the target role selected by the user, the corresponding target learning video is selected from the plurality of learning videos in the target learning video set, and then the target learning video is played.
The target learning video does not contain the audio information of the target character, and only contains the audio information of other characters. When the target learning video is played, the dialogue audio of other characters in the target learning video is normally played, and in the dialogue, when the target character needs to answer the dialogue, the user can input reply information so that the user can carry out the dialogue with other characters in the target learning video as the target character. The dialog information of other roles in the target learning video can be used as a definition for a social scene and a guide for the reply information of the user, so that the user can be guided to reply and communicate.
For example, it is assumed that the target learning video includes a character a and a character B, the target character selected by the user is the character a, the video pictures of the character a and the character B are normally played in the target learning video, and the audio information of the character B is played, and the user inputs the reply information for the audio of the character B under the guidance of the dialog of the character B.
When a user inputs reply information as a role A, the reply information of the user is collected, then whether the reply information input by the user is correct or not is judged according to the standard audio frequency of the role A, if the reply information input by the user is correct, the exaggeration information is output, and if the reply information is incorrect, prompt information is output to prompt the user which places are wrong in answer, so that the user thinks how to reply and the social language ability of the user is improved.
According to the method, the user can have a conversation with other roles in the learning video through the playing of the learning video, so that the user can practice and learn the social language of a specific scene and a specific role, and when the user replies an error, the user is reminded, so that the user can know the place where the error is replied and think about how to reply, and the social language ability of the user is improved; meanwhile, which social languages are mastered by the user and which social languages are not mastered by the user can be judged, so that the user can further strengthen learning.
Fig. 2 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s010 obtains conversation videos corresponding to the social scenes;
s020 deleting audio information of a current role from a conversation video corresponding to a current social scene to form a learning video corresponding to the current role;
s030, generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
s040, constructing a video library according to the learning video sets corresponding to the social scenes;
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message.
Specifically, the video library includes learning video sets corresponding to social scenes, and each social scene corresponds to one or more learning video sets. The method for constructing the learning video set corresponding to each social scene comprises the following steps: conversation videos corresponding to the social scenes are obtained first, and a plurality of conversation videos can be obtained from each social scene. The conversation video can be shot by a real person according to the standard social expression, or can be animated according to the standard social expression.
And extracting the audio information of each character from the conversation video, deleting the audio information of a certain character, and generating the learning video of the character. For example, the dialogue video includes a character a and a character B, and the audio information of the character a is deleted from the dialogue video, so that the learning video of the character a can be generated; and deleting the audio information of the character B from the conversation video, so that the learning video of the character B can be generated. The learning video of the character A and the learning video of the character B can form a learning video set. If the conversation video comprises three roles, learning videos corresponding to the three roles can be generated, and the three learning videos can also form a learning video set.
And generating a learning video set corresponding to each social scene according to the conversation video of each social scene, and then storing all the generated learning video sets in a newly-built video library to form a built video library. And establishing the incidence relation between each learning video set and the social scenes in the constructed video library. After the target social scene selected by the user is obtained, the target learning video set corresponding to the target social scene can be found in the video library according to the association relationship between the social scene and the learning video set.
Fig. 3 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s010 obtains conversation videos corresponding to the social scenes;
s020 deleting audio information of a current role from a conversation video corresponding to a current social scene to form a learning video corresponding to the current role;
s030, generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
s040, constructing a video library according to the learning video sets corresponding to the social scenes;
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message;
s710, when the reply message is incorrect, determining the playing time of a place with a reply error in the conversation video according to the standard audio;
s720, controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Specifically, the user may reply many times while talking to other characters in the target learning video, and thus, the reply message may include multiple words. And comparing each sentence in the reply message with the corresponding sentence in the standard audio, judging whether an incorrect place exists in the reply message, and if so, determining the playing time of the place with the wrong reply in the conversation video corresponding to the target social scene according to the standard audio, wherein the playing time comprises the playing start time and the playing end time.
And the learning video set corresponding to the target social scene can also store a conversation video corresponding to the target social scene. And controlling the conversation video corresponding to the target social scene to be played according to the determined playing time so as to play a correct reply corresponding to the incorrect reply for the user to perform imitation learning, wherein the imitation learning is also an important learning process for children with small grades.
When the user replies with an error, the correct reply corresponding to the error reply is only played, so that the time of the user can be saved, and the user can perform imitation learning to improve the social language ability of children so as to better communicate with other people.
Fig. 4 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s610, recognizing the semantics of the reply information;
s620, identifying the semantics of the standard audio;
s630, comparing the semantics of the reply message with the semantics of the standard audio;
s640, when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct; otherwise, the reply message is incorrect;
and S650, if the code is incorrect, sending a prompt message.
Specifically, the reply information is compared with the standard audio to judge whether the reply information is correct or not, the semantics of the reply information is firstly identified, the semantics of the standard audio is identified, then the semantics of the reply information is compared with the semantics of the standard audio, when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct, and if the semantics of the reply information is different from the semantics of the standard audio, the reply information is incorrect.
When the reply message contains a plurality of sentences, the semantics of each sentence in the reply message is compared with the semantics of the corresponding sentence in the standard audio to judge which sentence in the reply message replies correctly and which sentence replies incorrectly. And for incorrect reply, prompting information can be sent to remind the user, and after the user is reminded, voice information input by the user can be collected again, wherein the voice information input again is the re-reply of the user to the place with the wrong reply, and whether the voice information input again by the user is correct is judged according to the standard audio frequency of the error reply. When the user replies for many times and still does not reply correctly, the correct reply corresponding to the error reply in the conversation video can be played, when the conversation video is played, only the correct reply part corresponding to the error reply is played, and other places which reply correctly can not be played any more; of course, if the user wishes to play the complete conversation video, it is played.
In this embodiment, whether the reply message is correct or not is determined according to the semantics of the reply message and the semantics of the standard audio, which can prevent the situation that the semantics are the same but are determined incorrectly due to different characters, thereby improving the accuracy of the determination.
Fig. 5 is a flowchart of another embodiment of a method for learning a social language according to the present invention, where the method for learning a social language includes:
s100, acquiring a target social scene selected by a user and a target role in the target social scene;
s200, selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
s300, selecting a target learning video corresponding to the target role from the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
s400, playing the target learning video;
s500, collecting reply information input by a user when the target learning video is played;
s600, judging whether the reply message is correct or not according to the standard audio, and if not, sending a prompt message;
s810, scoring the reply information according to the judgment result;
s820, pushing a learning video set of the corresponding social scene to the user according to the scores.
Specifically, after judging whether the reply information is correct according to the standard audio, the reply information can be scored according to the judgment result. For example, if five sentences are included in the reply message, three sentences reply correctly, and two sentences reply incorrectly, the score of the reply message is 60.
If the score of the reply information is higher, the user is indicated to know the social language of the social scene; if the score of the reply message is lower, the user is not informed of the social language of the social scene. If the user already knows the social language of the social scene, a learning video set of the social scene with higher difficulty can be pushed to the user. If the user does not master the social language of the social scene, pushing other learning video sets corresponding to the social scene to the user, or pushing the learning video sets of the social scene associated with the social scene to the user.
In this embodiment, according to the score of the reply information, the corresponding learning video set is pushed to the user, so that the pushed learning video set better conforms to the current level of the user, and the user is better helped to learn the social language.
It should be understood that, in the foregoing embodiments, the sequence numbers of the steps do not mean the execution sequence, and the execution sequence of the steps should be determined by functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
Fig. 6 is a schematic block diagram of a structure of an embodiment of an intelligent device provided in the present invention, the intelligent device including:
the acquiring module 10 is configured to acquire a target social scene selected by a user and a target role in the target social scene;
the searching module 20 is configured to select a target learning video set corresponding to the target social scene from a pre-constructed video library, where the target learning video set includes a plurality of learning videos for learning social languages of different roles;
the searching module 20 is further configured to select a target learning video corresponding to the target role in the target learning video set, and acquire a standard audio of the target role in the target learning video, where the target learning video does not include audio information of the target role;
the playing module 30 is used for playing the target learning video;
the acquisition module 40 is used for acquiring reply information input by a user when the target learning video is played;
the judging module 50 is configured to judge whether the reply message is correct according to the standard audio;
and the prompt module 60 is configured to send a prompt message when the reply message is incorrect.
Specifically, the intelligent device of the invention can be various intelligent terminals, such as a family education machine, a tablet computer, an intelligent desk lamp and the like.
When children need to learn social language, a target social scene needing to be learned and a target role in the target social scene can be selected on the intelligent terminal.
A social scenario refers to a conversation with different people in different scenarios. For example, a dialog scenario when asking a teacher for a topic; asking friends for a dialogue scene during learning; a conversation scenario when sharing a toy with a child; conversation scenes when playing with friends; meeting conversation scenes of a mature person on the road; a dialog scenario when asking the teacher, etc.
Each conversation scene comprises a plurality of roles, namely two roles, namely one-to-one conversation; more than three characters, i.e. multi-person conversations, are also possible. The user can select any one of the roles as a target role for social language learning. Of course, in order to better develop and improve the social language ability of the user, it is better to select a character consistent with the own identity for social language learning. For example, in a conversation scenario with a teacher, when the user is a child, it is preferable to select a child character for social language learning in order to better learn and grasp a social language consistent with the identity of the user.
After the intelligent terminal obtains the target social scene selected by the user and the target role in the target social scene, a target learning video set corresponding to the target social scene can be selected from a pre-constructed video library. Each social scene corresponds to one or more learning video sets, namely, under the same social scene, multiple expressions can be provided. When the target social scene corresponds to a plurality of learning video sets, if the users of the plurality of learning video sets have not learned, randomly selecting one learning video set from the plurality of learning video sets as a target learning video set; if the user has learned some learning video sets in the plurality of learning video sets, a learning video set which is not learned by the user can be randomly selected as a target learning video set.
The learning video set includes a plurality of learning videos, each of which may be used to learn a social language of a character. After the target learning video set is selected, according to the target role selected by the user, the corresponding target learning video is selected from the plurality of learning videos in the target learning video set, and then the target learning video is played.
The target learning video does not contain the audio information of the target character, and only contains the audio information of other characters. When the target learning video is played, the dialogue audio of other characters in the target learning video is normally played, and in the dialogue, when the target character needs to answer the dialogue, the user can input reply information so that the user can carry out the dialogue with other characters in the target learning video as the target character. The dialog information of other roles in the target learning video can be used as a definition for a social scene and a guide for the reply information of the user, so that the user can be guided to reply and communicate.
For example, it is assumed that the target learning video includes a character a and a character B, the target character selected by the user is the character a, the video pictures of the character a and the character B are normally played in the target learning video, and the audio information of the character B is played, and the user inputs the reply information for the audio of the character B under the guidance of the dialog of the character B.
When a user inputs reply information as a role A, the reply information of the user is collected, then whether the reply information input by the user is correct or not is judged according to the standard audio frequency of the role A, if the reply information input by the user is correct, the exaggeration information is output, and if the reply information is incorrect, prompt information is output to prompt the user which places are wrong in answer, so that the user thinks how to reply and the social language ability of the user is improved.
According to the method, the user can have a conversation with other roles in the learning video through the playing of the learning video, so that the user can practice and learn the social language of a specific scene and a specific role, and when the user replies an error, the user is reminded, so that the user can know the place where the error is replied and think about how to reply, and the social language ability of the user is improved; meanwhile, which social languages are mastered by the user and which social languages are not mastered by the user can be judged, so that the user can further strengthen learning.
Preferably, the system further comprises a construction module; the construction module comprises:
the acquisition unit is used for acquiring the conversation videos corresponding to the social scenes;
the processing unit is used for deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
the storage unit is used for generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and the storage unit is used for constructing the video library according to the learning video set corresponding to each social scene.
Specifically, the video library includes learning video sets corresponding to social scenes, and each social scene corresponds to one or more learning video sets. The method for constructing the learning video set corresponding to each social scene comprises the following steps: conversation videos corresponding to the social scenes are obtained first, and a plurality of conversation videos can be obtained from each social scene. The conversation video can be shot by a real person according to the standard social expression, or can be animated according to the standard social expression.
And extracting the audio information of each character from the conversation video, deleting the audio information of a certain character, and generating the learning video of the character. For example, the dialogue video includes a character a and a character B, and the audio information of the character a is deleted from the dialogue video, so that the learning video of the character a can be generated; and deleting the audio information of the character B from the conversation video, so that the learning video of the character B can be generated. The learning video of the character A and the learning video of the character B can form a learning video set. If the conversation video comprises three roles, learning videos corresponding to the three roles can be generated, and the three learning videos can also form a learning video set.
And generating a learning video set corresponding to each social scene according to the conversation video of each social scene, and then storing all the generated learning video sets in a newly-built video library to form a built video library. And establishing the incidence relation between each learning video set and the social scenes in the constructed video library. After the target social scene selected by the user is obtained, the target learning video set corresponding to the target social scene can be found in the video library according to the association relationship between the social scene and the learning video set.
Preferably, the method further comprises the following steps:
the determining module is used for determining the playing time of a place with wrong reply in the conversation video according to the standard audio when the reply information is incorrect;
and the control module is used for controlling the conversation video corresponding to the target social scene to be played according to the playing time.
Specifically, the user may reply many times while talking to other characters in the target learning video, and thus, the reply message may include multiple words. And comparing each sentence in the reply message with the corresponding sentence in the standard audio, judging whether an incorrect place exists in the reply message, and if so, determining the playing time of the place with the wrong reply in the conversation video corresponding to the target social scene according to the standard audio, wherein the playing time comprises the playing start time and the playing end time.
And the learning video set corresponding to the target social scene can also store a conversation video corresponding to the target social scene. And controlling the conversation video corresponding to the target social scene to be played according to the determined playing time so as to play a correct reply corresponding to the incorrect reply, so that the user can perform simulation learning.
When the user replies with an error, the correct reply corresponding to the error reply is only played, so that the time of the user can be saved, and the user can perform imitation learning to improve the social language ability of children so as to better communicate with other people.
Preferably, the judging module 50 includes:
the identification unit is used for identifying the semantics of the reply information;
the recognition unit is used for recognizing the semantics of the standard audio;
the comparison unit is used for comparing the semantics of the reply information with the semantics of the standard audio;
the judging unit is used for judging that the reply information is correct when the semantics of the reply information is the same as the semantics of the standard audio; otherwise, judging that the reply information is incorrect.
Specifically, the reply information is compared with the standard audio to judge whether the reply information is correct or not, the semantics of the reply information is firstly identified, the semantics of the standard audio is identified, then the semantics of the reply information is compared with the semantics of the standard audio, when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct, and if the semantics of the reply information is different from the semantics of the standard audio, the reply information is incorrect.
When the reply message contains a plurality of sentences, the semantics of each sentence in the reply message is compared with the semantics of the corresponding sentence in the standard audio to judge which sentence in the reply message replies correctly and which sentence replies incorrectly. And for incorrect reply, prompting information can be sent to remind the user, and after the user is reminded, voice information input by the user can be collected again, wherein the voice information input again is the re-reply of the user to the place with the wrong reply, and whether the voice information input again by the user is correct is judged according to the standard audio frequency of the error reply. When the user replies for many times and still does not reply correctly, the correct reply corresponding to the error reply in the conversation video can be played, when the conversation video is played, only the correct reply part corresponding to the error reply is played, and other places which reply correctly can not be played any more; of course, if the user wishes to play the complete conversation video, it is played.
In this embodiment, whether the reply message is correct or not is determined according to the semantics of the reply message and the semantics of the standard audio, which can prevent the situation that the semantics are the same but are determined incorrectly due to different characters, thereby improving the accuracy of the determination.
Preferably, the method further comprises the following steps:
the scoring module is used for scoring the reply information according to the judgment result;
and the pushing module is used for pushing a learning video set corresponding to the social scene to the user according to the scores.
Specifically, after judging whether the reply information is correct according to the standard audio, the reply information can be scored according to the judgment result. For example, if five sentences are included in the reply message, three sentences reply correctly, and two sentences reply incorrectly, the score of the reply message is 60.
If the score of the reply information is higher, the user is indicated to know the social language of the social scene; if the score of the reply message is lower, the user is not informed of the social language of the social scene. If the user already knows the social language of the social scene, a learning video set of the social scene with higher difficulty can be pushed to the user. If the user does not master the social language of the social scene, pushing other learning video sets corresponding to the social scene to the user, or pushing the learning video sets of the social scene associated with the social scene to the user.
In this embodiment, according to the score of the reply information, the corresponding learning video set is pushed to the user, so that the pushed learning video set better conforms to the current level of the user, and the user is better helped to learn the social language.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A method for learning social language, comprising:
acquiring a target social scene selected by a user and a target role in the target social scene;
selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
selecting a target learning video corresponding to the target role in the target learning video set, and acquiring a standard audio of the target role in the target learning video, wherein the target learning video does not contain audio information of the target role;
playing the target learning video;
when the target learning video is played, acquiring reply information input by a user;
and judging whether the reply information is correct or not according to the standard audio, and if not, sending a prompt message.
2. The method for learning social language according to claim 1, wherein the video library is constructed by:
obtaining a conversation video corresponding to each social scene;
deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and constructing the video library according to the learning video set corresponding to each social scene.
3. The method for learning social language according to claim 2, further comprising:
when the reply information is incorrect, determining the playing time of a place with a reply error in the conversation video according to the standard audio;
and controlling the conversation video corresponding to the target social scene to be played according to the playing time.
4. The method for learning social language according to claim 1, wherein the determining whether the reply message is correct according to the standard audio specifically comprises:
identifying semantics of the reply information;
identifying semantics of the standard audio;
comparing the semantics of the reply information with the semantics of the standard audio;
when the semantics of the reply information is the same as the semantics of the standard audio, the reply information is correct; otherwise, the reply message is incorrect.
5. The method for learning social language according to claim 1, further comprising:
scoring the reply information according to the judgment result;
and pushing a learning video set corresponding to the social scene to the user according to the score.
6. A smart device, comprising:
the acquisition module is used for acquiring a target social scene selected by a user and a target role in the target social scene;
the searching module is used for selecting a target learning video set corresponding to the target social scene from a pre-constructed video library, wherein the target learning video set comprises a plurality of learning videos for learning social languages of different roles;
the searching module is further configured to select a target learning video corresponding to the target role in the target learning video set, and acquire a standard audio of the target role in the target learning video, where the target learning video does not include audio information of the target role;
the playing module is used for playing the target learning video;
the acquisition module is used for acquiring reply information input by a user when the target learning video is played;
the judging module is used for judging whether the reply information is correct or not according to the standard audio;
and the prompt module is used for sending out prompt information when the reply information is incorrect.
7. The intelligent device of claim 6, further comprising a building module;
the building module comprises:
the acquisition unit is used for acquiring the conversation videos corresponding to the social scenes;
the processing unit is used for deleting the audio information of the current role from the conversation video corresponding to the current social scene to form a learning video corresponding to the current role;
the storage unit is used for generating a learning video set corresponding to the current social scene according to the learning videos corresponding to the roles in the current social scene;
and the storage unit is used for constructing the video library according to the learning video set corresponding to each social scene.
8. The intelligent device according to claim 7, further comprising:
the determining module is used for determining the playing time of a place with wrong reply in the conversation video according to the standard audio when the reply information is incorrect;
and the control module is used for controlling the conversation video corresponding to the target social scene to be played according to the playing time.
9. The intelligent device according to claim 6, wherein the judging module comprises:
the identification unit is used for identifying the semantics of the reply information;
the recognition unit is used for recognizing the semantics of the standard audio;
the comparison unit is used for comparing the semantics of the reply information with the semantics of the standard audio;
the judging unit is used for judging that the reply information is correct when the semantics of the reply information is the same as the semantics of the standard audio; otherwise, judging that the reply information is incorrect.
10. The intelligent device according to claim 6, further comprising:
the scoring module is used for scoring the reply information according to the judgment result;
and the pushing module is used for pushing a learning video set corresponding to the social scene to the user according to the scores.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910596768.2A CN112185187B (en) | 2019-07-02 | 2019-07-02 | Learning method and intelligent device for social language |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910596768.2A CN112185187B (en) | 2019-07-02 | 2019-07-02 | Learning method and intelligent device for social language |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112185187A true CN112185187A (en) | 2021-01-05 |
CN112185187B CN112185187B (en) | 2022-05-20 |
Family
ID=73915990
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910596768.2A Active CN112185187B (en) | 2019-07-02 | 2019-07-02 | Learning method and intelligent device for social language |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112185187B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112560512A (en) * | 2021-02-24 | 2021-03-26 | 南京酷朗电子有限公司 | Intelligent voice analysis and auxiliary communication method for foreign language learning |
CN112669658A (en) * | 2021-01-25 | 2021-04-16 | 杨涵予 | English learning real scene dialogue training device |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5930757A (en) * | 1996-11-21 | 1999-07-27 | Freeman; Michael J. | Interactive two-way conversational apparatus with voice recognition |
WO2004040576A1 (en) * | 2002-11-01 | 2004-05-13 | Synchro Arts Limited | Methods and apparatus for use in sound replacement with automatic synchronization to images |
US20070015121A1 (en) * | 2005-06-02 | 2007-01-18 | University Of Southern California | Interactive Foreign Language Teaching |
CN101123043A (en) * | 2007-09-13 | 2008-02-13 | 无敌科技(西安)有限公司 | Scenario foreign language learning method |
TW200811769A (en) * | 2006-08-25 | 2008-03-01 | Inventec Besta Co Ltd | Method of scenario language learning |
US20080254425A1 (en) * | 2007-03-28 | 2008-10-16 | Cohen Martin L | Systems and methods for computerized interactive training |
US20130073964A1 (en) * | 2011-09-20 | 2013-03-21 | Brian Meaney | Outputting media presentations using roles assigned to content |
CN103117057A (en) * | 2012-12-27 | 2013-05-22 | 安徽科大讯飞信息科技股份有限公司 | Application method of special human voice synthesis technique in mobile phone cartoon dubbing |
CN103426335A (en) * | 2013-08-26 | 2013-12-04 | 苏州跨界软件科技有限公司 | Social language learning method |
US20160133154A1 (en) * | 2013-05-13 | 2016-05-12 | Mango IP Holdings, LLC | System and method for language learning through film |
CN106952515A (en) * | 2017-05-16 | 2017-07-14 | 宋宇 | The interactive learning methods and system of view-based access control model equipment |
US20180065054A1 (en) * | 2016-09-07 | 2018-03-08 | Isaac Davenport | Dialog simulation |
CN108040289A (en) * | 2017-12-12 | 2018-05-15 | 天脉聚源(北京)传媒科技有限公司 | A kind of method and device of video playing |
CN207425137U (en) * | 2017-08-30 | 2018-05-29 | 湖南安全技术职业学院 | English study training device based on the dialogue of VR real scenes |
CN208796480U (en) * | 2018-05-16 | 2019-04-26 | 侯院凤 | A kind of multi-functional early education robot |
-
2019
- 2019-07-02 CN CN201910596768.2A patent/CN112185187B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5930757A (en) * | 1996-11-21 | 1999-07-27 | Freeman; Michael J. | Interactive two-way conversational apparatus with voice recognition |
WO2004040576A1 (en) * | 2002-11-01 | 2004-05-13 | Synchro Arts Limited | Methods and apparatus for use in sound replacement with automatic synchronization to images |
US20070015121A1 (en) * | 2005-06-02 | 2007-01-18 | University Of Southern California | Interactive Foreign Language Teaching |
TW200811769A (en) * | 2006-08-25 | 2008-03-01 | Inventec Besta Co Ltd | Method of scenario language learning |
US20080254425A1 (en) * | 2007-03-28 | 2008-10-16 | Cohen Martin L | Systems and methods for computerized interactive training |
CN101123043A (en) * | 2007-09-13 | 2008-02-13 | 无敌科技(西安)有限公司 | Scenario foreign language learning method |
US20130073964A1 (en) * | 2011-09-20 | 2013-03-21 | Brian Meaney | Outputting media presentations using roles assigned to content |
CN103117057A (en) * | 2012-12-27 | 2013-05-22 | 安徽科大讯飞信息科技股份有限公司 | Application method of special human voice synthesis technique in mobile phone cartoon dubbing |
US20160133154A1 (en) * | 2013-05-13 | 2016-05-12 | Mango IP Holdings, LLC | System and method for language learning through film |
CN103426335A (en) * | 2013-08-26 | 2013-12-04 | 苏州跨界软件科技有限公司 | Social language learning method |
US20180065054A1 (en) * | 2016-09-07 | 2018-03-08 | Isaac Davenport | Dialog simulation |
CN106952515A (en) * | 2017-05-16 | 2017-07-14 | 宋宇 | The interactive learning methods and system of view-based access control model equipment |
CN207425137U (en) * | 2017-08-30 | 2018-05-29 | 湖南安全技术职业学院 | English study training device based on the dialogue of VR real scenes |
CN108040289A (en) * | 2017-12-12 | 2018-05-15 | 天脉聚源(北京)传媒科技有限公司 | A kind of method and device of video playing |
CN208796480U (en) * | 2018-05-16 | 2019-04-26 | 侯院凤 | A kind of multi-functional early education robot |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112669658A (en) * | 2021-01-25 | 2021-04-16 | 杨涵予 | English learning real scene dialogue training device |
CN112560512A (en) * | 2021-02-24 | 2021-03-26 | 南京酷朗电子有限公司 | Intelligent voice analysis and auxiliary communication method for foreign language learning |
Also Published As
Publication number | Publication date |
---|---|
CN112185187B (en) | 2022-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105512228B (en) | A kind of two-way question and answer data processing method and system based on intelligent robot | |
Fay et al. | How to bootstrap a human communication system | |
CN109643325B (en) | Recommending friends in automatic chat | |
CN108509591B (en) | Information question-answer interaction method and system, storage medium, terminal and intelligent knowledge base | |
CN106294582A (en) | Man-machine interaction method based on natural language and system | |
CN109460503B (en) | Answer input method, answer input device, storage medium and electronic equipment | |
CN109582700A (en) | A kind of voice room user matching method, device and equipment | |
CN112685550B (en) | Intelligent question-answering method, intelligent question-answering device, intelligent question-answering server and computer readable storage medium | |
CN109801527B (en) | Method and apparatus for outputting information | |
CN108763548A (en) | Collect method, apparatus, equipment and the computer readable storage medium of training data | |
CN112185187B (en) | Learning method and intelligent device for social language | |
CN111178081B (en) | Semantic recognition method, server, electronic device and computer storage medium | |
CN117271753B (en) | Intelligent property question-answering method and related products | |
CN104462122A (en) | Test question data processing method and device | |
CN113361396A (en) | Multi-modal knowledge distillation method and system | |
CN113032520A (en) | Information analysis method and device, electronic equipment and computer readable storage medium | |
CN117453871A (en) | Interaction method, device, computer equipment and storage medium | |
CN108306813B (en) | Session message processing method, server and client | |
CN113763962A (en) | Audio processing method and device, storage medium and computer equipment | |
CN117370512A (en) | Method, device, equipment and storage medium for replying to dialogue | |
CN116403583A (en) | Voice data processing method and device, nonvolatile storage medium and vehicle | |
CN113656566B (en) | Intelligent dialogue processing method, intelligent dialogue processing device, computer equipment and storage medium | |
CN111427444A (en) | Control method and device of intelligent device | |
CN109582971B (en) | Correction method and correction system based on syntactic analysis | |
CN113806620B (en) | Content recommendation method, device, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |