WO2020082566A1

WO2020082566A1 - Physiological sign recognition-based distance learning method, device, apparatus, and storage medium

Info

Publication number: WO2020082566A1
Application number: PCT/CN2018/123186
Authority: WO
Inventors: 万梅
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2018-10-25
Filing date: 2018-12-24
Publication date: 2020-04-30
Also published as: CN109614849A

Abstract

A physiological sign recognition-based distance learning method, a device, an apparatus, and a storage medium, pertaining to the technical field of physiological sign recognition. The method comprises: receiving a learning instruction triggered by a learner, playing a tutorial streaming media, and acquiring a video containing the face of the learner during playback of the tutorial streaming media (S10); determining a facial expression of the learner (S20); determining, according to the facial expression, whether the learner is in a pre-determined low-effectiveness learning state (S30); if so, acquiring a tutorial frame and tutorial voice data currently being played in the tutorial streaming media (S40); and extracting a keyword from the tutorial voice data, determining, from the tutorial frame, an object requiring mixed reality according to the extracted keyword, and performing mixed reality processing on the determined object to obtain a mixed reality tutorial frame enabling the learner to have immersive experience (S50).

Description

Remote teaching method, device, equipment and storage medium based on biological identification

This application requires the priority of the Chinese patent application submitted to the China Patent Office on October 25, 2018, with the application number 201811247129.7 and the invention titled "Biometric-based remote teaching methods, devices, equipment, and storage media". Incorporated by reference in the application.

Technical field

The present application relates to the field of biometrics technology, in particular to a remote teaching method, device, equipment and storage medium based on biometrics.

Background technique

With the development of Internet technology, distance teaching (also known as online teaching) has gradually entered daily life, becoming a common method for people to learn knowledge, effectively helping those who cannot participate in closed teaching environments such as schools due to various restrictions The learner who is learning.

Although, distance teaching brings convenience to people, it can enable learners to study according to their needs at any time and any place. However, the inventor realized that because the object of distance teaching is more generalized and not targeted, in the specific teaching process, learners cannot experience the authenticity of classroom teaching, nor can they participate in the teaching process, so the teaching effect Poor.

Therefore, there is an urgent need to provide a distance teaching method that can enhance learner participation.

Summary of the invention

The main purpose of the present application is to provide a distance learning method, device, equipment and storage medium based on biometrics, aiming to increase the participation of learners in distance learning.

In order to achieve the above purpose, the present application provides a distance learning method based on biometrics, which includes the following steps:

Receiving a learning instruction triggered by a learner, playing teaching streaming media, and collecting videos containing the learner's face during the playing of the teaching streaming media;

Determine the facial expression of the learner in the video;

According to the facial expression, determine whether the learner is in a preset inefficient learning state;

If the learner is in a preset low-efficiency learning state, acquire the teaching picture and teaching voice currently played by the teaching streaming media;

Perform keyword extraction on the teaching speech, determine the objects that need to be mixed reality in the teaching picture according to the extracted keywords, and perform mixed reality processing on the objects to obtain mixed reality that can make the learner immersive Teaching pictures to enable learners to interact with objects in the mixed reality teaching pictures.

In addition, in order to achieve the above object, the present application also proposes a remote teaching device based on biometrics, the device includes:

Play module, used to receive learning instructions triggered by learners and play teaching streaming media;

A collection module, configured to collect a video containing the learner's face during the playback of the teaching streaming media;

A determination module for determining the facial expression of the learner in the video;

The judgment module is used to judge whether the learner is in a preset low-efficiency learning state according to the facial expression;

An obtaining module, configured to obtain the teaching picture and teaching voice currently played by the teaching streaming media when the learner is in a preset low-efficiency learning state;

The processing module is used for keyword extraction of the teaching speech, determining the object that needs to be mixed reality in the teaching picture according to the extracted keyword, and performing mixed reality processing on the object to obtain the learner's presence The mixed reality teaching picture in the context, so that the learner can interact with the objects in the mixed reality teaching picture.

In addition, in order to achieve the above object, the present application also proposes a biometrics-based remote teaching device, the device includes: a memory, a processor, and a biometrics-based device stored on the memory and operable on the processor The readable instruction of the distance learning based on biometrics, the readable instruction of the distance learning based on biometrics is configured to implement the steps of the method of distance learning based on biometrics as described above.

In addition, in order to achieve the above object, the present application also proposes a storage medium on which readable instructions for distance learning based on biometrics are stored, and the readable instructions for distance learning based on biometrics are executed by the processor To realize the steps of the remote teaching method based on biometrics as described above.

The remote teaching method, device, equipment and storage medium based on biometrics in this embodiment collect video containing the learner's face in real time while the learner is watching the teaching streaming media, and confirm the learner's Facial expressions, and then determine whether the learner is in the preset low-efficiency learning state by analyzing the facial expressions. When it is determined that the learner is in the preset low-efficiency learning state, by acquiring the teaching picture and teaching currently played by the teaching streaming media Speech, and use keyword extraction technology to extract keywords from teaching speech, then determine the objects in the teaching picture that need to be mixed reality according to the extracted keywords, and finally use mixed reality technology to perform mixed reality processing on the determined objects, so that Obtain a mixed reality teaching picture that can make the learner immersive, so that the learner can interact with the objects in the mixed reality teaching picture while watching the teaching streaming media, thereby enhancing the learner's participation and making the learner Can better self-help through distance teaching Learn.

BRIEF DESCRIPTION

1 is a schematic structural diagram of a remote teaching device based on biometrics in a hardware operating environment according to an embodiment of the present application;

FIG. 2 is a schematic flowchart of a first embodiment of a remote teaching method based on biometrics identification;

FIG. 3 is a schematic flowchart of a second embodiment of a remote teaching method based on biometrics identification;

FIG. 4 is a structural block diagram of a first embodiment of a remote teaching device based on biometrics of the present application.

The implementation, functional characteristics and advantages of the present application will be further described in conjunction with the embodiments and with reference to the drawings.

detailed description

It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.

Referring to FIG. 1, FIG. 1 is a schematic structural diagram of a remote teaching device based on biometrics in a hardware operating environment according to an embodiment of the present application.

As shown in FIG. 1, the remote teaching device based on biometrics may include: a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Among them, the communication bus 1002 is used to implement connection communication between these components. The user interface 1003 may include a display (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a wireless fidelity (WIreless-FIdelity, WI-FI) interface). The memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) memory, or a stable non-volatile memory (Non-Volatile Memory, NVM), such as a disk memory. The memory 1005 may optionally be a storage device independent of the foregoing processor 1001.

Those skilled in the art may understand that the structure shown in FIG. 1 does not constitute a limitation on the remote teaching device based on biometrics, and may include more or less components than the illustration, or a combination of certain components, or different Parts layout.

As shown in FIG. 1, the memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and readable instructions for distance learning based on biometrics.

In the long-distance teaching device based on biometrics shown in FIG. 1, the network interface 1004 is mainly used for data communication with the long-distance teaching platform, Internet platform, etc .; the user interface 1003 is mainly used for data interaction with the user; this application is based on biometrics The processor 1001 in the long-distance teaching device of the remote control device 1001, and the memory 1005 may be provided in a remote teaching device based on biometrics. It can read instructions and execute the remote teaching method based on biometrics provided by the embodiments of the present application.

An embodiment of the present application provides a distance learning method based on biometrics. Referring to FIG. 2, FIG. 2 is a schematic flowchart of a first embodiment of a distance learning method based on biometrics.

In this embodiment, the remote teaching method based on biometrics includes the following steps:

Step S10: Receive a learning instruction triggered by a learner, play teaching streaming media, and collect a video containing the learner's face during the playing of the teaching streaming media.

Specifically, the execution subject in this example is a terminal device capable of playing teaching streaming media, such as a learner's personal computer, smart phone, tablet computer, etc., which will not be enumerated here, and there is no restriction on this.

It should be understood that when the terminal device playing the teaching streaming media is the above-mentioned terminal device, in order to ensure that the subsequent mixed reality teaching picture can be presented in front of the learner, the learner needs to wear 3D glasses while watching the teaching streaming media Switching equipment.

In addition, in order to facilitate the user to watch, the terminal device that plays the teaching streaming media can directly use the 3D player, so that the learner does not need to wear 3D glasses, and can simply and conveniently use the interactive pen to get the objects in the subsequent mixed reality teaching screen. Interact.

In addition, the operation in step S10 is roughly as follows in practical applications:

For example, when a learner needs to use his own terminal device to watch teaching streaming media, he first presses the play button, so that the processor in the terminal device can receive the learning instruction triggered by the learner, and then control the terminal device to play according to the learning instruction The teaching streaming media, at the same time, during the playback process, the camera built in the terminal device or the camera in the room where the learner is located (the external camera and the terminal device establish a communication connection in advance), and the face containing the learner is collected in real time Video.

It should be noted that the above is only an example, and does not constitute a limitation on the technical solution of the present application. In a specific implementation, a person skilled in the art may set the above operation logic as needed, without limitation here.

Step S20: Determine the facial expression of the learner in the video.

Specifically, in practical applications, the operation of determining the facial expression of the learner in the video may be implemented by the following steps:

(1) Extract the learner's facial feature points from the video according to the facial feature detection model obtained in advance training.

It should be understood that, in order to ensure that unnecessary interference is reduced, the face image of the learner may be identified from the video according to the face detection model obtained in advance. Then, according to the facial feature detection model obtained in advance, the facial feature points of the learner are extracted from the facial image, such as the feature points of the eyes, eyebrows, mouth, jaw and other parts.

(2) According to each facial feature point, the facial area of the learner's face is divided to obtain a facial feature area corresponding to each facial feature point.

For example, there is only one facial feature point in the divided facial feature area, that is, each facial feature point is located in a facial feature area.

Or, it is stipulated that several facial feature points of the same object are located in one facial feature area, for example, all facial feature points identifying the left eyebrow are located in the same facial feature area, and all facial feature points identifying the right eyebrow are located in the same facial feature area.

It should be noted that the above is only an example, and does not constitute a limitation on the technical solution of the present application. In a specific implementation, a person skilled in the art can divide the facial area of a human face as needed, and no limitation is made here.

(3) Based on the optical flow method, the velocity vectors of the facial feature points in each facial area are determined.

It should be noted that the speed vector mentioned here is not only used to indicate the motion speed information of the corresponding facial feature point, but also used to indicate the motion direction information of the facial feature point.

In addition, regarding the method of determining the velocity vector of the facial feature points in each face area based on the optical flow method, it may specifically be that by traversing each facial feature area, the facial feature points in the current facial feature area traversed are detected in the adjacent two The intensity of pixel changes between image frames; then, based on the intensity of pixel changes, the velocity vector of facial feature points in the current facial feature area is inferred.

In addition, it is worth mentioning that, when calculating the velocity vector, the spatial position coordinates of the above facial features need to be determined according to the face key point positioning technology, and then the offset amount is determined according to the change of the position coordinates. And through the corresponding sensing device, determine the current video intensity.

For ease of understanding, specific explanations are provided below.

Suppose that the position coordinate of a certain facial feature point is P (x, y, t), the intensity is I (x, y, t), and Δx, Δy, Δt is moved between two frames. Among them, x is the abscissa, y is the ordinate, and t is the optical value. According to the constant constraint of brightness, there are:

Formula (1): I (x, y, t) = I (x + Δx, y + Δy, t + Δt);

Assuming that the movement is small, the image constraints of I (x, y, t) can be obtained using Taylor series:

Formula (2):

Among them, τ is a high-order infinitesimal. Therefore, by formulating formula (1) and formula (2), we can obtain:

Formula (3):

Formula (4)

By sorting out formula (3) and formula (4), we can obtain:

Formula (5):

Where V _x and V _y are the components of x and y, respectively, the velocity or optical flow of I (x, y, t). Therefore, when the distance Δt is between two frames, the optical value t of the above feature point is expressed as a two-dimensional velocity vector

In addition, the content that is not introduced in this embodiment can be realized by searching for relevant materials of the optical flow method, which will not be repeated here.

(4) Determine the facial expression of the learner in the video based on the velocity vector of each facial feature point.

For example, when the feature point of the upper eyelid that marks the inner corner of the eye moves downward, which causes the upper eyelid of the inner eye to lower, and the feature point that marks the mouth moves outward, causing the mouth to open wide, it can usually be considered that the learner ’s Facial expression is sleepy.

For another example, when the characteristic point of the upper lip is marked upward, the characteristic point of the lower lip and the characteristic point of the upper lip are moved upward, causing the upper lip to lift up, and the lower lip and the upper lip are tightly closed. The characteristic point of the inner corner moves toward the heart of the eyebrow, causing the inner corner of the eyebrows to wrinkle together and the eyebrows to be raised. It is generally considered that the learner's facial expression in the video is doubtful.

For another example, when the feature point that marks the corner of the lip moves backward and upward on the cheek, causing the corner of the lips to be pulled back and raised, and the feature point that marks the mouth moves outward, causing the mouth to open wide. The learner's facial expression is satisfied.

It should be noted that the above is only an example. In a specific implementation, a person skilled in the art can combine the changing features of the micro-expression and the obtained velocity vector of each facial feature point to determine the learner's face in the video Emoticons will not be repeated here, and there is no limit to this.

Step S30, according to the facial expression, determine whether the learner is in a preset inefficient learning state.

Specifically, studies have shown that when learners make sleepy and puzzled expressions and movements, the learners' current learning efficiency is low, that is, they are in a preset state of inefficient learning. Therefore, when determining whether the learner is in a preset inefficient learning state based on the facial expression, it is only necessary to determine whether the learner has made a sleepy and doubtful facial expression.

It should be noted that the above are only two specific facial expressions and body movements used to determine the inefficient learning state. In practical applications, they can be set according to micro-expression and related body linguistics. Here No restrictions.

In addition, it is worth mentioning that, in the actual teaching process, learners often have difficulty in maintaining the best learning state at all times during the long course of listening. In general, the brain will go from an efficient period of efficient learning to an inefficient period of thinking and poor learning. In the state of inefficient learning, learners often have difficulty in understanding the teaching content played in the teaching streaming media, so in this state, it is easy to produce knowledge "blind spots" (that is, for learners, it is impossible to see through, Inexplicable and unclear knowledge points). Therefore, in order to improve the quality of distance learning, it is necessary to attract the learner's attention as much as possible, so that the learner's brain can maintain an efficient learning state for a long time, or can recover from an inefficient learning state to an efficient learning state as soon as possible. After judging that the user has entered an inefficient learning state, first pause the currently playing teaching streaming media, play a light and pleasant music for the learner, or tell a small joke to give the learner a short break. Then, the operations in and after step S40 are performed.

Step S40: If the learner is in a preset low-efficiency learning state, acquire the teaching picture and teaching voice currently played by the teaching streaming media.

Specifically, when it is determined that the learner is currently in a preset inefficient learning state, in order to mobilize the enthusiasm of the learner, so that the learner can recover from the inefficient learning state to the efficient learning state as soon as possible, the The teaching pictures and teaching voices played, so that the current moment can be determined more accurately, and the knowledge points that cause the learner to be confused, so that the mixed reality processing in the subsequent step S50 is specifically for the content that causes the current learner to be confused. In order to provide different learners with the teaching methods they need in the process of distance teaching.

Step S50: Perform keyword extraction on the teaching voice, determine the object that needs to be mixed reality in the teaching picture according to the extracted keyword, and perform mixed reality processing on the object to obtain an immersive learner Of mixed reality teaching pictures to enable learners to interact with objects in the mixed reality teaching pictures.

Specifically, when extracting keywords from teaching speech, it is necessary to convert the teaching speech into text format first, and then extract keywords from the obtained text content.

For ease of understanding, the above-mentioned determination of objects in the teaching screen that need to be mixed reality is based on the extracted keywords, which will be described below by way of example.

Assuming that the learner's course is the content of contour terrain judgment in the geography course, the learner has extracted the keywords "mountain peak", "basin depression", "ridge ridge line", "valley valley line", " "Saddle", "steep cliff" and other contents are more confusing, and the pictures displayed in the teaching screen are also contour lines. In this state, the learner may not be able to imagine a specific three-dimensional picture in the brain.

At this time, you can determine the objects that need to be mixed reality in the teaching picture based on the extracted keywords.

For example, the "mountain peak" is determined as the contour map representing the mountain peak in the teaching picture. At this time, it is possible to perform virtual reality processing and augmented reality processing on the contour maps of the mountain peaks, and display the three-dimensional mountain graphics (virtual reality) on the plane graphics of the mountain peak contour lines, and at the same time, display The description information of the terrain features and the expression method used habitually. In this way, the learner can see the three-dimensional contour model, and during the viewing process, he can also make preset actions by holding the interactive pen, such as sliding to the left to rotate the contour model to the left. It is convenient for scholars to see the picture of the opposite, so as to better understand the knowledge points currently explained.

In addition, the above-mentioned mixed reality (MR) processing operation on the object may be roughly performed by first digitizing the object to obtain an image matrix corresponding to the object; then, determining the image matrix and The similarity between the image feature matrices corresponding to various types of objects obtained by pre-training; then, according to the preset filtering rules, the image feature matrix whose similarity meets the filtering rules is selected; then, according to the preset mapping relationship table To obtain the rendering model corresponding to the selected image feature matrix and the corresponding introduction information, the mapping relationship table is the correspondence between each image feature matrix and the corresponding rendering model and the corresponding introduction information; then, from the teaching Extract image data in real time from streaming media to determine the real-time position and size of the object in the image data; Finally, according to the real-time position and size of the object in the image data, superimpose on the image data in real time The rendering model and the introduction information obtain the mixed reality teaching picture.

In addition, it is worth mentioning that, in practical applications, in order to ensure the effect of the final mixed reality teaching picture, when the image data is extracted from the teaching streaming media in real time, it can be specifically accurate to the frame, that is, the frame is the unit Extract image data from the teaching streaming media in real time, so that when determining the real-time position and size of the object in the image data, you can perform image data for each frame according to the feature information of the object Feature detection to determine the real-time position and size of the object in the image data.

Through this frame-accurate processing method, the accuracy of the real-time position and size of the object in the image data determined subsequently can be effectively ensured, so that the rendering can be accurately superimposed on the image data in real time The model and the introductory information ensure a mixed reality effect.

Further, in order to ensure that the above operation can be carried out smoothly, the image feature matrix and the mapping relationship table corresponding to various types of objects used in the above operation can be constructed in advance.

For example, by acquiring a set of training sample images, the set of training sample images includes sample images corresponding to various types of objects and object categories corresponding to each sample image; taking each sample image and the object category corresponding to each sample image as input, for deep learning The model performs classification training to obtain image feature matrices corresponding to various types of objects; establishes a correspondence between image feature matrices corresponding to various types of objects, corresponding rendering models and corresponding introduction information, and generates the mapping relationship table.

It should be noted that the above is only a specific implementation manner, and does not constitute a limitation on the technical solution of the present application. In a specific implementation, a person skilled in the art can select an appropriate operation method as required to realize the teaching screen The mixed reality processing is not limited here.

From the above description, it is not difficult to find that the remote teaching method based on biometrics provided in this embodiment collects video containing the learner's face in real time while the learner is watching the teaching streaming media, and indeed learns in the video with the help of biometric technology The facial expression of the learner, and then determine whether the learner is in the preset low-efficiency learning state by analyzing the facial expression. When it is determined that the learner is in the preset low-efficiency learning state, by acquiring the teaching picture currently played by the teaching streaming media And teaching voice, and use keyword extraction technology to extract keywords from the teaching voice, and then determine the objects that need to be mixed reality in the teaching picture according to the extracted keywords, and finally use mixed reality technology to perform mixed reality processing on the determined objects. Thereby, a mixed reality teaching picture that can make the learner immersive can be obtained, so that the learner can interact with the objects in the mixed reality teaching picture while watching the teaching streaming media, thereby enhancing the participation of the learner, making Learners can better use distance teaching methods Line self-learning.

Referring to FIG. 3, FIG. 3 is a schematic flowchart of a second embodiment of a remote teaching method based on biometrics of the present application.

Based on the first embodiment described above, the remote teaching method based on biometrics in this embodiment after step S50 may further include:

Step S60: Find corresponding learning materials according to the extracted keywords, and push the found learning materials to the learner to assist the scholar to understand the knowledge points corresponding to the keywords.

Specifically, the above-mentioned operation of searching for corresponding learning materials may be based on the extracted keywords to search for corresponding learning materials in a learning case or a pre-stored learning case in the Internet or a terminal device used by the learner.

Correspondingly, after the learning materials are found, the learning materials are pushed to the learner, which may specifically be sent to the mailbox set by the user, or displayed directly on the user interface of the current terminal device for convenience Learners view and learn.

In addition, in order to enable the distance teaching method to better assist the learner in learning, after playing the teaching streaming media for the learner, a feedback portal may be provided in the user interface of the terminal device to enable the learner to make the content for the teaching ( For example, the teaching method of recording teaching streaming media, the teaching courses arranged, etc.) feedback information, so that after the user enters the feedback information and clicks the OK button on the interface, the feedback information is uploaded to the remote teaching service platform, so Not only can the teaching content be adaptively adjusted according to the content fed back by the learner, but also can be used as a reference to evaluate the teaching quality of the teacher based on the content fed back by the learner.

Further, in order to be able to provide learners with tailor-made teaching streaming media according to the needs of learners, it can be stipulated that when the teaching streaming media is recorded, the teacher can be refined to each different knowledge point, thereby enabling the remote teaching service platform The stored teaching streaming media is based on knowledge points and stored separately. In this way, when the terminal device is used to watch the teaching streaming media, the learner can first input the keywords of the content that he wants to learn, and then the terminal device sends these keywords to the distance teaching service platform, so that the distance teaching service platform according to the user Provide the keywords of the learning content, find the corresponding knowledge points and combine them to obtain the teaching streaming media that meets the user's requirements.

It should be noted that the above is only an example and does not limit the technical solution of the present application. In a specific implementation, those skilled in the art can set it according to the actual situation, and there is no limitation here.

In addition, it should be understood that in practical applications, searching for learning materials according to keywords and pushing the learning materials for learners is the same as determining the need for mixed reality in the teaching screen according to the keywords in step S50 above. Objects, operations for performing mixed reality processing on the objects can be processed in parallel, and there is no limitation here.

From the above description, it is not difficult to find that the distance learning method based on biometrics provided in this embodiment, when it is determined that the learner is in a preset low-efficiency learning state, searches for the knowledge blind spots that the learner is currently facing based on the extracted keywords Relevant learning materials, and push the found learning materials to learners, realizing immediate prompts to learners, which can help learners eliminate knowledge blind spots as early as possible, and better assist learners to use distance teaching methods for learning.

It should be noted that those of ordinary skill in the art may understand that all or part of the steps to implement the above-mentioned embodiments may be completed by hardware, or may be controlled by computer-readable instructions to control related hardware. The computer-readable instructions may be stored in In a non-volatile computer-readable storage medium, the aforementioned non-volatile readable storage medium may be a read-only memory, a magnetic disk, or an optical disk.

Referring to FIG. 4, FIG. 4 is a structural block diagram of a first embodiment of a remote teaching device based on biometrics of the present application.

As shown in FIG. 4, the remote teaching device based on biometrics proposed in the embodiment of the present application includes: a playback module 4001, an acquisition module 4002, a determination module 4003, a judgment module 4004, an acquisition module 4005, and a processing module 4006.

Among them, the playing module 4001 is used to receive the learning instruction triggered by the learner and play the teaching streaming media; the collecting module 4002 is used to collect the video containing the learner's face during the playing of the teaching streaming media; Module 4003, used to determine the facial expression of the learner in the video; Judgment module 4004, used to determine whether the learner is in a preset inefficient learning state based on the facial expression; Acquisition module 4005, used When the learner is in a preset low-efficiency learning state, the teaching picture and teaching voice currently played by the teaching streaming media are obtained; the processing module 4006 is used for keyword extraction of the teaching voice, according to the extracted Keywords identify the objects in the teaching picture that need to be mixed reality, and the mixed reality processing is performed on the objects to obtain a mixed reality teaching picture that can make the learner immersive, so that the learner can interact with the mixed reality The objects in the teaching picture interact.

It should be understood that in practical applications, the operation of the determination module 4003 to determine the facial expression of the learner in the video may be specifically implemented based on the face recognition technology in biometrics.

For example, before determining the facial expression of the learner in the video, the face sample data stored in the big data platform is trained based on the face feature detection method in face recognition technology to obtain a face feature detection model, Therefore, when determining the facial expression of the learner in the video, the facial feature points of the learner can be extracted from the video according to the facial feature detection model obtained in advance training.

Next, in order to facilitate the later determination of the change of each facial feature point, the facial area of the learner's face may be divided according to each facial feature point to obtain a facial feature area corresponding to each facial feature point.

After completing the operation of dividing the facial area of the human face, based on the optical flow method, the velocity vectors of the facial feature points in each facial area are determined.

For the specific calculation method, reference may be made to the related calculation formula of the optical flow method, which will not be repeated here.

Finally, the facial expressions of the learner in the video may be determined based on the obtained velocity vectors of facial feature points.

It should be noted that the above is only a specific implementation manner, and does not constitute a limitation on the technical solution of the present application. In a specific implementation, a person skilled in the art may select an appropriate face recognition method to determine learning as required The facial expressions of the person are not limited here.

In addition, in order to facilitate understanding, the module 4006 performs mixed reality processing on the object to obtain an operation of a mixed reality teaching screen that can enable the learner to be immersed, which will be described in detail below.

For example, first digitize the object to obtain the image matrix corresponding to the object; then, determine the similarity between the image matrix and the image feature matrix corresponding to various types of objects obtained by pre-training; Set the filtering rules to select the image feature matrix whose similarity meets the filtering rule; then, according to the preset mapping relationship table, obtain the rendering model corresponding to the selected image feature matrix and the corresponding introduction information, the mapping relationship The table is the correspondence between each image feature matrix and the corresponding rendering model and corresponding introduction information; then, image data is extracted from the teaching streaming media in real time to determine the real-time position of the object in the image data and Size; Finally, according to the real-time position and size of the object in the image data, the rendering model and the introduction information are superimposed on the image data in real time to obtain the mixed reality teaching picture.

In addition, it is worth mentioning that, in practical applications, in order to ensure the effect of the final mixed reality teaching picture, when the image data is extracted from the teaching streaming media in real time, it can be specifically accurate to the frame, that is, the frame is the unit Extract image data from the teaching streaming media in real time, so that when determining the real-time position and size of the object in the image data, you can perform image data for each frame according to the object's feature information Feature detection to determine the real-time position and size of the object in the image data.

It is not difficult to find from the above description that the biometrics-based remote teaching device provided in this embodiment collects videos containing the learner's face in real time while the learner is watching the teaching streaming media, and indeed learns from the video with the help of biometric technology The facial expression of the learner, and then determine whether the learner is in the preset low-efficiency learning state by analyzing the facial expression. When it is determined that the learner is in the preset low-efficiency learning state, by acquiring the teaching picture currently played by the teaching streaming media And teaching voice, and use keyword extraction technology to extract keywords from the teaching voice, and then determine the objects that need to be mixed reality in the teaching picture according to the extracted keywords, and finally use mixed reality technology to perform mixed reality processing on the determined objects. Thereby, a mixed reality teaching picture that can make the learner immersive can be obtained, so that the learner can interact with the objects in the mixed reality teaching picture while watching the teaching streaming media, thereby enhancing the participation of the learner, making Learners can better use distance teaching methods Line self-learning.

It should be noted that the workflow described above is only schematic and does not limit the scope of protection of this application. In practical applications, those skilled in the art can select some or all of them according to actual needs. The purpose of the solution of this embodiment is not limited here.

In addition, for technical details that are not described in detail in this embodiment, reference may be made to the remote teaching method based on biometrics provided in any embodiment of this application, and details are not described herein again.

Based on the above-mentioned first embodiment of the biometrics-based remote teaching device, the second embodiment of the biometrics-based remote teaching device of the present application is proposed.

In this embodiment, the remote teaching device based on biometrics further includes a search module and a push module.

Wherein, the search module is used to search corresponding learning materials according to the extracted keywords; the push module is used to push the found learning materials to the learners to assist scholars to understand the key Knowledge points corresponding to words.

It is not difficult to find from the above description that the biometrics-based remote teaching device provided in this embodiment, when it is determined that the learner is in a preset low-efficiency learning state, searches for the blind spots with the learner's current knowledge based on the extracted keywords Relevant learning materials, and push the found learning materials to learners, realizing immediate prompts to learners, which can help learners eliminate knowledge blind spots as early as possible, and better assist learners to use distance teaching methods for learning.

Claims

A remote teaching method based on biometrics, the method includes:

Receiving a learning instruction triggered by a learner, playing teaching streaming media, and collecting videos containing the learner's face during the playing of the teaching streaming media;

Determine the facial expression of the learner in the video;

According to the facial expression, determine whether the learner is in a preset inefficient learning state;

If the learner is in a preset low-efficiency learning state, acquire the teaching picture and teaching voice currently played by the teaching streaming media;

Perform keyword extraction on the teaching speech, determine the objects that need to be mixed reality in the teaching picture according to the extracted keywords, and perform mixed reality processing on the objects to obtain mixed reality that can make the learner immersive Teaching pictures to enable learners to interact with objects in the mixed reality teaching pictures.
The method of claim 1, the step of determining the facial expression of the learner in the video comprises:

Extract the learner's facial feature points from the video according to the facial feature detection model obtained in advance training;

Divide the facial area of the learner's face according to each facial feature point to obtain a facial feature area corresponding to each facial feature point;

Based on the optical flow method, the velocity vectors of the facial feature points in each facial area are determined, and the velocity vectors are used to represent the movement speed information and the movement direction information of each facial feature point;

The facial expression of the learner in the video is determined according to the velocity vector of each facial feature point.
The method of claim 2, the step of determining the velocity vector of facial feature points in each facial area based on the optical flow method includes:

Traverse each facial feature area, and detect the pixel change intensity of the facial feature points in the current facial feature area between two adjacent image frames;

According to the intensity of the pixel change, the velocity vector of the facial feature point in the current facial feature area is inferred.
The method according to claim 1, wherein the step of performing mixed reality processing on the object to obtain a mixed reality teaching picture that enables the learner to be immersive includes:

Digitizing the object to obtain an image matrix corresponding to the object;

Determine the similarity between the image matrix and the image feature matrix corresponding to various types of objects obtained by pre-training;

According to a preset filtering rule, the image feature matrix whose similarity meets the filtering rule is selected;

According to a preset mapping relationship table, the rendering model corresponding to the selected image feature matrix and corresponding introduction information are obtained. The mapping relationship table is a correspondence between each image feature matrix and the corresponding rendering model and corresponding introduction information. ;

Extract image data from the teaching streaming media in real time to determine the real-time position and size of the object in the image data;

According to the real-time position and size of the object in the image data, the rendering model and the introduction information are superimposed on the image data in real time to obtain the mixed reality teaching picture.
The method of claim 4, the step of extracting image data from the teaching streaming media in real time and determining the real-time position and size of the object in the image data includes:

Extract image data in real time from the teaching streaming media in units of frames;

According to the feature information of the object, perform feature detection on each frame of the image data to determine the real-time position and size of the object in the image data.
The method according to claim 4, before the step of performing mixed reality processing on the object to obtain a mixed reality teaching picture that enables the learner to be immersive, the method further includes:

Obtain a training sample image set, where the training sample image set includes sample images corresponding to various types of objects and object categories corresponding to each sample image;

Take each sample image and the object category corresponding to each sample image as input, perform classification training on the deep learning model, and obtain the image feature matrix corresponding to various objects;

A correspondence relationship between an image feature matrix corresponding to various objects, a corresponding rendering model and corresponding introduction information is established, and the mapping relationship table is generated.
The method according to claim 1, after the step of obtaining a mixed reality teaching picture that enables the learner to be immersive, the method further comprises:

Find corresponding learning materials according to the extracted keywords, and push the found learning materials to the learners to assist the scholar to understand the knowledge points corresponding to the keywords.
A remote teaching device based on biometrics, the device includes:

Play module, used to receive learning instructions triggered by learners and play teaching streaming media;

A collection module, configured to collect a video containing the learner's face during the playback of the teaching streaming media;

A determination module for determining the facial expression of the learner in the video;

The judgment module is used to judge whether the learner is in a preset low-efficiency learning state according to the facial expression;

An obtaining module, configured to obtain the teaching picture and teaching voice currently played by the teaching streaming media when the learner is in a preset low-efficiency learning state;

The processing module is used for keyword extraction of the teaching speech, determining the object that needs to be mixed reality in the teaching picture according to the extracted keyword, and performing mixed reality processing on the object to obtain the learner's presence The mixed reality teaching picture in the context, so that the learner can interact with the objects in the mixed reality teaching picture.
A remote teaching device based on biometrics, the device includes: a memory, a processor, and a readable instruction for remote teaching based on biometrics that is stored on the memory and can run on the processor, based on The readable instructions for distance learning in biometrics are configured to implement the following steps:

Receiving a learning instruction triggered by a learner, playing teaching streaming media, and collecting videos containing the learner's face during the playing of the teaching streaming media;

Determine the facial expression of the learner in the video;

According to the facial expression, determine whether the learner is in a preset inefficient learning state;

If the learner is in a preset low-efficiency learning state, acquire the teaching picture and teaching voice currently played by the teaching streaming media;

Perform keyword extraction on the teaching speech, determine the objects that need to be mixed reality in the teaching picture according to the extracted keywords, and perform mixed reality processing on the objects to obtain mixed reality that can make the learner immersive Teaching pictures to enable learners to interact with objects in the mixed reality teaching pictures.
The distance learning device based on biometric recognition according to claim 9, the step of determining the facial expression of the learner in the video includes:

Extract the learner's facial feature points from the video according to the facial feature detection model obtained in advance training;

Divide the facial area of the learner's face according to each facial feature point to obtain a facial feature area corresponding to each facial feature point;

Based on the optical flow method, the velocity vectors of the facial feature points in each facial area are determined, and the velocity vectors are used to represent the movement speed information and the movement direction information of each facial feature point;

The facial expression of the learner in the video is determined according to the velocity vector of each facial feature point.
The distance learning device based on biometric recognition according to claim 10, the step of determining the velocity vector of facial feature points in each facial area based on the optical flow method includes:

Traverse each facial feature area, and detect the pixel change intensity of the facial feature points in the current facial feature area between two adjacent image frames;

According to the intensity of the pixel change, the velocity vector of the facial feature point in the current facial feature area is inferred.
The biometrics-based remote teaching device according to claim 9, the step of performing mixed reality processing on the object to obtain a mixed reality teaching picture that can enable the learner to be immersive includes:

Digitizing the object to obtain an image matrix corresponding to the object;

Determine the similarity between the image matrix and the image feature matrix corresponding to various types of objects obtained by pre-training;

According to a preset filtering rule, the image feature matrix whose similarity meets the filtering rule is selected;

According to a preset mapping relationship table, the rendering model corresponding to the selected image feature matrix and corresponding introduction information are obtained. The mapping relationship table is a correspondence between each image feature matrix and the corresponding rendering model and corresponding introduction information. ;

Extract image data from the teaching streaming media in real time to determine the real-time position and size of the object in the image data;

According to the real-time position and size of the object in the image data, the rendering model and the introduction information are superimposed on the image data in real time to obtain the mixed reality teaching picture.
The remote teaching device based on biometrics according to claim 12, the step of extracting image data from the teaching streaming media in real time and determining the real-time position and size of the object in the image data includes:

Extract image data in real time from the teaching streaming media in units of frames;

According to the feature information of the object, perform feature detection on each frame of the image data to determine the real-time position and size of the object in the image data.
According to the biometrics-based remote teaching device of claim 12, before the step of performing mixed reality processing on the object to obtain a mixed reality teaching picture that enables the learner to be immersive, the method further includes:

Obtain a training sample image set, where the training sample image set includes sample images corresponding to various types of objects and object categories corresponding to each sample image;

Take each sample image and the object category corresponding to each sample image as input, perform classification training on the deep learning model, and obtain the image feature matrix corresponding to various objects;

A correspondence relationship between an image feature matrix corresponding to various objects, a corresponding rendering model and corresponding introduction information is established, and the mapping relationship table is generated.
The distance learning device based on biometrics according to claim 9, after the step of obtaining a mixed reality teaching picture that enables the learner to be immersive, the method further includes:

Find corresponding learning materials according to the extracted keywords, and push the found learning materials to the learners to assist the scholar to understand the knowledge points corresponding to the keywords.
A storage medium storing readable instructions for distance learning based on biometrics on a storage medium, the readable instructions for distance learning based on biometrics being executed by a processor to implement the following steps:

Receiving a learning instruction triggered by a learner, playing teaching streaming media, and collecting videos containing the learner's face during the playing of the teaching streaming media;

Determine the facial expression of the learner in the video;

According to the facial expression, determine whether the learner is in a preset inefficient learning state;

If the learner is in a preset low-efficiency learning state, acquire the teaching picture and teaching voice currently played by the teaching streaming media;

Perform keyword extraction on the teaching speech, determine the objects that need to be mixed reality in the teaching picture according to the extracted keywords, and perform mixed reality processing on the objects to obtain mixed reality that can make the learner immersive Teaching pictures to enable learners to interact with objects in the mixed reality teaching pictures.
The storage medium of claim 16, the step of determining the facial expression of the learner in the video comprises:

Extract the learner's facial feature points from the video according to the facial feature detection model obtained in advance training;

Divide the facial area of the learner's face according to each facial feature point to obtain a facial feature area corresponding to each facial feature point;

Based on the optical flow method, the velocity vectors of the facial feature points in each facial area are determined, and the velocity vectors are used to represent the movement speed information and the movement direction information of each facial feature point;

The facial expression of the learner in the video is determined according to the velocity vector of each facial feature point.
The storage medium according to claim 17, wherein the step of determining the velocity vector of the facial feature points in each facial area based on the optical flow method includes:

Traverse each facial feature area, and detect the pixel change intensity of the facial feature points in the current facial feature area between two adjacent image frames;

According to the intensity of the pixel change, the velocity vector of the facial feature point in the current facial feature area is inferred.
The storage medium according to claim 16, wherein the step of performing mixed reality processing on the object to obtain a mixed reality teaching picture that enables the learner to be immersive includes:

Digitizing the object to obtain an image matrix corresponding to the object;

Determine the similarity between the image matrix and the image feature matrix corresponding to various types of objects obtained by pre-training;

According to a preset filtering rule, the image feature matrix whose similarity meets the filtering rule is selected;

According to a preset mapping relationship table, the rendering model corresponding to the selected image feature matrix and corresponding introduction information are obtained. The mapping relationship table is a correspondence between each image feature matrix and the corresponding rendering model and corresponding introduction information. ;

Extract image data from the teaching streaming media in real time to determine the real-time position and size of the object in the image data;

According to the real-time position and size of the object in the image data, the rendering model and the introduction information are superimposed on the image data in real time to obtain the mixed reality teaching picture.
The storage medium according to claim 16, after the step of obtaining a mixed reality teaching picture that enables the learner to be immersive, the method further includes:

Find corresponding learning materials according to the extracted keywords, and push the found learning materials to the learners to assist the scholar to understand the knowledge points corresponding to the keywords.