CN112464020B

CN112464020B - Network classroom information processing method and system and computer readable storage medium

Info

Publication number: CN112464020B
Application number: CN202011327166.6A
Authority: CN
Inventors: 李璐; 冯文澜
Original assignee: Chengdu Suirui Cloud Technology Co ltd; Suirui Technology Group Co Ltd
Current assignee: Chengdu Suirui Cloud Technology Co ltd; Suirui Technology Group Co Ltd
Priority date: 2020-11-24
Filing date: 2020-11-24
Publication date: 2022-04-29
Anticipated expiration: 2040-11-24
Also published as: CN112464020A

Abstract

The invention discloses a method and a system for processing online classroom information and a computer readable storage medium, wherein the method comprises the following steps: performing voice recognition on the teacher client in real time; acquiring face information of students in a video picture; judging micro-expression characteristics of the students according to the face information of the students; when the emotion corresponding to the micro expression characteristics of the student is judged to be puzzled, sending alarm information, and recording a timestamp for sending the alarm information; after the alarm information is received, capturing teacher voice data in a first preset time period before the time stamp and teacher voice data in a second preset time period after the time stamp; and intercepting part of voice data from the intercepted teacher voice data according to a preset rule. According to the invention, through the information processing of the online classroom, the teaching contents of personal confusion of students are extracted in a targeted manner, so that the students can review the teaching contents in a targeted manner after class, and the online teaching effect is improved.

Description

Network classroom information processing method and system and computer readable storage medium

Technical Field

The present invention relates to the field of video communication technologies, and in particular, to a method and a system for processing network classroom information and a computer-readable storage medium.

Background

With the development of internet technology, network teaching is more and more popular because of the advantages of convenience, time saving, remote education realization and the like. However, compared with the traditional face-to-face teaching, the network teaching has poor learning atmosphere, can not ensure that students can learn carefully or master knowledge points, and still has a certain gap in teaching effect at present.

The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

Disclosure of Invention

The invention aims to provide an online classroom information processing method and system and a computer readable storage medium, which judge the lecture listening condition of students by expression recognition technology, extract the individual confusing teaching contents of the students in a targeted manner by the information processing of the online classroom, and provide the students to review the individual post-classroom in a targeted manner, thereby improving the online teaching effect to a certain extent.

In order to achieve the above object, the present invention provides a method for processing network classroom information, which comprises: performing voice recognition on the teacher client in real time; acquiring face information of students in a video picture; judging micro-expression characteristics of the students according to the face information of the students; when the emotion corresponding to the micro expression characteristics of the student is judged to be puzzled, sending alarm information, and recording a timestamp for sending the alarm information; after the alarm information is received, capturing teacher voice data in a first preset time period before the time stamp and teacher voice data in a second preset time period after the time stamp; and intercepting part of voice data from the intercepted teacher voice data according to a preset rule.

In one embodiment of the present invention, the determining the micro-expression characteristics of the student according to the face information of the student comprises: and judging whether the micro-expression characteristics of the student appear according to the pixel change conditions of one or more areas of the face of the student.

In an embodiment of the present invention, intercepting a part of the voice data from the intercepted teacher voice data according to a preset rule includes: judging whether the intercepted teacher voice data meets any two of three conditions, wherein the three conditions are respectively as follows: the intercepted teacher voice data contains a first class of keywords, the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold, and the intercepted teacher voice data contains a second class of keywords and the time of not generating any voice data after the second class of keywords is not less than a second preset threshold; and if any two conditions of the three conditions are met, intercepting partial voice data from the intercepted voice data of the teacher.

In an embodiment of the present invention, when a first type of keyword appears in the captured teacher voice data and the frequency of appearance of a same vocabulary in the captured teacher voice data is not less than a first preset threshold, a part of voice data is captured from the captured teacher voice data, where the part of voice data includes the first type of keyword and the same vocabulary whose frequency of appearance is not less than the first preset threshold.

In an embodiment of the present invention, when a first type of keyword occurs in the intercepted teacher voice data, and a second type of keyword occurs in the intercepted teacher voice data and a time during which no voice data is generated after the second type of keyword occurs is not less than a second preset threshold, a part of voice data is intercepted from the intercepted teacher voice data, where the part of voice data includes the first type of keyword and the second type of keyword.

In an embodiment of the present invention, when the occurrence frequency of the same vocabulary in the intercepted teacher speech data is not less than a first preset threshold, and the occurrence frequency of the second type of keyword in the intercepted teacher speech data and the time during which no speech data is generated after the second type of keyword occurs are not less than a second preset threshold, a part of speech data is intercepted from the intercepted teacher speech data, where the part of speech data includes the same vocabulary and the second type of keyword, the occurrence frequency of which is not less than the first preset threshold.

In an embodiment of the present invention, the online classroom information processing method further includes: converting the part of voice data into text content; extracting content corresponding to the text content from a first document of the current teaching of the teacher; storing the extracted content in a second document; and sending the second document to the client of the student.

Based on the same inventive concept, the invention also provides a network classroom information processing system. The voice recognition module is used for performing voice recognition on the teacher client in real time; the face information acquisition module is used for acquiring face information of students in the video pictures; the micro-expression characteristic judging module is coupled with the face information acquiring module and is used for judging micro-expression characteristics of the students according to the face information of the students, and when the emotion corresponding to the micro-expression characteristics of the students is judged to be puzzled, alarming information is sent out, and a timestamp for sending the alarming information is recorded; the first voice data intercepting module is coupled with the voice recognition module and the micro expression characteristic judging module and is used for intercepting teacher voice data in a first preset time period before the timestamp and teacher voice data in a second preset time period after the timestamp after the alarm information is received; and the second voice data intercepting module is coupled with the first voice data intercepting module and is used for intercepting part of voice data from the intercepted teacher voice data according to a preset rule.

In an embodiment of the present invention, the second voice data capturing module is configured to determine whether the captured teacher voice data meets any two of three conditions, where the three conditions are: the intercepted teacher voice data contains a first class of keywords, the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold, and the intercepted teacher voice data contains a second class of keywords and the time of not generating any voice data after the second class of keywords is not less than a second preset threshold; and if any two conditions of the three conditions are met, intercepting partial voice data from the intercepted voice data of the teacher.

Based on the same inventive concept, the present invention also provides a computer-readable storage medium storing a computer program for executing the network classroom information processing method according to any one of the above embodiments.

Compared with the prior art, according to the online classroom information processing method and system and the computer readable storage medium, the voice recognition technology is used for detecting the voice of a teacher, the student listening condition is judged through the expression recognition technology, the key content of teaching of the teacher is grasped through the keyword information in the preset word bank, the teaching content of personal confusion of the student is extracted in a targeted mode, preferably, the extracted teaching content is generated into a personalized document for the student to review with the key point after the student goes back, so that the learning is more convenient and faster, the time can be saved, the self learning requirements of different students can be met, and the online teaching effect is improved to a certain extent.

Drawings

Fig. 1 is a block diagram of steps of a method for processing webclass information according to an embodiment of the present invention;

fig. 2 is a block diagram of a network classroom information processing system according to an embodiment of the present invention.

Detailed Description

The following detailed description of the present invention is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the specific embodiments.

Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.

In order to improve the effect of network teaching, an inventor conducts careful analysis on the teaching field, and finds that a teacher can leave homework to detect whether students master knowledge points after on-site teaching, and test papers at the moment are the same and cannot be treated differently aiming at the students with good foundation or mastered, and only all students can spend time to answer the test papers, so that the time of a part of students is wasted, the knowledge points of one test paper cannot cover all the knowledge points, different emphasis points are not given aiming at different students, and the waste of time cost is caused because the mastering conditions of the students cannot be judged. It has also been found that in on-site teaching, a student who has encountered a meeting may be written or otherwise recorded to delay the subsequent lessons from hearing.

Through the above discovered problems, the inventor pays careful thought that whether the content spoken by the teacher is mastered or not can be preliminarily judged according to the expressions of the students, at the moment, the system automatically records the time period of doubtful and dysphoric existence in the expressions of the students, the knowledge point spoken by the teacher is intercepted through the timestamp, the knowledge point can be extracted according to the keyword lexicon, and the range covered by the knowledge point is specifically determined by the questioning words, the occurrence frequency of the vocabularies and the pause duration. Therefore, the knowledge which is not mastered by each student can be analyzed from the question bank in a targeted manner, and then the test paper is output for detection, and the students are instructed to master, namely, each student has a teacher to specially instruct the individual learning condition.

Based on the above thought, the invention provides a method and a system for processing online classroom information and a computer readable storage medium, which are mainly divided into three parts of expression recognition, voice recognition and condition retrieval. Facial expression recognition is carried out on all students in a classroom through an expression recognition technology, and the expression states of the students are judged. The voice recognition mainly records the content of the lectures of the teacher, when the voice detection of the teacher client is started and simultaneously detects that the facial expressions of students are in doubt or abnormal, the system gives an alarm, and judges the content of the knowledge point according to the lecture content through the timestamp, so that the teaching content of personal confusion of the students is pertinently extracted for the students to review pointedly after class, and the effect of network teaching is improved to a certain extent.

Fig. 1 is a network classroom information processing method according to an embodiment of the present invention. The network classroom information processing method comprises the following steps: step S1 to step S6.

The teacher client is subjected to speech recognition in real time in step S1.

Face information of the student in the video picture is acquired in step S2.

In step S3, the micro-expression feature of the student is determined according to the face information of the student. Specifically, whether the student has micro-expression features can be judged according to pixel change conditions of one or more areas of the face of the student.

When the emotion corresponding to the micro-expression feature of the student is judged to be puzzled in step S4, alarm information is sent out, and a timestamp for sending the alarm information is recorded. For example, when the micro-expression is characterized by frowning, it indicates a confusing emotion that the student may be experiencing.

After receiving the alarm information in step S5, the teacher voice data in the first preset time period before the time stamp and the teacher voice data in the second preset time period after the time stamp are intercepted. Optionally, teacher voice data is intercepted 5 minutes before and after the timestamp.

A part of the voice data is intercepted from the intercepted teacher voice data according to a preset rule in step S6. Specifically, the step S6 of intercepting part of the voice data from the intercepted teacher voice data according to a preset rule includes: judging whether the intercepted teacher voice data meets any two of three conditions, wherein the three conditions are respectively as follows: the intercepted teacher voice data contains a first class of keywords, the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold, and the intercepted teacher voice data contains a second class of keywords and the time of not generating any voice data after the second class of keywords is not less than a second preset threshold; and if any two conditions of the three conditions are met, intercepting partial voice data from the intercepted voice data of the teacher. The first category of keywords are words representing hierarchical relationships, such as: "one", "two", "first point", "next", etc. The second type of keywords are spoken words used for transition between two knowledge points commonly used by the teacher or spoken words after one knowledge point is spoken, such as "how to understand", "question", "understand", and in general, after the spoken words are spoken, the teacher speaks for a pause, a long time or a short time, and optionally, the second preset threshold may be set to 5 seconds. Alternatively, the first preset threshold may be set to 2 times. In addition, it should be noted that all the types of keywords described herein are stored in a preset keyword lexicon.

Specifically, when a first type of keyword appears in the intercepted teacher voice data and the frequency of appearance of the same vocabulary in the intercepted teacher voice data is not less than a first preset threshold, intercepting part of voice data from the intercepted teacher voice data, wherein the part of voice data comprises the first type of keyword and the same vocabulary of which the frequency of appearance is not less than the first preset threshold.

And when a first class of keywords appear in the intercepted teacher voice data, and a second class of keywords appear in the intercepted teacher voice data, and the time for not generating any voice data after the second class of keywords appear is not less than a second preset threshold value, intercepting partial voice data from the intercepted teacher voice data, wherein the partial voice data comprises the first class of keywords and the second class of keywords.

When the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold value, and the intercepted time when a second class keyword appears in the teacher voice data and no voice data is generated after the second class keyword appears is not less than a second preset threshold value, intercepting partial voice data from the intercepted teacher voice data, wherein the partial voice data comprises the same vocabulary with the appearance frequency not less than the first preset threshold value and the second class keyword.

For example, a teacher currently speaks "Her sons are most important in Her life; the comparison levels of importants are more importants and most importants, Major, We exemplify We have left accounted Major schemes, the difference between these two words is … …, which is the usage of the two words that are different, you can understand how, next We say the usage of the next knowledge point, many and much ". The analysis that comes, when the teacher said this word of important, this expression of frown has appeared in student king a certain, the system reports to the police, begin speech recognition, intercept teacher's speech data, and in the teacher's speech data of interception, through judging the discovery, important appears and is not less than the cubic, major appears and is not less than twice, and first class keyword "next" has appeared, second class keyword "has appeared still" has understood ", and the teacher has said" has understood ", has stopped for 6 seconds again, then will include first class keyword in the teacher's speech data of current interception, second class keyword, important, major's whole section speech data intercepting. Preferably, in order to simplify the voice data, the third category keywords in the voice data captured twice may be filtered. The third category of keywords is some conjunctions, such as "just is", "because", "then", etc.

Preferably, in order to generate a personalized learning document, the online classroom information processing method of an embodiment further includes: converting the part of voice data into text content; extracting content corresponding to the text content from a first document of the current teaching of the teacher; and storing the extracted content into a second document. The first document can be a question bank, a teacher matches the question bank with the intercepted partial voice data, questions puzzled by students in the question bank are extracted, and a second document or a test paper is generated independently. The online classroom is more intelligent, and the advantages of the online classroom can be effectively brought into play.

Based on the same inventive concept, the present invention further provides an online classroom information processing system, as shown in fig. 2, an embodiment of the online classroom information processing system includes: the system comprises a voice recognition module 10, a face information acquisition module 11, a micro expression characteristic judgment module 12, a first voice data interception module 13 and a second voice data interception module 14.

The voice recognition module 10 is used for performing voice recognition on the teacher client in real time.

The face information acquiring module 11 is configured to acquire face information of a student in a video frame.

The micro-expression characteristic judging module 12 is coupled with the face information acquiring module 11, and is configured to judge the micro-expression characteristics of the student according to the face information of the student, and when the emotion corresponding to the micro-expression characteristics of the student is judged to be puzzled, send alarm information, and record a timestamp for sending the alarm information. Whether the micro-expression features of the students appear or not can be judged according to the pixel change conditions of one or more areas of the faces of the students.

The first voice data intercepting module 13 is coupled to the voice recognition module 10 and the micro-expression characteristic judging module 12, and is configured to intercept, after receiving the alarm information, teacher voice data in a first preset time period before the timestamp and teacher voice data in a second preset time period after the timestamp.

The second voice data intercepting module 14 is coupled to the first voice data intercepting module 13, and is configured to intercept part of the voice data from the intercepted teacher voice data according to a preset rule.

Specifically, the second voice data intercepting module 14 is configured to determine whether the intercepted teacher voice data meets any two of three conditions, where the three conditions are: the intercepted teacher voice data contains a first class of keywords, the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold, and the intercepted teacher voice data contains a second class of keywords and the time of not generating any voice data after the second class of keywords is not less than a second preset threshold; and if any two conditions of the three conditions are met, intercepting partial voice data from the intercepted voice data of the teacher.

Specifically, the second voice data intercepting module 14 is configured to intercept, when a first type of keyword appears in the intercepted teacher voice data and the frequency of appearance of the same vocabulary in the intercepted teacher voice data is not less than a first preset threshold, a part of voice data from the intercepted teacher voice data, where the part of voice data includes the first type of keyword and the same vocabulary whose frequency of appearance is not less than the first preset threshold.

The second voice data intercepting module 14 is configured to intercept, when a first type of keyword occurs in the intercepted teacher voice data, and a second type of keyword occurs in the intercepted teacher voice data and a time when no voice data is generated after the second type of keyword occurs is not less than a second preset threshold, a part of voice data from the intercepted teacher voice data, where the part of voice data includes the first type of keyword and the second type of keyword.

The second voice data intercepting module 14 is configured to intercept part of the voice data from the intercepted teacher voice data when the frequency of occurrence of the same vocabulary in the intercepted teacher voice data is not less than a first preset threshold value, and the intercepted time of occurrence of a second type of keyword in the teacher voice data and occurrence of the second type of keyword after no generation of any voice data is not less than a second preset threshold value, where the part of voice data includes the same vocabulary of which the frequency of occurrence is not less than the first preset threshold value and the second type of keyword.

Preferably, in order to simplify the voice data, the third category keywords in the part of the voice data which is cut out twice may be filtered. The third category of keywords is some conjunctions, such as "just is", "because", "then", etc.

Preferably, in order to generate a personalized learning document, the online classroom information processing system of an embodiment further includes: the second document generation module is used for converting the part of voice data into text content; extracting content corresponding to the text content from a first document of the current teaching of the teacher; and storing the extracted content into a second document.

Based on the same inventive concept, the present embodiment further provides a computer-readable storage medium, where a computer program is stored, and the computer program is configured to execute the network classroom information processing method according to any one of the above embodiments.

In summary, according to the online classroom information processing method and system and the computer-readable storage medium of the embodiments, the voice recognition technology is used to detect the teacher voice, the expression recognition technology is used to determine the student listening situation, the key content of teaching by the teacher is grasped through the keyword information in the preset lexicon, the teaching content of personal confusion of the student is specifically extracted, and preferably, the extracted teaching content is further generated into a personalized document for the student to review with the key point after the student is in class, so that the learning is more convenient and faster, the time can be saved, the self-learning requirements of different students can be met, and the online teaching effect is improved to a certain extent.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims

1. A network classroom information processing method is characterized by comprising the following steps:

performing voice recognition on the teacher client in real time;

acquiring face information of students in a video picture;

judging micro-expression characteristics of the students according to the face information of the students;

when the emotion corresponding to the micro expression characteristics of the student is judged to be puzzled, sending alarm information, and recording a timestamp for sending the alarm information;

after the alarm information is received, capturing teacher voice data in a first preset time period before the time stamp and teacher voice data in a second preset time period after the time stamp;

intercepting part of voice data from the intercepted teacher voice data according to a preset rule, wherein the voice data comprises the following steps: judging whether the intercepted teacher voice data meets any two of three conditions, wherein the three conditions are respectively as follows: the intercepted teacher voice data contains a first class of keywords, the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold, and the intercepted teacher voice data contains a second class of keywords and the time of not generating any voice data after the second class of keywords is not less than a second preset threshold;

if any two conditions of the three conditions are met, intercepting partial voice data from the intercepted voice data of the teacher;

converting the part of voice data into text content;

extracting content corresponding to the text content from a first document of the current teaching of the teacher;

storing the extracted content in a second document;

and sending the second document to the client of the student.

2. The online classroom information processing method of claim 1, wherein the determining micro-expression characteristics of the student based on the face information of the student comprises:

and judging whether the micro-expression characteristics of the student appear according to the pixel change conditions of one or more areas of the face of the student.

3. The network classroom information processing method of claim 1, wherein when a first type of keyword occurs in the captured teacher speech data and the frequency of occurrence of the same vocabulary in the captured teacher speech data is not less than a first preset threshold, a part of speech data is captured from the captured teacher speech data, wherein the part of speech data comprises the first type of keyword and the same vocabulary whose frequency of occurrence is not less than the first preset threshold.

4. The network classroom information processing method of claim 1, wherein when a first type of keyword occurs in the captured teacher voice data and a second type of keyword occurs in the captured teacher voice data and a time during which no voice data is generated after the second type of keyword occurs is not less than a second preset threshold, a portion of voice data is captured from the captured teacher voice data, wherein the portion of voice data includes the first type of keyword and the second type of keyword.

5. The network classroom information processing method of claim 1, wherein when the frequency of occurrence of the same vocabulary in the captured teacher voice data is not less than a first preset threshold, and the time during which no voice data is generated after the occurrence of the second type of keyword and the second type of keyword in the captured teacher voice data is not less than a second preset threshold, a part of voice data is captured from the captured teacher voice data, wherein the part of voice data includes the same vocabulary and the second type of keyword, the frequency of occurrence of which is not less than the first preset threshold.

6. An online classroom information processing system, comprising:

the voice recognition module is used for carrying out voice recognition on the teacher client in real time;

the face information acquisition module is used for acquiring face information of students in the video pictures;

the micro-expression characteristic judgment module is coupled with the face information acquisition module and used for judging micro-expression characteristics of the students according to the face information of the students, and when the emotion corresponding to the micro-expression characteristics of the students is judged to be puzzled, alarming information is sent out, and a timestamp for sending the alarming information is recorded;

the first voice data intercepting module is coupled with the voice recognition module and the micro expression characteristic judging module and is used for intercepting teacher voice data in a first preset time period before the timestamp and teacher voice data in a second preset time period after the timestamp after the alarm information is received;

the second voice data intercepting module is coupled with the first voice data intercepting module and used for intercepting part of voice data from the intercepted teacher voice data according to a preset rule;

the second voice data intercepting module is used for judging whether the intercepted teacher voice data meets any two of three conditions, wherein the three conditions are as follows: the intercepted teacher voice data contains a first class of keywords, the intercepted frequency of the same vocabulary in the teacher voice data is not less than a first preset threshold, and the intercepted teacher voice data contains a second class of keywords and the time of not generating any voice data after the second class of keywords is not less than a second preset threshold; if any two conditions of the three conditions are met, intercepting partial voice data from the intercepted voice data of the teacher;

converting the part of voice data into text content;

storing the extracted content in a second document;

and sending the second document to the client of the student.

7. A computer-readable storage medium storing a computer program for executing the network classroom information processing method according to any one of claims 1 through 5.