CN113723233A - Student learning participation degree evaluation method based on layered time sequence multi-example learning - Google Patents
Student learning participation degree evaluation method based on layered time sequence multi-example learning Download PDFInfo
- Publication number
- CN113723233A CN113723233A CN202110942289.9A CN202110942289A CN113723233A CN 113723233 A CN113723233 A CN 113723233A CN 202110942289 A CN202110942289 A CN 202110942289A CN 113723233 A CN113723233 A CN 113723233A
- Authority
- CN
- China
- Prior art keywords
- video
- level
- learning
- segment
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 25
- 238000000034 method Methods 0.000 claims abstract description 16
- 239000011159 matrix material Substances 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 14
- 230000001815 facial effect Effects 0.000 claims description 12
- 230000004927 fusion Effects 0.000 claims description 12
- 230000007246 mechanism Effects 0.000 claims description 8
- 230000002123 temporal effect Effects 0.000 claims description 8
- 239000000126 substance Substances 0.000 claims description 7
- 230000009467 reduction Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 238000007500 overflow downdraw method Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 5
- 230000008921 facial expression Effects 0.000 abstract description 3
- 238000013210 evaluation model Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 4
- 239000000470 constituent Substances 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000007499 fusion processing Methods 0.000 description 2
- 239000013255 MILs Substances 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a student learning participation degree evaluation method based on hierarchical time sequence multi-example learning. The method uses three characteristics of head posture, facial expression and body posture extracted from a video and a video-level learning participation label to train an evaluation model, and the model can obtain the video-level learning participation and the learning participation of all video segments. The implementation method is convenient, efficient and simple in calculation, and the learning participation degree evaluation precision is reliably guaranteed.
Description
Technical Field
The invention belongs to the field of computer vision, artificial intelligence and education, and particularly relates to a student learning participation degree evaluation method based on hierarchical time sequence multi-example learning.
Background
The advent of large-scale open online courses (MOOC) has raised widespread interest and great expectations in the educational community. Despite the broad potential of new educational approaches, low learning completion rates of students are considered to be one of their major problems. In order to overcome the deficiency, dynamic assessment of the participation of individual students during online learning activities can provide timely teaching intervention to improve the learning completion rate and perform personalized learning. Since a large number of students are often seen in an MOOC environment, the cost of manually performing such evaluations is prohibitive. Therefore, research into automated techniques for instantly evaluating student learning participation is receiving increasing attention.
The study of automatic assessment of learning participation has the following problems:
1) due to the fact that the work of segment-by-segment annotation is time-consuming and labor-consuming, the problem of evaluating the learning participation degree of the whole video is only solved by a plurality of previous methods, and attention is lacked to the evaluation of the more meaningful video segment participation degree.
2) The lessons of distance education usually last tens of minutes or even an hour, and the large amount of video data makes evaluation difficult. Therefore, how to obtain effective features that can represent the whole video and each short film becomes a problem to be solved urgently.
Disclosure of Invention
The invention aims to provide a student learning participation degree evaluation method based on hierarchical time sequence multi-example learning, aiming at the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: a student learning participation degree evaluation method based on hierarchical time sequence multi-example learning comprises the following steps:
step 1, extracting image frames from each video, wherein each frame of image forms a video segment, and each video acquires N video segments.
And 2, respectively extracting body posture features, head posture features and facial key point features of each frame image in each video clip by using OpenPose, FSA-NET and PLFD networks for evaluation of learning participation.
Step 3, for each type of characteristic sequence of the video clip, respectively using a Bi-LSTM network to obtain the hidden state of each moment; the hidden state is input into an underlying temporal multi-instance learning module (B-TMIL) to obtain a feature representation of the video segment. And processing the extracted features of all video segments of one video through a full-connection and top-level multi-instance learning module (T-TMIL) to obtain the feature representation of the video. Wherein, the B-TIMIL and the T-TMIL are realized based on a self-attention mechanism, and the structures are the same.
And 4, fusing the three video segment-level features extracted by the B-TMIL in the step 3, and fusing the three video-level features extracted by the T-TMIL in the step 3.
And 5, performing full connection operation on the fusion characteristics of the video clip level to obtain the learning participation of the video clip. And obtaining the learning participation of the video through full-connection operation of the video-level fusion characteristics. And respectively establishing local and global supervision by using the average value of the video segment participation and the learning participation of the video, and training the whole network.
Further, in step 1, extracting the image frames specifically includes reserving one frame every several frames at equal intervals.
Further, in step 2, for the video clip frame vi,jRespectively extracting head posture characteristics e by using OpenPose, FSA-NET and PLFD networksi,jBody posture characteristics bi,jAnd face key point mi,j. For video clip ViThen get the head pose sequence Ei={ei,1,ei,2…,ei,l}, body posture sequence Bi={bi,1,bi,2…,bi,lAnd facial Key Point sequence Mi={mi,1,mi,2…,mi,l}。
Further, in step 3, the B-TMIL module acts on the sequence of sampled video frames that constitute a short video segment, where the frames are examples and the segments are packets. A valid representation of the packet needs to be obtained in order to accurately obtain its tag. The characterization of the packet is adaptively obtained by trainable parameters using a multi-instance learning module of an attention-machine mechanism to act on the hidden state at all times of the Bi-LSTM.
Further, with XiRepresenting one of a head pose sequence, a body pose sequence, a facial keypoint sequence, XiInputting into Bi-LSTM to obtain hidden state sequence Hi={hi,1,hi,2…,hi,l}。XiCorresponding segment-level aggregation featuresIs calculated asAnd (3) weighted sum form after dimension reduction:
wherein the content of the first and second substances,representing one of a segment-level head pose feature sequence, a segment-level body pose feature sequence, and a segment-level face keypoint feature sequence. Delta is the function of the ReLU and is,is a full join operation for dimensionality reduction.Is a weight matrix,. alpha.is an element-by-element multiplication,. alpha.is a Sigmoid function,. tau.is Tanh function.
Further, in step 3, T-TMIL acts between video segments, which are considered as examples, and the complete video composed of segments is a packet. The MIL module is applied on the basis of the segment-level features. The dimensionality of the segment-level features is reduced by the fully-join operation to generate a more efficient embedded representation. A weighted combination of video segments is constructed to represent the video.
Further, the air conditioner is provided with a fan,representing one of a video-level head pose feature sequence, a video-level body pose feature sequence, a video-level facial key point feature sequence:
wherein the content of the first and second substances,is the weight matrix in T-TMIL.Representing one of a segment-level head pose feature sequence, a segment-level body pose feature sequence, and a segment-level face keypoint feature sequence.
Further, in step 4, a weighted feature fusion method is adopted to extract the fusion features of the segment level and the video level for evaluation. The weight matrix is the same size as the feature matrix and is composed of trainable parameters. And normalizing each column of the weight matrix through a Softmax function to obtain different proportions of different features on corresponding dimensions, and then weighting and summing to obtain weighted fusion features.
The invention has the beneficial effects that: the invention establishes a layered time sequence multi-example learning model according to the time correlation between examples, wherein the model is composed of a bottom module of a video frame-video fragment and a top module of the video fragment-video. The method uses three characteristics of head posture, facial expression and body posture extracted from the video and the learning participation degree label of video level to train the evaluation model, and the model not only can obtain the learning participation degree of video level, but also can obtain the learning participation degree of all video segments. The implementation method is convenient, efficient and simple in calculation, and the learning participation degree evaluation precision is reliably guaranteed.
Drawings
FIG. 1 is a schematic diagram of a learning engagement assessment framework;
FIG. 2 is a schematic diagram of a feature fusion process.
Detailed Description
As shown in fig. 1, the student learning participation degree evaluation method based on hierarchical time series multi-instance learning of the present invention includes the following steps:
step 1, pretreatment. Extracting 3000 frames of images at equal intervals from each video; each 30 frames of images constitute one video clip, so that 100 video clips can be acquired for one video.
Down-sampling: the body posture, head posture and facial key points of the learner tend to change gradually and slowly during the learning process. Therefore, we downsample each original video by keeping one frame every few frames to achieve more efficient computational processing. In our experiment, 3000 frames per video were reserved for evaluation.
And (3) dividing: since the features of the current moment have little influence on the evaluation of learning engagement of other moments, and the network usually has difficulty in processing a lengthy feature sequence, we segment the input video into short video segments as basic analysis objects. We set the length of the video segment to 30 frames, denoted l-30 in all our experiments, to trade off the result between computational efficiency and accuracy of the proposed method. The number of video clips N extracted from each video is 100.
And 2, extracting the characteristics. And respectively extracting body posture characteristics, head posture characteristics and facial key point characteristics of each frame image in each video clip by using OpenPose, FSA-NET and PLFD networks for the evaluation of learning participation.
According to previous studies, the body language and facial expression have a strong correlation with the learner's learning engagement. Therefore, we use head pose features, body pose features, and facial keypoints as inputs to improve the accuracy and robustness of our model. For video clip frame vi,jWe use OpenPose, FSA-NET and PLFD networks to extract the head pose features e separatelyi,jBody posture characteristics bi,jAnd face key point mi,j(ii) a i is 1 to N, and j is 1 to l. Then for video segment ViWe can get the head pose sequence Ei={ei,1,ei,2…,ei,l}, body posture sequence Bi={bi,1,bi,2…,bi,lAnd facial Key Point sequence Mi={mi,1,mi,2…,mi,l}。
And step 3, layering a time sequence multi-instance learning model (H-TMIL). Based on a frame-segment-video structure with only video-level tags, as shown in fig. 1, we propose a hierarchical temporal multi-instance learning model (H-TMIL) consisting of a bottom-level temporal multi-instance learning module (B-TMIL) and a top-level temporal multi-instance learning module (T-TMIL) that work to learn the potential relationships between segments and their constituent frames and between videos and their constituent segments, respectively. Through this framework, we establish a connection between the underlying video frames and the video-level tags, and can implicitly learn the expressions of the middleware, i.e., the segment-level features useful for evaluating segment-level learning engagement.
And 3.1, a bottom-layer time sequence multi-instance learning module (B-TMIL). For each class of feature sequences of a video segment, we use the Bi-LSTM network to obtain the hidden state at each time. The hidden state is input into an underlying temporal multi-instance learning module (B-TMIL) to obtain a feature representation of the video segment. As shown in the lower left corner of FIG. 1, B-TMIL is implemented based on a self-attention mechanism.
The B-TMIL module operates on a sequence of sampled video frames that constitute a short video segment, where frames are examples and segments are packets. We need to obtain a valid representation of a packet in order to accurately obtain its tag. However, unlike conventional multi-instance learning MILs, there is a strong timing correlation between frames for a short sequence of time frames. The use of Bi-LSTM to capture temporal correlations and use the last layer of hidden states to express sequences results in the loss of early information. To address this problem, we use a multi-instance learning module of an attention-oriented mechanism to act on hidden states at all times of Bi-LSTM, adaptively obtaining the characterization of the packet through trainable parameters. The body posture feature is used as an example for illustration.
First, we will characterize the body posture sequence BiInputting into Bi-LSTM to obtain hidden state sequence Hi={hi,1,hi,2…,hi,l}. Segment level aggregation featureCan be calculated asAnd (3) weighted sum form after dimension reduction:
where, delta refers to the ReLU function,refers to a full join operation for dimensionality reduction. Weight ofThe calculation formula is as follows:
in the formula (I), the compound is shown in the specification,is a weight matrix,. phi.is an element-by-element multiplication,. sigma.is a Sigmoid function, and. tau.is a Tanh function. Wherein, the Tanh function is used for obtaining the correlation between the characteristics, and the Sigmoid function is used as a door mechanism.
Similar to segment-level body posture feature sequenceUsing the B-TMIL module to extract segment-level head pose feature sequencesAnd segment level face key point feature sequenceI.e. B in the above processiIs replaced by EiOr Mi。
And 3.2, a top-level time sequence multi-instance learning module (T-TMIL). And processing the extracted features of all video segments of a video through a full-connection and top-level multi-instance learning module (T-TMIL) to obtain a feature representation of the video. As shown in the lower left corner of FIG. 1, T-TMIL is also implemented based on a self-attention mechanism, similar to B-TIMIL.
Similar to B-TMIL, T-TMIL acts between video clips, which can be considered as examples, where a complete video composed of clips is a packet. However, since the duration of each video is long, we consider that there is no longer a strong temporal relationship between video segments. Many previous works have generally considered that the video level feature can be represented as an average of all segment level features. These methods treat all video segments equally, which adversely affects the accuracy of the final assessment. To obtain a more robust and flexible video representation, we still apply the MIL module on the basis of the slice-level features. And the top module becomes more similar to the conventional multi-instance structure. Here we also describe the process of T-TMIL with body posture characteristics as an example.
The dimensionality of the segment-level features is reduced by the fully-join operation to generate a more efficient embedded representation. We construct a weighted combination of video segments to represent the video, which is calculated as follows:
wherein the content of the first and second substances,is a video-level body posture characteristic sequence.The calculation is as follows:
wherein the content of the first and second substances,is the weight matrix in T-TMIL. Similar to the process described above, we use the T-TMIL module to aggregate video-level head pose feature sequencesAnd video level face key point feature sequenceI.e. in the above formulaIs replaced byOr
And 4, fusing the characteristics. And (3) fusing the three types of video clip level features extracted in the step (3.1), and fusing the three types of video level features extracted in the step (3.2). The fusion process is shown in figure 2.
We use three types of features to assess learning engagement, and our hierarchical module processes each type of feature separately. In order to realize advantage complementation among different features and increase judgment information, a weighted feature fusion method is provided, and segment-level and video-level fusion features used for evaluation are extracted. As shown in fig. 2, the weight matrix is the same size as the feature matrix and is composed of trainable parameters. And normalizing each column of the weight matrix through a Softmax function to obtain different proportions of different features on the dimension, and then weighting and summing to obtain weighted fusion features.
And 5, performing full connection operation on the fusion characteristics of the video clip level to obtain the learning participation of the video clip. And obtaining the learning participation of the video through full-connection operation of the video-level fusion characteristics. And respectively establishing local and global supervision by using the average value of the video segment participation and the learning participation of the video, and training the network.
Claims (8)
1. A student learning participation degree evaluation method based on hierarchical time sequence multi-example learning is characterized by comprising the following steps:
step 1, extracting image frames from each video, wherein each frame of image forms a video segment, and each video acquires N video segments.
And 2, respectively extracting body posture features, head posture features and facial key point features of each frame image in each video clip by using OpenPose, FSA-NET and PLFD networks for evaluation of learning participation.
Step 3, for each type of characteristic sequence of the video clip, respectively using a Bi-LSTM network to obtain the hidden state of each moment; the hidden state is input into an underlying temporal multi-instance learning module (B-TMIL) to obtain a feature representation of the video segment. And processing the extracted features of all video segments of one video through a full-connection and top-level multi-instance learning module (T-TMIL) to obtain the feature representation of the video. Wherein, the B-TIMIL and the T-TMIL are realized based on a self-attention mechanism, and the structures are the same.
And 4, fusing the three video segment-level features extracted by the B-TMIL in the step 3, and fusing the three video-level features extracted by the T-TMIL in the step 3.
And 5, performing full connection operation on the fusion characteristics of the video clip level to obtain the learning participation of the video clip. And obtaining the learning participation of the video through full-connection operation of the video-level fusion characteristics. And respectively establishing local and global supervision by using the average value of the video segment participation and the learning participation of the video, and training the whole network.
2. The student learning participation degree evaluation method based on the hierarchical time series multi-instance learning as claimed in claim 1, wherein in the step 1, the image frames are extracted to be specific to reserving one frame every few frames at equal intervals.
3. The student learning engagement assessment method based on hierarchical time series multi-instance learning as claimed in claim 1, wherein in step 2, v is a video segment framei,jRespectively extracting head posture characteristics e by using OpenPose, FSA-NET and PLFD networksi,jBody posture characteristics bi,jAnd face key point mi,j. For video clip ViThen get the head gesture sequenceEi={ei,1,ei,2…,ei,l}, body posture sequence Bi={bi,1,bi,2…,bi,lAnd facial Key Point sequence Mi={mi,1,mi,2…,mi,l}。
4. The student learning participation evaluation method based on hierarchical time series multiple instance learning according to claim 1, wherein in step 3, the B-TMIL module acts on a sequence of sample video frames constituting a short video segment, wherein the frames are instances and the segments are packets. A valid representation of the packet needs to be obtained in order to accurately obtain its tag. The characterization of the packet is adaptively obtained by trainable parameters using a multi-instance learning module of an attention-machine mechanism to act on the hidden state at all times of the Bi-LSTM.
5. The student learning engagement assessment method based on hierarchical time series multi-instance learning according to claim 4, characterized in that X is usediRepresenting one of a head pose sequence, a body pose sequence, a facial keypoint sequence, XiInputting into Bi-LSTM to obtain hidden state sequence Hi={hi,1,hi,2…,hi,l}。XiCorresponding segment-level aggregation featuresIs calculated asAnd (3) weighted sum form after dimension reduction:
wherein the content of the first and second substances,representing one of a segment-level head pose feature sequence, a segment-level body pose feature sequence, and a segment-level face keypoint feature sequence. Delta is the function of the ReLU and is,is a full join operation for dimensionality reduction.Is a weight matrix,. phi.is an element-by-element multiplication,. sigma.is a Sigmoid function, and. tau.is a Tanh function.
6. The student learning participation evaluation method based on the hierarchical time series multi-instance learning as claimed in claim 1, wherein in step 3, T-TMIL is applied between video segments, the video segments are considered as instances, and a complete video composed of the segments is a package. The MIL module is applied on the basis of the segment-level features. The dimensionality of the segment-level features is reduced by the fully-join operation to generate a more efficient embedded representation. A weighted combination of video segments is constructed to represent the video.
7. The student learning engagement assessment method based on hierarchical time series multi-instance learning according to claim 6,representing one of a video-level head pose feature sequence, a video-level body pose feature sequence, a video-level facial key point feature sequence:
8. The student learning participation evaluation method based on the hierarchical time series multi-instance learning as claimed in claim 1, wherein in step 4, a weighted feature fusion method is adopted to extract segment-level and video-level fusion features for evaluation. The weight matrix is the same size as the feature matrix and is composed of trainable parameters. And normalizing each column of the weight matrix through a Softmax function to obtain different proportions of different features on corresponding dimensions, and then weighting and summing to obtain weighted fusion features.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110942289.9A CN113723233B (en) | 2021-08-17 | 2021-08-17 | Student learning participation assessment method based on hierarchical time sequence multi-example learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110942289.9A CN113723233B (en) | 2021-08-17 | 2021-08-17 | Student learning participation assessment method based on hierarchical time sequence multi-example learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113723233A true CN113723233A (en) | 2021-11-30 |
CN113723233B CN113723233B (en) | 2024-03-26 |
Family
ID=78676052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110942289.9A Active CN113723233B (en) | 2021-08-17 | 2021-08-17 | Student learning participation assessment method based on hierarchical time sequence multi-example learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113723233B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116485792A (en) * | 2023-06-16 | 2023-07-25 | 中南大学 | Histopathological subtype prediction method and imaging method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178142A (en) * | 2019-12-05 | 2020-05-19 | 浙江大学 | Hand posture estimation method based on space-time context learning |
CN112287891A (en) * | 2020-11-23 | 2021-01-29 | 福州大学 | Method for evaluating learning concentration through video based on expression and behavior feature extraction |
US20210042937A1 (en) * | 2019-08-08 | 2021-02-11 | Nec Laboratories America, Inc. | Self-supervised visual odometry framework using long-term modeling and incremental learning |
CN112541529A (en) * | 2020-12-04 | 2021-03-23 | 北京科技大学 | Expression and posture fusion bimodal teaching evaluation method, device and storage medium |
WO2021051579A1 (en) * | 2019-09-17 | 2021-03-25 | 平安科技(深圳)有限公司 | Body pose recognition method, system, and apparatus, and storage medium |
-
2021
- 2021-08-17 CN CN202110942289.9A patent/CN113723233B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210042937A1 (en) * | 2019-08-08 | 2021-02-11 | Nec Laboratories America, Inc. | Self-supervised visual odometry framework using long-term modeling and incremental learning |
WO2021051579A1 (en) * | 2019-09-17 | 2021-03-25 | 平安科技(深圳)有限公司 | Body pose recognition method, system, and apparatus, and storage medium |
CN111178142A (en) * | 2019-12-05 | 2020-05-19 | 浙江大学 | Hand posture estimation method based on space-time context learning |
CN112287891A (en) * | 2020-11-23 | 2021-01-29 | 福州大学 | Method for evaluating learning concentration through video based on expression and behavior feature extraction |
CN112541529A (en) * | 2020-12-04 | 2021-03-23 | 北京科技大学 | Expression and posture fusion bimodal teaching evaluation method, device and storage medium |
Non-Patent Citations (2)
Title |
---|
康洪晶;王甲生;: "MOOC在线学习者参与度的模糊综合评判研究", 计算机与数字工程, no. 11, 20 November 2018 (2018-11-20) * |
缪佳;禹东川;: "基于课堂视频的学生课堂参与度分析", 教育生物学杂志, no. 04, 15 December 2019 (2019-12-15) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116485792A (en) * | 2023-06-16 | 2023-07-25 | 中南大学 | Histopathological subtype prediction method and imaging method |
CN116485792B (en) * | 2023-06-16 | 2023-09-15 | 中南大学 | Histopathological subtype prediction method and imaging method |
Also Published As
Publication number | Publication date |
---|---|
CN113723233B (en) | 2024-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108229338B (en) | Video behavior identification method based on deep convolution characteristics | |
CN107766447A (en) | It is a kind of to solve the method for video question and answer using multilayer notice network mechanism | |
CN111652202B (en) | Method and system for solving video question-answer problem by improving video-language representation learning through self-adaptive space-time diagram model | |
CN110889672A (en) | Student card punching and class taking state detection system based on deep learning | |
Hieu et al. | Identifying learners’ behavior from videos affects teaching methods of lecturers in Universities | |
CN106897671A (en) | A kind of micro- expression recognition method encoded based on light stream and FisherVector | |
CN111611854B (en) | Classroom condition evaluation method based on pattern recognition | |
Zhou et al. | Classroom learning status assessment based on deep learning | |
CN114898460B (en) | Teacher nonverbal behavior detection method based on graph convolution neural network | |
CN111723667A (en) | Human body joint point coordinate-based intelligent lamp pole crowd behavior identification method and device | |
CN116935447A (en) | Self-adaptive teacher-student structure-based unsupervised domain pedestrian re-recognition method and system | |
CN115240259A (en) | Face detection method and face detection system based on YOLO deep network in classroom environment | |
CN113723233A (en) | Student learning participation degree evaluation method based on layered time sequence multi-example learning | |
Tang et al. | Automatic facial expression analysis of students in teaching environments | |
CN113989608A (en) | Student experiment classroom behavior identification method based on top vision | |
CN110941976A (en) | Student classroom behavior identification method based on convolutional neural network | |
CN113688789B (en) | Online learning input degree identification method and system based on deep learning | |
CN117058752A (en) | Student classroom behavior detection method based on improved YOLOv7 | |
Tang et al. | Multi-level Amplified iterative training of semi-supervision deep learning for glaucoma diagnosis | |
CN114638988A (en) | Teaching video automatic classification method and system based on different presentation modes | |
CN113469001A (en) | Student classroom behavior detection method based on deep learning | |
CN113536926A (en) | Human body action recognition method based on distance vector and multi-angle self-adaptive network | |
CN111144368A (en) | Student behavior detection method based on long-time and short-time memory neural network | |
CN111507241A (en) | Lightweight network classroom expression monitoring method | |
Zhu et al. | Emotion Recognition in Learning Scenes Supported by Smart Classroom and Its Application. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |