CN112990892A

CN112990892A - Video information acquisition method and image processing system for teaching evaluation

Info

Publication number: CN112990892A
Application number: CN202110566344.9A
Authority: CN
Inventors: 金龙; 张英俊
Original assignee: Nanjing Bailence Intelligent Technology Co Ltd
Current assignee: Nanjing Bailence Intelligent Technology Co Ltd
Priority date: 2021-05-24
Filing date: 2021-05-24
Publication date: 2021-06-18

Abstract

The invention provides a video information acquisition method and an image processing system for teaching evaluation, wherein the method comprises the following steps: the video information acquisition equipment acquires video data in a real-time scene in the application process, and transmits the video data to the data storage server through compression after passing through the temporary cache region; constructing a teaching evaluation index system; acquiring corresponding video data stored in the data storage server according to the requirements of a user terminal; extracting data features in the corresponding video data; integrating various data characteristics, classifying the teaching evaluation behaviors, and acquiring a final evaluation result; and intelligently recommending matching courses according to the final evaluation result. The invention combines the characteristics of artificial intelligence and the theory of teaching behavior analysis, utilizes intelligent technology to automatically collect and code teaching data, more comprehensively identifies the classroom teaching behavior, automatically analyzes and visually presents the classroom teaching, and provides powerful support for improving the teaching quality.

Description

Video information acquisition method and image processing system for teaching evaluation

Technical Field

The invention relates to a video information acquisition method and an image processing system for teaching evaluation, in particular to the technical field of image data processing and analysis.

Background

With the promotion and development of information technology, artificial intelligence technology has entered into various fields of mass life, and with the construction and popularization of intelligent teaching environments, teaching modes and classroom behavior analysis have become the focus of attention. In order to better show the teaching result, classroom teaching behavior analysis becomes one of the detection approaches. The study on classroom teaching activities can better promote the development of students in-house mechanisms, help teachers to obtain practical knowledge on the contrary, and is favorable for promoting the improvement of classroom teaching quality.

In the prior art, most of evaluation models and methods adopted in traditional teaching evaluation perform data acquisition and analysis through self-reporting, manual observation and manual coding. In the traditional analysis process, due to the defects of strong subjectivity, small sample size and the like in the coding process, the method is not beneficial to finding out the general rule in the teaching process.

Disclosure of Invention

The purpose of the invention is as follows: a video information acquisition method and an image processing system for teaching evaluation are provided to solve the problems in the prior art.

The technical scheme is as follows: in a first aspect, a video information acquisition method for teaching evaluation is provided, which specifically includes the following steps:

the video information acquisition equipment acquires video data in a real-time scene in the application process, and transmits the video data to the data storage server through compression after passing through the temporary cache region;

constructing a teaching evaluation index system;

acquiring corresponding video data stored in the data storage server according to the requirements of a user terminal;

extracting data features in the corresponding video data;

integrating various data characteristics, classifying the teaching evaluation behaviors, and acquiring a final evaluation result;

and intelligently recommending matching courses according to the final evaluation result.

In some implementations of the first aspect, a temporary buffer is established for storing the video data in the real-time scene; the temporary buffer area is a ring-shaped storage queue.

And when the data volume in the temporary cache region reaches a preset size, the data are packaged and transmitted to the cloud.

The process of storing the numerical value in the temporary cache region is as follows: and moving a head pointer to point to the next storage space of the storage queue, and circularly updating the numerical value in the storage queue in a numerical value covering mode.

In some implementation manners of the first aspect, the teaching evaluation index system includes facial expression recognition and judgment in a user learning process, and the specific steps are as follows:

acquiring video image data stored in a cloud end, and preprocessing the video image data;

extracting the characteristics of the preprocessed image data;

constructing a classifier, and obtaining an identification result through classification and identification according to the image data characteristics;

and outputting the learning state corresponding to the current expression, and taking the learning state as one of the reference data of the teaching evaluation result.

Wherein the pretreatment process comprises the following steps:

a video frame of the video image data is extracted,

detecting a face image in the video frame,

the median filtering is used to denoise and denoise the image,

and carrying out correction alignment and image gray level conversion on the face image according to requirements.

In some implementations of the first aspect, a neural network is established for extracting features of the image data; each convolutional layer of the neural network is followed by an activation layer; and the weight values between the neural network structural layers are updated in a back propagation mode when the learning ability of the neural network is trained.

While increasing the structural depth of the neural network, connecting the line layer structure of the neural network with a deep network through a residual block; wherein, the structural expression of the residual block is:

in the formula (I), the compound is shown in the specification,

an output representing the residual structure;

representing the mapping result by the convolutional layer; where x represents the input identity map, and the derivative of x is always equal to 1 during back propagation.

In some implementations of the first aspect, an index system for teaching evaluation is formed by at least one evaluation index, each evaluation index corresponding to a calculation weight; and calculating and acquiring an initial result of the teaching evaluation according to the result and the weighted value corresponding to the evaluation index.

In some implementations of the first aspect, when extracting features of an image, the histogram dimension and complexity may increase as the number of domain sampling points increases due to the fact that the histogram is used to describe the texture of the image. Meanwhile, because the CS-LBP operator does not consider the gray value of the central pixel point, the embodiment of the invention adopts the difference value centrosymmetric local binary pattern to encode the image, namely:

wherein Z represents encoded image data; n represents the number of field pixel points; t represents a threshold value;

expressing the gray value of the pixel point;

representing the gray value of the center pixel.

The convolution layer adopts the operation mode of up-sampling and down-sampling in the process of feature extraction, and realizes the input of pictures with any size through up-sampling, thereby ensuring that the output data meets the requirements. The downsampling operation may reduce the characteristics of the current convolutional layer by half. The image is divided into two directions after entering the convolutional layer, one enters the output layer, the other enters the pooling layer, data entering the pooling layer enters the full-link layer after passing through the convolutional layer and the pooling layer with preset number of layers to carry out final convolution operation, and finally, the final image characteristics are obtained through characteristic fusion.

The expression of the method involving the downsampling fusion is as follows:

in the formula (I), the compound is shown in the specification,

an operation representing down-sampling;

represents a convolution operation;

an activation function representing a current convolutional layer;

a feature map representing the downsampled feature map;

a feature map representing layer 0;

representing the weight values and correction term values for the ith layer.

The expression of the method involving the upsampling fusion is as follows:

in the formula (I), the compound is shown in the specification,

an operation representing down-sampling;

represents a convolution operation;

an activation function representing a current convolutional layer;

a feature map representing the downsampled feature map;

a feature map representing layer 0;

representing the weight values and bias correction term values for the ith layer.

For the update of the weight value in the network, a random gradient calculation mode is adopted, so that the phenomenon that simple chain calculation cannot achieve an ideal effect in operation is avoided. The specific weight updating expression is as follows:

in the formula (I), the compound is shown in the specification,

representing the learning rate during the training process; w represents a weight value;

representing a loss cost function. Wherein, the process expression of the change of the learning rate is as follows:

in the formula (I), the compound is shown in the specification,

represents an initial learning rate;

represents a learning rate decay; s represents the number of iterations in the current training process.

And the initial result is used as a teaching evaluation and output comparison result of the final evaluation result.

In some implementations of the first aspect, the intelligent recommendation matching course further comprises: and according to the teaching evaluation result, carrying out intelligent course recommendation on the user receiving the teaching behavior, and pushing the course in a terminal in a recommendation page mode. The method comprises the following specific steps:

constructing a course set containing the same attributes;

generating a teaching task of the next stage according to the evaluation result of the user accepting the teaching behavior on the learned course and the course setting corresponding to the teaching difficulty;

and generating recommendation information of the course set with the same attribute on the terminal interface based on the teaching task of the next stage, and presenting the recommendation information to a terminal visual interface.

The course sets with the same attribute are determined by the similarity between courses, and when the similarity between the courses reaches a preset value, the courses are classified into the same attribute set; the expression for similarity between two courses is calculated as:

in the formula (I), the compound is shown in the specification,

representing the number of courses that the user matches and likes;

representing the number of users matching course i;

representing the number of users matching course j;

indicating the number of users for both course i and course j.

In a second aspect, an image processing system for teaching assessment is provided, the system specifically including:

an information acquisition device configured to acquire video data generated during a teaching process;

a buffer area configured to temporarily buffer video image data acquired by the information acquisition device;

the processor is arranged to receive the video data in the buffer area and output the video data after encoding and compressing;

the data transmission module is set as a data interaction channel;

the cloud end is arranged to be connected with the data transmission module and used for storing video data;

a neural network arranged to process a face image in the video data;

the index system is set as an evaluation index of teaching evaluation;

and the recommending module is used for intelligently recommending the matching courses according to the evaluation result output by the neural network.

In some implementations of the second aspect, the indicator system includes at least one evaluation index.

In some implementations of the second aspect, the processor is implemented in an FPGA plus ARM architecture, and integrated on-chip.

In a third aspect, a computer-readable storage medium is provided, the computer-readable storage medium having stored thereon computer program instructions, which, when executed, implement any one of the video information collection methods for teaching assessment.

Has the advantages that: the invention provides a video information acquisition method and an image processing system for teaching evaluation, which are combined with the characteristics of artificial intelligence and the theory of teaching behavior analysis, utilize intelligent technology to automatically acquire and encode teaching data, more comprehensively identify the teaching behavior of a classroom, automatically analyze and visually present the classroom teaching, and provide powerful support for improving the teaching quality.

Drawings

FIG. 1 is a flow chart of data processing according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the invention.

In one embodiment, a video information collecting method for teaching assessment is provided, as shown in fig. 1, the method specifically includes the following steps:

the video information acquisition equipment acquires video data in a real-time scene in an application process, and transmits the video data to the data storage server after compression;

constructing a teaching evaluation index system;

extracting data features in the corresponding video data;

In a further embodiment, with the development of information technology, video image data is taken as an important data source, the state in the teaching process can be reflected visually, and the advantage that other parameters cannot be replaced is achieved. In order to improve the accuracy of subsequent image data analysis, the acquisition of video image information has high requirements on definition and time length, so that the maximum storage space and network bandwidth are required.

Specifically, after video data are collected, the data are stored in an established temporary cache area, the video data processed by a processor are divided into two branches, and one branch is transmitted through a communication module and uploaded to a cloud for storage; and the other path is used for triggering a display signal and displaying the notification information on the local display through the display module. In the preferred embodiment, when online teaching is carried out, the attendance state of a student is recorded in real time through the camera, video data which can be generated in the process is divided into two branches after being processed by the processor, and one branch is transmitted through the communication module and uploaded to the cloud for storage; and the other path generates a completion signal, triggers the display module to generate a display signal, and displays the learning completion notification information on a display at the local display screen end through the display module.

Compared with the traditional multi-chip architecture, the on-chip combination mode adopted by the application ensures the signal integrity of data during telling transmission on the basis of reducing the area and power consumption of a printed circuit board.

The collected data is stored in a temporary buffer before being processed by the processor. The data are firstly stored in the temporary cache region in a mode of establishing the annular storage queue, and when the data volume in the temporary cache region reaches a preset size, the data are packaged and transmitted to the cloud. By setting the storage mode of the annular queue, the next storage space of the storage queue can be pointed in a mode of moving the head pointer, so that the size of the buffer interval is reduced and the storage cost is saved in a mode of circularly updating the numerical value. On the other hand, the establishment of the temporary cache region can alleviate the hard requirement of synchronous operation of the processor, thereby reducing the hardware development cost.

In a further embodiment, the index system for teaching evaluation is formed by at least one evaluation index, and each evaluation index corresponds to a calculation weight. And calculating and acquiring an initial result of the teaching evaluation according to the result and the weighted value corresponding to the evaluation index.

In a further embodiment, when a user receiving a teaching behavior performs teaching evaluation, the specific implementation process of the evaluation is as follows, for a learning state index in an evaluation index system: in the learning process, the facial expression of the teaching behavior user is received, and the love degree, the absorption degree and the adaptation degree of the user to the course are judged, so that a corresponding evaluation result is generated.

Specifically, firstly, video image data stored in a cloud terminal is acquired; secondly, preprocessing the acquired video image data; thirdly, extracting the characteristics of the processed image data; from the second time, obtaining a recognition result through classification recognition; and finally, outputting the learning state corresponding to the current expression, and taking the learning state as one of the reference data of the teaching evaluation result.

The preprocessing process comprises the steps of extracting a video frame of video image data, detecting a face image in the video frame, then utilizing median filtering to reduce noise and denoise the image, and carrying out correction alignment and gray level conversion of the image on the face image according to requirements.

When the image data is subjected to feature extraction, the method is realized through the established neural network, in order to solve the problem of nonlinearity presented in the feature extraction process, an activation layer is added behind each convolution layer, nonlinearity is introduced into the activation layer, and the generalization capability of the network is improved. In the process of training the recognition capability, the weight parameters of the neural network pair are updated in a back propagation mode. In order to increase the performance of the neural network in the learning process, when the network depth of the network structure is increased, the line layer structure of the neural network is connected with the deep network through the residual block, so that the degradation problem of the neural network is reduced while the number of layers is increased.

When image features are extracted, the dimension and complexity of a histogram can increase along with the increase of the number of sampling points in the field due to the fact that the histogram is used for describing image textures. Meanwhile, because the CS-LBP operator does not consider the gray value of the central pixel point, the embodiment of the invention adopts the difference value centrosymmetric local binary pattern to encode the image, namely:

expressing the gray value of the pixel point;

representing the gray value of the center pixel.

The expression of the method involving the downsampling fusion is as follows:

in the formula (I), the compound is shown in the specification,

an operation representing down-sampling;

represents a convolution operation;

an activation function representing a current convolutional layer;

a feature map representing the downsampled feature map;

a feature map representing layer 0;

representing the weight values and correction term values for the ith layer.

The expression of the method involving the upsampling fusion is as follows:

in the formula (I), the compound is shown in the specification,

an operation representing down-sampling;

represents a convolution operation;

an activation function representing a current convolutional layer;

a feature map representing the downsampled feature map;

a feature map representing layer 0;

in the formula (I), the compound is shown in the specification,

in the formula (I), the compound is shown in the specification,

represents an initial learning rate;

Wherein, the expression of the residual structure is:

in the formula (I), the compound is shown in the specification,

an output representing the residual structure;

representing the mapping result by the convolutional layer; wherein x represents the input identity mapping and is used for solving the degradation problem of the network along with the increase of the depth, and the derivative value of x is equal to 1 in the back propagation process, so that the data can be transmitted to the shallow layer without loss.

In a further embodiment, according to the teaching evaluation result, intelligent course recommendation is performed on the user receiving the teaching behavior, and pushing is performed on the terminal in a recommendation page mode.

Specifically, firstly, a course set containing the same attributes is constructed; secondly, generating a teaching task of the next stage according to the evaluation result of the user accepting the teaching behavior on the learned course and the course setting corresponding to the teaching difficulty; and finally, generating recommendation information of the course sets with the same attributes on the terminal interface based on the teaching task of the next stage, and presenting the recommendation information to a terminal visual interface.

The course sets with the same attribute are determined by the similarity between courses, and when the similarity between courses reaches a preset value, the courses are classified into the same attribute set. The expression for similarity between two courses is calculated as:

in the formula (I), the compound is shown in the specification,

representing the number of users matching course i;

representing the number of users matching course j;

indicating the number of users for both course i and course j.

In order to better adapt the user's preference to the courses, the similarity between the courses is modified by increasing the interestingness parameter, namely:

in the formula (I), the compound is shown in the specification,

representing the number of courses that the user matches and likes;

representing the number of users matching course i;

representing the number of users matching course j;

indicating the number of users for both course i and course j.

In one embodiment, an image processing system for teaching assessment is provided, which is used to implement a video information acquisition method for teaching assessment, and specifically includes:

the data transmission module is set as a data interaction channel;

a neural network arranged to process a face image in the video data;

the index system is set as an evaluation index of teaching evaluation;

In a further embodiment, the index system comprises at least one evaluation index.

The processor adopts an FPGA and ARM framework and an on-chip combination mode.

In one embodiment, a computer-readable storage medium having computer program instructions stored thereon for execution by a processor to implement any one of the methods of video information collection for teaching appraisal is provided.

As noted above, while the present invention has been shown and described with reference to certain preferred embodiments, it is not to be construed as limited thereto. Various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A video information acquisition method for teaching evaluation is characterized by comprising the following steps:

constructing a teaching evaluation index system;

extracting data features in the corresponding video data;

2. The video information acquisition method for teaching appraisal according to claim 1, characterized in that a temporary buffer is established for storing the video data in the real-time scene; the temporary buffer area is an annular storage queue;

the video data under the real-time scene is stored in a temporary cache area, and when the data volume in the temporary cache area reaches a preset size, the data is packaged and transmitted to a cloud end;

3. The video information acquisition method for teaching evaluation according to claim 1, wherein the teaching evaluation index system comprises facial expression recognition and judgment in the user learning process, and the specific steps are as follows:

extracting the characteristics of the preprocessed image data;

outputting the learning state corresponding to the current expression and taking the learning state as one of reference data of the teaching evaluation result;

wherein the pretreatment process comprises the following steps:

a video frame of the video image data is extracted,

detecting a face image in the video frame,

the median filtering is used to denoise and denoise the image,

4. The video information collecting method for teaching appraisal according to claim 1,

establishing a neural network for extracting image data characteristics; each convolutional layer of the neural network is followed by an activation layer; the weight values between the neural network structural layers are updated in a back propagation mode when the learning ability of the neural network is trained;

in the formula (I), the compound is shown in the specification,

an output representing the residual structure;

5. The video information collecting method for teaching appraisal according to claim 1,

an index system for teaching evaluation is formed by at least one evaluation index, and each evaluation index corresponds to a calculation weight; calculating and acquiring an initial result of the teaching evaluation according to the result and the weighted value corresponding to the evaluation index;

the initial result is used as a comparison result of a final evaluation result output by the teaching evaluation;

when image features are extracted, a difference value centrosymmetric local binary pattern is adopted to encode an image, namely:

expressing the gray value of the pixel point;

expressing the gray value of the central pixel point;

the convolution layer adopts an up-sampling and down-sampling operation mode in the process of feature extraction, and realizes the input of pictures with any size through up-sampling, thereby ensuring that the output data meets the requirements; the downsampling operation may reduce the characteristics of the current convolutional layer by half; dividing the image into two directions after entering the convolutional layer, wherein one direction enters the output layer, the other direction enters the pooling layer, the data entering the pooling layer enters the full-link layer for final convolution operation after passing through the convolutional layer and the pooling layer with preset number of layers, and finally, obtaining the final image characteristics through characteristic fusion;

the expression of the method involving the downsampling fusion is as follows:

in the formula (I), the compound is shown in the specification,

an operation representing down-sampling;

represents a convolution operation;

an activation function representing a current convolutional layer;

a feature map representing the downsampled feature map;

a feature map representing layer 0;

representing the weight value and the correction term value of the ith layer;

the expression of the method involving the upsampling fusion is as follows:

in the formula (I), the compound is shown in the specification,

an operation representing down-sampling;

represents a convolution operation;

an activation function representing a current convolutional layer;

a feature map representing the downsampled feature map;

a feature map representing layer 0;

representing weight values and bias correction term values of the ith layer;

for the update of the weight value in the network, a random gradient calculation mode is adopted, and the specific weight update expression is as follows:

in the formula (I), the compound is shown in the specification,

representing a loss cost function; wherein, the process expression of the change of the learning rate is as follows:

in the formula (I), the compound is shown in the specification,

represents an initial learning rate;

6. The method of claim 1, wherein the intelligent recommendation matching course further comprises: according to the teaching evaluation result, carrying out intelligent course recommendation on the user receiving the teaching behavior, and pushing the user in a terminal in a recommendation page form;

the method comprises the following specific steps:

constructing a course set containing the same attributes;

generating recommendation information of the course set with the same attribute on a terminal interface based on the teaching task of the next stage, and presenting the recommendation information to a terminal visual interface;

in the formula (I), the compound is shown in the specification,

representing the number of courses that the user matches and likes;

representing the number of users matching course i;

representing the number of users matching course j;

indicating the number of users for both course i and course j.

7. An image processing system for teaching assessment, for implementing the method of any one of claims 1 to 6, comprising:

the data transmission module is set as a data interaction channel;

a neural network arranged to process a face image in the video data;

the index system is set as an evaluation index of teaching evaluation;

8. An image processing system for teaching appraisal according to claim 7,

the index system comprises at least one evaluation index.

9. An image processing system for teaching appraisal according to claim 7,

the processor adopts an FPGA and ARM framework and an on-chip combination mode.

10. A computer-readable storage medium having computer program instructions stored thereon which, when executed by a processor, implement a method of video information capture according to any one of claims 1-6.