CN111915460B

CN111915460B - AI vision-based intelligent scoring system for experimental examination

Info

Publication number: CN111915460B
Application number: CN202010728327.6A
Authority: CN
Inventors: 孙效华; 郭炜炜; 叶颖; 周鑫; 孟诗乔; 张啸天; 赵羿昕
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2020-05-07
Filing date: 2020-07-23
Publication date: 2022-05-13
Anticipated expiration: 2040-07-23
Also published as: CN111915460A

Abstract

An AI vision-based intelligent scoring system for an experimental examination comprises a student end (1), a teacher end (2) and a server end (3), wherein the server end (3) comprises an algorithm module (4) and a database (5); when the system is applied specifically, the system further comprises a first user and a second user, wherein the first user is a student, and the second user is a teacher; the algorithm module (4) runs on the server side (3), the database (5) is stored in the server side (3) and is called by the algorithm module (4), and the server side (3) comprises a cloud server; and so on. The system can be applied to teaching and examination scenes of middle-school physical and chemical biological experiments, and solves the problems that students in the teaching scene receive different information, teachers in the examination scene have low correcting efficiency and different scoring standards. The invention can also detect the action of the student by using the algorithm when the student independently exercises the experiment operation, provides effective guidance and timely feedback and improves the self-learning effect.

Description

AI vision-based intelligent scoring system for experimental examination

Technical Field

The invention relates to the field of deep learning and computer vision technology and application teaching.

Background

The current situation of physics, biology and chemistry experiment teaching and examination in middle school is as follows: one teacher corresponds to a plurality of students, and the teacher can not achieve the purpose of performing targeted and comprehensive teaching guidance or supervision on each student during teaching or examination. The problems of low efficiency of experimental teaching and examination, disorder of standards and the like are caused.

The specific problem is that when the teaching is carried out in experimental operation, one teacher corresponds to students of the whole class, and the teacher has limited energy and cannot carry out personalized tutoring according to the learning condition of each classmate. The positions of students in the classroom are different, and the received information is also different; secondly, when students do exercises by themselves, the students cannot get the correct operation in time without subsequent guidance of teachers, and therefore the teaching effect is not ideal.

In the experimental examination, the relatively traditional mode is for many teachers on-the-spot for many students invigorate and score, and a teacher needs to correspond many students usually, and this can lead to the teacher, and efficiency is very low grade, and although there is unified course and examination standard, because the individual subjective impression of teacher is different, can lead to the standard of grading to have some differences, and this is unfair for students equally.

Disclosure of Invention

In order to overcome the defects of the current situation, the invention provides an AI vision-based intelligent guidance and scoring system for middle school experiments. The system can be applied to teaching and examination scenes of middle-school physical and chemical biological experiments, and solves the problems that students in the teaching scene receive different information, teachers in the examination scene have low correcting efficiency and different scoring standards. The invention can also detect the action of the student by using the algorithm when the student independently exercises the experiment operation, provides effective guidance and timely feedback and improves the self-learning effect.

In order to achieve the purpose, the invention adopts the technical scheme that:

the utility model provides a middle school experiment intelligence is instructed and system of grading based on AI vision which characterized in that: the system is applied to the scenes of teaching, practice and examination of physics, chemistry and biological experiments of middle school, and comprises a student end (1), a teacher end (2) and a server end (3), wherein the server end (3) comprises an algorithm module (4) and a database (5); when the system is applied specifically, the system further comprises a first user and a second user, wherein the first user is a student, and the second user is a teacher;

the algorithm module (4) runs on the server side (3), the database (5) is stored in the server side (3) and is called by the algorithm module (4), and the server side (3) comprises a cloud server;

the student end (1) is an intelligent experiment table and mainly comprises conventional experiment equipment, a student client and two cameras for collecting videos, and students perform conventional operation and fill in experiment reports on the client. The camera is located student's test bench dead ahead central authorities respectively to and the right side, but have angle regulation's support and universal head, be convenient for gather the video data of all directions.

In an experiment practice scene, the student end (1) can acquire the video data in the database (5) and play the video data on the student client end so as to guide the first user to carry out experiment operation.

In an experimental examination or experimental practice scene, the student terminal (1) acquires video data of a first user through a video acquisition device and uploads the video data to the server terminal (3); the algorithm module (4) identifies, classifies, automatically scores and gives an error report according to the video data uploaded by the student terminal (1) by combining with the information in the database (5), and transmits the score and error report to the teacher terminal (2) and the student terminal (1);

the teacher end (2) is a software platform capable of managing student information, experiment practice and examination, and can transmit information with the server end (3) and the database (5);

the teacher end (2) can transmit interactive information to the student end (1), and the interactive information is instructions for exercise or examination mode selection, experiment starting and ending and the like.

The teacher end (2) can display the experiment history information of the first user to the second user, enable the second user to transmit interaction information to the server end (3), and manage part of data in the database (5).

The algorithm module (4) comprises a dynamic algorithm, a static algorithm and instrument and experimental result identification. And scoring different experiments by adopting the three algorithms, and finally adding the scores given by the three algorithms to obtain accurate and comprehensive experimental scores.

And the dynamic algorithm part is self-developed.

And in the static algorithm part, the existing algorithm is directly utilized.

And (4) identifying an algorithm by using the instrument and an experimental result, and directly utilizing the existing algorithm.

The dynamic algorithm part extracts the human body characteristic points of the first user in motion by using a skeleton line extraction algorithm, simultaneously identifies the position of the object in motion by using an object identification algorithm, and pre-trains the extracted human body characteristic points and the characteristic points of the object position based on the pre-trainingCNN and LSTMDeveloped autonomouslyFeature vector of neural network model to input And (4) processing, namely performing secondary classification by judging whether each step scores, and adding to obtain the final overall score.

The skeleton line extraction algorithm can directly adopt the existing Alphapos algorithm and is disclosed in RMPE, Regional Multi-person Pose Estimation, Haoshu Fang, Shuqinxie, Yuwing Tai, CewuLuApr 2017. The object identification algorithm can be directly realized by adopting the existing SSD algorithm and is disclosed in SSD, Single Shot MultiBox Detector Wei Liu, ECCV 2016.

The database (5) may store personal and historical information of the first user and may store information based onNeural network model for CNN and LSTM autonomous developmentAll sample data required. After the data uploaded by the user is obtained on the server system, the data set can be expanded, thereby improving the accuracy of any deep learning model.

Said independently developedDynamic algorithm ofFor persons and objectsDynamic time series classificationThe algorithm, as shown in fig. 5, includes the following steps:

(1) feature extraction

Video data provided from student end is extracted by using existing skeleton line extraction algorithm such as alpha phase algorithm at each momentCharacteristic points of a personExtracting by using the existing object recognition algorithms such as SSD algorithmPosition of object Is arranged as a characteristic point(ii) a Providing step (2).

(2) Pre-input pre-processing

And (3) splicing the positions of all the feature points extracted at each moment in the step (1) into a feature vector X with the length of N, and if the total number of the extracted frames is N, then forming N N-dimensional feature vectors as the input of the algorithm in the step (3). The feature vector of the input data at each moment is set as a first dimension, and the time dimension of the input data is set as a second dimension. The dimensions of the input data are: n is multiplied by N. The schematic diagram is shown in fig. 4.

(3) Network algorithm

The neural network is divided into two parts in total: frontend and backskend. Wherein the front is used for extracting multi-channel features from the feature vector in the step (2), and the back is used for processing the multi-channel features extracted from the front part from two angles.

Wherein, the specific structure of frontend is: convolution kernels with the size of 1 x 3 are used for performing convolution on the first dimensionality (characteristic vector dimensionality) of input data, and averaging posing is used for performing dimensionality reduction on the data, and the number of the specific convolution kernels and the pooling layers can be manually adjusted according to actual conditions. And setting the number of the channels after convolution as channels and setting the channel dimension as a third dimension. The length of the data is reduced from N to N ', and the dimension of the data at this time becomes N × N' × channel.

Wherein, the concrete structure of backskend is: the backup part is divided into two branches for parallel processing, and the first branch classifies the time sequence by using the relation that a single characteristic in the original input data changes along with time. The second branch classifies the time series by using the mutual relation among a plurality of characteristics. And then, overlapping (concat) the results obtained by the two branches to obtain a final classification result.

A first branch: the data processed by frontend is recombined (Flatten) into dimensions of N · channel × N', and is input into the LSTM neural network, a matrix h is generated at each time step (as shown in fig. 5), and the weight of the output result in the middle of the sequence of each loop, i.e., α in fig. 5 (the weight of the LSTM final result of the first branch at each time step) is changed by training using the Attention mechanism. Will hide the layer h_tOutput result of (2)

And alpha_t(the weight of the LSTM final result of the first branch is multiplied by the time step at the time t), all multiplied results are added and then normalized to obtain an output S_t(formula one). Setting the number of hidden layer neurons in LSTM neural network as beta₁Data S obtained after attention_tDimension of beta₁×m₁. Will S_tInputting the data into a single-layer or multi-layer fully-connected neural network to obtain a classification result

A second branch circuit: the data of dimension N × N '× channel obtained from front is convolved into data of dimension N' × N '× channel' by 3d convolution. The data are then recombined (Flatte)N) is the dimension of N ' · N ' x channel ', and is input into the LSTM neural network, a matrix h ' is generated at each time step (as shown in fig. 5), and the weight of the output result in the middle of the sequence of each cycle, i.e., α ' in fig. 5 (the weight of the LSTM final result of each time step to the second branch) is changed by training using the Attention mechanism. Will hide the layer h_t' output result

And alpha_t' (weight of LSTM final result of second branch at time step of t time) multiplication, adding all multiplication results and normalization operation to obtain output S_t' (same formula one). Setting the number of hidden layer neurons in LSTM neural network as beta₂Data S obtained after attention_tOf dimension beta₂×m₂. Will S_t' inputting the data into a single-layer or multi-layer fully-connected neural network to finally obtain a classification result

Finally, will

And

obtaining final result by weighted addition

(4) Training model

In training the model, the actual score y is used_tAnd output the result

And (4) error calculation is carried out, and parameters in the model are learned by utilizing a back propagation algorithm of the neural network.

Wherein, two branches in the backup can be pre-trained respectively. The pre-training method comprises the following steps: when one branch is trained, the other branch is removed during training, and only the trained model parameters of the current branch are obtained. After the two branches are trained separately, the overall training (finetune) is performed: and loading the two pre-trained branches into a training model during overall training, and performing repeated training by using the same training data.

The Loss function is Cross Entropy Loss (Cross Entropy Loss), which is expressed as follows. The model adopts an Adam optimizer, and the detailed parameters are as follows: lr is 0.001, betas is (0.9,0.999), eps is 1e-08, weight _ decay is 0, and amsgrad is False. And finally obtaining a time series classification model with higher accuracy.

M-number of categories;

y_i-an indication variable (0 or 1), 1 if the class is the same as the class of the sample, and 0 otherwise;

pi — the predicted probability of belonging to class i for an observed sample.

(5) And predicting by using the trained network model.

The static algorithm part utilizes the existing object identification algorithm such as an SSD algorithm to identify the relative position relation of the static object, compares the obtained position with the position in the standard answer, and provides the position to the scoring module so as to give a score;

the instrument and experimental result recognition part realizes character, number and form recognition by using an OCR technology, and realizes reading of instrument scales and pointer readings by using opencv and deep learning algorithm;

when the algorithm module (4) processes the experimental video in the server (3), processing the collected video data by using a human skeleton point extraction algorithm and an object detection algorithm aiming at a moving object in the video, extracting time sequences of human skeleton points and object positions, extracting video motion characteristics by using a pre-trained LSTM neural network with an attention mechanism, comparing the extracted characteristic vectors with standard experimental steps in a database, and further automatically scoring and giving error analysis; for static objects in a video, determining the relative positions of the objects by using an object detection technology based on deep learning and Opencv, and further scoring experimental results and giving error analysis; recognizing characters and scales by utilizing Opencv and OCR technologies according to characters, instrument readings and tables in a video, comparing the characters and scales with standard answers, automatically scoring the experiment of a student and giving error analysis; and integrating the scores obtained by the dynamic algorithm part, the static algorithm part, the instrument and the experimental result identification part to obtain the final score of the first user in the experimental video, and finally performing visual presentation on the score and error analysis by a software interface.

The system software part comprises all functions required by the student client side in the student end (1) and the teacher end (2), so that students and teachers can conveniently perform examination and exercise operations, and teachers can record and upload standard experiment videos.

In addition, the system of the invention constructs a crowdsourced ecosystem. Due to the fact that the initial data volume is limited, the effect of the algorithm cannot be fully exerted, but a large amount of data can be obtained in a crowdsourcing mode to improve the scoring accuracy of the algorithm model. The platform of the system not only unilaterally provides algorithm service for schools or teachers, but also receives a plurality of data in the using process of the platform and utilizes a data optimization algorithm. The whole platform is open-source shared, and resources are mutually beneficial.

Compared with the prior art, the invention has the beneficial effects that: through innovation of a software part algorithm and design of hardware facilities, teaching for assisting teaching teachers to perform experimental operation in a teaching stage is achieved, and teaching fairness and efficiency are improved; the operation data of students are collected in time for analysis in the experimental practice stage, and an error report is given, so that the learning effect of the students is improved to the maximum extent; and intelligent scoring is performed in the experimental examination stage, so that the scoring efficiency and the scoring rationality are improved. In addition, the crowdsourced ecosystem of the platform also ensures the continuous optimization of the algorithm and the feasibility of the business model.

Drawings

The invention is further illustrated with reference to the figures and examples.

FIG. 1 is an overall system schematic of the present invention.

Fig. 2 is a structural diagram of a student-side laboratory bench.

Fig. 3 is a configuration diagram of the teacher side.

Fig. 4 is a schematic diagram of an algorithm implementation.

FIG. 5 is a schematic diagram of a portion of an autonomic development dynamic algorithm.

Detailed Description

Usage scenarios

An AI vision-based intelligent guidance and scoring system for middle school experiments is mainly applied to scenes of teaching, practice and examination of physical, chemical and biological experiments of middle school, and serves mainly as teachers and students using the system.

The invention is further described below with reference to the accompanying drawings, in which three typical usage scenarios are listed below:

1. examination scenario

(1) A teacher selects the experiment content to be performed in the examination through a teacher end and selects an issuing command;

(2) a teacher checks the seat attendance information through a teacher end and controls the starting and the progress of the examination;

(3) the method comprises the following steps that a student starts experimental examination operation, and video data are collected in the operation process of the student by camera equipment of an experiment table;

(4) the background algorithm processes the collected video:

(5) after the background algorithm scores the operation of each examinee, the teacher can check the score, ranking and other information of the examinees at the teacher end to perform corresponding management operation.

2. Student exercise scene

(1) When a student arrives at the experiment table, the student end is operated by self to select the experiment practice content, the selected student end display equipment displays the specific steps and content of the experiment operation, and the student can also directly access the standard experiment video in the database to learn;

(2) the students study related experiments according to the contents and operate camera equipment of the experiment table to collect video data of the operation process of the students;

(3) the background algorithm processes the collected video:

(4) the students can obtain the analysis reports of the operation of the students on the display equipment, check the error reasons of each step, and meanwhile, the students can also check the historical exercise records of the students, including video and data analysis. And the learning effect is improved when the user exercises next time.

3. Teacher co-construction scene

3.1 teacher uploads existing Standard experiments

The teacher can download the standard experiment in the resource library, and simultaneously can record the standard experiment video of the experiment and print a label to upload so as to continuously update the optimization algorithm.

(1) The teacher selects the "resource library" option bar in the platform and clicks the "upload" button of the selected experiment. Two options appear in the platform, and the teacher can utilize the camera of intelligent laboratory bench self-carrying to carry out the video and upload after recording, or carry out the uploading of the video of storing in the place in advance through "local uploading". The video content is recorded at two angles in the same experiment operation process.

(2) After the experiment operation video is uploaded, the video can be customized. In the first step, a cutting tool is used to divide the video into a plurality of sections, and each section corresponds to a piece of scoring content. And secondly, marking the experimental equipment in the video by using a marking tool to serve as an equipment position comparison score of the subsequent experimental operation. And thirdly, adding descriptions and scores for each experimental operation. And completing the customization of the standard experimental video by the three steps.

(3) After the division of the operation steps and the formulation of the grading criterion are completed, the customized video can be uploaded to a platform resource library, and other teachers can download the video to own account numbers for use.

3.2 the teacher independently customizes the standard experiment to the experiment that does not exist in the resource library, and the teacher carries out the formulation of independently experimenting through the function that the platform had set for in advance.

(1) The teacher selects the "resource library" option bar in the platform and clicks the "add" button. The remaining operation was identical to 3.1.

(2) After the experiment operation video is uploaded, the video can be customized. The operation is in accordance with 3.1.

(3) The operation is in accordance with 3.1.

Claims

1. The utility model provides an experimental examination intelligence system of grading based on AI vision, the system is applied to in the scene of teaching, exercise and examination of middle school physics, chemistry, biological assay, its characterized in that: the system comprises a student end (1), a teacher end (2) and a server end (3), wherein the server end (3) comprises an algorithm module (4) and a database (5); when the system is applied specifically, the system further comprises a first user and a second user, wherein the first user is a student, and the second user is a teacher;

the student end (1) is an intelligent experiment table and comprises conventional experiment equipment, a student client and two cameras for collecting videos, and students perform conventional operation and fill in experiment reports on the client; the cameras are respectively positioned in the center and the right side of the front of the student test bed, and are provided with angle-adjustable supports and universal heads, so that the video data in all directions can be conveniently acquired;

the teacher end (2) is a software platform capable of managing student information, experiment practice and examination, and can transmit information with the server end (3) and the database (5); the teacher end (2) can transmit interactive information to the student end (1); the teacher end (2) can display the experiment history information of the first user to a second user, enable the second user to transmit interactive information to the server end (3) and manage part of data in the database (5);

the algorithm module (4) comprises a dynamic algorithm, a static algorithm and instrument and experimental result identification; scoring different experiments by adopting the three algorithms, and finally adding the scores given by the three algorithms to obtain accurate and comprehensive experimental scores;

the dynamic algorithm part extracts human body feature points of a first user moving by using a skeleton line extraction algorithm, simultaneously identifies the position of a moving object by using an object identification algorithm, processes input feature vectors by using pre-trained neural network models based on CNN and LSTM, performs secondary classification on whether each step scores or not, and obtains the final integral score after adding;

the database (5) may store personal and historical information of the first user and may store all sample data required for CNN and LSTM based neural network models; after the data uploaded by the user are obtained on the server system, the data set can be expanded, so that the accuracy of any deep learning model is improved;

the dynamic algorithm is a dynamic time series classification algorithm aiming at people and objects, and the algorithm flow is as follows:

(1) feature extraction

Firstly, extracting characteristic points of people at each moment by using a skeleton line extraction algorithm of an Alphaose algorithm and extracting positions of objects as the characteristic points by using an SSD algorithm object recognition algorithm from video data provided by a student end;

(2) pre-input pre-processing

Splicing the position feature points of all the feature points extracted at each moment in the step (1) into a feature vector X with the length of N, and if the total number of the extracted frames is N, then forming N N-dimensional feature vectors as the input of the algorithm in the step (3); setting a feature vector of input data at each moment as a first dimension, and setting a time dimension of the input data as a second dimension; the dimensions of the input data are: n is multiplied by N;

(3) network algorithm

The neural network is divided into two parts in total: front and backskend, the front is used for extracting multi-channel features from the feature vector in the step (2), and the backskend is used for processing the multi-channel features extracted from the front part from two angles;

wherein, the specific structure of frontend is: convolution kernels with the size of 1 x 3 are used for performing convolution on the first dimension of the input data, and averaging posing is used for performing dimension reduction on the data, wherein the number of the specific convolution kernels and the pooling layers can be manually adjusted according to actual conditions; setting the number of the channels after convolution as channels and setting the channel dimension as a third dimension; the length of the data is reduced from N to N ', and the dimension of the data at the moment is changed into Nxn' x channel;

wherein, the concrete structure of backskend is: the backup part is divided into two branches for parallel processing, and the first branch classifies the time sequence by using the relation that a single characteristic in the original input data changes along with time; the second branch classifies the time series by utilizing the mutual relation among a plurality of characteristics; then, overlapping (concat) the results obtained by the two branches to obtain a final classification result;

a first branch: recombining (Flatten) data processed by frontend into a dimension of N.channel multiplied by N', inputting the dimension into an LSTM neural network, generating a matrix h at each time step, and changing the weight of output results in the middle of the sequence of each cycle by training by utilizing an Attention mechanism; will hide the layer h_tOutput result of (2)

And alpha_tMultiplying, adding all multiplied results and then carrying out normalization operation to obtain output S_t(ii) a Setting the number of hidden layer neurons in LSTM neural network as beta₁Data S obtained after attention_tDimension of beta₁×m₁(ii) a Will S_tInputting the data into a single-layer or multi-layer fully-connected neural network to obtain a classification result

A second branch: convolving the data with dimension N x N 'x channel obtained by front into the data with dimension N' x N 'x channel' by 3d convolution; then, recombining (Flatten) the data into the dimension of N '. N'. times.channel ', inputting the dimension into an LSTM neural network, generating a matrix h' at each time step, and changing the weight of the output result in the middle of the sequence of each cycle by training by utilizing an Attention mechanism; will hide the layer h_t' output result

And alpha_tMultiplication, adding all the multiplication results and then carrying out normalization operation to obtain output S_t' same as formula one; setting the number of hidden layer neurons in LSTM neural network as beta₂Data S obtained after attention_t' dimension of beta₂×m₂(ii) a Will S_t' inputting the data into a single-layer or multi-layer fully-connected neural network to finally obtain a classification result

Finally, will

And

obtaining final result by weighted addition

(4) Training model

In training the model, the actual score y is used_tAnd output the result

Error calculation is carried out, and parameters in the model are learned by utilizing a back propagation algorithm of a neural network;

wherein, two branches in the backup can be pre-trained respectively; the pre-training method comprises the following steps: when one branch is trained, the other branch is removed during training, and only the trained model parameters of the current branch are obtained; after the two branches are trained separately, the overall training (finetune) is performed: loading the two pre-trained branches into a training model during overall training, and performing repeated training by using the same training data;

the Loss function is Cross Entropy Loss (Cross Entropy Loss) which is a formula II; the model adopts an Adam optimizer, and the detailed parameters are as follows: lr is 0.001, betas is (0.9,0.999), eps is 1e-08, weight _ decay is 0, amsgrad is False; finally, a time series classification model is obtained;

m-number of categories;

y_i-indicating a variable 0 or 1, 1 if the class is the same as the class of the sample, otherwise 0;

p_i-a predicted probability for an observation sample belonging to class i;

(5) prediction by using trained network model

The static algorithm part utilizes an SSD algorithm to identify the relative position relation of the static object, compares the position obtained with the position in the standard answer, and provides the position to a scoring module so as to give a score;

2. The system according to claim 1, wherein in an experiment practice scenario, the student end (1) can acquire video data in the database (5) and play it on the student client to guide the first user to perform the experiment operation.

3. The system according to claim 1, wherein in an experimental examination or an experimental practice scene, the student terminal (1) acquires video data of a first user through a video acquisition device and uploads the video data to the server terminal (3); the algorithm module (4) identifies, classifies, automatically scores and gives an error report according to the video data uploaded by the student terminal (1) by combining with the information in the database (5), and transmits the score and the error report to the teacher terminal (2) and the student terminal (1).