CN113962424A - Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment - Google Patents

Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment Download PDF

Info

Publication number
CN113962424A
CN113962424A CN202110800902.3A CN202110800902A CN113962424A CN 113962424 A CN113962424 A CN 113962424A CN 202110800902 A CN202110800902 A CN 202110800902A CN 113962424 A CN113962424 A CN 113962424A
Authority
CN
China
Prior art keywords
data
pcanet
bigru
training
gru
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110800902.3A
Other languages
Chinese (zh)
Inventor
薛景
孔健睿
陈铭璋
李恺玥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110800902.3A priority Critical patent/CN113962424A/en
Publication of CN113962424A publication Critical patent/CN113962424A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Primary Health Care (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a PCANet-BiGRU-based score prediction method, which utilizes basic statistical data and score data of students collected from an online learning platform; dividing the original achievement data into three independent parts: the method comprises the steps of training a set, verifying the set and testing the set, cleaning and preprocessing data of the training set, and constructing a data matrix; inputting the matrix data into a PCANet network of a principal component analysis network to extract the characteristics of the result data; the data processed by the PCANet is input into a Bi-GRU layer of a bidirectional gating circulation unit neural network, and the ordinary performance of the student is predicted.

Description

Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment
Technical Field
The invention relates to a score prediction method, in particular to a student score prediction method based on PCANet-BiGRU, and belongs to the technical field of education.
Background
With the wide application of online teaching platforms such as MOOC, rain class and the like of China university, education is more efficient, convenient, more independent and humanized. And the key for ensuring and improving the remote education quality is to strengthen the management of the whole process of on-line learning of students. As is well known, online learning is a long-acting mechanism, and the effort and cognition of ordinary learning basically determine the quality of the learning effect.
However, how to carry out scientific research on the learning process, learning intervention and learning results in a quantitative mode is not enough due to the problems of data sparsity and method scientificity.
The problems existing in the prior art are as follows:
at present, most performance prediction modes comprise a statistical school, a traditional machine learning school and a deep learning school, the statistical school is low in calculation consumption but mostly cannot achieve high accuracy, and machine learning and deep learning have high accuracy but have high calculation complexity.
Disclosure of Invention
The invention aims to provide a PCANet-BiGRU-based score prediction method, which solves the problems that the calculation consumption of a statistical school method is low but high accuracy rate cannot be achieved mostly; the method solves the problem that the machine learning and the deep learning have higher accuracy but too high computational complexity.
The purpose of the invention is realized as follows: a achievement prediction method based on PCANet-BiGRU comprises the following steps:
step 1) using the basic statistical data and score data of students collected from the online learning platform, specifically comprising: acquiring basic statistical data (such as a school number, a course name and a school year) and achievement data (such as arrangement times, completion times and achievement of homework at ordinary times) of a corresponding student from a specific online learning platform (such as MOOC (university of China);
step 2) dividing the original achievement data collected in the step 1) into three independent parts: training set, verification set and test set, and carrying out data cleaning and normalization processing on the training set (because the total score of the daily results is different at each time, normalization processing is needed to be carried out in order to eliminate the influence of dimension on data prediction), constructing a data matrix, wherein the data cleaning specifically comprises the following steps: based on the training set data acquired in the previous stage, if a missing value exists in a certain row of data required to be used in training, deleting the row of data, namely deleting the raw data; if the line of data cannot be used during training, even if the line has missing values, the line of data still remains (based on the training integrated performance data acquired at the previous stage, the situation that the performance of the operation is missing due to the fact that the operation is missed at ordinary times at a certain time can occur, and the abnormal data can cause great deviation in model training, so that the missing data must be cleaned);
step 3) inputting the matrix data into a principal component analysis network PCANet, and extracting the characteristics of the result data, wherein the principal component analysis network PCANet consists of a PCA convolution layer, a nonlinear processing layer and a characteristic pooling layer;
and 4) inputting the data processed by the PCANet into a Bi-GRU layer of a bidirectional gating circulation unit neural network, and predicting the ordinary achievement of the student.
As a further limitation of the present invention, the training set in step 2) is used to train a model, and initial parameters of the model are found by fitting to establish the model, i.e. the weight and bias parameters of the model are determined; the verification set is used for determining parameters of an optimization adjustment model such as a network structure or control model complexity; and the test set checks how well the model that is finally selected is performing. The division ratio of the common training set, the verification set and the test set is 6:2: 2; under the condition of few data sets, generally randomly extracting 20% of data as a test set, and then adopting a cross validation algorithm on the rest data; the cross validation algorithm comprises the following specific steps:
a. randomly equally dividing training data into k parts;
b. selecting k-1 parts of training in turn, verifying the remaining part, and calculating the square sum of prediction errors;
c. and finally averaging the square sum of the k prediction errors to be used as a basis for selecting an optimal model structure.
As a further limitation of the present invention, on the basis of the structure of the gated circulation unit GRU, the bidirectional gated circulation unit neural network Bi-GRU in step 4) simultaneously increases forward and backward propagation in a hidden layer thereof, and captures a long-term dependence relationship of the learning achievement at different times by forward and backward bidirectional operations, so as to obtain a more accurate achievement prediction.
As a further limitation of the invention, the PCA convolutional layer, for each data j of the input layer l, has a convolution kernel P around itjSampling on a window, then sliding a convolution kernel, concatenating all sample blocks as a representation X of the samplei=[xi,1,xi,2,...,xi,n]And averaging the values; performing the operation on the N data sets to obtain a new feature matrix X; then, Principal Component Analysis (PCA) is performed on the matrix; PCA is a common method for data analysis and modeling, and mainly retains the most important characteristics of high-dimensional data, removes noise and unimportant characteristics, reduces the dimension and greatly reduces the data processing cost and speed; the specific algorithm steps are as follows:
a. recording a matrix X with n rows and m columns;
b. normalizing each row of X;
c. solving the covariance matrix C of X:
Figure BDA0003164459120000031
d. and (3) solving a characteristic value E and a characteristic vector D corresponding to the C:
[E,D]=eig(C)
wherein, the eig is a function of the eigenvalue and the eigenvector;
e. arranging the eigenvector D according to the size of the corresponding eigenvalue, and selecting the first k columns to form a new matrix, wherein the new matrix is the eigenvector of the data after dimension reduction;
f. and taking the K groups of feature vectors as a PCA filter, taking the PCA filter as a convolution kernel K, and performing convolution on the data set and the K groups of data to finish the convolution operation of extracting the data.
As a further limitation of the present invention, the nonlinear part is used to enhance the characteristic expressiveness of the data, specifically: and carrying out nonlinear processing on the data convolved by two layers of PCAs: carrying out binarization on each convolution result, and selecting a Hervesaide step function
Figure BDA0003164459120000032
Carrying out binarization; then, the result after binarization is weighted to obtain an integral graph T of the ith numerical value on the output characteristic of the ith layeri l
Figure BDA0003164459120000041
As a further limitation of the present invention, the feature pooling layer uses local histograms for PCANet's feature pooling (since the non-linear processing layer has an output range of [0,2 ]l-1]Values in between, so PCANet does not apply to the max pooling and mean pooling common to CNNs), integer plot Ti lThe local histograms are divided into blocks, and each histogram is counted and vectorized, denoted as Bhist (T)i l) The feature vector derived from concatenating the vectors generated by the k integer maps is represented as: f. ofi=[Bhist(Ti 1),...,Bhist(Ti k)]T
As a further limitation of the present invention, the GRU in the Bi-GRU neural network has two gate functions: "reset gate" and "update gate"; wherein, the reset gate rtFor controlling the state h of the preceding time periodt-1For the current state to be determined
Figure BDA0003164459120000042
The degree of influence of (c); and "update gate" ztIs used to determine ht-1In the message to htThe amount of information of (a); the GRU neural network unit is updated in the way of
Figure BDA0003164459120000043
Wherein, tanh is the hyperbolic tangent function, i.e.
Figure BDA0003164459120000044
The output value is always in the interval (-1, 1); f is a sigmod function, the output value of which is always in the interval (0,1) so as to determine the importance degree of information and be more helpful to determine the updating or abandoning of data; i.e. itAs input at time t, and ytAs output at time t; wir、Wiz、WihThe weight matrices input to the reset gate, the update gate, and the state, respectively; whr、Whz、Whh、WoRespectively obtaining weight matrixes from a state to a reset gate, an update gate, a current state and an initial state at a certain moment; br、bz、bhRespectively, a "reset gate", an "update gate", and an offset of the state.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
A processor for running a program, wherein the program performs the above method when running.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
the PCANet-BiGRU result prediction model provided by the invention considers the accurate prediction results of low consumption and deep learning of the statistical PCA classification model, so that the student results are predicted, students possibly failing to meet the examination are early warned in time, the students who learn later can timely check for missing and fill in the deficiency, the learning results of the students are improved, and the PCANet-BiGRU result prediction model has important teaching guidance and practice significance.
Drawings
FIG. 1 is a network structure of a PCANet and Bi-GRU based achievement early warning model.
Fig. 2 shows a network structure of PCANet.
FIG. 3 is a schematic diagram of the internal structure of GRU
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
learning is a long-term sequence behavior, in which there is a good or bad state, and different sequence performances lead to different outcome outcomes; aiming at the serial defects of overlong parameter training time, special parameter adjusting skills and the like of the structure, a mixed prediction model based on a principal component analysis network (PCANet) and a bidirectional gated cyclic unit (Bi-GRU) neural network shown in figure 1 is provided and applied to courses of C language, C + +, Python and the like on an online learning platform; on the basis of the structure of the GRU, the Bi-GRU captures the long-term dependence of the ordinary learning achievement in different periods through the forward and backward bidirectional operation in the hidden layer of the GRU, so as to obtain more accurate achievement prediction; the experimental result shows that the PCANet-BiGRU result prediction model effectively predicts the results and improves the accuracy and efficiency of the result prediction.
The PCANet-BiGRU achievement prediction model specifically comprises the following steps: PCANet and BiGRU;
PCANet as shown in fig. 2, the PCANet mainly follows the structure and concept of CNN, but its convolution kernel is composed of Principal Component Analysis (PCA) kernel, nonlinear layer is hash algorithm, and features are generated by histogram statistics, and the PCANet is composed of PCA convolution layer, nonlinear processing layer and feature pooling layer and is used for extracting spatial features of achievement data.
PCA convolutional layer
For each data of the input layer, a convolution kernel P is formed around the datajSampling on a window, then sliding a convolution kernel, concatenating all sample blocks as a representation X of the samplei=[xi,1,xi,2,...,xi,n]And averaging the values; performing the operation on the N data sets to obtain a new feature matrix X; then, Principal Component Analysis (PCA) is performed on the matrix; PCA is a common method for data analysis and modeling, and mainly retains the most important characteristics of high-dimensional dataRemoving noise and unimportant features, and reducing dimensions, thereby greatly reducing the data processing cost and speed; the specific algorithm steps are as follows:
a marks that a matrix X has n rows and m columns;
b normalizing each row of X;
c, solving a covariance matrix C of X:
Figure BDA0003164459120000061
d, solving a characteristic value E and a characteristic vector D corresponding to the C: [ E, D ] ═ eig (c), where eig is a function of the eigenvalues and eigenvectors;
e, arranging the eigenvectors D according to the sizes of the corresponding eigenvalues, and selecting the first k columns to form a new matrix, wherein the new matrix is the eigenvector of the data after dimension reduction;
and f, taking the K groups of feature vectors as a PCA filter, taking the PCA filter as a convolution kernel K, and performing convolution on the data set and the K groups of data to finish the convolution operation of extracting the data.
PCANet adopts two layers of PCA convolution layers for convolution; if only one layer of convolution is performed, the feature extraction effect is not satisfactory, and if three or more layers of convolution are performed, the calculation amount is increased dramatically due to the increased dimensionality.
Non-linear processing layer
In order to enhance the characteristic expressiveness of the data, the data convolved by two layers of PCA is subjected to nonlinear processing: carrying out binarization on each convolution result; selecting a Hervesseld step function
Figure BDA0003164459120000062
And (3) carrying out binarization: in H (x), if the data convolved by PCA is greater than 0, the function is set to 1, otherwise, the function is set to 0; then, the result after binarization is weighted to obtain an integral graph T of the ith numerical value on the output characteristic of the ith layeri l
Figure BDA0003164459120000063
Feature pooling layer
The output range is [0,2 ] due to the non-linear processing layerl-1]The values in between, so PCANet does not use the maximum pooling and average pooling common to the second part CNN, but rather uses local histograms for the feature pooling operation of PCANet; integer graph Ti lThe local histograms are divided into blocks, and each histogram is counted and vectorized, denoted as Bhist (T)i l) (ii) a The feature vector derived from concatenating the vectors generated for the k integer maps is represented as: f. ofi=[Bhist(Ti 1),...,Bhist(Ti k)]T
The GRU neural network in the Bi-GRU network belongs to one of the Recurrent Neural Networks (RNNs) and can fully reflect time sequence data; however, when the RNN faces long-term dependency, it is difficult to transmit earlier information to the following, and important information may be missed; in addition, RNN also suffers from problems such as disappearance of gradients and explosion. The LSTM neural network introduces an input gate, a forgetting gate and an output gate to regulate information flow, so as to solve the problems.
The GRU is a variant of the long-time memory network LSTM; the input gate and the forgetting gate in the LSTM are combined into an updating gate to construct a GRU; the GRU not only keeps the effect of memorizing the network LSTM in long and short time, but also has fewer parameters, has a simpler structure, is easier to realize in calculation, and is not easy to cause problems such as overfitting.
As shown in fig. 3, each GRU neural network element has two gate functions: "reset gate" and "update gate"; 'reset gate' rtFor controlling the state h of the preceding time periodt-1For the current state to be determined
Figure BDA0003164459120000071
The degree of influence of (c); update gate ztIs used to determine ht-1In the message to htThe amount of information of (2).
The updating mode of the GRU neural network unit is as follows:
Figure BDA0003164459120000072
wherein tanh is a hyperbolic tangent function, i.e.
Figure BDA0003164459120000073
The output value is always in the interval (-1, 1); f is a sigmod function, the output value of which is always in the interval (0,1) so as to determine the importance degree of information and be more helpful to determine the updating or abandoning of data; i.e. itAs input at time t, and ytAs output at time t; wir、Wiz、WihThe weight matrices input to the reset gate, the update gate, and the state, respectively; whr、Whz、Whh、WoRespectively obtaining weight matrixes from a state to a reset gate, an update gate, a current state and an initial state at a certain moment; br、bz、bhRespectively, a "reset gate", an "update gate", and an offset of the state.
In the GRU structure of the above model, the transmission of information is unidirectional from front to back. However, it is considered that the performance data is influenced not only by the data of the previous period but also by the data of the latter period. Therefore Bi-directional gated recurrent neural networks Bi-GRU are used. On the basis of the structure of the GRU, forward and backward propagation is added in a hidden layer of the Bi-GRU, and the long-term dependence of the ordinary learning achievement in different periods is captured by forward and backward bidirectional operation, so that more accurate achievement prediction is obtained.
PCANet-BiGRU
The PCANet has the characteristics of local perception and weight sharing, so that the PCANet is used for extracting spatial features related to the position of the achievement data. Combining the PCANet and the BiGRU, and mining the time and space characteristics of the achievement data through the space perception characteristic of the PCANet and the two-way memory characteristic of the BiGRU to realize the achievement prediction. The PCANet-BiGRU performance prediction process is as follows:
step 1, preprocessing the usual job result data to construct a matrix
Step 2, inputting emergency data into PCANet to extract data space characteristics
And 3, inputting the data processed by the PCANet into the Bi-GRU layer, wherein the model structure of the data for predicting the ordinary performance is shown in the figure.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
A processor for running a program, wherein the program performs the above method when running. An example of a specific application of the present invention is given below:
data acquisition
The experimental data of the invention come from the information of each course stored in the MOOC on-line learning system. These classes were opened in 2019 in the system, where student performance information was stored in the form of Excel tables. The student achievement evaluation standard in the data set is based on the achievement of each ordinary homework, and is basically the ordinary homework submitted in a week. The system preprocesses the evaluated data into vectors as the input of the neural network according to the internal standard for the ordinary performances.
Data set partitioning
Randomly extracting 20% of the data set as a test set, and then adopting a cross validation algorithm on the rest data:
a, randomly dividing training data into k parts;
b, selecting k-1 parts of training in turn, verifying the remaining part, and calculating the square sum of the prediction errors;
and c, finally averaging the square sum of the k prediction errors to be used as a basis for selecting an optimal model structure.
Data cleansing
And (3) performing data cleaning on the data of the training set: based on the training integrated performance data acquired in the previous stage, the situation that the performance of the operation is lost due to the fact that the operation is missed at ordinary times at a certain time can occur, and the abnormal data can cause great deviation of model training, so that the lost data must be cleaned. Based on the training set data acquired in the previous stage, if a missing value exists in a certain row of data required to be used in training, deleting the row of data, namely deleting the raw data; if the column data is not used during training, the row of data is retained even if the column has missing values.
Data normalization
The total score of each ordinary achievement is different, and in order to eliminate the influence of the dimension on data prediction, normalization processing is carried out on the achievement.
Construction of models using training sets
Constructing a PCANet-BiGRU network structure, constructing a network model by using a training set, and determining the weight and the bias parameters of the model through fitting;
to prevent overfitting, DropOut technique is used in the training process to enhance the generalization ability of the deep neural network with feature map perturbation.
Determining hyper-parameters of an optimized tuning model using a validation set
Since the learning rate has a significant influence on the performance of the model, and too large or too small of the learning rate causes oscillation of the network and cannot converge to an optimal solution, an optimization method that can update the learning rate by using the general information in the gradient descent is required to improve the model training speed. When the number of the PCA layers is too small, the extraction features are incomplete, and when the number of the PCA layers is too large, huge calculation complexity is brought. Therefore, according to the optimization of the verification set, the PCA convolution layer number is determined to be 2, the Bi-GRU layer number is determined to be 4, the Bi-GRU layer neuron number is determined to be 30, the optimized learning algorithm is Adam, and the learning rate is 0.1.
Evaluating models using test sets
Evaluation index
In order to measure the performance of the prediction model, the square root error RMSE and the average absolute percentage error MAPE are used as performance evaluation indexes to carry out model evaluation.
Figure BDA0003164459120000101
Figure BDA0003164459120000102
Wherein N represents the number of samples, yiRepresents the actual achievement of a certain student,
Figure BDA0003164459120000103
representing the predicted performance of the student. MAPE reflects the total deviation of the result predicted value, and the prediction accuracy of the model can be measured; the RMSE reflects the error between the predicted value and the true value, which can measure the accuracy of the predicted value.
Analysis of Experimental results
The experiment predicts the general achievement in normal times according to the general achievement of MOOC platform C language, C + + and Python. The results are shown in the following table
Form a achievement result prediction form
Subject of normal score C language C + + language Python language
RMSE 0.0252 0.0256 0.0263
MAPE/% 2.6139 2.6732 2.7141
As shown in the table, the RMSE and MAE values of the prediction results are both below 0.03, and the MAPE values of the prediction results are both below 2.8%, which shows that the model can effectively predict the performances of different courses of the MOOC, the prediction results are more accurate, and the prediction requirements of the actual MOOC performances can be met. Compared with the conventional performance prediction model using GM (1,1) and PSO-SVM models for performance prediction, the model is remarkably optimized in the aspect of prediction accuracy.
The PCANet-BiGRU model established in the method has obvious advantages in running time.
TABLE II training time comparison table
Figure BDA0003164459120000111
From the above table, the PCANet-BiGRU has a significant advantage in operation efficiency over the CNN-GRU. The more times of training, the greater the advantage
The result prediction can enable the online learning condition of the students to be better known by teachers, and promote the effective progress of the student results. The invention provides a deep neural network prediction model combining PCANet and Bi-GRU, which utilizes PCANet to extract data hidden features, reduces the size of data, and utilizes Bi-GRU to extract internal dynamic rules to realize early warning of performance. Compared with the previous model, the model can integrate data in time, early warn the score of students, remarkably improve the utilization rate of the data and the time, finally promote the development of education informatization and have higher practical value.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims (10)

1. A achievement prediction method based on PCANet-BiGRU is characterized by comprising the following steps:
step 1) utilizing basic statistical data and score data of students collected from an online learning platform;
step 2) dividing the original achievement data collected in the step 1) into three independent parts: training set, verification set and test set, and carrying out data cleaning and normalization processing on the training set to construct a data matrix, wherein the data cleaning specifically comprises the following steps: based on the training set data acquired in the previous stage, if a missing value exists in a certain row of data required to be used in training, deleting the row of data, namely deleting the raw data; if the data of the row is not used during training, the row of data is still reserved even if the row has missing values;
step 3) inputting the matrix data into a principal component analysis network PCANet, and extracting the characteristics of the result data, wherein the principal component analysis network PCANet consists of a PCA convolution layer, a nonlinear processing layer and a characteristic pooling layer;
and 4) inputting the data processed by the PCANet into a Bi-GRU layer of a bidirectional gating circulation unit neural network, and predicting the ordinary achievement of the student.
2. The PCANet-BiGRU-based performance prediction method of claim 1, wherein the training set in step 2) is used to train a model, and the model is built by fitting to find initial parameters of the model, i.e. determining weights of the model and biasing the parameters; the verification set is used for determining parameters of an optimization adjustment model such as a network structure or control model complexity; and the test set checks how well the model that is finally selected is performing. The division ratio of the common training set, the verification set and the test set is 6:2: 2; under the condition of few data sets, generally randomly extracting 20% of data as a test set, and then adopting a cross validation algorithm on the rest data; the cross validation algorithm comprises the following specific steps:
a. randomly equally dividing training data into k parts;
b. selecting k-1 parts of training in turn, verifying the remaining part, and calculating the square sum of prediction errors;
c. and finally averaging the square sum of the k prediction errors to be used as a basis for selecting an optimal model structure.
3. The PCANet-BiGRU-based achievement prediction method of claim 1, wherein the step 4) bidirectional gated loop unit neural network Bi-GRU increases forward and backward propagation in a hidden layer of the bidirectional gated loop unit GRU on the basis of the structure of the gated loop unit GRU, and captures long-term dependence of learning achievement at different periods through forward and backward bidirectional operation to obtain more accurate achievement prediction.
4. The PCANet-BiGRU-based performance prediction method of claim 1, wherein the PCA convolutional layer, for each data j of the input layer/, has a convolution kernel P around itjSampling on a window, then sliding a convolution kernel, concatenating all sample blocks as a representation X of the samplei=[xi,1,xi,2,...,xi,n]And averaging the values; performing the operation on the N data sets to obtain a new feature matrix X; then, Principal Component Analysis (PCA) is performed on the matrix; PCA is a common method for data analysis and modeling, and mainly retains the most important characteristics of high-dimensional data, removes noise and unimportant characteristics, reduces the dimension and greatly reduces the data processing cost and speed; the specific algorithm steps are as follows:
a. recording a matrix X with n rows and m columns;
b. normalizing each row of X;
c. solving the covariance matrix C of X:
Figure FDA0003164459110000021
d. and (3) solving a characteristic value E and a characteristic vector D corresponding to the C:
[E,D]=eig(C)
wherein, the eig is a function of the eigenvalue and the eigenvector;
e. arranging the eigenvector D according to the size of the corresponding eigenvalue, and selecting the first k columns to form a new matrix, wherein the new matrix is the eigenvector of the data after dimension reduction;
f. and taking the K groups of feature vectors as a PCA filter, taking the PCA filter as a convolution kernel K, and performing convolution on the data set and the K groups of data to finish the convolution operation of extracting the data.
5. The PCANet-BiGRU-based performance prediction method of claim 4, wherein the non-linear part is used for enhancing the feature expressiveness of the data, and specifically comprises: and carrying out nonlinear processing on the data convolved by two layers of PCAs: carrying out binarization on each convolution result, and selecting a Hervesaide step function
Figure FDA0003164459110000022
Carrying out binarization; then, the result after binarization is weighted to obtain an integral graph T of the ith numerical value on the output characteristic of the ith layeri l
Figure FDA0003164459110000023
6. The PCANet-BiGRU-based performance prediction method of claim 5, wherein the feature pooling layer performs a feature pooling operation of PCANet using a local histogram, integer graph Ti lThe local histograms are divided into blocks, and each histogram is counted and vectorized, denoted as Bhist (T)i l) The feature vector derived from concatenating the vectors generated by the k integer maps is represented as: f. ofi=[Bhist(Ti 1),...,Bhist(Ti k)]T
7. The PCANet-BiGRU-based performance prediction method of claim 5, wherein the GRU in the Bi-GRU neural network has two gate functions: "reset gate" and "update gate"; wherein "Reset gate rtFor controlling the state h of the preceding time periodt-1For the current state to be determined
Figure FDA0003164459110000031
The degree of influence of (c); and "update gate" ztIs used to determine ht-1In the message to htThe amount of information of (a); the GRU neural network unit is updated in the way of
Figure FDA0003164459110000032
Wherein tanh is a hyperbolic tangent function, i.e.
Figure FDA0003164459110000033
The output value is always in the interval (-1, 1); f is a sigmod function, the output value of which is always in the interval (0,1) so as to determine the importance degree of information and be more helpful to determine the updating or abandoning of data; i.e. itAs input at time t, and ytAs output at time t; wir、Wiz、WihThe weight matrices input to the reset gate, the update gate, and the state, respectively; whr、Whz、Whh、WoRespectively obtaining weight matrixes from a state to a reset gate, an update gate, a current state and an initial state at a certain moment; br、bz、bhRespectively, a "reset gate", an "update gate", and an offset of the state.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the program is executed by the processor.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
10. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of any of claims 1 to 7.
CN202110800902.3A 2021-07-15 2021-07-15 Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment Pending CN113962424A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110800902.3A CN113962424A (en) 2021-07-15 2021-07-15 Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110800902.3A CN113962424A (en) 2021-07-15 2021-07-15 Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN113962424A true CN113962424A (en) 2022-01-21

Family

ID=79460379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110800902.3A Pending CN113962424A (en) 2021-07-15 2021-07-15 Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN113962424A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115656840A (en) * 2022-12-27 2023-01-31 武汉工程大学 Method, device, system and storage medium for predicting battery charging remaining time

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115656840A (en) * 2022-12-27 2023-01-31 武汉工程大学 Method, device, system and storage medium for predicting battery charging remaining time

Similar Documents

Publication Publication Date Title
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
CN111915059B (en) Attention mechanism-based Seq2Seq berth occupancy prediction method
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN110648014B (en) Regional wind power prediction method and system based on space-time quantile regression
CN111626785A (en) CNN-LSTM network fund price prediction method based on attention combination
CN111310965A (en) Aircraft track prediction method based on LSTM network
CN112465199A (en) Airspace situation evaluation system
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN113205698A (en) Navigation reminding method based on IGWO-LSTM short-time traffic flow prediction
CN114580545A (en) Wind turbine generator gearbox fault early warning method based on fusion model
CN115310782A (en) Power consumer demand response potential evaluation method and device based on neural turing machine
CN114548591A (en) Time sequence data prediction method and system based on hybrid deep learning model and Stacking
CN116542701A (en) Carbon price prediction method and system based on CNN-LSTM combination model
CN114973665A (en) Short-term traffic flow prediction method combining data decomposition and deep learning
CN115096357A (en) Indoor environment quality prediction method based on CEEMDAN-PCA-LSTM
CN113962424A (en) Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment
CN114580262A (en) Lithium ion battery health state estimation method
CN107704944B (en) Construction method of stock market fluctuation interval prediction model based on information theory learning
CN114596726A (en) Parking position prediction method based on interpretable space-time attention mechanism
Xu et al. Residual autoencoder-LSTM for city region vehicle emission pollution prediction
CN116579408A (en) Model pruning method and system based on redundancy of model structure
CN109978138A (en) The structural reliability methods of sampling based on deeply study
Siraj et al. Data mining and neural networks: the impact of data representation
Zhong et al. Handwritten digit recognition based on corner detection and convolutional neural network
CN112651168B (en) Construction land area prediction method based on improved neural network algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination