CN113962424A - Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment - Google Patents
Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment Download PDFInfo
- Publication number
- CN113962424A CN113962424A CN202110800902.3A CN202110800902A CN113962424A CN 113962424 A CN113962424 A CN 113962424A CN 202110800902 A CN202110800902 A CN 202110800902A CN 113962424 A CN113962424 A CN 113962424A
- Authority
- CN
- China
- Prior art keywords
- data
- pcanet
- bigru
- training
- gru
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000000513 principal component analysis Methods 0.000 claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 37
- 239000011159 matrix material Substances 0.000 claims abstract description 25
- 238000013528 artificial neural network Methods 0.000 claims abstract description 19
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 11
- 238000012360 testing method Methods 0.000 claims abstract description 11
- 238000004140 cleaning Methods 0.000 claims abstract description 6
- 238000012545 processing Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 14
- 238000011176 pooling Methods 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 238000012795 verification Methods 0.000 claims description 7
- 238000012935 Averaging Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 230000007774 longterm Effects 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 4
- 238000007405 data analysis Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 201000007902 Primary cutaneous amyloidosis Diseases 0.000 claims description 2
- 208000014670 posterior cortical atrophy Diseases 0.000 claims description 2
- 230000002708 enhancing effect Effects 0.000 claims 1
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 6
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 4
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Artificial Intelligence (AREA)
- Marketing (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- General Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Primary Health Care (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a PCANet-BiGRU-based score prediction method, which utilizes basic statistical data and score data of students collected from an online learning platform; dividing the original achievement data into three independent parts: the method comprises the steps of training a set, verifying the set and testing the set, cleaning and preprocessing data of the training set, and constructing a data matrix; inputting the matrix data into a PCANet network of a principal component analysis network to extract the characteristics of the result data; the data processed by the PCANet is input into a Bi-GRU layer of a bidirectional gating circulation unit neural network, and the ordinary performance of the student is predicted.
Description
Technical Field
The invention relates to a score prediction method, in particular to a student score prediction method based on PCANet-BiGRU, and belongs to the technical field of education.
Background
With the wide application of online teaching platforms such as MOOC, rain class and the like of China university, education is more efficient, convenient, more independent and humanized. And the key for ensuring and improving the remote education quality is to strengthen the management of the whole process of on-line learning of students. As is well known, online learning is a long-acting mechanism, and the effort and cognition of ordinary learning basically determine the quality of the learning effect.
However, how to carry out scientific research on the learning process, learning intervention and learning results in a quantitative mode is not enough due to the problems of data sparsity and method scientificity.
The problems existing in the prior art are as follows:
at present, most performance prediction modes comprise a statistical school, a traditional machine learning school and a deep learning school, the statistical school is low in calculation consumption but mostly cannot achieve high accuracy, and machine learning and deep learning have high accuracy but have high calculation complexity.
Disclosure of Invention
The invention aims to provide a PCANet-BiGRU-based score prediction method, which solves the problems that the calculation consumption of a statistical school method is low but high accuracy rate cannot be achieved mostly; the method solves the problem that the machine learning and the deep learning have higher accuracy but too high computational complexity.
The purpose of the invention is realized as follows: a achievement prediction method based on PCANet-BiGRU comprises the following steps:
step 1) using the basic statistical data and score data of students collected from the online learning platform, specifically comprising: acquiring basic statistical data (such as a school number, a course name and a school year) and achievement data (such as arrangement times, completion times and achievement of homework at ordinary times) of a corresponding student from a specific online learning platform (such as MOOC (university of China);
step 2) dividing the original achievement data collected in the step 1) into three independent parts: training set, verification set and test set, and carrying out data cleaning and normalization processing on the training set (because the total score of the daily results is different at each time, normalization processing is needed to be carried out in order to eliminate the influence of dimension on data prediction), constructing a data matrix, wherein the data cleaning specifically comprises the following steps: based on the training set data acquired in the previous stage, if a missing value exists in a certain row of data required to be used in training, deleting the row of data, namely deleting the raw data; if the line of data cannot be used during training, even if the line has missing values, the line of data still remains (based on the training integrated performance data acquired at the previous stage, the situation that the performance of the operation is missing due to the fact that the operation is missed at ordinary times at a certain time can occur, and the abnormal data can cause great deviation in model training, so that the missing data must be cleaned);
step 3) inputting the matrix data into a principal component analysis network PCANet, and extracting the characteristics of the result data, wherein the principal component analysis network PCANet consists of a PCA convolution layer, a nonlinear processing layer and a characteristic pooling layer;
and 4) inputting the data processed by the PCANet into a Bi-GRU layer of a bidirectional gating circulation unit neural network, and predicting the ordinary achievement of the student.
As a further limitation of the present invention, the training set in step 2) is used to train a model, and initial parameters of the model are found by fitting to establish the model, i.e. the weight and bias parameters of the model are determined; the verification set is used for determining parameters of an optimization adjustment model such as a network structure or control model complexity; and the test set checks how well the model that is finally selected is performing. The division ratio of the common training set, the verification set and the test set is 6:2: 2; under the condition of few data sets, generally randomly extracting 20% of data as a test set, and then adopting a cross validation algorithm on the rest data; the cross validation algorithm comprises the following specific steps:
a. randomly equally dividing training data into k parts;
b. selecting k-1 parts of training in turn, verifying the remaining part, and calculating the square sum of prediction errors;
c. and finally averaging the square sum of the k prediction errors to be used as a basis for selecting an optimal model structure.
As a further limitation of the present invention, on the basis of the structure of the gated circulation unit GRU, the bidirectional gated circulation unit neural network Bi-GRU in step 4) simultaneously increases forward and backward propagation in a hidden layer thereof, and captures a long-term dependence relationship of the learning achievement at different times by forward and backward bidirectional operations, so as to obtain a more accurate achievement prediction.
As a further limitation of the invention, the PCA convolutional layer, for each data j of the input layer l, has a convolution kernel P around itjSampling on a window, then sliding a convolution kernel, concatenating all sample blocks as a representation X of the samplei=[xi,1,xi,2,...,xi,n]And averaging the values; performing the operation on the N data sets to obtain a new feature matrix X; then, Principal Component Analysis (PCA) is performed on the matrix; PCA is a common method for data analysis and modeling, and mainly retains the most important characteristics of high-dimensional data, removes noise and unimportant characteristics, reduces the dimension and greatly reduces the data processing cost and speed; the specific algorithm steps are as follows:
a. recording a matrix X with n rows and m columns;
b. normalizing each row of X;
c. solving the covariance matrix C of X:
d. and (3) solving a characteristic value E and a characteristic vector D corresponding to the C:
[E,D]=eig(C)
wherein, the eig is a function of the eigenvalue and the eigenvector;
e. arranging the eigenvector D according to the size of the corresponding eigenvalue, and selecting the first k columns to form a new matrix, wherein the new matrix is the eigenvector of the data after dimension reduction;
f. and taking the K groups of feature vectors as a PCA filter, taking the PCA filter as a convolution kernel K, and performing convolution on the data set and the K groups of data to finish the convolution operation of extracting the data.
As a further limitation of the present invention, the nonlinear part is used to enhance the characteristic expressiveness of the data, specifically: and carrying out nonlinear processing on the data convolved by two layers of PCAs: carrying out binarization on each convolution result, and selecting a Hervesaide step functionCarrying out binarization; then, the result after binarization is weighted to obtain an integral graph T of the ith numerical value on the output characteristic of the ith layeri l:
As a further limitation of the present invention, the feature pooling layer uses local histograms for PCANet's feature pooling (since the non-linear processing layer has an output range of [0,2 ]l-1]Values in between, so PCANet does not apply to the max pooling and mean pooling common to CNNs), integer plot Ti lThe local histograms are divided into blocks, and each histogram is counted and vectorized, denoted as Bhist (T)i l) The feature vector derived from concatenating the vectors generated by the k integer maps is represented as: f. ofi=[Bhist(Ti 1),...,Bhist(Ti k)]T。
As a further limitation of the present invention, the GRU in the Bi-GRU neural network has two gate functions: "reset gate" and "update gate"; wherein, the reset gate rtFor controlling the state h of the preceding time periodt-1For the current state to be determinedThe degree of influence of (c); and "update gate" ztIs used to determine ht-1In the message to htThe amount of information of (a); the GRU neural network unit is updated in the way of
Wherein, tanh is the hyperbolic tangent function, i.e.The output value is always in the interval (-1, 1); f is a sigmod function, the output value of which is always in the interval (0,1) so as to determine the importance degree of information and be more helpful to determine the updating or abandoning of data; i.e. itAs input at time t, and ytAs output at time t; wir、Wiz、WihThe weight matrices input to the reset gate, the update gate, and the state, respectively; whr、Whz、Whh、WoRespectively obtaining weight matrixes from a state to a reset gate, an update gate, a current state and an initial state at a certain moment; br、bz、bhRespectively, a "reset gate", an "update gate", and an offset of the state.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
A processor for running a program, wherein the program performs the above method when running.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
the PCANet-BiGRU result prediction model provided by the invention considers the accurate prediction results of low consumption and deep learning of the statistical PCA classification model, so that the student results are predicted, students possibly failing to meet the examination are early warned in time, the students who learn later can timely check for missing and fill in the deficiency, the learning results of the students are improved, and the PCANet-BiGRU result prediction model has important teaching guidance and practice significance.
Drawings
FIG. 1 is a network structure of a PCANet and Bi-GRU based achievement early warning model.
Fig. 2 shows a network structure of PCANet.
FIG. 3 is a schematic diagram of the internal structure of GRU
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
learning is a long-term sequence behavior, in which there is a good or bad state, and different sequence performances lead to different outcome outcomes; aiming at the serial defects of overlong parameter training time, special parameter adjusting skills and the like of the structure, a mixed prediction model based on a principal component analysis network (PCANet) and a bidirectional gated cyclic unit (Bi-GRU) neural network shown in figure 1 is provided and applied to courses of C language, C + +, Python and the like on an online learning platform; on the basis of the structure of the GRU, the Bi-GRU captures the long-term dependence of the ordinary learning achievement in different periods through the forward and backward bidirectional operation in the hidden layer of the GRU, so as to obtain more accurate achievement prediction; the experimental result shows that the PCANet-BiGRU result prediction model effectively predicts the results and improves the accuracy and efficiency of the result prediction.
The PCANet-BiGRU achievement prediction model specifically comprises the following steps: PCANet and BiGRU;
PCANet as shown in fig. 2, the PCANet mainly follows the structure and concept of CNN, but its convolution kernel is composed of Principal Component Analysis (PCA) kernel, nonlinear layer is hash algorithm, and features are generated by histogram statistics, and the PCANet is composed of PCA convolution layer, nonlinear processing layer and feature pooling layer and is used for extracting spatial features of achievement data.
PCA convolutional layer
For each data of the input layer, a convolution kernel P is formed around the datajSampling on a window, then sliding a convolution kernel, concatenating all sample blocks as a representation X of the samplei=[xi,1,xi,2,...,xi,n]And averaging the values; performing the operation on the N data sets to obtain a new feature matrix X; then, Principal Component Analysis (PCA) is performed on the matrix; PCA is a common method for data analysis and modeling, and mainly retains the most important characteristics of high-dimensional dataRemoving noise and unimportant features, and reducing dimensions, thereby greatly reducing the data processing cost and speed; the specific algorithm steps are as follows:
a marks that a matrix X has n rows and m columns;
b normalizing each row of X;
d, solving a characteristic value E and a characteristic vector D corresponding to the C: [ E, D ] ═ eig (c), where eig is a function of the eigenvalues and eigenvectors;
e, arranging the eigenvectors D according to the sizes of the corresponding eigenvalues, and selecting the first k columns to form a new matrix, wherein the new matrix is the eigenvector of the data after dimension reduction;
and f, taking the K groups of feature vectors as a PCA filter, taking the PCA filter as a convolution kernel K, and performing convolution on the data set and the K groups of data to finish the convolution operation of extracting the data.
PCANet adopts two layers of PCA convolution layers for convolution; if only one layer of convolution is performed, the feature extraction effect is not satisfactory, and if three or more layers of convolution are performed, the calculation amount is increased dramatically due to the increased dimensionality.
Non-linear processing layer
In order to enhance the characteristic expressiveness of the data, the data convolved by two layers of PCA is subjected to nonlinear processing: carrying out binarization on each convolution result; selecting a Hervesseld step functionAnd (3) carrying out binarization: in H (x), if the data convolved by PCA is greater than 0, the function is set to 1, otherwise, the function is set to 0; then, the result after binarization is weighted to obtain an integral graph T of the ith numerical value on the output characteristic of the ith layeri l:
Feature pooling layer
The output range is [0,2 ] due to the non-linear processing layerl-1]The values in between, so PCANet does not use the maximum pooling and average pooling common to the second part CNN, but rather uses local histograms for the feature pooling operation of PCANet; integer graph Ti lThe local histograms are divided into blocks, and each histogram is counted and vectorized, denoted as Bhist (T)i l) (ii) a The feature vector derived from concatenating the vectors generated for the k integer maps is represented as: f. ofi=[Bhist(Ti 1),...,Bhist(Ti k)]T。
The GRU neural network in the Bi-GRU network belongs to one of the Recurrent Neural Networks (RNNs) and can fully reflect time sequence data; however, when the RNN faces long-term dependency, it is difficult to transmit earlier information to the following, and important information may be missed; in addition, RNN also suffers from problems such as disappearance of gradients and explosion. The LSTM neural network introduces an input gate, a forgetting gate and an output gate to regulate information flow, so as to solve the problems.
The GRU is a variant of the long-time memory network LSTM; the input gate and the forgetting gate in the LSTM are combined into an updating gate to construct a GRU; the GRU not only keeps the effect of memorizing the network LSTM in long and short time, but also has fewer parameters, has a simpler structure, is easier to realize in calculation, and is not easy to cause problems such as overfitting.
As shown in fig. 3, each GRU neural network element has two gate functions: "reset gate" and "update gate"; 'reset gate' rtFor controlling the state h of the preceding time periodt-1For the current state to be determinedThe degree of influence of (c); update gate ztIs used to determine ht-1In the message to htThe amount of information of (2).
wherein tanh is a hyperbolic tangent function, i.e.The output value is always in the interval (-1, 1); f is a sigmod function, the output value of which is always in the interval (0,1) so as to determine the importance degree of information and be more helpful to determine the updating or abandoning of data; i.e. itAs input at time t, and ytAs output at time t; wir、Wiz、WihThe weight matrices input to the reset gate, the update gate, and the state, respectively; whr、Whz、Whh、WoRespectively obtaining weight matrixes from a state to a reset gate, an update gate, a current state and an initial state at a certain moment; br、bz、bhRespectively, a "reset gate", an "update gate", and an offset of the state.
In the GRU structure of the above model, the transmission of information is unidirectional from front to back. However, it is considered that the performance data is influenced not only by the data of the previous period but also by the data of the latter period. Therefore Bi-directional gated recurrent neural networks Bi-GRU are used. On the basis of the structure of the GRU, forward and backward propagation is added in a hidden layer of the Bi-GRU, and the long-term dependence of the ordinary learning achievement in different periods is captured by forward and backward bidirectional operation, so that more accurate achievement prediction is obtained.
PCANet-BiGRU
The PCANet has the characteristics of local perception and weight sharing, so that the PCANet is used for extracting spatial features related to the position of the achievement data. Combining the PCANet and the BiGRU, and mining the time and space characteristics of the achievement data through the space perception characteristic of the PCANet and the two-way memory characteristic of the BiGRU to realize the achievement prediction. The PCANet-BiGRU performance prediction process is as follows:
Step 2, inputting emergency data into PCANet to extract data space characteristics
And 3, inputting the data processed by the PCANet into the Bi-GRU layer, wherein the model structure of the data for predicting the ordinary performance is shown in the figure.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
A processor for running a program, wherein the program performs the above method when running. An example of a specific application of the present invention is given below:
data acquisition
The experimental data of the invention come from the information of each course stored in the MOOC on-line learning system. These classes were opened in 2019 in the system, where student performance information was stored in the form of Excel tables. The student achievement evaluation standard in the data set is based on the achievement of each ordinary homework, and is basically the ordinary homework submitted in a week. The system preprocesses the evaluated data into vectors as the input of the neural network according to the internal standard for the ordinary performances.
Data set partitioning
Randomly extracting 20% of the data set as a test set, and then adopting a cross validation algorithm on the rest data:
a, randomly dividing training data into k parts;
b, selecting k-1 parts of training in turn, verifying the remaining part, and calculating the square sum of the prediction errors;
and c, finally averaging the square sum of the k prediction errors to be used as a basis for selecting an optimal model structure.
Data cleansing
And (3) performing data cleaning on the data of the training set: based on the training integrated performance data acquired in the previous stage, the situation that the performance of the operation is lost due to the fact that the operation is missed at ordinary times at a certain time can occur, and the abnormal data can cause great deviation of model training, so that the lost data must be cleaned. Based on the training set data acquired in the previous stage, if a missing value exists in a certain row of data required to be used in training, deleting the row of data, namely deleting the raw data; if the column data is not used during training, the row of data is retained even if the column has missing values.
Data normalization
The total score of each ordinary achievement is different, and in order to eliminate the influence of the dimension on data prediction, normalization processing is carried out on the achievement.
Construction of models using training sets
Constructing a PCANet-BiGRU network structure, constructing a network model by using a training set, and determining the weight and the bias parameters of the model through fitting;
to prevent overfitting, DropOut technique is used in the training process to enhance the generalization ability of the deep neural network with feature map perturbation.
Determining hyper-parameters of an optimized tuning model using a validation set
Since the learning rate has a significant influence on the performance of the model, and too large or too small of the learning rate causes oscillation of the network and cannot converge to an optimal solution, an optimization method that can update the learning rate by using the general information in the gradient descent is required to improve the model training speed. When the number of the PCA layers is too small, the extraction features are incomplete, and when the number of the PCA layers is too large, huge calculation complexity is brought. Therefore, according to the optimization of the verification set, the PCA convolution layer number is determined to be 2, the Bi-GRU layer number is determined to be 4, the Bi-GRU layer neuron number is determined to be 30, the optimized learning algorithm is Adam, and the learning rate is 0.1.
Evaluating models using test sets
Evaluation index
In order to measure the performance of the prediction model, the square root error RMSE and the average absolute percentage error MAPE are used as performance evaluation indexes to carry out model evaluation.
Wherein N represents the number of samples, yiRepresents the actual achievement of a certain student,representing the predicted performance of the student. MAPE reflects the total deviation of the result predicted value, and the prediction accuracy of the model can be measured; the RMSE reflects the error between the predicted value and the true value, which can measure the accuracy of the predicted value.
Analysis of Experimental results
The experiment predicts the general achievement in normal times according to the general achievement of MOOC platform C language, C + + and Python. The results are shown in the following table
Form a achievement result prediction form
Subject of normal score | C language | C + + language | Python language |
RMSE | 0.0252 | 0.0256 | 0.0263 |
MAPE/% | 2.6139 | 2.6732 | 2.7141 |
As shown in the table, the RMSE and MAE values of the prediction results are both below 0.03, and the MAPE values of the prediction results are both below 2.8%, which shows that the model can effectively predict the performances of different courses of the MOOC, the prediction results are more accurate, and the prediction requirements of the actual MOOC performances can be met. Compared with the conventional performance prediction model using GM (1,1) and PSO-SVM models for performance prediction, the model is remarkably optimized in the aspect of prediction accuracy.
The PCANet-BiGRU model established in the method has obvious advantages in running time.
TABLE II training time comparison table
From the above table, the PCANet-BiGRU has a significant advantage in operation efficiency over the CNN-GRU. The more times of training, the greater the advantage
The result prediction can enable the online learning condition of the students to be better known by teachers, and promote the effective progress of the student results. The invention provides a deep neural network prediction model combining PCANet and Bi-GRU, which utilizes PCANet to extract data hidden features, reduces the size of data, and utilizes Bi-GRU to extract internal dynamic rules to realize early warning of performance. Compared with the previous model, the model can integrate data in time, early warn the score of students, remarkably improve the utilization rate of the data and the time, finally promote the development of education informatization and have higher practical value.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.
Claims (10)
1. A achievement prediction method based on PCANet-BiGRU is characterized by comprising the following steps:
step 1) utilizing basic statistical data and score data of students collected from an online learning platform;
step 2) dividing the original achievement data collected in the step 1) into three independent parts: training set, verification set and test set, and carrying out data cleaning and normalization processing on the training set to construct a data matrix, wherein the data cleaning specifically comprises the following steps: based on the training set data acquired in the previous stage, if a missing value exists in a certain row of data required to be used in training, deleting the row of data, namely deleting the raw data; if the data of the row is not used during training, the row of data is still reserved even if the row has missing values;
step 3) inputting the matrix data into a principal component analysis network PCANet, and extracting the characteristics of the result data, wherein the principal component analysis network PCANet consists of a PCA convolution layer, a nonlinear processing layer and a characteristic pooling layer;
and 4) inputting the data processed by the PCANet into a Bi-GRU layer of a bidirectional gating circulation unit neural network, and predicting the ordinary achievement of the student.
2. The PCANet-BiGRU-based performance prediction method of claim 1, wherein the training set in step 2) is used to train a model, and the model is built by fitting to find initial parameters of the model, i.e. determining weights of the model and biasing the parameters; the verification set is used for determining parameters of an optimization adjustment model such as a network structure or control model complexity; and the test set checks how well the model that is finally selected is performing. The division ratio of the common training set, the verification set and the test set is 6:2: 2; under the condition of few data sets, generally randomly extracting 20% of data as a test set, and then adopting a cross validation algorithm on the rest data; the cross validation algorithm comprises the following specific steps:
a. randomly equally dividing training data into k parts;
b. selecting k-1 parts of training in turn, verifying the remaining part, and calculating the square sum of prediction errors;
c. and finally averaging the square sum of the k prediction errors to be used as a basis for selecting an optimal model structure.
3. The PCANet-BiGRU-based achievement prediction method of claim 1, wherein the step 4) bidirectional gated loop unit neural network Bi-GRU increases forward and backward propagation in a hidden layer of the bidirectional gated loop unit GRU on the basis of the structure of the gated loop unit GRU, and captures long-term dependence of learning achievement at different periods through forward and backward bidirectional operation to obtain more accurate achievement prediction.
4. The PCANet-BiGRU-based performance prediction method of claim 1, wherein the PCA convolutional layer, for each data j of the input layer/, has a convolution kernel P around itjSampling on a window, then sliding a convolution kernel, concatenating all sample blocks as a representation X of the samplei=[xi,1,xi,2,...,xi,n]And averaging the values; performing the operation on the N data sets to obtain a new feature matrix X; then, Principal Component Analysis (PCA) is performed on the matrix; PCA is a common method for data analysis and modeling, and mainly retains the most important characteristics of high-dimensional data, removes noise and unimportant characteristics, reduces the dimension and greatly reduces the data processing cost and speed; the specific algorithm steps are as follows:
a. recording a matrix X with n rows and m columns;
b. normalizing each row of X;
c. solving the covariance matrix C of X:
d. and (3) solving a characteristic value E and a characteristic vector D corresponding to the C:
[E,D]=eig(C)
wherein, the eig is a function of the eigenvalue and the eigenvector;
e. arranging the eigenvector D according to the size of the corresponding eigenvalue, and selecting the first k columns to form a new matrix, wherein the new matrix is the eigenvector of the data after dimension reduction;
f. and taking the K groups of feature vectors as a PCA filter, taking the PCA filter as a convolution kernel K, and performing convolution on the data set and the K groups of data to finish the convolution operation of extracting the data.
5. The PCANet-BiGRU-based performance prediction method of claim 4, wherein the non-linear part is used for enhancing the feature expressiveness of the data, and specifically comprises: and carrying out nonlinear processing on the data convolved by two layers of PCAs: carrying out binarization on each convolution result, and selecting a Hervesaide step functionCarrying out binarization; then, the result after binarization is weighted to obtain an integral graph T of the ith numerical value on the output characteristic of the ith layeri l:
6. The PCANet-BiGRU-based performance prediction method of claim 5, wherein the feature pooling layer performs a feature pooling operation of PCANet using a local histogram, integer graph Ti lThe local histograms are divided into blocks, and each histogram is counted and vectorized, denoted as Bhist (T)i l) The feature vector derived from concatenating the vectors generated by the k integer maps is represented as: f. ofi=[Bhist(Ti 1),...,Bhist(Ti k)]T。
7. The PCANet-BiGRU-based performance prediction method of claim 5, wherein the GRU in the Bi-GRU neural network has two gate functions: "reset gate" and "update gate"; wherein "Reset gate rtFor controlling the state h of the preceding time periodt-1For the current state to be determinedThe degree of influence of (c); and "update gate" ztIs used to determine ht-1In the message to htThe amount of information of (a); the GRU neural network unit is updated in the way of
Wherein tanh is a hyperbolic tangent function, i.e.The output value is always in the interval (-1, 1); f is a sigmod function, the output value of which is always in the interval (0,1) so as to determine the importance degree of information and be more helpful to determine the updating or abandoning of data; i.e. itAs input at time t, and ytAs output at time t; wir、Wiz、WihThe weight matrices input to the reset gate, the update gate, and the state, respectively; whr、Whz、Whh、WoRespectively obtaining weight matrixes from a state to a reset gate, an update gate, a current state and an initial state at a certain moment; br、bz、bhRespectively, a "reset gate", an "update gate", and an offset of the state.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the program is executed by the processor.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
10. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110800902.3A CN113962424A (en) | 2021-07-15 | 2021-07-15 | Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110800902.3A CN113962424A (en) | 2021-07-15 | 2021-07-15 | Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113962424A true CN113962424A (en) | 2022-01-21 |
Family
ID=79460379
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110800902.3A Pending CN113962424A (en) | 2021-07-15 | 2021-07-15 | Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113962424A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115656840A (en) * | 2022-12-27 | 2023-01-31 | 武汉工程大学 | Method, device, system and storage medium for predicting battery charging remaining time |
-
2021
- 2021-07-15 CN CN202110800902.3A patent/CN113962424A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115656840A (en) * | 2022-12-27 | 2023-01-31 | 武汉工程大学 | Method, device, system and storage medium for predicting battery charging remaining time |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109492822B (en) | Air pollutant concentration time-space domain correlation prediction method | |
CN111915059B (en) | Attention mechanism-based Seq2Seq berth occupancy prediction method | |
CN111563706A (en) | Multivariable logistics freight volume prediction method based on LSTM network | |
CN110648014B (en) | Regional wind power prediction method and system based on space-time quantile regression | |
CN111626785A (en) | CNN-LSTM network fund price prediction method based on attention combination | |
CN111310965A (en) | Aircraft track prediction method based on LSTM network | |
CN112465199A (en) | Airspace situation evaluation system | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN113205698A (en) | Navigation reminding method based on IGWO-LSTM short-time traffic flow prediction | |
CN114580545A (en) | Wind turbine generator gearbox fault early warning method based on fusion model | |
CN115310782A (en) | Power consumer demand response potential evaluation method and device based on neural turing machine | |
CN114548591A (en) | Time sequence data prediction method and system based on hybrid deep learning model and Stacking | |
CN116542701A (en) | Carbon price prediction method and system based on CNN-LSTM combination model | |
CN114973665A (en) | Short-term traffic flow prediction method combining data decomposition and deep learning | |
CN115096357A (en) | Indoor environment quality prediction method based on CEEMDAN-PCA-LSTM | |
CN113962424A (en) | Performance prediction method based on PCANet-BiGRU, processor, readable storage medium and computer equipment | |
CN114580262A (en) | Lithium ion battery health state estimation method | |
CN107704944B (en) | Construction method of stock market fluctuation interval prediction model based on information theory learning | |
CN114596726A (en) | Parking position prediction method based on interpretable space-time attention mechanism | |
Xu et al. | Residual autoencoder-LSTM for city region vehicle emission pollution prediction | |
CN116579408A (en) | Model pruning method and system based on redundancy of model structure | |
CN109978138A (en) | The structural reliability methods of sampling based on deeply study | |
Siraj et al. | Data mining and neural networks: the impact of data representation | |
Zhong et al. | Handwritten digit recognition based on corner detection and convolutional neural network | |
CN112651168B (en) | Construction land area prediction method based on improved neural network algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |