CN112801362A - Academic early warning method based on artificial neural network and LSTM network - Google Patents
Academic early warning method based on artificial neural network and LSTM network Download PDFInfo
- Publication number
- CN112801362A CN112801362A CN202110101091.8A CN202110101091A CN112801362A CN 112801362 A CN112801362 A CN 112801362A CN 202110101091 A CN202110101091 A CN 202110101091A CN 112801362 A CN112801362 A CN 112801362A
- Authority
- CN
- China
- Prior art keywords
- information
- student
- weight
- early warning
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000012545 processing Methods 0.000 claims abstract description 12
- 230000005284 excitation Effects 0.000 claims abstract description 11
- 230000003044 adaptive effect Effects 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 47
- 230000006870 function Effects 0.000 claims description 46
- 238000004422 calculation algorithm Methods 0.000 claims description 26
- 230000008859 change Effects 0.000 claims description 23
- 238000004140 cleaning Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- PHTXVQQRWJXYPP-UHFFFAOYSA-N ethyltrifluoromethylaminoindane Chemical compound C1=C(C(F)(F)F)C=C2CC(NCC)CC2=C1 PHTXVQQRWJXYPP-UHFFFAOYSA-N 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 3
- 238000003491 array Methods 0.000 claims description 3
- 239000000969 carrier Substances 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000003203 everyday effect Effects 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 5
- 238000010801 machine learning Methods 0.000 description 6
- 238000012706 support-vector machine Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Strategic Management (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Human Resources & Organizations (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- General Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Marketing (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Primary Health Care (AREA)
- Evolutionary Biology (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an academic early warning method based on an artificial neural network and an LSTM network, which is characterized by comprising the following steps of: 1) processing missing data based on the RBF core; 2) extracting self-adaptive features based on multi-dimensional normal distribution; 3) artificial neural network training based on the refined network; 4) training an adaptive excitation function LSTM network; 5) in conjunction with a software platform. The method has the advantages of good universality, low false detection rate and high prediction accuracy.
Description
Technical Field
The invention relates to the field of big data processing and machine learning, in particular to a academic early warning method based on an artificial neural network and an LSTM network.
Background
The academic early warning has the function of predicting the future performance trend of students and judging whether the students can be graduated, and has very good application value in colleges and universities. Some colleges and universities have used academic early warning systems to assist schools in improving the graduation rate of students and reducing the failure rate. Along with the improvement of the enrollment scale of colleges and universities, too many students exist, and the manual tracking of the completion degree of the student's academic industry is not practical, so that mass data are trained by utilizing the modern advanced deep learning algorithm to form an academic early warning system with accurate prediction, early warning is made for the student's academic industry in advance, the college students can be aware of the danger that the academic industry can not be normally completed in advance, and the occurrence of the condition that the student's academic industry fails to meet the requirements can be effectively reduced.
In the traditional academic early warning method based on machine learning, the algorithms such as a Support Vector Machine (SVM), a Random forest (Random forest) and the like are applied to the academic early warning system, the time characteristics of the scores of college students, the reading time of libraries and the like cannot be used, the front and back time lines cannot be effectively connected, the influence of a front course on a rear course cannot be well processed, and the missing data cannot be well processed. In recent years, the university of oceanic labor proposes a 'academic early warning' mechanism, introduces a 'academic early warning' concept, and carries out early warning of students with low academic completion degree to different degrees through the academic early warning mechanism.
Seep Hochreiter et al propose a long-short term memory network (LSTM) algorithm, and solve the problems that the cyclic neural network (RNN) cannot forget information and cannot memorize information for a long time; tao et al propose an algorithm for improving the SVM based on KFCM, apply the algorithm for improving SVM to the field of academic early warning, improve the traditional algorithm for machine learning, and obtain a better prediction effect; ren et al propose an algorithm based on the FT _ BP neural network, use deep learning and improve the conventional BP network, and apply it to the field of academic early warning.
Although these studies have solved some of the problems of academic forewarning to some extent in recent years, they still have many shortcomings. Firstly, academic early warning has the conditions of complex conditions and non-uniform data, the application scene of the existing method is narrow, the data with uniform standards is required to be used as support, and the key points of the methods are all in algorithm, and the importance of a data set and the time and space correlation among different characteristics are not considered; secondly, the KFCM-SVM algorithm has a good effect when the machine learning algorithm deals with a small number of data sets, but the KFCM-SVM algorithm is not good for scenes with large data and complex data, and the KFCM-SVM algorithm has the problems that the machine learning has high false detection rate and the accuracy rate needs to be further improved due to the characteristic limitation of the algorithm.
Disclosure of Invention
The invention aims to provide a academic early warning method based on an artificial neural network and an LSTM network, aiming at the defects of the prior art. The method has the advantages of good universality, low false detection rate and high prediction accuracy.
The technical scheme for realizing the purpose of the invention is as follows:
an academic early warning method based on an artificial neural network and an LSTM network comprises the following steps:
1) processing missing data based on the RBF core: firstly, cleaning student data: normalizing the student information with complete data by x' ═ x-mu)/(Max-Min); then, the original data of the student information is processed by K (v1, v2) ═ exp (-gamma | | v1-v2 |)2The method maps the student information source from the low-dimensional space to the high-dimensional space, and the method has the advantages of completely retaining all information of the original data, not considering missing values, not considering problems such as linear inseparability and the like, and the mapping process is shown as the formula (1):
wherein, x ', y ' are student information data arrays, y ' is a corresponding array label, α is a RBF kernel parameter gamma value, the smaller the α value is, the larger the influence is, the smaller the value is, and the larger the influence is, thereby each element of the obtained high-dimensional space is:
further obtaining a data K (v1, v2, v3..) vector, and then carrying out normalization processing on the data K (v1, v2, v3..) vector by using x (y-mu)/(Max-Min), wherein y is a student information data array to be normalized, and x is a normalized student information array;
2) self-adaptive feature extraction based on multi-dimensional normal distribution: extracting a result information vector G (x1, x2, x3..) in the student data subjected to data cleaning processing and normalization in the step 1), performing multivariate normal distribution screening, and defining n-dimensional result information in the result information vector G (x1, x2, x3..) as any linear combination of x1, x2, x3., and xn, wherein Y is a-a1x1+a2x2+...+anxnSubject to a normal distribution, there is a random vector Z ═ Z1,...,ZM]TWherein each element follows a normal distribution and a random vector μ ═ μ1,...,μN]TAnd N × M satisfies X ═ AZ + μ; if the n-dimensional achievement information vector G (x1, x2, x3..) meets the three conditions, the n-dimensional achievement information vector G is called to meet the multivariate normal distribution, namely the achievement information vector G (x1, x2, x3..) obeys fxFor the student information with the score meeting the multivariate normal distribution, the student information is planned to be a standard class, and the student information with the score not meeting the multivariate normal distribution is planned to be a singular class, as shown in a formula (2):
here, x is a normalized student information array, μ is an average value, and k is a constant index;
3) artificial neural network training based on refined network: training the data related to the student performance information obtained in the step 2) by adopting an artificial neural network, wherein the training adopts an elastic back propagation (Rprep) algorithm, in the general back propagation algorithm, the change amount of the weight in the learning process is determined by the partial derivative (gradient) of the error function to the weight, and in the Rprep algorithm, the change amount delta w of the weighti,jIs directly equal to the learning rate etai,j(t), therefore the gradient of the error function does not affect the change value of the weight, the gradient only affects the sign of the change value of the weight in the Rprep algorithm, i.e. affects the direction of the change of the weight, the change amount of the weight in the training process is directly equal to the learning rate corresponding to each weight, the sign of the change amount of the weight depends on the sign of the gradient of the error function, the gradient of the error function only determines the direction of the update of the weight, does not determine the strength of the update of the weight, if the gradient of the error function is positive, the corresponding weight needs to be reduced, and w can be enabled to be changedi,jSubtracting etai,j(t), if the gradient is negative, the corresponding weight should be increased to make the error function approach the minimum value, as shown in equation (3): :
now that it is clear how the weights are updated, the learning rate η is describedi,jHow (t) is updated, when first it should be considered how the sign of the gradient at any two time points, t and (t-1), will change, there are two cases in total: if the signs of the gradients of the error functions at two time points (t-1) and t are different, which indicates that the minimum value has been crossed at t, which indicates that the last update step of the weight value is too large, ηi,j(t) should be greater than ηi,j(t-1) is smaller to make the search for the lowest value more accurate, and the learning rate of the previous step and a value eta greater than 0 and smaller than 1 are mathematically madeupMultiply to get the current learning rate, however, when the symbol is two timesThe same sign indicates that the lowest point of the error function has not been reached, the corresponding learning rate can be increased by some to speed up the learning step, and therefore the learning rate of the previous step can be multiplied by an η greater than 1downTo obtain the current learning rate, as shown in equation (4):
4) training an adaptive excitation function LSTM network: adopt LSTM network to train student's all-purpose card consumption relevant information every day to carry out the cascade output final graduation probability result with artifical neural network training result, LSTM network structure among the prior art is: the LSTM module has three inputs: c. Ct-1、ht-1And xtOutputs through the LSTM module are respectively ct、htAnd ytWherein x istRepresents the input of the current round, ht-1Representing the state quantity output of the previous round, ct-1Carriers representing one information global in the previous round, ytRepresents the output of the current wheel, htRepresenting the output of the state quantity of the current wheel, ctAn information carrier representing the global of the round, xtAnd ht-1Combining the vectors into a vector, multiplying the vector by a vector W, and wrapping a layer of tanh function outside the vector W to obtain a vector z; using activation function sigmoid to convert x in LSTM network structure in prior arttAnd ht-1Are combined into a vector and then multiplied by a matrix Wf、WiAnd WoTo obtain zf、ziAnd zo,Wf,Wi,WoWeight matrixes of the forgetting gate, the input gate and the output gate are used for multiplying the variable input by each gate, zf,zi,zoThe outputs of the gates are multiplied by a weight W plus an offset, and then the vectors are used to obtain c from equation (5)t:
ct=zfct-1+ziz (5),
H is obtained from the formula (6)t:ht=zotanh(ct) (6),
The output y of the wheel is obtained by the formula (7)t:yt=σ(Wht) (7),
Changing the original tanh excitation function into a weighted average function of the adaptive excitation function Relu + tanh, and adopting the data x transmitted by each gate of the LSTMExciting in a u + v-1 form, effectively avoiding the problem of gradient disappearance of tanh, and keeping the nonlinear characteristic of tanh, so that LSTM is utilized to train the consumption related information of the one-card students, early warning and classifying are carried out on each student, a personal analysis report is generated for each student in a targeted manner, if high-risk early warning is carried out, corresponding early warning information is received, and the early warning information is cascaded with the artificial neural network in the step 3) so as to predict the graduation probability of the student;
5) in combination with a software platform: a software platform suitable for the steps is established by utilizing the existing early warning algorithm software, a user port is divided into a teacher end and a student end, the student has the right to check the personal learning score and the personal related information, the class related score information of the student, and the corresponding subject ranking on class grade, and if the student end is a high-risk early warning student end, the student end receives corresponding early warning information; the teacher end has the right to check the information of all students of the course taken by the teacher, check all kinds of information of the class and all kinds of information of the grade, if the students receive the early warning, the teacher also receives the prompt of the early warning of the students, the teacher can conveniently pay attention to the grades of the students, meanwhile, the individual, class and course charts of the students are generated, and the teacher can conveniently track the learning condition and the state of each student in real time.
The technical scheme combines a neural network and an LSTM network algorithm, performs data cleaning on data, performs missing data mapping based on RBF (radial basis function) kernel, performs data normalization processing and adaptive feature extraction based on multidimensional normal distribution, performs refined primary and secondary artificial neural network training and LSTM network training based on an adaptive excitation function, performs prediction in a mode of cascade connection of the artificial neural network and the LSTM network, has a good visual interface by combining with a software platform, can provide visual early warning data information graphs, prediction graphs and reports for students and teachers, can automatically divide early warning grades for the students, and provides corresponding early warning suggestion functions.
The method effectively improves the early warning accuracy by cleaning various data of students, mapping missing data, normalizing, extracting the self-adaptive characteristics based on multi-dimensional normal distribution, then utilizing the artificial neural network and LSTM network distribution training and a cascading prediction mode, and can achieve high accuracy under the condition of less characteristic data.
The method has the advantages of good universality, low false detection rate and high prediction accuracy.
Drawings
FIG. 1 is a schematic flow chart of an exemplary method;
FIG. 2 is a diagram illustrating the comparison between the prediction accuracy of the embodiment and the prediction accuracy of other methods;
FIG. 3 is a schematic diagram of an LSTM module in the prior art;
fig. 4 is a schematic structural diagram of an LSTM module in the embodiment.
Detailed Description
The invention will be further elucidated with reference to the drawings and examples, without however being limited thereto.
Example (b):
referring to fig. 1, a academic early warning method based on an artificial neural network and an LSTM network includes the following steps:
1) processing missing data based on the RBF core: firstly, cleaning student data: normalizing the student information with complete data by x' ═ x-mu)/(Max-Min); then, the original data of the student information is processed by K (v1, v2) ═ exp (-gamma | | v1-v2 |)2The method maps the student information source from the low-dimensional space to the high-dimensional space, and the method has the advantages of completely retaining all information of the original data, not considering missing values, not considering problems such as linear inseparability and the like, and the mapping process is shown as the formula (1):
wherein, x ', y ' are student information data arrays, y ' is a corresponding array label, α is a RBF kernel parameter gamma value, the smaller the α value is, the larger the influence is, the smaller the value is, and the larger the influence is, thereby each element of the obtained high-dimensional space is:
further obtaining a data K (v1, v2, v3..) vector, and then carrying out normalization processing on the data K (v1, v2, v3..) vector by using x (y-mu)/(Max-Min), wherein y is a student information data array to be normalized, and x is a normalized student information array;
2) self-adaptive feature extraction based on multi-dimensional normal distribution: extracting a result information vector G (x1, x2, x3..) in the student data subjected to data cleaning processing and normalization in the step 1), performing multivariate normal distribution screening, and defining n-dimensional result information in the result information vector G (x1, x2, x3..) as any linear combination of x1, x2, x3., and xn, wherein Y is a-a1x1+a2x2+...+anxnSubject to a normal distribution, there is a random vector Z ═ Z1,...,ZM]TWherein each element follows a normal distribution and a random vector μ ═ μ1,...,μN]TAnd N × M satisfies X ═ AZ + μ; if the n-dimensional achievement information vector G (x1, x2, x3..) meets the three conditions, the n-dimensional achievement information vector G is called to meet the multivariate normal distribution, namely the achievement information vector G (x1, x2, x3..) obeys fxFor the student information with the score meeting the multivariate normal distribution, the student information is planned to be a standard class, and the student information with the score not meeting the multivariate normal distribution is planned to be a singular class, as shown in a formula (2):
here, x is a normalized student information array, and μ is an average value;
3) artificial neural network training based on refined network: training the data related to the student performance information obtained in the step 2) by adopting an artificial neural network, wherein the training adopts an elastic back propagation (Rprep) algorithm, in the general back propagation algorithm, the change amount of the weight in the learning process is determined by the partial derivative (gradient) of the error function to the weight, and in the Rprep algorithm, the change amount delta w of the weighti,jIs directly equal to the learning rate etai,j(t), therefore the gradient of the error function does not affect the change value of the weight, the gradient only affects the sign of the change value of the weight in the Rprep algorithm, i.e. affects the direction of the change of the weight, the change amount of the weight in the training process is directly equal to the learning rate corresponding to each weight, the sign of the change amount of the weight depends on the sign of the gradient of the error function, the gradient of the error function only determines the direction of the update of the weight, does not determine the strength of the update of the weight, if the gradient of the error function is positive, the corresponding weight needs to be reduced, and w can be enabled to be changedi,jSubtracting etai,j(t), if the gradient is negative, the corresponding weight should be increased to make the error function approach the minimum value, as shown in equation (3): :
now that it is clear how the weights are updated, the learning rate η is describedi,jHow (t) is updated, when first it should be considered how the sign of the gradient at any two time points, t and (t-1), will change, there are two cases in total: if the signs of the gradients of the error functions at two time points (t-1) and t are different, which indicates that the minimum value has been crossed at t, which indicates that the last update step of the weight value is too large, ηi,j(t) should be greater than ηi,j(t-1) is smaller to make the search for the lowest value more accurate, and the learning rate of the previous step and a value eta greater than 0 and smaller than 1 are mathematically madeupMultiply to obtain the current learning rate, however, whenThe same sign of the two times indicates that the lowest point of the error function has not been reached, the corresponding learning rate can be increased by some to speed up the learning step, and therefore the learning rate of the previous step can be multiplied by an eta greater than 1downTo obtain the current learning rate, as shown in equation (4):
4) training an adaptive excitation function LSTM network: the LSTM network is adopted to train the consumption related information of the one-card-through-card each day of the student, and the training result and the artificial neural network are cascaded to output the final graduation probability result, as shown in figure 3, the LSTM network structure in the prior art is as follows: the LSTM module has three inputs: c. Ct-1、ht-1And xtOutputs through the LSTM module are respectively ct、htAnd ytWherein x istRepresents the input of the current round, ht-1Representing the state quantity output of the previous round, ct-1Carriers representing one information global in the previous round, ytRepresents the output of the current wheel, htRepresenting the output of the state quantity of the current wheel, ctAn information carrier representing the global of the round, xtAnd ht-1Combining the vectors into a vector, multiplying the vector by a vector W, and wrapping a layer of tanh function outside the vector W to obtain a vector z; using activation function sigmoid to convert x in LSTM network structure in prior arttAnd ht-1Are combined into a vector and then multiplied by a matrix Wf、WiAnd WoTo obtain zf、ziAnd zo,Wf,Wi,WoWeight matrixes of the forgetting gate, the input gate and the output gate are used for multiplying the variable input by each gate, zf,zi,zoThe outputs of the gates are multiplied by a weight W plus an offset, and then the vectors are used to obtain c from equation (5)t:
ct=zfct-1+ziz (5),
H is obtained from the formula (6)t:ht=zotanh(ct) (6),
The output y of the wheel is obtained by the formula (7)t:yt=σ(Wht) (7),
As shown in FIG. 4, the LSTM network structure in this example is modified from the network structure based on FIG. 3 in that the original tanh excitation function is changed into a weighted average function of the adaptive excitation functions Relu + tanh, and the data x incoming to each gate of the LSTM is taken asExciting in a u + v-1 form, effectively avoiding the problem of gradient disappearance of tanh, and keeping the nonlinear characteristic of tanh, so that LSTM is utilized to train the consumption related information of the one-card students, early warning and classifying are carried out on each student, a personal analysis report is generated for each student in a targeted manner, if high-risk early warning is carried out, corresponding early warning information is received, and the early warning information is cascaded with the artificial neural network in the step 3) so as to predict the graduation probability of the student;
5) in combination with a software platform: a software platform suitable for the steps is established by utilizing the existing early warning algorithm software, a user port is divided into a teacher end and a student end, the student has the right to check the personal learning score and the personal related information, the class related score information of the student, and the corresponding subject ranking on class grade, and if the student end is a high-risk early warning student end, the student end receives corresponding early warning information; the teacher end has the right to check the information of all students of the course taken by the teacher, check all kinds of information of the class and all kinds of information of the grade, if the students receive the early warning, the teacher also receives the prompt of the early warning of the students, the teacher can conveniently pay attention to the grades of the students, meanwhile, the individual, class and course charts of the students are generated, and the teacher can conveniently track the learning condition and the state of each student in real time.
Through multiple consideration and tests, the accuracy of the method can stably reach 94.21%, the highest accuracy can reach 98.17%, and the average false detection rate is stably 1.97%, as shown in fig. 2, compared with the existing machine learning algorithm SVM, the RF accuracy is obviously improved.
Claims (1)
1. An academic early warning method based on an artificial neural network and an LSTM network is characterized by comprising the following steps:
1) processing missing data based on the RBF core: firstly, cleaning student data: normalizing the student information with complete data by x' ═ x-mu)/(Max-Min); then, the original data of the student information is processed by K (v1, v2) ═ exp (-gamma | | v1-v2 |)2The method maps the student information source from a low-dimensional space to a high-dimensional space, and the mapping process is shown as formula (1):
wherein, x ', y ' are student information data arrays, y ' is a corresponding array label, and α is a RBF kernel parameter gamma value, so that each element of the obtained high-dimensional space is:further obtaining a data K (v1, v2, v3..) vector, and then carrying out normalization processing on the data K (v1, v2, v3..) vector by using x (y-mu)/(Max-Min), wherein y is a student information data array to be normalized, x is a normalized student information array, mu is an average value, Max and Min are the maximum value and the minimum value of all x elements;
2) self-adaptive feature extraction based on multi-dimensional normal distribution: extracting a result information vector G (x1, x2, x3..) in the student data subjected to data cleaning processing and normalization in the step 1), performing multivariate normal distribution screening, and defining n-dimensional result information in the result information vector G (x1, x2, x3..) as any linear combination of x1, x2, x3., and xn, wherein Y is a-a1x1+a2x2+...+anxnSubject to a normal distribution, there is a random vector Z ═ Z1,...,ZM]TWherein each element follows a normal distribution and a random vector μ ═ μ1,...,μN]TAnd N × M satisfies X ═ AZ + μ; if the n-dimensional achievement information vector G (x1, x2, x3..) meets the three conditions, the n-dimensional achievement information vector G is called to meet the multivariate normal distribution, namely the achievement information vector G (x1, x2, x3..) obeys fxFor the student information with the score meeting the multivariate normal distribution, the student information is planned to be a standard class, and the student information with the score not meeting the multivariate normal distribution is planned to be a singular class, as shown in a formula (2):
wherein x is a normalized student information array, mu is an average value, and k is a constant index;
3) artificial neural network training based on refined network: training the data related to the student performance information obtained in the step 2) by adopting an artificial neural network, wherein the training adopts an elastic back propagation (Rprep algorithm), and the variation delta w of the weight in the Rprep algorithmi,jIs directly equal to the learning rate etai,j(t), the gradient of the error function does not influence the change value of the weight, the gradient of the error function only influences the sign of the change value of the weight in the Rprep algorithm, namely influences the change direction of the weight, the change amount of the weight in the training process is directly equal to the learning rate corresponding to each weight, the sign of the change amount of the weight depends on the sign of the gradient of the error function, the gradient of the error function only determines the update direction of the weight, the update strength of the weight is not determined, if the gradient of the error function is positive, the corresponding weight is reduced, and the w is enabled to be Wi,jSubtracting etai,j(t), if the gradient is negative, then the corresponding weight is increased to bring the error function to the minimum, as shown in equation (3): :
thus, it is clear how the weight is updated, and then the learning rate etai,j(t) update, the gradient at any two time points, t and (t-1), will change sign, and there are two cases of change in total: if the signs of the gradients of the error functions at two time points (t-1) and t are different, which indicates that the minimum value has been crossed at t, which indicates that the last update step of the weight value is too large, ηi,j(t) ratio ηi,j(t-1) smaller learning rate of the previous step and a value eta greater than 0 and smaller than 1upMultiplying to obtain current learning rate, when the signs of two times are identical, indicating that the lowest point of error function has not been reached yet, making the learning rate of previous step multiply by an eta greater than 1downObtaining the current learning rate as shown in formula (4):
4) training an adaptive excitation function LSTM network: adopt LSTM network to train student's all-purpose card consumption relevant information every day to carry out the cascade output final graduation probability result with artifical neural network training result, LSTM network structure among the prior art is: the LSTM module has three inputs: c. Ct-1、ht-1And xtThe LSTM module outputs are respectively ct、htAnd ytWherein x istRepresents the input of the current round, ht-1Representing the state quantity output of the previous round, ct-1Carriers representing one information global in the previous round, ytRepresents the output of the current wheel, htRepresenting the output of the state quantity of the current wheel, ctAn information carrier representing the global of the round, xtAnd ht-1Combining into a vector, multiplying by vector W, wrapping a layer of tanh function outside to obtain vector z, and adopting activation function sigmoid to convert x in LSTM network structure in the prior arttAnd ht-1Are combined into a vector and then multiplied by a matrix Wf、WiAnd WoTo obtain zf、ziAnd zo,Wf,Wi,WoWeight matrixes of the forgetting gate, the input gate and the output gate are used for multiplying the variable input by each gate, zf,zi,zoC is obtained from equation (5) for each gate output multiplied by the weight W plus the offsett:
ct=zfct-1+ziz (5),
H is obtained from the formula (6)t:ht=zotanh(ct) (6),
The output y of the wheel is obtained by the formula (7)t:yt=σ(Wht) (7),
Changing the original tanh excitation function into a weighted average function of the adaptive excitation function Relu + tanh, and adopting the data x transmitted by each gate of the LSTMThe u + v is excited in a form of 1, so that consumption related information of the one-card students is trained through LSTM, early warning classification is carried out on each student, a personal analysis report is generated for each student in a targeted manner, if the high-risk early warning is carried out, corresponding early warning information is received, and the early warning information is cascaded with the artificial neural network in the step 3) so as to predict the graduation probability of the student;
5) in combination with a software platform: a software platform suitable for the steps is established by utilizing the existing early warning algorithm software, a user port is divided into a teacher end and a student end, the student has the right to check the personal learning score and the personal related information, the class related score information of the student, and the corresponding subject ranking on class grade, and if the student end is a high-risk early warning student end, the student end receives corresponding early warning information; the teacher end has the right to check the information of all students of the course taken by the teacher, check all kinds of information of the class and all kinds of information of the grade, if the students receive the early warning, the teacher also receives the prompt of the early warning of the students, the teacher can conveniently pay attention to the grades of the students, meanwhile, the individual, class and course charts of the students are generated, and the teacher can conveniently track the learning condition and the state of each student in real time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110101091.8A CN112801362B (en) | 2021-01-26 | 2021-01-26 | Academic early warning method based on artificial neural network and LSTM network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110101091.8A CN112801362B (en) | 2021-01-26 | 2021-01-26 | Academic early warning method based on artificial neural network and LSTM network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112801362A true CN112801362A (en) | 2021-05-14 |
CN112801362B CN112801362B (en) | 2022-03-22 |
Family
ID=75811697
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110101091.8A Expired - Fee Related CN112801362B (en) | 2021-01-26 | 2021-01-26 | Academic early warning method based on artificial neural network and LSTM network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112801362B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110939A (en) * | 2019-05-15 | 2019-08-09 | 杭州华网信息技术有限公司 | The academic record prediction and warning method of behavior is serialized based on deep learning student |
CN111260230A (en) * | 2020-01-19 | 2020-06-09 | 西北大学 | Academic early warning method based on lifting tree model |
US20200302296A1 (en) * | 2019-03-21 | 2020-09-24 | D. Douglas Miller | Systems and method for optimizing educational outcomes using artificial intelligence |
US20200356852A1 (en) * | 2019-05-07 | 2020-11-12 | Samsung Electronics Co., Ltd. | Model training method and apparatus |
CN112257935A (en) * | 2020-10-26 | 2021-01-22 | 中国人民解放军空军工程大学 | Aviation safety prediction method based on LSTM-RBF neural network model |
-
2021
- 2021-01-26 CN CN202110101091.8A patent/CN112801362B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200302296A1 (en) * | 2019-03-21 | 2020-09-24 | D. Douglas Miller | Systems and method for optimizing educational outcomes using artificial intelligence |
US20200356852A1 (en) * | 2019-05-07 | 2020-11-12 | Samsung Electronics Co., Ltd. | Model training method and apparatus |
CN110110939A (en) * | 2019-05-15 | 2019-08-09 | 杭州华网信息技术有限公司 | The academic record prediction and warning method of behavior is serialized based on deep learning student |
CN111260230A (en) * | 2020-01-19 | 2020-06-09 | 西北大学 | Academic early warning method based on lifting tree model |
CN112257935A (en) * | 2020-10-26 | 2021-01-22 | 中国人民解放军空军工程大学 | Aviation safety prediction method based on LSTM-RBF neural network model |
Non-Patent Citations (2)
Title |
---|
宋楚平等: "一种RBF神经网络改进算法在高校学习预警中的应用", 《计算机应用与软件》 * |
肖逸枫: "数据挖掘技术用于高校学生留级预警的研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
Also Published As
Publication number | Publication date |
---|---|
CN112801362B (en) | 2022-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022077587A1 (en) | Data prediction method and apparatus, and terminal device | |
CN109446430B (en) | Product recommendation method and device, computer equipment and readable storage medium | |
CN110599336B (en) | Financial product purchase prediction method and system | |
Liang et al. | Multi-scale dynamic adaptive residual network for fault diagnosis | |
US11704570B2 (en) | Learning device, learning system, and learning method | |
CN110555459A (en) | Score prediction method based on fuzzy clustering and support vector regression | |
CN109284662B (en) | Underwater sound signal classification method based on transfer learning | |
CN111340107A (en) | Fault diagnosis method and system based on convolutional neural network cost sensitive learning | |
CN112149884A (en) | Academic early warning monitoring method for large-scale students | |
Liu et al. | Stock price trend prediction model based on deep residual network and stock price graph | |
CN109063750B (en) | SAR target classification method based on CNN and SVM decision fusion | |
CN112489689B (en) | Cross-database voice emotion recognition method and device based on multi-scale difference countermeasure | |
CN111462817B (en) | Classification model construction method and device, classification model and classification method | |
CN112801362B (en) | Academic early warning method based on artificial neural network and LSTM network | |
CN112381338B (en) | Event probability prediction model training method, event probability prediction method and related device | |
CN117011219A (en) | Method, apparatus, device, storage medium and program product for detecting quality of article | |
CN112085079B (en) | Rolling bearing fault diagnosis method based on multi-scale and multi-task learning | |
CN113010687B (en) | Exercise label prediction method and device, storage medium and computer equipment | |
CN111291838B (en) | Method and device for interpreting entity object classification result | |
CN111382761B (en) | CNN-based detector, image detection method and terminal | |
US20210133556A1 (en) | Feature-separated neural network processing of tabular data | |
CN110647630A (en) | Method and device for detecting same-style commodities | |
CN116405368B (en) | Network fault diagnosis method and system under high-dimensional unbalanced data condition | |
Tomar | A critical evaluation of activation functions for autoencoder neural networks | |
CN113469450B (en) | Data classification method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220322 |