CN112801362A - Academic early warning method based on artificial neural network and LSTM network - Google Patents

Academic early warning method based on artificial neural network and LSTM network Download PDF

Info

Publication number
CN112801362A
CN112801362A CN202110101091.8A CN202110101091A CN112801362A CN 112801362 A CN112801362 A CN 112801362A CN 202110101091 A CN202110101091 A CN 202110101091A CN 112801362 A CN112801362 A CN 112801362A
Authority
CN
China
Prior art keywords
information
student
weight
early warning
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110101091.8A
Other languages
Chinese (zh)
Other versions
CN112801362B (en
Inventor
欧阳宁
成浩
谷盛民
石将煌
梁达林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202110101091.8A priority Critical patent/CN112801362B/en
Publication of CN112801362A publication Critical patent/CN112801362A/en
Application granted granted Critical
Publication of CN112801362B publication Critical patent/CN112801362B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Human Resources & Organizations (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Marketing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Primary Health Care (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an academic early warning method based on an artificial neural network and an LSTM network, which is characterized by comprising the following steps of: 1) processing missing data based on the RBF core; 2) extracting self-adaptive features based on multi-dimensional normal distribution; 3) artificial neural network training based on the refined network; 4) training an adaptive excitation function LSTM network; 5) in conjunction with a software platform. The method has the advantages of good universality, low false detection rate and high prediction accuracy.

Description

Academic early warning method based on artificial neural network and LSTM network
Technical Field
The invention relates to the field of big data processing and machine learning, in particular to a academic early warning method based on an artificial neural network and an LSTM network.
Background
The academic early warning has the function of predicting the future performance trend of students and judging whether the students can be graduated, and has very good application value in colleges and universities. Some colleges and universities have used academic early warning systems to assist schools in improving the graduation rate of students and reducing the failure rate. Along with the improvement of the enrollment scale of colleges and universities, too many students exist, and the manual tracking of the completion degree of the student's academic industry is not practical, so that mass data are trained by utilizing the modern advanced deep learning algorithm to form an academic early warning system with accurate prediction, early warning is made for the student's academic industry in advance, the college students can be aware of the danger that the academic industry can not be normally completed in advance, and the occurrence of the condition that the student's academic industry fails to meet the requirements can be effectively reduced.
In the traditional academic early warning method based on machine learning, the algorithms such as a Support Vector Machine (SVM), a Random forest (Random forest) and the like are applied to the academic early warning system, the time characteristics of the scores of college students, the reading time of libraries and the like cannot be used, the front and back time lines cannot be effectively connected, the influence of a front course on a rear course cannot be well processed, and the missing data cannot be well processed. In recent years, the university of oceanic labor proposes a 'academic early warning' mechanism, introduces a 'academic early warning' concept, and carries out early warning of students with low academic completion degree to different degrees through the academic early warning mechanism.
Seep Hochreiter et al propose a long-short term memory network (LSTM) algorithm, and solve the problems that the cyclic neural network (RNN) cannot forget information and cannot memorize information for a long time; tao et al propose an algorithm for improving the SVM based on KFCM, apply the algorithm for improving SVM to the field of academic early warning, improve the traditional algorithm for machine learning, and obtain a better prediction effect; ren et al propose an algorithm based on the FT _ BP neural network, use deep learning and improve the conventional BP network, and apply it to the field of academic early warning.
Although these studies have solved some of the problems of academic forewarning to some extent in recent years, they still have many shortcomings. Firstly, academic early warning has the conditions of complex conditions and non-uniform data, the application scene of the existing method is narrow, the data with uniform standards is required to be used as support, and the key points of the methods are all in algorithm, and the importance of a data set and the time and space correlation among different characteristics are not considered; secondly, the KFCM-SVM algorithm has a good effect when the machine learning algorithm deals with a small number of data sets, but the KFCM-SVM algorithm is not good for scenes with large data and complex data, and the KFCM-SVM algorithm has the problems that the machine learning has high false detection rate and the accuracy rate needs to be further improved due to the characteristic limitation of the algorithm.
Disclosure of Invention
The invention aims to provide a academic early warning method based on an artificial neural network and an LSTM network, aiming at the defects of the prior art. The method has the advantages of good universality, low false detection rate and high prediction accuracy.
The technical scheme for realizing the purpose of the invention is as follows:
an academic early warning method based on an artificial neural network and an LSTM network comprises the following steps:
1) processing missing data based on the RBF core: firstly, cleaning student data: normalizing the student information with complete data by x' ═ x-mu)/(Max-Min); then, the original data of the student information is processed by K (v1, v2) ═ exp (-gamma | | v1-v2 |)2The method maps the student information source from the low-dimensional space to the high-dimensional space, and the method has the advantages of completely retaining all information of the original data, not considering missing values, not considering problems such as linear inseparability and the like, and the mapping process is shown as the formula (1):
Figure BDA0002915668840000021
Figure BDA0002915668840000022
wherein, x ', y ' are student information data arrays, y ' is a corresponding array label, α is a RBF kernel parameter gamma value, the smaller the α value is, the larger the influence is, the smaller the value is, and the larger the influence is, thereby each element of the obtained high-dimensional space is:
Figure BDA0002915668840000023
further obtaining a data K (v1, v2, v3..) vector, and then carrying out normalization processing on the data K (v1, v2, v3..) vector by using x (y-mu)/(Max-Min), wherein y is a student information data array to be normalized, and x is a normalized student information array;
2) self-adaptive feature extraction based on multi-dimensional normal distribution: extracting a result information vector G (x1, x2, x3..) in the student data subjected to data cleaning processing and normalization in the step 1), performing multivariate normal distribution screening, and defining n-dimensional result information in the result information vector G (x1, x2, x3..) as any linear combination of x1, x2, x3., and xn, wherein Y is a-a1x1+a2x2+...+anxnSubject to a normal distribution, there is a random vector Z ═ Z1,...,ZM]TWherein each element follows a normal distribution and a random vector μ ═ μ1,...,μN]TAnd N × M satisfies X ═ AZ + μ; if the n-dimensional achievement information vector G (x1, x2, x3..) meets the three conditions, the n-dimensional achievement information vector G is called to meet the multivariate normal distribution, namely the achievement information vector G (x1, x2, x3..) obeys fxFor the student information with the score meeting the multivariate normal distribution, the student information is planned to be a standard class, and the student information with the score not meeting the multivariate normal distribution is planned to be a singular class, as shown in a formula (2):
Figure BDA0002915668840000031
here, x is a normalized student information array, μ is an average value, and k is a constant index;
3) artificial neural network training based on refined network: training the data related to the student performance information obtained in the step 2) by adopting an artificial neural network, wherein the training adopts an elastic back propagation (Rprep) algorithm, in the general back propagation algorithm, the change amount of the weight in the learning process is determined by the partial derivative (gradient) of the error function to the weight, and in the Rprep algorithm, the change amount delta w of the weighti,jIs directly equal to the learning rate etai,j(t), therefore the gradient of the error function does not affect the change value of the weight, the gradient only affects the sign of the change value of the weight in the Rprep algorithm, i.e. affects the direction of the change of the weight, the change amount of the weight in the training process is directly equal to the learning rate corresponding to each weight, the sign of the change amount of the weight depends on the sign of the gradient of the error function, the gradient of the error function only determines the direction of the update of the weight, does not determine the strength of the update of the weight, if the gradient of the error function is positive, the corresponding weight needs to be reduced, and w can be enabled to be changedi,jSubtracting etai,j(t), if the gradient is negative, the corresponding weight should be increased to make the error function approach the minimum value, as shown in equation (3): :
Figure BDA0002915668840000032
now that it is clear how the weights are updated, the learning rate η is describedi,jHow (t) is updated, when first it should be considered how the sign of the gradient at any two time points, t and (t-1), will change, there are two cases in total: if the signs of the gradients of the error functions at two time points (t-1) and t are different, which indicates that the minimum value has been crossed at t, which indicates that the last update step of the weight value is too large, ηi,j(t) should be greater than ηi,j(t-1) is smaller to make the search for the lowest value more accurate, and the learning rate of the previous step and a value eta greater than 0 and smaller than 1 are mathematically madeupMultiply to get the current learning rate, however, when the symbol is two timesThe same sign indicates that the lowest point of the error function has not been reached, the corresponding learning rate can be increased by some to speed up the learning step, and therefore the learning rate of the previous step can be multiplied by an η greater than 1downTo obtain the current learning rate, as shown in equation (4):
Figure BDA0002915668840000041
4) training an adaptive excitation function LSTM network: adopt LSTM network to train student's all-purpose card consumption relevant information every day to carry out the cascade output final graduation probability result with artifical neural network training result, LSTM network structure among the prior art is: the LSTM module has three inputs: c. Ct-1、ht-1And xtOutputs through the LSTM module are respectively ct、htAnd ytWherein x istRepresents the input of the current round, ht-1Representing the state quantity output of the previous round, ct-1Carriers representing one information global in the previous round, ytRepresents the output of the current wheel, htRepresenting the output of the state quantity of the current wheel, ctAn information carrier representing the global of the round, xtAnd ht-1Combining the vectors into a vector, multiplying the vector by a vector W, and wrapping a layer of tanh function outside the vector W to obtain a vector z; using activation function sigmoid to convert x in LSTM network structure in prior arttAnd ht-1Are combined into a vector and then multiplied by a matrix Wf、WiAnd WoTo obtain zf、ziAnd zo,Wf,Wi,WoWeight matrixes of the forgetting gate, the input gate and the output gate are used for multiplying the variable input by each gate, zf,zi,zoThe outputs of the gates are multiplied by a weight W plus an offset, and then the vectors are used to obtain c from equation (5)t
ct=zfct-1+ziz (5),
H is obtained from the formula (6)t:ht=zotanh(ct) (6),
The output y of the wheel is obtained by the formula (7)t:yt=σ(Wht) (7),
Changing the original tanh excitation function into a weighted average function of the adaptive excitation function Relu + tanh, and adopting the data x transmitted by each gate of the LSTM
Figure BDA0002915668840000042
Exciting in a u + v-1 form, effectively avoiding the problem of gradient disappearance of tanh, and keeping the nonlinear characteristic of tanh, so that LSTM is utilized to train the consumption related information of the one-card students, early warning and classifying are carried out on each student, a personal analysis report is generated for each student in a targeted manner, if high-risk early warning is carried out, corresponding early warning information is received, and the early warning information is cascaded with the artificial neural network in the step 3) so as to predict the graduation probability of the student;
5) in combination with a software platform: a software platform suitable for the steps is established by utilizing the existing early warning algorithm software, a user port is divided into a teacher end and a student end, the student has the right to check the personal learning score and the personal related information, the class related score information of the student, and the corresponding subject ranking on class grade, and if the student end is a high-risk early warning student end, the student end receives corresponding early warning information; the teacher end has the right to check the information of all students of the course taken by the teacher, check all kinds of information of the class and all kinds of information of the grade, if the students receive the early warning, the teacher also receives the prompt of the early warning of the students, the teacher can conveniently pay attention to the grades of the students, meanwhile, the individual, class and course charts of the students are generated, and the teacher can conveniently track the learning condition and the state of each student in real time.
The technical scheme combines a neural network and an LSTM network algorithm, performs data cleaning on data, performs missing data mapping based on RBF (radial basis function) kernel, performs data normalization processing and adaptive feature extraction based on multidimensional normal distribution, performs refined primary and secondary artificial neural network training and LSTM network training based on an adaptive excitation function, performs prediction in a mode of cascade connection of the artificial neural network and the LSTM network, has a good visual interface by combining with a software platform, can provide visual early warning data information graphs, prediction graphs and reports for students and teachers, can automatically divide early warning grades for the students, and provides corresponding early warning suggestion functions.
The method effectively improves the early warning accuracy by cleaning various data of students, mapping missing data, normalizing, extracting the self-adaptive characteristics based on multi-dimensional normal distribution, then utilizing the artificial neural network and LSTM network distribution training and a cascading prediction mode, and can achieve high accuracy under the condition of less characteristic data.
The method has the advantages of good universality, low false detection rate and high prediction accuracy.
Drawings
FIG. 1 is a schematic flow chart of an exemplary method;
FIG. 2 is a diagram illustrating the comparison between the prediction accuracy of the embodiment and the prediction accuracy of other methods;
FIG. 3 is a schematic diagram of an LSTM module in the prior art;
fig. 4 is a schematic structural diagram of an LSTM module in the embodiment.
Detailed Description
The invention will be further elucidated with reference to the drawings and examples, without however being limited thereto.
Example (b):
referring to fig. 1, a academic early warning method based on an artificial neural network and an LSTM network includes the following steps:
1) processing missing data based on the RBF core: firstly, cleaning student data: normalizing the student information with complete data by x' ═ x-mu)/(Max-Min); then, the original data of the student information is processed by K (v1, v2) ═ exp (-gamma | | v1-v2 |)2The method maps the student information source from the low-dimensional space to the high-dimensional space, and the method has the advantages of completely retaining all information of the original data, not considering missing values, not considering problems such as linear inseparability and the like, and the mapping process is shown as the formula (1):
Figure BDA0002915668840000061
Figure BDA0002915668840000062
wherein, x ', y ' are student information data arrays, y ' is a corresponding array label, α is a RBF kernel parameter gamma value, the smaller the α value is, the larger the influence is, the smaller the value is, and the larger the influence is, thereby each element of the obtained high-dimensional space is:
Figure BDA0002915668840000063
further obtaining a data K (v1, v2, v3..) vector, and then carrying out normalization processing on the data K (v1, v2, v3..) vector by using x (y-mu)/(Max-Min), wherein y is a student information data array to be normalized, and x is a normalized student information array;
2) self-adaptive feature extraction based on multi-dimensional normal distribution: extracting a result information vector G (x1, x2, x3..) in the student data subjected to data cleaning processing and normalization in the step 1), performing multivariate normal distribution screening, and defining n-dimensional result information in the result information vector G (x1, x2, x3..) as any linear combination of x1, x2, x3., and xn, wherein Y is a-a1x1+a2x2+...+anxnSubject to a normal distribution, there is a random vector Z ═ Z1,...,ZM]TWherein each element follows a normal distribution and a random vector μ ═ μ1,...,μN]TAnd N × M satisfies X ═ AZ + μ; if the n-dimensional achievement information vector G (x1, x2, x3..) meets the three conditions, the n-dimensional achievement information vector G is called to meet the multivariate normal distribution, namely the achievement information vector G (x1, x2, x3..) obeys fxFor the student information with the score meeting the multivariate normal distribution, the student information is planned to be a standard class, and the student information with the score not meeting the multivariate normal distribution is planned to be a singular class, as shown in a formula (2):
Figure BDA0002915668840000071
here, x is a normalized student information array, and μ is an average value;
3) artificial neural network training based on refined network: training the data related to the student performance information obtained in the step 2) by adopting an artificial neural network, wherein the training adopts an elastic back propagation (Rprep) algorithm, in the general back propagation algorithm, the change amount of the weight in the learning process is determined by the partial derivative (gradient) of the error function to the weight, and in the Rprep algorithm, the change amount delta w of the weighti,jIs directly equal to the learning rate etai,j(t), therefore the gradient of the error function does not affect the change value of the weight, the gradient only affects the sign of the change value of the weight in the Rprep algorithm, i.e. affects the direction of the change of the weight, the change amount of the weight in the training process is directly equal to the learning rate corresponding to each weight, the sign of the change amount of the weight depends on the sign of the gradient of the error function, the gradient of the error function only determines the direction of the update of the weight, does not determine the strength of the update of the weight, if the gradient of the error function is positive, the corresponding weight needs to be reduced, and w can be enabled to be changedi,jSubtracting etai,j(t), if the gradient is negative, the corresponding weight should be increased to make the error function approach the minimum value, as shown in equation (3): :
Figure BDA0002915668840000072
now that it is clear how the weights are updated, the learning rate η is describedi,jHow (t) is updated, when first it should be considered how the sign of the gradient at any two time points, t and (t-1), will change, there are two cases in total: if the signs of the gradients of the error functions at two time points (t-1) and t are different, which indicates that the minimum value has been crossed at t, which indicates that the last update step of the weight value is too large, ηi,j(t) should be greater than ηi,j(t-1) is smaller to make the search for the lowest value more accurate, and the learning rate of the previous step and a value eta greater than 0 and smaller than 1 are mathematically madeupMultiply to obtain the current learning rate, however, whenThe same sign of the two times indicates that the lowest point of the error function has not been reached, the corresponding learning rate can be increased by some to speed up the learning step, and therefore the learning rate of the previous step can be multiplied by an eta greater than 1downTo obtain the current learning rate, as shown in equation (4):
Figure BDA0002915668840000073
4) training an adaptive excitation function LSTM network: the LSTM network is adopted to train the consumption related information of the one-card-through-card each day of the student, and the training result and the artificial neural network are cascaded to output the final graduation probability result, as shown in figure 3, the LSTM network structure in the prior art is as follows: the LSTM module has three inputs: c. Ct-1、ht-1And xtOutputs through the LSTM module are respectively ct、htAnd ytWherein x istRepresents the input of the current round, ht-1Representing the state quantity output of the previous round, ct-1Carriers representing one information global in the previous round, ytRepresents the output of the current wheel, htRepresenting the output of the state quantity of the current wheel, ctAn information carrier representing the global of the round, xtAnd ht-1Combining the vectors into a vector, multiplying the vector by a vector W, and wrapping a layer of tanh function outside the vector W to obtain a vector z; using activation function sigmoid to convert x in LSTM network structure in prior arttAnd ht-1Are combined into a vector and then multiplied by a matrix Wf、WiAnd WoTo obtain zf、ziAnd zo,Wf,Wi,WoWeight matrixes of the forgetting gate, the input gate and the output gate are used for multiplying the variable input by each gate, zf,zi,zoThe outputs of the gates are multiplied by a weight W plus an offset, and then the vectors are used to obtain c from equation (5)t
ct=zfct-1+ziz (5),
H is obtained from the formula (6)t:ht=zotanh(ct) (6),
The output y of the wheel is obtained by the formula (7)t:yt=σ(Wht) (7),
As shown in FIG. 4, the LSTM network structure in this example is modified from the network structure based on FIG. 3 in that the original tanh excitation function is changed into a weighted average function of the adaptive excitation functions Relu + tanh, and the data x incoming to each gate of the LSTM is taken as
Figure BDA0002915668840000081
Exciting in a u + v-1 form, effectively avoiding the problem of gradient disappearance of tanh, and keeping the nonlinear characteristic of tanh, so that LSTM is utilized to train the consumption related information of the one-card students, early warning and classifying are carried out on each student, a personal analysis report is generated for each student in a targeted manner, if high-risk early warning is carried out, corresponding early warning information is received, and the early warning information is cascaded with the artificial neural network in the step 3) so as to predict the graduation probability of the student;
5) in combination with a software platform: a software platform suitable for the steps is established by utilizing the existing early warning algorithm software, a user port is divided into a teacher end and a student end, the student has the right to check the personal learning score and the personal related information, the class related score information of the student, and the corresponding subject ranking on class grade, and if the student end is a high-risk early warning student end, the student end receives corresponding early warning information; the teacher end has the right to check the information of all students of the course taken by the teacher, check all kinds of information of the class and all kinds of information of the grade, if the students receive the early warning, the teacher also receives the prompt of the early warning of the students, the teacher can conveniently pay attention to the grades of the students, meanwhile, the individual, class and course charts of the students are generated, and the teacher can conveniently track the learning condition and the state of each student in real time.
Through multiple consideration and tests, the accuracy of the method can stably reach 94.21%, the highest accuracy can reach 98.17%, and the average false detection rate is stably 1.97%, as shown in fig. 2, compared with the existing machine learning algorithm SVM, the RF accuracy is obviously improved.

Claims (1)

1. An academic early warning method based on an artificial neural network and an LSTM network is characterized by comprising the following steps:
1) processing missing data based on the RBF core: firstly, cleaning student data: normalizing the student information with complete data by x' ═ x-mu)/(Max-Min); then, the original data of the student information is processed by K (v1, v2) ═ exp (-gamma | | v1-v2 |)2The method maps the student information source from a low-dimensional space to a high-dimensional space, and the mapping process is shown as formula (1):
Figure FDA0002915668830000011
Figure FDA0002915668830000012
wherein, x ', y ' are student information data arrays, y ' is a corresponding array label, and α is a RBF kernel parameter gamma value, so that each element of the obtained high-dimensional space is:
Figure FDA0002915668830000013
further obtaining a data K (v1, v2, v3..) vector, and then carrying out normalization processing on the data K (v1, v2, v3..) vector by using x (y-mu)/(Max-Min), wherein y is a student information data array to be normalized, x is a normalized student information array, mu is an average value, Max and Min are the maximum value and the minimum value of all x elements;
2) self-adaptive feature extraction based on multi-dimensional normal distribution: extracting a result information vector G (x1, x2, x3..) in the student data subjected to data cleaning processing and normalization in the step 1), performing multivariate normal distribution screening, and defining n-dimensional result information in the result information vector G (x1, x2, x3..) as any linear combination of x1, x2, x3., and xn, wherein Y is a-a1x1+a2x2+...+anxnSubject to a normal distribution, there is a random vector Z ═ Z1,...,ZM]TWherein each element follows a normal distribution and a random vector μ ═ μ1,...,μN]TAnd N × M satisfies X ═ AZ + μ; if the n-dimensional achievement information vector G (x1, x2, x3..) meets the three conditions, the n-dimensional achievement information vector G is called to meet the multivariate normal distribution, namely the achievement information vector G (x1, x2, x3..) obeys fxFor the student information with the score meeting the multivariate normal distribution, the student information is planned to be a standard class, and the student information with the score not meeting the multivariate normal distribution is planned to be a singular class, as shown in a formula (2):
Figure FDA0002915668830000021
wherein x is a normalized student information array, mu is an average value, and k is a constant index;
3) artificial neural network training based on refined network: training the data related to the student performance information obtained in the step 2) by adopting an artificial neural network, wherein the training adopts an elastic back propagation (Rprep algorithm), and the variation delta w of the weight in the Rprep algorithmi,jIs directly equal to the learning rate etai,j(t), the gradient of the error function does not influence the change value of the weight, the gradient of the error function only influences the sign of the change value of the weight in the Rprep algorithm, namely influences the change direction of the weight, the change amount of the weight in the training process is directly equal to the learning rate corresponding to each weight, the sign of the change amount of the weight depends on the sign of the gradient of the error function, the gradient of the error function only determines the update direction of the weight, the update strength of the weight is not determined, if the gradient of the error function is positive, the corresponding weight is reduced, and the w is enabled to be Wi,jSubtracting etai,j(t), if the gradient is negative, then the corresponding weight is increased to bring the error function to the minimum, as shown in equation (3): :
Figure FDA0002915668830000022
thus, it is clear how the weight is updated, and then the learning rate etai,j(t) update, the gradient at any two time points, t and (t-1), will change sign, and there are two cases of change in total: if the signs of the gradients of the error functions at two time points (t-1) and t are different, which indicates that the minimum value has been crossed at t, which indicates that the last update step of the weight value is too large, ηi,j(t) ratio ηi,j(t-1) smaller learning rate of the previous step and a value eta greater than 0 and smaller than 1upMultiplying to obtain current learning rate, when the signs of two times are identical, indicating that the lowest point of error function has not been reached yet, making the learning rate of previous step multiply by an eta greater than 1downObtaining the current learning rate as shown in formula (4):
Figure FDA0002915668830000023
4) training an adaptive excitation function LSTM network: adopt LSTM network to train student's all-purpose card consumption relevant information every day to carry out the cascade output final graduation probability result with artifical neural network training result, LSTM network structure among the prior art is: the LSTM module has three inputs: c. Ct-1、ht-1And xtThe LSTM module outputs are respectively ct、htAnd ytWherein x istRepresents the input of the current round, ht-1Representing the state quantity output of the previous round, ct-1Carriers representing one information global in the previous round, ytRepresents the output of the current wheel, htRepresenting the output of the state quantity of the current wheel, ctAn information carrier representing the global of the round, xtAnd ht-1Combining into a vector, multiplying by vector W, wrapping a layer of tanh function outside to obtain vector z, and adopting activation function sigmoid to convert x in LSTM network structure in the prior arttAnd ht-1Are combined into a vector and then multiplied by a matrix Wf、WiAnd WoTo obtain zf、ziAnd zo,Wf,Wi,WoWeight matrixes of the forgetting gate, the input gate and the output gate are used for multiplying the variable input by each gate, zf,zi,zoC is obtained from equation (5) for each gate output multiplied by the weight W plus the offsett
ct=zfct-1+ziz (5),
H is obtained from the formula (6)t:ht=zotanh(ct) (6),
The output y of the wheel is obtained by the formula (7)t:yt=σ(Wht) (7),
Changing the original tanh excitation function into a weighted average function of the adaptive excitation function Relu + tanh, and adopting the data x transmitted by each gate of the LSTM
Figure FDA0002915668830000031
The u + v is excited in a form of 1, so that consumption related information of the one-card students is trained through LSTM, early warning classification is carried out on each student, a personal analysis report is generated for each student in a targeted manner, if the high-risk early warning is carried out, corresponding early warning information is received, and the early warning information is cascaded with the artificial neural network in the step 3) so as to predict the graduation probability of the student;
5) in combination with a software platform: a software platform suitable for the steps is established by utilizing the existing early warning algorithm software, a user port is divided into a teacher end and a student end, the student has the right to check the personal learning score and the personal related information, the class related score information of the student, and the corresponding subject ranking on class grade, and if the student end is a high-risk early warning student end, the student end receives corresponding early warning information; the teacher end has the right to check the information of all students of the course taken by the teacher, check all kinds of information of the class and all kinds of information of the grade, if the students receive the early warning, the teacher also receives the prompt of the early warning of the students, the teacher can conveniently pay attention to the grades of the students, meanwhile, the individual, class and course charts of the students are generated, and the teacher can conveniently track the learning condition and the state of each student in real time.
CN202110101091.8A 2021-01-26 2021-01-26 Academic early warning method based on artificial neural network and LSTM network Expired - Fee Related CN112801362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110101091.8A CN112801362B (en) 2021-01-26 2021-01-26 Academic early warning method based on artificial neural network and LSTM network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110101091.8A CN112801362B (en) 2021-01-26 2021-01-26 Academic early warning method based on artificial neural network and LSTM network

Publications (2)

Publication Number Publication Date
CN112801362A true CN112801362A (en) 2021-05-14
CN112801362B CN112801362B (en) 2022-03-22

Family

ID=75811697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110101091.8A Expired - Fee Related CN112801362B (en) 2021-01-26 2021-01-26 Academic early warning method based on artificial neural network and LSTM network

Country Status (1)

Country Link
CN (1) CN112801362B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110939A (en) * 2019-05-15 2019-08-09 杭州华网信息技术有限公司 The academic record prediction and warning method of behavior is serialized based on deep learning student
CN111260230A (en) * 2020-01-19 2020-06-09 西北大学 Academic early warning method based on lifting tree model
US20200302296A1 (en) * 2019-03-21 2020-09-24 D. Douglas Miller Systems and method for optimizing educational outcomes using artificial intelligence
US20200356852A1 (en) * 2019-05-07 2020-11-12 Samsung Electronics Co., Ltd. Model training method and apparatus
CN112257935A (en) * 2020-10-26 2021-01-22 中国人民解放军空军工程大学 Aviation safety prediction method based on LSTM-RBF neural network model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200302296A1 (en) * 2019-03-21 2020-09-24 D. Douglas Miller Systems and method for optimizing educational outcomes using artificial intelligence
US20200356852A1 (en) * 2019-05-07 2020-11-12 Samsung Electronics Co., Ltd. Model training method and apparatus
CN110110939A (en) * 2019-05-15 2019-08-09 杭州华网信息技术有限公司 The academic record prediction and warning method of behavior is serialized based on deep learning student
CN111260230A (en) * 2020-01-19 2020-06-09 西北大学 Academic early warning method based on lifting tree model
CN112257935A (en) * 2020-10-26 2021-01-22 中国人民解放军空军工程大学 Aviation safety prediction method based on LSTM-RBF neural network model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋楚平等: "一种RBF神经网络改进算法在高校学习预警中的应用", 《计算机应用与软件》 *
肖逸枫: "数据挖掘技术用于高校学生留级预警的研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *

Also Published As

Publication number Publication date
CN112801362B (en) 2022-03-22

Similar Documents

Publication Publication Date Title
WO2022077587A1 (en) Data prediction method and apparatus, and terminal device
CN109446430B (en) Product recommendation method and device, computer equipment and readable storage medium
CN110599336B (en) Financial product purchase prediction method and system
Liang et al. Multi-scale dynamic adaptive residual network for fault diagnosis
US11704570B2 (en) Learning device, learning system, and learning method
CN110555459A (en) Score prediction method based on fuzzy clustering and support vector regression
CN109284662B (en) Underwater sound signal classification method based on transfer learning
CN111340107A (en) Fault diagnosis method and system based on convolutional neural network cost sensitive learning
CN112149884A (en) Academic early warning monitoring method for large-scale students
Liu et al. Stock price trend prediction model based on deep residual network and stock price graph
CN109063750B (en) SAR target classification method based on CNN and SVM decision fusion
CN112489689B (en) Cross-database voice emotion recognition method and device based on multi-scale difference countermeasure
CN111462817B (en) Classification model construction method and device, classification model and classification method
CN112801362B (en) Academic early warning method based on artificial neural network and LSTM network
CN112381338B (en) Event probability prediction model training method, event probability prediction method and related device
CN117011219A (en) Method, apparatus, device, storage medium and program product for detecting quality of article
CN112085079B (en) Rolling bearing fault diagnosis method based on multi-scale and multi-task learning
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
CN111291838B (en) Method and device for interpreting entity object classification result
CN111382761B (en) CNN-based detector, image detection method and terminal
US20210133556A1 (en) Feature-separated neural network processing of tabular data
CN110647630A (en) Method and device for detecting same-style commodities
CN116405368B (en) Network fault diagnosis method and system under high-dimensional unbalanced data condition
Tomar A critical evaluation of activation functions for autoencoder neural networks
CN113469450B (en) Data classification method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220322