CN105894119A - Student ranking prediction method based on campus data - Google Patents

Student ranking prediction method based on campus data Download PDF

Info

Publication number
CN105894119A
CN105894119A CN201610207978.4A CN201610207978A CN105894119A CN 105894119 A CN105894119 A CN 105894119A CN 201610207978 A CN201610207978 A CN 201610207978A CN 105894119 A CN105894119 A CN 105894119A
Authority
CN
China
Prior art keywords
student
data
characteristic vector
ranking
students
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610207978.4A
Other languages
Chinese (zh)
Inventor
杨磊
聂敏
夏虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Xundao Technology Co Ltd
Original Assignee
Chengdu Xundao Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Xundao Technology Co Ltd filed Critical Chengdu Xundao Technology Co Ltd
Priority to CN201610207978.4A priority Critical patent/CN105894119A/en
Publication of CN105894119A publication Critical patent/CN105894119A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a student ranking prediction method based on campus data, comprising the following steps: collecting the data of all students, including performance data and behavior data; cleaning the student data, and normalizing the non-time data items; extracting the behavior characteristic vector of each student from the processed data, wherein behavior characteristics include performance characteristic, effort degree characteristic and law-of-life characteristic; reducing the dimension of each behavior characteristic vector; subtracting the behavior characteristic vector of each of the other students from the dimension-reduced behavior characteristic vector of each student to get a difference characteristic vector, and inputting the difference characteristic vectors into a classifier to get corresponding tag values, and summing the tag values to get the score of the student; sorting the scores of all the students to get the predicted ranking of each student. According to the invention, the campus data of students is analyzed, the learning habits and behavior characteristics of students are described using data, and the ranking of each student is predicted and used as a reference for student education.

Description

Student's ranking Forecasting Methodologies based on campus data
Technical field
The invention belongs to big data analysis digging technology field, more specifically, relate to a kind of based on campus Student's ranking Forecasting Methodology of data.
Background technology
How to understand students psychology, students ' abnormal behaviour, prediction student's study condition and personalization is provided Teach, have become as many colleges and universities problems faced and challenge.In recent years, along with " data and calculating " For the scientific and technological revolution driven, big data become the important factor in order of Internet information technique industry.How will Big data introduce education sector, as promoting Education Reform, leading the powerful power-assisted of creativity in education, become new Research direction.But at present, it is difficult to the problems such as quantization due to students ' behavior, counts greatly in education sector According to application be also in conceptual phase, effective application mode not yet occurs.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of students based on campus data arrange Name Forecasting Methodology, by being analyzed the campus data of student, describes the study habit of student by data And behavioural characteristic, it was predicted that obtain student's ranking, as the reference of student education.
For achieving the above object, present invention student based on campus data ranking Forecasting Methodology includes following Step:
S1: gather the data of all students, including achievement data and behavioral data, wherein achievement data Including the course types of all courses of student, credit number, achievement, behavioral data includes that student is in campus Each place uses the record of all-in-one campus card;
S2: the student data collected is carried out data cleansing;
S3: to the non-temporal data item in the student data cleaned, uses following methods to carry out data rule Model:
The jth item non-temporal data of note i-th student are xij, i=1,2 ..., N, N represent student's quantity, J=1,2 ..., M, M represent non-temporal data item quantity;Ask for each data xijLinear transformation value x 'ij, meter Calculation formula is:
x i j ′ = x i j - min j max j - min j ( T j _ m a x - T j _ m i n ) + T j _ m i n
Wherein, maxjRepresent the maximum in jth item data sequence, minjRepresent in jth item data sequence Minimum of a value, Tj_maxRepresent that jth item data sequence limits the interval upper limit, Tj_minRepresent jth item data sequence Limit interval lower limit;
To the data x ' after linear transformationij, calculate authority data value y according to below equationij:
y i j = x i j ′ - x ‾ j s j
Wherein,Represent the mean value of jth item data sequence, sjRepresent the variance of jth item data sequence;
S4: extract from student data each student behavioural characteristic vector, behavioural characteristic include achievement feature, Level of effort feature and rule of life feature, wherein achievement feature include all courses of student course types, Credit number, achievement, level of effort feature is the frequency that student enters the relevant place of study, rule of life feature It is the rule of life metric of student, is made up of the behavioural characteristic vector of student data above item;
S5: the behavioural characteristic vector extracting step S4 carries out dimensionality reduction, the row of each student after obtaining dimensionality reduction It is characterized vector;
S6: to i-th student, uses its behavioural characteristic vector to deduct the behavioural characteristic vector of other each students, Obtaining N-1 difference characteristic vector, by grader good for difference characteristic vector input training in advance, it is right to obtain N-1 the label answered, label value is 1 or-1, is sued for peace by all label values of student, obtains this student's Score, is ranked up the score of all students, thus obtains the ranking predicted value of student;
Wherein, the training method of grader is: to having the student of history ranking, collects and obtains these students Data, obtain the behavioural characteristic vector of these students according to the method for step S1 to step S5, then two Two try to achieve the difference characteristic vector between student;For a difference characteristic vector, if being subtracted characteristic vector Earlier above, then this label corresponding to difference characteristic vector is 1 to corresponding student's ranking, is otherwise-1;By this A little difference characteristic vectors are as the input of grader, and grader, as output, is trained by corresponding label.
Present invention student based on campus data ranking Forecasting Methodology, gathers the data of all students, bag Include achievement data and behavioral data, student data is carried out data cleansing, and to non-temporal data item number According to specification, the data after processing extract the behavioural characteristic vector of each student, and behavioural characteristic includes achievement Feature, level of effort feature and rule of life feature, then carry out dimensionality reduction, Mei Gexue to behavioural characteristic vector Behavioural characteristic vector after raw its dimensionality reduction of employing deducts the behavioural characteristic vector of other each students, tries to achieve difference Characteristic vector, obtains the label value of correspondence in input grader, label value summation obtains the score of student, The score of all students is ranked up, i.e. can get the ranking predicted value of each student.
The present invention is directed to student's learning behavior data in campus and carry out depth analysis, the base to individual students This information, study, living condition carry out quantificational description accurately, it was predicted that the ranking of individual students, are relevant Functional department, provide quantified decision-making foundation for the teaching management of relevant functional department and daily guidance work, Thus effectively discharge the value of student data.
Accompanying drawing explanation
Fig. 1 is the flow chart of present invention student based on campus data ranking Forecasting Methodology;
Fig. 2 is the flow chart of behavioural characteristic Data Dimensionality Reduction.
Detailed description of the invention
Below in conjunction with the accompanying drawings the detailed description of the invention of the present invention is described, in order to those skilled in the art It is more fully understood that the present invention.Requiring particular attention is that, in the following description, when known function and Perhaps, when the detailed description of design can desalinate the main contents of the present invention, these are described in and will be left in the basket here.
Embodiment
Fig. 1 is the flow chart of present invention student based on campus data ranking Forecasting Methodology.As it is shown in figure 1, Present invention student based on campus data ranking Forecasting Methodology comprises the following steps:
S101: student data collection:
First having to gather the data of all students, student data stems from each functional department of school, There is heterogeneous structure, contain the school serialized to the time from structurized student's essential information data Garden life track.Student data includes achievement data and behavioral data, and wherein achievement data includes the institute of student Have the course types of course, credit number and an achievement, and each part of achievement situation (as usual performance, Interim achievement etc.), behavioral data includes that student each place in campus uses the record of all-in-one campus card, example As student in supermarket, the consumer record fetched water of dining room and classroom, including consumption time and the amount of money;Come in and go out figure Book shop, the record of dormitory gate inhibition;Check out record, including book information and borrow the time.Table 1 is number of students According to source and content example.
Table 1
S102: data cleansing:
After collecting all student data, need the initial data collected is carried out data cleansing.This From multiple operation systems and comprise a large amount of historical data due to student data in bright, usually there will be repetition Value, missing values etc., it is therefore desirable to carry out data cleansing.The task of data cleansing be filter those do not meet want The data asked, write data warehouse again after correction.Clean object mainly include the repetition values in data, Missing values, inconsistent data etc., data cleansing is the conventional means of big data fields, and its detailed process exists This repeats no more.
S103: data normalization:
For the student data cleaned, owing to the attribute of every item data is different, it is generally of different amounts Guiding principle and the order of magnitude.It is said that in general, represent that attribute will cause this attribute to have higher value territory by less unit, So tend to " weight " making such attribute have large effect or higher.In order to avoid single to tolerance Position select dependence, it is ensured that the reliability of result, need in initial data in addition to time data its He carries out standardization processing at data item.
Data normalization refers to data bi-directional scaling, is allowed to fall into a little specific interval.This Mode is often used in some compares and the index evaluated processes, and the unit removing data limits, by it It is converted into nondimensional pure values, it is simple to the index of commensurate or magnitude can not compare and weight.This In invention, data normalization include following two step:
● linear transformation:
The jth item non-temporal data of note i-th student are xij, i=1,2 ..., N, N represent student's quantity, J=1,2 ..., M, M represent non-temporal data item quantity.To each data, ask for according to below equation respectively Linear transformation value x 'ij:
x i j ′ = x i j - min j max j - min j ( T j _ m a x - T j _ m i n ) + T j _ m i n
Wherein, maxjRepresent the maximum in jth item data sequence, minjRepresent in jth item data sequence Minimum of a value, Tj_maxRepresent that jth item data sequence limits the interval upper limit, Tj_minRepresent jth item data sequence Limit interval lower limit.Jth item data sequence is exactly the sequence of the jth item data composition of all students.It is visible, By above formula, by script in jth item data sequence at interval [minj,maxj] primary system one be mapped to [Tj_min,Tj_maxOn].
Assume jth item data sequence for [1,2, Isosorbide-5-Nitrae, 3,2,5,6,2,7], interval is [1,7], its limit interval as [0,1], then the data sequence after linear transformation is [0,0.16,0,0.5,0.33,0.16,0.66,0.83,0.16,1].
● numerical value specification:
Based on data for data after linear transformation averages and standard deviation are carried out numerical value standardization.Become linear Data x ' after changingij, calculate authority data value y according to below equationij:
y i j = x i j ′ - x ‾ j s j
Wherein,Represent the mean value of jth item data sequence,sjRepresent jth item data sequence The variance of row,
Each data sequence average after numerical value standardization is 0, and variance is 1, and dimensionless, the word in sequence Segment value fluctuates around about 0, more than 0 explanation higher than average level, less than 0 explanation less than average level.
Unified interval can not only be mapped the data into by two above step, and effectively eliminate Beyond the impact on data overall distribution of the Outlier Data of span.
S104: extract behavioural characteristic vectorial:
After completing the work of data normalization, need extracting data learning behavior feature.Institute in the present invention The behavioural characteristic needing each student is divided into three parts: achievement feature, level of effort feature and rule of life are special Levy.Achievement feature includes the course types of all courses of student, credit number, achievement.Level of effort feature is united Having counted the frequency entering the relevant place of study, including entering library's number of times, number of times is checked card in classroom, print time Count, check out number of times etc., the study level of effort describing student with this and Active Learning wish.Rule of life Feature is the rule of life metric of student, is to be portrayed in the charge time of different location by analysis student The regularity of its daily life system.
In the present embodiment, the computational methods of rule of life metric are: first according to the number of students of each student Access situation to default several places (generally dining room, dormitory, classroom) according to, is calculated This student access probability to these places in predetermined amount of time, is then calculated Shannon according to access probability Entropy, this Shannon entropy is the rule of life metric of student.
Shannon entropy (Shannon Entropy) have expressed the average information that a discrete variable is brought, May be used for characterizing rule of life, its computing formula is:
H i ( z ) = - Σ z P i f ( z ) log 2 P i f ( z )
Wherein, HiZ () represents the Shannon entropy of i-th student, PifZ () represents that i-th student accesses the f place Access probability, f=1,2 ..., F, F represent place quantity.
Such as, when be calculated a student respectively in dining room, dormitory, these three place, classroom access general When rate is 0.3,0.3,0.4 respectively, it is calculated Shannon entropy H1(z)=1.572.Another student accesses three When the probability in place is 0.1,0.6,0.2 respectively, it is calculated H2(z)=1.24.The Shannon entropy of the latter is less, Embody higher Behavior law (probability of the dormitory that comes in and goes out is higher).For a probability distribution, when generally When rate concentrates on certain several value less (one of variable several values that can take minority in most cases), The value of Shannon entropy can be relatively low, if on the contrary, probability relatively averagely (almost cannot judge in various values Which value variable can take), then Shannon entropy can be higher.It can therefore be seen that the time that place is accessed by student More concentrate, then entropy will be the least, and rule of life is the strongest.
Calculating the access probability to each place can use student data to add up, it would however also be possible to employ density The mode estimated obtains, and concrete grammar can be as desired to arrange.For middle school student's data volume of the present invention Big feature, it is proposed that a kind of access probability computational methods, its detailed process is as follows:
Predetermined amount of time is carried out time interval segmentation, from student data, extracts student's visit to every class place Ask the time, project to segment time interval, add up every class place access in each segmentation time interval time Number, general to the access in such place in then using density Estimation Function Estimation to obtain each segmentation time interval Rate, then integration obtains the preset time period access probability to such place.Density Estimation function can basis Being actually needed and select, the density Estimation function expression employed in the present embodiment is:
p i f v ( z ) = 1 2 π G i f h i f Σ v = 1 V e - ( z - z i f v ) 2 2 h i f 2
Wherein, pifvZ () represents that i-th student accesses the access in f place in the v segmentation time interval Probability, v=1,2 ..., V, V represent the quantity of segmentation time interval.zifvRepresent that i-th student is thin at v The access times in f place are accessed in dividing time interval.GifRepresent that i-th student visits within a predetermined period of time Ask total access times in f place, i.e.hifRepresent that i-th student is accessing the f ground The density Estimation bandwidth value that point is corresponding, its its empirical equation is:
h i f = 1.06 * σ i f * G i f - 1 5
Wherein σifRepresent V access times zifvStandard deviation.
Then to V pifvZ () is integrated, it is possible in obtaining predetermined amount of time, i-th student accesses f The access probability P in placeif(z)。
S105: behavioural characteristic Data Dimensionality Reduction:
After extracting student characteristics, owing to characteristic item is more, it is therefore desirable to data are carried out dimension-reduction treatment, Data Dimensionality Reduction can reduce the complexity of calculating, reduces the disappearance of the information content that correlation causes, for magnanimity The feature extraction of data has great significance.The method of Data Dimensionality Reduction has many, can be according to actual needs Select, for the feature of application scenarios of the present invention in the present embodiment, have devised a kind of dimension reduction method, By dimensionality reduction, multi objective is converted into a few overall target, so that the characteristic after dimensionality reduction is contained Information more fully.
Fig. 2 is the flow chart of behavioural characteristic Data Dimensionality Reduction.As in figure 2 it is shown, characteristic dimensionality reduction includes following Step:
S201: structure behavioural characteristic matrix:
The behavioural characteristic vector of note i-th student is Bi={ bi1,bi2,…,biD}T, D represents feature item number, by institute The behavioural characteristic matrix U that size is D × N is formed, it is clear that in matrix U by the behavioural characteristic data of student, the I row are Bi, subscript T represents transposition.
S202: ask for covariance matrix:
Ask for the covariance matrix C of behavioural characteristic matrix U.
S203: ask for the eigenmatrix of covariance matrix:
Ask for the characteristic value of covariance matrix C and characteristic of correspondence vector, then according to character pair value from To little, characteristic vector is become matrix the most by rows greatly, take front K row composition characteristic vector matrix P, K Numerical value be configured according to actual needs.
S204: behavioural characteristic matrix after calculating dimensionality reduction:
Calculating the behavioural characteristic matrix Q=PU of student after dimensionality reduction, in matrix Q, the i-th row are after dimensionality reduction the The behavioural characteristic vector B ' of i studenti
Obviously the line number of matrix Q is K, and in step S203, K is the biggest, and the matrix Q obtained more can embody row It is characterized, but the complexity of subsequent calculations also can increase.The span typically arranging K is
Assume that the behavioural characteristic matrix H constructed by behavioural characteristic vector of 10 students is as follows:
H = 2.5 0.5 2.2 1.9 3.1 2.3 2 1 1.5 1.1 2.4 0.7 2.9 2.2 3 2.7 1.6 1.1 1.6 0.9
Visible, the behavioural characteristic vector of each student comprises two characteristic items.
Try to achieve covariance matrix C as follows:
C = 0.616555556 0.615444444 0.615444444 0.716555556
The eigenvalue λ and the characteristic of correspondence vector α that try to achieve covariance matrix C are respectively as follows:
λ1=0.490833989, α1=[-0.735178656,0.677873399]
λ2=1.28402771, α2=[-0.677873399 ,-0.735178656]
Then 1 eigenvalue λ of maximum is selected2Characteristic of correspondence vector forms characteristic vector square as column vector Battle array, then eigenvectors matrix P=[-0.677873399 ,-0.735178656].It is calculated student after dimensionality reduction Behavioural characteristic matrix Q=PU, it may be assumed that
Q=[-0.8280,1.7776 ,-0.9922 ,-0.2742 ,-1.6758 ,-0.9129,0.0991,1.1446,0.4380,1.2238]
After in matrix Q, each numerical value takes decimal point four.
S106: student's ranking is predicted:
By step S101 to S105, from the student data of magnanimity, extract the behavioural characteristic of each student Vector, it is possible to carry out ranking by the behavioural characteristic vector of student and predict.Ranking prediction in the present invention Method particularly includes:
To i-th student, its behavioural characteristic vector is used to deduct the behavioural characteristic vector of other each students, To N-1 difference characteristic vector, by grader good for difference characteristic vector input training in advance, obtain correspondence N-1 label, label value is 1 or-1, all label values of student is sued for peace, obtain this student Point, the score of all students is ranked up, thus obtains the ranking predicted value of student.
Wherein, grader is to be trained by the student data with history ranking to obtain, and training method is: To having the student of history ranking, collect the data of these students, according to step S101 to step S105 Method obtains the behavioural characteristic vector of these students, tries to achieve the difference characteristic vector between student the most two-by-two. For a difference characteristic vector, if being subtracted the student's ranking corresponding to characteristic vector earlier above, then this difference Label corresponding to characteristic vector is 1, is otherwise-1;Using vectorial for these difference characteristics input as grader, Grader, as output, is trained by corresponding label.
As described above it can be seen that the present invention is to have employed the method compared two-by-two to portray two people Difference.Each behavioural characteristic vector of any two people is subtracted each other, as a new characteristic vector. Such as, the ranking of student A is 5, and behavioural characteristic vector is A=(3,2,5,7,9,6,8, Isosorbide-5-Nitrae, 7)T, student B's Ranking is 12, and behavioural characteristic vector is B=(5,9,8,6,7,1,3,4,7,6)T, then difference characteristic vector A-B=(-2 ,-7 ,-3,1,2,5,5 ,-3 ,-3,1)T
Assuming that there be W student in training sample, each two student is calculated a difference characteristic vector, institute The difference characteristic vector obtained just has W (W-1)/2, then the training sample of grader has W (W-1)/2. Because label only has two classes (1 and-1), so prediction is exactly this label.It is to say, the present invention will Ranking predictive conversion between student is in order to first predict the relative rankings order obtaining each two student, the most again It is real ranking according to these relative rankings sequential conversions, ranking forecasting problem has been translated into a row Sequence problem concerning study, thus efficiently solve the ranking forecasting problem of student.If the ranking of student A is the highest, He occurs that the number of times of 1 is the most in others relatively more produced label, and the number of times of-1 is the fewest, then by meter Label sum produced by the raw A of mathematics and other students can obtain a score, obtaining according to all students Divide and be ranked up can be obtained by the ranking predicted value of current student A.Such as, student A and other student's phases The tag set relatively obtained relatively obtains for (1 ,-1 ,-1,1,1,1 ,-1,1 ,-1 ,-1,1), student B and other student's ratios Tag set be (-1,1 ,-1 ,-1,1,1 ,-1 ,-1 ,-1,1,1), can obtain student A must be divided into 1, student B's -1 must be divided into, then student A can be located further forward than the ranking of student B.
Although detailed description of the invention illustrative to the present invention is described above, in order to the art Artisans understand that the present invention, it should be apparent that the invention is not restricted to the scope of detailed description of the invention, right From the point of view of those skilled in the art, as long as various change limits in appended claim and determines The spirit and scope of the present invention in, these changes are apparent from, all utilize present inventive concept send out Bright creation is all at the row of protection.

Claims (5)

1. student's ranking Forecasting Methodology based on campus data, it is characterised in that comprise the following steps:
S1: gather the data of all students, including achievement data and behavioral data, wherein achievement data Including the course types of all courses of student, credit number, achievement, behavioral data includes that student is in campus Each place uses the record of all-in-one campus card;
S2: the student data collected is carried out data cleansing;
S3: to the non-temporal data item in the student data cleaned, uses following methods to carry out data rule Model:
The jth item non-temporal data of note i-th student are xij, i=1,2 ..., N, N represent student's quantity, J=1,2 ..., M, M represent data item quantity;Ask for each data xijLinear transformation value x 'ij, computing formula For:
x i j ′ = x i j - min j max j - min j ( T j _ m a x - T j _ m i n ) + T j _ m i n
Wherein, maxjRepresent the maximum in jth item data sequence, minjRepresent in jth item data sequence Minimum of a value, Tj_maxRepresent that jth item data sequence limits the interval upper limit, Tj_minRepresent jth item data sequence Limit interval lower limit;
To the data x ' after linear transformationij, calculate authority data value y according to below equationij:
y i j = x i j ′ - x ‾ j s j
Wherein,Represent the mean value of jth item data sequence, sjRepresent the variance of jth item data sequence;
S4: extract from student data each student behavioural characteristic vector, behavioural characteristic include achievement feature, Level of effort feature and rule of life feature, wherein achievement feature include all courses of student course types, Credit number, achievement, level of effort feature is the frequency that student enters the relevant place of study, rule of life feature It is the rule of life metric of student, is made up of the behavioural characteristic vector of student data above item;
S5: the behavioural characteristic vector extracting step S4 carries out dimensionality reduction, the row of each student after obtaining dimensionality reduction It is characterized vector;
S6: to i-th student, uses the behavioural characteristic vector after its dimensionality reduction to deduct the behavior of other each students Characteristic vector, obtains N-1 difference characteristic vector, by classification good for difference characteristic vector input training in advance Device, obtains N-1 label of correspondence, and label value is 1 or-1, is sued for peace by all label values of this student, Obtain the score of this student, the score of all students is ranked up, thus obtain the ranking predicted value of student;
Wherein, the training method of grader is: to having the student of history ranking, collects and obtains these students Data, obtain the behavioural characteristic vector of these students according to the method for step S1 to step S5, then two Two try to achieve the difference characteristic vector between student;For a difference characteristic vector, if being subtracted characteristic vector Earlier above, then this label corresponding to difference characteristic vector is 1 to corresponding student's ranking, is otherwise-1;By this A little difference characteristic vectors are as the input of grader, and grader, as output, is trained by corresponding label.
Student's ranking Forecasting Methodology the most according to claim 1, it is characterised in that described step S4 The computational methods of middle rule of life metric are: according in the student data of each student to default several The access situation in place, is calculated this student access probability to these places in predetermined amount of time, then Being calculated Shannon entropy according to access probability, this Shannon entropy is the rule of life metric of this student.
Student's ranking Forecasting Methodology the most according to claim 2, it is characterised in that described access probability Computational methods be:
Predetermined amount of time is carried out time interval segmentation, from student data, extracts student's visit to every class place Ask the time, project to segment time interval, add up every class place access in each segmentation time interval time Number, uses density Estimation Function Estimation to obtain the interior access probability to such place of each segmentation time interval, Then integration obtains the preset time period access probability to such place.
Student's ranking the most according to claim 1 prediction arrangement method, it is characterised in that described step In S5, the method for behavioural characteristic vector dimensionality reduction is:
S5.1: the behavioural characteristic vector of note i-th student is Bi={ bi1,bi2,…,biD}T, D represents feature item number, The behavioural characteristic data of all students are formed the behavioural characteristic matrix U that size is D × N;
S5.2: ask for the covariance matrix C of behavioural characteristic matrix U;
S5.3: ask for characteristic value and the characteristic of correspondence vector of covariance matrix C, then according to character pair Characteristic vector is become matrix by value from big to small the most by rows, takes front K row composition characteristic vector matrix The numerical value of P, K is configured according to actual needs;
S5.4: the behavioural characteristic matrix Q=PU of student after calculating dimensionality reduction, in matrix Q, the i-th row are through fall The behavioural characteristic vector B ' of i-th student after dimensioni
Student's ranking the most according to claim 4 prediction arrangement method, it is characterised in that described step The span of parameter K is
CN201610207978.4A 2016-04-05 2016-04-05 Student ranking prediction method based on campus data Pending CN105894119A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610207978.4A CN105894119A (en) 2016-04-05 2016-04-05 Student ranking prediction method based on campus data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610207978.4A CN105894119A (en) 2016-04-05 2016-04-05 Student ranking prediction method based on campus data

Publications (1)

Publication Number Publication Date
CN105894119A true CN105894119A (en) 2016-08-24

Family

ID=57012173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610207978.4A Pending CN105894119A (en) 2016-04-05 2016-04-05 Student ranking prediction method based on campus data

Country Status (1)

Country Link
CN (1) CN105894119A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991187A (en) * 2017-04-10 2017-07-28 武汉朱雀闻天科技有限公司 The analysis method and device of a kind of campus data
CN107423563A (en) * 2017-07-25 2017-12-01 深信服科技股份有限公司 A kind of students psychology analysis method, equipment and its storage medium
CN108280531A (en) * 2017-07-28 2018-07-13 淮阴工学院 A kind of student class marks sequencing prediction technique returned based on Lasso
CN108305195A (en) * 2018-01-07 2018-07-20 深圳前海易维教育科技有限公司 A kind of comprehensive index system towards students in middle and primary schools' evaluation and theme attribute analysis
CN108320045A (en) * 2017-12-20 2018-07-24 卓智网络科技有限公司 Student performance prediction technique and device
CN108876123A (en) * 2018-06-01 2018-11-23 首都师范大学 A kind of teaching interference method and device
CN108875800A (en) * 2018-05-29 2018-11-23 重庆大学 A kind of behavioural characteristic extracting method based on RFID card
CN108985522A (en) * 2018-08-02 2018-12-11 杭州华网信息技术有限公司 A kind of Intelligent campus extension section's method for early warning and system
CN110245867A (en) * 2019-06-18 2019-09-17 青海大学 A kind of grassland degeneration stage division based on bp neural network
CN110852390A (en) * 2019-11-13 2020-02-28 山东师范大学 Student score classification prediction method and system based on campus behavior sequence
CN112465260A (en) * 2020-12-10 2021-03-09 成都寻道科技有限公司 Student teaching management system based on campus data
CN113705985A (en) * 2021-08-12 2021-11-26 河南工业职业技术学院 Student think of political affairs condition analysis and early warning method, system, terminal and medium
CN115185996A (en) * 2022-07-19 2022-10-14 广州凯园软件科技有限公司 Education training comprehensive evaluation system based on point system

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991187A (en) * 2017-04-10 2017-07-28 武汉朱雀闻天科技有限公司 The analysis method and device of a kind of campus data
CN107423563A (en) * 2017-07-25 2017-12-01 深信服科技股份有限公司 A kind of students psychology analysis method, equipment and its storage medium
CN108280531A (en) * 2017-07-28 2018-07-13 淮阴工学院 A kind of student class marks sequencing prediction technique returned based on Lasso
CN108280531B (en) * 2017-07-28 2021-07-09 淮阴工学院 Student class score ranking prediction method based on Lasso regression
CN108320045A (en) * 2017-12-20 2018-07-24 卓智网络科技有限公司 Student performance prediction technique and device
CN108305195A (en) * 2018-01-07 2018-07-20 深圳前海易维教育科技有限公司 A kind of comprehensive index system towards students in middle and primary schools' evaluation and theme attribute analysis
CN108875800A (en) * 2018-05-29 2018-11-23 重庆大学 A kind of behavioural characteristic extracting method based on RFID card
CN108876123A (en) * 2018-06-01 2018-11-23 首都师范大学 A kind of teaching interference method and device
CN108985522A (en) * 2018-08-02 2018-12-11 杭州华网信息技术有限公司 A kind of Intelligent campus extension section's method for early warning and system
CN110245867A (en) * 2019-06-18 2019-09-17 青海大学 A kind of grassland degeneration stage division based on bp neural network
CN110852390A (en) * 2019-11-13 2020-02-28 山东师范大学 Student score classification prediction method and system based on campus behavior sequence
CN112465260A (en) * 2020-12-10 2021-03-09 成都寻道科技有限公司 Student teaching management system based on campus data
CN113705985A (en) * 2021-08-12 2021-11-26 河南工业职业技术学院 Student think of political affairs condition analysis and early warning method, system, terminal and medium
CN113705985B (en) * 2021-08-12 2023-09-29 河南工业职业技术学院 Student status analysis early warning method, system, terminal and medium
CN115185996A (en) * 2022-07-19 2022-10-14 广州凯园软件科技有限公司 Education training comprehensive evaluation system based on point system

Similar Documents

Publication Publication Date Title
CN105894119A (en) Student ranking prediction method based on campus data
Fouss et al. Algorithms and models for network data and link analysis
Subasi Practical machine learning for data analysis using python
Wikle et al. Spatio-temporal statistics with R
Wauthier et al. Bayesian bias mitigation for crowdsourcing
Gangwar et al. Partitions based computational method for high-order fuzzy time series forecasting
Prado et al. Time series: modeling, computation, and inference
Getis Spatial interaction and spatial autocorrelation: a cross-product approach
Shahriari et al. An entropy search portfolio for Bayesian optimization
Carrijo et al. Modified Moran's I for small samples
Moschen et al. A ground motion record selection approach based on multiobjective optimization
Chachi et al. A hybrid fuzzy regression model and its application in hydrology engineering
CN104063429A (en) Predicting method for user behavior in e-commerce
Widiputra et al. Multiple time-series prediction through multiple time-series relationships profiling and clustered recurring trends
Jebaseel et al. M-learning sentiment analysis with data mining techniques
Islam et al. Incorporating spatial information in machine learning: The Moran eigenvector spatial filter approach
Gupta et al. K-Means clustering based high order weighted probabilistic fuzzy time series forecasting method
Pourzeynali et al. Robust multi-objective optimization design of active tuned mass damper system to mitigate the vibrations of a high-rise building
Amirteimoori et al. Increasing the discrimination power of data envelopment analysis
CN109241275A (en) A kind of text subject clustering algorithm based on natural language processing
CN112465260A (en) Student teaching management system based on campus data
Widiputra et al. Dynamic interaction networks versus local trend models for multiple time-series prediction
Mishra et al. Optimization of fuzzified economic order quantity model allowing shortage and deterioration with full backlogging
Proietti et al. Stochastic trends and seasonality in economic time series: new evidence from Bayesian stochastic model specification search
Christou et al. Nonlinear dimension reduction for conditional quantiles

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160824