CN105894119A - Student ranking prediction method based on campus data - Google Patents
Student ranking prediction method based on campus data Download PDFInfo
- Publication number
- CN105894119A CN105894119A CN201610207978.4A CN201610207978A CN105894119A CN 105894119 A CN105894119 A CN 105894119A CN 201610207978 A CN201610207978 A CN 201610207978A CN 105894119 A CN105894119 A CN 105894119A
- Authority
- CN
- China
- Prior art keywords
- student
- data
- characteristic vector
- ranking
- students
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 239000013598 vector Substances 0.000 claims abstract description 74
- 230000003542 behavioural effect Effects 0.000 claims description 58
- 239000011159 matrix material Substances 0.000 claims description 29
- 230000009467 reduction Effects 0.000 claims description 22
- 230000011218 segmentation Effects 0.000 claims description 8
- 230000002123 temporal effect Effects 0.000 claims description 8
- 238000012549 training Methods 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 6
- 238000000205 computational method Methods 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 2
- 230000006399 behavior Effects 0.000 abstract description 11
- 238000004140 cleaning Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- MFRCZYUUKMFJQJ-UHFFFAOYSA-N 1,4-dioxane-2,5-dione;1,3-dioxan-2-one Chemical compound O=C1OCCCO1.O=C1COC(=O)CO1 MFRCZYUUKMFJQJ-UHFFFAOYSA-N 0.000 description 1
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 230000018199 S phase Effects 0.000 description 1
- 102100029469 WD repeat and HMG-box DNA-binding protein 1 Human genes 0.000 description 1
- 101710097421 WD repeat and HMG-box DNA-binding protein 1 Proteins 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a student ranking prediction method based on campus data, comprising the following steps: collecting the data of all students, including performance data and behavior data; cleaning the student data, and normalizing the non-time data items; extracting the behavior characteristic vector of each student from the processed data, wherein behavior characteristics include performance characteristic, effort degree characteristic and law-of-life characteristic; reducing the dimension of each behavior characteristic vector; subtracting the behavior characteristic vector of each of the other students from the dimension-reduced behavior characteristic vector of each student to get a difference characteristic vector, and inputting the difference characteristic vectors into a classifier to get corresponding tag values, and summing the tag values to get the score of the student; sorting the scores of all the students to get the predicted ranking of each student. According to the invention, the campus data of students is analyzed, the learning habits and behavior characteristics of students are described using data, and the ranking of each student is predicted and used as a reference for student education.
Description
Technical field
The invention belongs to big data analysis digging technology field, more specifically, relate to a kind of based on campus
Student's ranking Forecasting Methodology of data.
Background technology
How to understand students psychology, students ' abnormal behaviour, prediction student's study condition and personalization is provided
Teach, have become as many colleges and universities problems faced and challenge.In recent years, along with " data and calculating "
For the scientific and technological revolution driven, big data become the important factor in order of Internet information technique industry.How will
Big data introduce education sector, as promoting Education Reform, leading the powerful power-assisted of creativity in education, become new
Research direction.But at present, it is difficult to the problems such as quantization due to students ' behavior, counts greatly in education sector
According to application be also in conceptual phase, effective application mode not yet occurs.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of students based on campus data arrange
Name Forecasting Methodology, by being analyzed the campus data of student, describes the study habit of student by data
And behavioural characteristic, it was predicted that obtain student's ranking, as the reference of student education.
For achieving the above object, present invention student based on campus data ranking Forecasting Methodology includes following
Step:
S1: gather the data of all students, including achievement data and behavioral data, wherein achievement data
Including the course types of all courses of student, credit number, achievement, behavioral data includes that student is in campus
Each place uses the record of all-in-one campus card;
S2: the student data collected is carried out data cleansing;
S3: to the non-temporal data item in the student data cleaned, uses following methods to carry out data rule
Model:
The jth item non-temporal data of note i-th student are xij, i=1,2 ..., N, N represent student's quantity,
J=1,2 ..., M, M represent non-temporal data item quantity;Ask for each data xijLinear transformation value x 'ij, meter
Calculation formula is:
Wherein, maxjRepresent the maximum in jth item data sequence, minjRepresent in jth item data sequence
Minimum of a value, Tj_maxRepresent that jth item data sequence limits the interval upper limit, Tj_minRepresent jth item data sequence
Limit interval lower limit;
To the data x ' after linear transformationij, calculate authority data value y according to below equationij:
Wherein,Represent the mean value of jth item data sequence, sjRepresent the variance of jth item data sequence;
S4: extract from student data each student behavioural characteristic vector, behavioural characteristic include achievement feature,
Level of effort feature and rule of life feature, wherein achievement feature include all courses of student course types,
Credit number, achievement, level of effort feature is the frequency that student enters the relevant place of study, rule of life feature
It is the rule of life metric of student, is made up of the behavioural characteristic vector of student data above item;
S5: the behavioural characteristic vector extracting step S4 carries out dimensionality reduction, the row of each student after obtaining dimensionality reduction
It is characterized vector;
S6: to i-th student, uses its behavioural characteristic vector to deduct the behavioural characteristic vector of other each students,
Obtaining N-1 difference characteristic vector, by grader good for difference characteristic vector input training in advance, it is right to obtain
N-1 the label answered, label value is 1 or-1, is sued for peace by all label values of student, obtains this student's
Score, is ranked up the score of all students, thus obtains the ranking predicted value of student;
Wherein, the training method of grader is: to having the student of history ranking, collects and obtains these students
Data, obtain the behavioural characteristic vector of these students according to the method for step S1 to step S5, then two
Two try to achieve the difference characteristic vector between student;For a difference characteristic vector, if being subtracted characteristic vector
Earlier above, then this label corresponding to difference characteristic vector is 1 to corresponding student's ranking, is otherwise-1;By this
A little difference characteristic vectors are as the input of grader, and grader, as output, is trained by corresponding label.
Present invention student based on campus data ranking Forecasting Methodology, gathers the data of all students, bag
Include achievement data and behavioral data, student data is carried out data cleansing, and to non-temporal data item number
According to specification, the data after processing extract the behavioural characteristic vector of each student, and behavioural characteristic includes achievement
Feature, level of effort feature and rule of life feature, then carry out dimensionality reduction, Mei Gexue to behavioural characteristic vector
Behavioural characteristic vector after raw its dimensionality reduction of employing deducts the behavioural characteristic vector of other each students, tries to achieve difference
Characteristic vector, obtains the label value of correspondence in input grader, label value summation obtains the score of student,
The score of all students is ranked up, i.e. can get the ranking predicted value of each student.
The present invention is directed to student's learning behavior data in campus and carry out depth analysis, the base to individual students
This information, study, living condition carry out quantificational description accurately, it was predicted that the ranking of individual students, are relevant
Functional department, provide quantified decision-making foundation for the teaching management of relevant functional department and daily guidance work,
Thus effectively discharge the value of student data.
Accompanying drawing explanation
Fig. 1 is the flow chart of present invention student based on campus data ranking Forecasting Methodology;
Fig. 2 is the flow chart of behavioural characteristic Data Dimensionality Reduction.
Detailed description of the invention
Below in conjunction with the accompanying drawings the detailed description of the invention of the present invention is described, in order to those skilled in the art
It is more fully understood that the present invention.Requiring particular attention is that, in the following description, when known function and
Perhaps, when the detailed description of design can desalinate the main contents of the present invention, these are described in and will be left in the basket here.
Embodiment
Fig. 1 is the flow chart of present invention student based on campus data ranking Forecasting Methodology.As it is shown in figure 1,
Present invention student based on campus data ranking Forecasting Methodology comprises the following steps:
S101: student data collection:
First having to gather the data of all students, student data stems from each functional department of school,
There is heterogeneous structure, contain the school serialized to the time from structurized student's essential information data
Garden life track.Student data includes achievement data and behavioral data, and wherein achievement data includes the institute of student
Have the course types of course, credit number and an achievement, and each part of achievement situation (as usual performance,
Interim achievement etc.), behavioral data includes that student each place in campus uses the record of all-in-one campus card, example
As student in supermarket, the consumer record fetched water of dining room and classroom, including consumption time and the amount of money;Come in and go out figure
Book shop, the record of dormitory gate inhibition;Check out record, including book information and borrow the time.Table 1 is number of students
According to source and content example.
Table 1
S102: data cleansing:
After collecting all student data, need the initial data collected is carried out data cleansing.This
From multiple operation systems and comprise a large amount of historical data due to student data in bright, usually there will be repetition
Value, missing values etc., it is therefore desirable to carry out data cleansing.The task of data cleansing be filter those do not meet want
The data asked, write data warehouse again after correction.Clean object mainly include the repetition values in data,
Missing values, inconsistent data etc., data cleansing is the conventional means of big data fields, and its detailed process exists
This repeats no more.
S103: data normalization:
For the student data cleaned, owing to the attribute of every item data is different, it is generally of different amounts
Guiding principle and the order of magnitude.It is said that in general, represent that attribute will cause this attribute to have higher value territory by less unit,
So tend to " weight " making such attribute have large effect or higher.In order to avoid single to tolerance
Position select dependence, it is ensured that the reliability of result, need in initial data in addition to time data its
He carries out standardization processing at data item.
Data normalization refers to data bi-directional scaling, is allowed to fall into a little specific interval.This
Mode is often used in some compares and the index evaluated processes, and the unit removing data limits, by it
It is converted into nondimensional pure values, it is simple to the index of commensurate or magnitude can not compare and weight.This
In invention, data normalization include following two step:
● linear transformation:
The jth item non-temporal data of note i-th student are xij, i=1,2 ..., N, N represent student's quantity,
J=1,2 ..., M, M represent non-temporal data item quantity.To each data, ask for according to below equation respectively
Linear transformation value x 'ij:
Wherein, maxjRepresent the maximum in jth item data sequence, minjRepresent in jth item data sequence
Minimum of a value, Tj_maxRepresent that jth item data sequence limits the interval upper limit, Tj_minRepresent jth item data sequence
Limit interval lower limit.Jth item data sequence is exactly the sequence of the jth item data composition of all students.It is visible,
By above formula, by script in jth item data sequence at interval [minj,maxj] primary system one be mapped to
[Tj_min,Tj_maxOn].
Assume jth item data sequence for [1,2, Isosorbide-5-Nitrae, 3,2,5,6,2,7], interval is [1,7], its limit interval as
[0,1], then the data sequence after linear transformation is [0,0.16,0,0.5,0.33,0.16,0.66,0.83,0.16,1].
● numerical value specification:
Based on data for data after linear transformation averages and standard deviation are carried out numerical value standardization.Become linear
Data x ' after changingij, calculate authority data value y according to below equationij:
Wherein,Represent the mean value of jth item data sequence,sjRepresent jth item data sequence
The variance of row,
Each data sequence average after numerical value standardization is 0, and variance is 1, and dimensionless, the word in sequence
Segment value fluctuates around about 0, more than 0 explanation higher than average level, less than 0 explanation less than average level.
Unified interval can not only be mapped the data into by two above step, and effectively eliminate
Beyond the impact on data overall distribution of the Outlier Data of span.
S104: extract behavioural characteristic vectorial:
After completing the work of data normalization, need extracting data learning behavior feature.Institute in the present invention
The behavioural characteristic needing each student is divided into three parts: achievement feature, level of effort feature and rule of life are special
Levy.Achievement feature includes the course types of all courses of student, credit number, achievement.Level of effort feature is united
Having counted the frequency entering the relevant place of study, including entering library's number of times, number of times is checked card in classroom, print time
Count, check out number of times etc., the study level of effort describing student with this and Active Learning wish.Rule of life
Feature is the rule of life metric of student, is to be portrayed in the charge time of different location by analysis student
The regularity of its daily life system.
In the present embodiment, the computational methods of rule of life metric are: first according to the number of students of each student
Access situation to default several places (generally dining room, dormitory, classroom) according to, is calculated
This student access probability to these places in predetermined amount of time, is then calculated Shannon according to access probability
Entropy, this Shannon entropy is the rule of life metric of student.
Shannon entropy (Shannon Entropy) have expressed the average information that a discrete variable is brought,
May be used for characterizing rule of life, its computing formula is:
Wherein, HiZ () represents the Shannon entropy of i-th student, PifZ () represents that i-th student accesses the f place
Access probability, f=1,2 ..., F, F represent place quantity.
Such as, when be calculated a student respectively in dining room, dormitory, these three place, classroom access general
When rate is 0.3,0.3,0.4 respectively, it is calculated Shannon entropy H1(z)=1.572.Another student accesses three
When the probability in place is 0.1,0.6,0.2 respectively, it is calculated H2(z)=1.24.The Shannon entropy of the latter is less,
Embody higher Behavior law (probability of the dormitory that comes in and goes out is higher).For a probability distribution, when generally
When rate concentrates on certain several value less (one of variable several values that can take minority in most cases),
The value of Shannon entropy can be relatively low, if on the contrary, probability relatively averagely (almost cannot judge in various values
Which value variable can take), then Shannon entropy can be higher.It can therefore be seen that the time that place is accessed by student
More concentrate, then entropy will be the least, and rule of life is the strongest.
Calculating the access probability to each place can use student data to add up, it would however also be possible to employ density
The mode estimated obtains, and concrete grammar can be as desired to arrange.For middle school student's data volume of the present invention
Big feature, it is proposed that a kind of access probability computational methods, its detailed process is as follows:
Predetermined amount of time is carried out time interval segmentation, from student data, extracts student's visit to every class place
Ask the time, project to segment time interval, add up every class place access in each segmentation time interval time
Number, general to the access in such place in then using density Estimation Function Estimation to obtain each segmentation time interval
Rate, then integration obtains the preset time period access probability to such place.Density Estimation function can basis
Being actually needed and select, the density Estimation function expression employed in the present embodiment is:
Wherein, pifvZ () represents that i-th student accesses the access in f place in the v segmentation time interval
Probability, v=1,2 ..., V, V represent the quantity of segmentation time interval.zifvRepresent that i-th student is thin at v
The access times in f place are accessed in dividing time interval.GifRepresent that i-th student visits within a predetermined period of time
Ask total access times in f place, i.e.hifRepresent that i-th student is accessing the f ground
The density Estimation bandwidth value that point is corresponding, its its empirical equation is:
Wherein σifRepresent V access times zifvStandard deviation.
Then to V pifvZ () is integrated, it is possible in obtaining predetermined amount of time, i-th student accesses f
The access probability P in placeif(z)。
S105: behavioural characteristic Data Dimensionality Reduction:
After extracting student characteristics, owing to characteristic item is more, it is therefore desirable to data are carried out dimension-reduction treatment,
Data Dimensionality Reduction can reduce the complexity of calculating, reduces the disappearance of the information content that correlation causes, for magnanimity
The feature extraction of data has great significance.The method of Data Dimensionality Reduction has many, can be according to actual needs
Select, for the feature of application scenarios of the present invention in the present embodiment, have devised a kind of dimension reduction method,
By dimensionality reduction, multi objective is converted into a few overall target, so that the characteristic after dimensionality reduction is contained
Information more fully.
Fig. 2 is the flow chart of behavioural characteristic Data Dimensionality Reduction.As in figure 2 it is shown, characteristic dimensionality reduction includes following
Step:
S201: structure behavioural characteristic matrix:
The behavioural characteristic vector of note i-th student is Bi={ bi1,bi2,…,biD}T, D represents feature item number, by institute
The behavioural characteristic matrix U that size is D × N is formed, it is clear that in matrix U by the behavioural characteristic data of student, the
I row are Bi, subscript T represents transposition.
S202: ask for covariance matrix:
Ask for the covariance matrix C of behavioural characteristic matrix U.
S203: ask for the eigenmatrix of covariance matrix:
Ask for the characteristic value of covariance matrix C and characteristic of correspondence vector, then according to character pair value from
To little, characteristic vector is become matrix the most by rows greatly, take front K row composition characteristic vector matrix P, K
Numerical value be configured according to actual needs.
S204: behavioural characteristic matrix after calculating dimensionality reduction:
Calculating the behavioural characteristic matrix Q=PU of student after dimensionality reduction, in matrix Q, the i-th row are after dimensionality reduction the
The behavioural characteristic vector B ' of i studenti。
Obviously the line number of matrix Q is K, and in step S203, K is the biggest, and the matrix Q obtained more can embody row
It is characterized, but the complexity of subsequent calculations also can increase.The span typically arranging K is
Assume that the behavioural characteristic matrix H constructed by behavioural characteristic vector of 10 students is as follows:
Visible, the behavioural characteristic vector of each student comprises two characteristic items.
Try to achieve covariance matrix C as follows:
The eigenvalue λ and the characteristic of correspondence vector α that try to achieve covariance matrix C are respectively as follows:
λ1=0.490833989, α1=[-0.735178656,0.677873399]
λ2=1.28402771, α2=[-0.677873399 ,-0.735178656]
Then 1 eigenvalue λ of maximum is selected2Characteristic of correspondence vector forms characteristic vector square as column vector
Battle array, then eigenvectors matrix P=[-0.677873399 ,-0.735178656].It is calculated student after dimensionality reduction
Behavioural characteristic matrix Q=PU, it may be assumed that
Q=[-0.8280,1.7776 ,-0.9922 ,-0.2742 ,-1.6758 ,-0.9129,0.0991,1.1446,0.4380,1.2238]
After in matrix Q, each numerical value takes decimal point four.
S106: student's ranking is predicted:
By step S101 to S105, from the student data of magnanimity, extract the behavioural characteristic of each student
Vector, it is possible to carry out ranking by the behavioural characteristic vector of student and predict.Ranking prediction in the present invention
Method particularly includes:
To i-th student, its behavioural characteristic vector is used to deduct the behavioural characteristic vector of other each students,
To N-1 difference characteristic vector, by grader good for difference characteristic vector input training in advance, obtain correspondence
N-1 label, label value is 1 or-1, all label values of student is sued for peace, obtain this student
Point, the score of all students is ranked up, thus obtains the ranking predicted value of student.
Wherein, grader is to be trained by the student data with history ranking to obtain, and training method is:
To having the student of history ranking, collect the data of these students, according to step S101 to step S105
Method obtains the behavioural characteristic vector of these students, tries to achieve the difference characteristic vector between student the most two-by-two.
For a difference characteristic vector, if being subtracted the student's ranking corresponding to characteristic vector earlier above, then this difference
Label corresponding to characteristic vector is 1, is otherwise-1;Using vectorial for these difference characteristics input as grader,
Grader, as output, is trained by corresponding label.
As described above it can be seen that the present invention is to have employed the method compared two-by-two to portray two people
Difference.Each behavioural characteristic vector of any two people is subtracted each other, as a new characteristic vector.
Such as, the ranking of student A is 5, and behavioural characteristic vector is A=(3,2,5,7,9,6,8, Isosorbide-5-Nitrae, 7)T, student B's
Ranking is 12, and behavioural characteristic vector is B=(5,9,8,6,7,1,3,4,7,6)T, then difference characteristic vector
A-B=(-2 ,-7 ,-3,1,2,5,5 ,-3 ,-3,1)T。
Assuming that there be W student in training sample, each two student is calculated a difference characteristic vector, institute
The difference characteristic vector obtained just has W (W-1)/2, then the training sample of grader has W (W-1)/2.
Because label only has two classes (1 and-1), so prediction is exactly this label.It is to say, the present invention will
Ranking predictive conversion between student is in order to first predict the relative rankings order obtaining each two student, the most again
It is real ranking according to these relative rankings sequential conversions, ranking forecasting problem has been translated into a row
Sequence problem concerning study, thus efficiently solve the ranking forecasting problem of student.If the ranking of student A is the highest,
He occurs that the number of times of 1 is the most in others relatively more produced label, and the number of times of-1 is the fewest, then by meter
Label sum produced by the raw A of mathematics and other students can obtain a score, obtaining according to all students
Divide and be ranked up can be obtained by the ranking predicted value of current student A.Such as, student A and other student's phases
The tag set relatively obtained relatively obtains for (1 ,-1 ,-1,1,1,1 ,-1,1 ,-1 ,-1,1), student B and other student's ratios
Tag set be (-1,1 ,-1 ,-1,1,1 ,-1 ,-1 ,-1,1,1), can obtain student A must be divided into 1, student B's
-1 must be divided into, then student A can be located further forward than the ranking of student B.
Although detailed description of the invention illustrative to the present invention is described above, in order to the art
Artisans understand that the present invention, it should be apparent that the invention is not restricted to the scope of detailed description of the invention, right
From the point of view of those skilled in the art, as long as various change limits in appended claim and determines
The spirit and scope of the present invention in, these changes are apparent from, all utilize present inventive concept send out
Bright creation is all at the row of protection.
Claims (5)
1. student's ranking Forecasting Methodology based on campus data, it is characterised in that comprise the following steps:
S1: gather the data of all students, including achievement data and behavioral data, wherein achievement data
Including the course types of all courses of student, credit number, achievement, behavioral data includes that student is in campus
Each place uses the record of all-in-one campus card;
S2: the student data collected is carried out data cleansing;
S3: to the non-temporal data item in the student data cleaned, uses following methods to carry out data rule
Model:
The jth item non-temporal data of note i-th student are xij, i=1,2 ..., N, N represent student's quantity,
J=1,2 ..., M, M represent data item quantity;Ask for each data xijLinear transformation value x 'ij, computing formula
For:
Wherein, maxjRepresent the maximum in jth item data sequence, minjRepresent in jth item data sequence
Minimum of a value, Tj_maxRepresent that jth item data sequence limits the interval upper limit, Tj_minRepresent jth item data sequence
Limit interval lower limit;
To the data x ' after linear transformationij, calculate authority data value y according to below equationij:
Wherein,Represent the mean value of jth item data sequence, sjRepresent the variance of jth item data sequence;
S4: extract from student data each student behavioural characteristic vector, behavioural characteristic include achievement feature,
Level of effort feature and rule of life feature, wherein achievement feature include all courses of student course types,
Credit number, achievement, level of effort feature is the frequency that student enters the relevant place of study, rule of life feature
It is the rule of life metric of student, is made up of the behavioural characteristic vector of student data above item;
S5: the behavioural characteristic vector extracting step S4 carries out dimensionality reduction, the row of each student after obtaining dimensionality reduction
It is characterized vector;
S6: to i-th student, uses the behavioural characteristic vector after its dimensionality reduction to deduct the behavior of other each students
Characteristic vector, obtains N-1 difference characteristic vector, by classification good for difference characteristic vector input training in advance
Device, obtains N-1 label of correspondence, and label value is 1 or-1, is sued for peace by all label values of this student,
Obtain the score of this student, the score of all students is ranked up, thus obtain the ranking predicted value of student;
Wherein, the training method of grader is: to having the student of history ranking, collects and obtains these students
Data, obtain the behavioural characteristic vector of these students according to the method for step S1 to step S5, then two
Two try to achieve the difference characteristic vector between student;For a difference characteristic vector, if being subtracted characteristic vector
Earlier above, then this label corresponding to difference characteristic vector is 1 to corresponding student's ranking, is otherwise-1;By this
A little difference characteristic vectors are as the input of grader, and grader, as output, is trained by corresponding label.
Student's ranking Forecasting Methodology the most according to claim 1, it is characterised in that described step S4
The computational methods of middle rule of life metric are: according in the student data of each student to default several
The access situation in place, is calculated this student access probability to these places in predetermined amount of time, then
Being calculated Shannon entropy according to access probability, this Shannon entropy is the rule of life metric of this student.
Student's ranking Forecasting Methodology the most according to claim 2, it is characterised in that described access probability
Computational methods be:
Predetermined amount of time is carried out time interval segmentation, from student data, extracts student's visit to every class place
Ask the time, project to segment time interval, add up every class place access in each segmentation time interval time
Number, uses density Estimation Function Estimation to obtain the interior access probability to such place of each segmentation time interval,
Then integration obtains the preset time period access probability to such place.
Student's ranking the most according to claim 1 prediction arrangement method, it is characterised in that described step
In S5, the method for behavioural characteristic vector dimensionality reduction is:
S5.1: the behavioural characteristic vector of note i-th student is Bi={ bi1,bi2,…,biD}T, D represents feature item number,
The behavioural characteristic data of all students are formed the behavioural characteristic matrix U that size is D × N;
S5.2: ask for the covariance matrix C of behavioural characteristic matrix U;
S5.3: ask for characteristic value and the characteristic of correspondence vector of covariance matrix C, then according to character pair
Characteristic vector is become matrix by value from big to small the most by rows, takes front K row composition characteristic vector matrix
The numerical value of P, K is configured according to actual needs;
S5.4: the behavioural characteristic matrix Q=PU of student after calculating dimensionality reduction, in matrix Q, the i-th row are through fall
The behavioural characteristic vector B ' of i-th student after dimensioni。
Student's ranking the most according to claim 4 prediction arrangement method, it is characterised in that described step
The span of parameter K is
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610207978.4A CN105894119A (en) | 2016-04-05 | 2016-04-05 | Student ranking prediction method based on campus data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610207978.4A CN105894119A (en) | 2016-04-05 | 2016-04-05 | Student ranking prediction method based on campus data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105894119A true CN105894119A (en) | 2016-08-24 |
Family
ID=57012173
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610207978.4A Pending CN105894119A (en) | 2016-04-05 | 2016-04-05 | Student ranking prediction method based on campus data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105894119A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991187A (en) * | 2017-04-10 | 2017-07-28 | 武汉朱雀闻天科技有限公司 | The analysis method and device of a kind of campus data |
CN107423563A (en) * | 2017-07-25 | 2017-12-01 | 深信服科技股份有限公司 | A kind of students psychology analysis method, equipment and its storage medium |
CN108280531A (en) * | 2017-07-28 | 2018-07-13 | 淮阴工学院 | A kind of student class marks sequencing prediction technique returned based on Lasso |
CN108305195A (en) * | 2018-01-07 | 2018-07-20 | 深圳前海易维教育科技有限公司 | A kind of comprehensive index system towards students in middle and primary schools' evaluation and theme attribute analysis |
CN108320045A (en) * | 2017-12-20 | 2018-07-24 | 卓智网络科技有限公司 | Student performance prediction technique and device |
CN108876123A (en) * | 2018-06-01 | 2018-11-23 | 首都师范大学 | A kind of teaching interference method and device |
CN108875800A (en) * | 2018-05-29 | 2018-11-23 | 重庆大学 | A kind of behavioural characteristic extracting method based on RFID card |
CN108985522A (en) * | 2018-08-02 | 2018-12-11 | 杭州华网信息技术有限公司 | A kind of Intelligent campus extension section's method for early warning and system |
CN110245867A (en) * | 2019-06-18 | 2019-09-17 | 青海大学 | A kind of grassland degeneration stage division based on bp neural network |
CN110852390A (en) * | 2019-11-13 | 2020-02-28 | 山东师范大学 | Student score classification prediction method and system based on campus behavior sequence |
CN112465260A (en) * | 2020-12-10 | 2021-03-09 | 成都寻道科技有限公司 | Student teaching management system based on campus data |
CN113705985A (en) * | 2021-08-12 | 2021-11-26 | 河南工业职业技术学院 | Student think of political affairs condition analysis and early warning method, system, terminal and medium |
CN115185996A (en) * | 2022-07-19 | 2022-10-14 | 广州凯园软件科技有限公司 | Education training comprehensive evaluation system based on point system |
-
2016
- 2016-04-05 CN CN201610207978.4A patent/CN105894119A/en active Pending
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991187A (en) * | 2017-04-10 | 2017-07-28 | 武汉朱雀闻天科技有限公司 | The analysis method and device of a kind of campus data |
CN107423563A (en) * | 2017-07-25 | 2017-12-01 | 深信服科技股份有限公司 | A kind of students psychology analysis method, equipment and its storage medium |
CN108280531A (en) * | 2017-07-28 | 2018-07-13 | 淮阴工学院 | A kind of student class marks sequencing prediction technique returned based on Lasso |
CN108280531B (en) * | 2017-07-28 | 2021-07-09 | 淮阴工学院 | Student class score ranking prediction method based on Lasso regression |
CN108320045A (en) * | 2017-12-20 | 2018-07-24 | 卓智网络科技有限公司 | Student performance prediction technique and device |
CN108305195A (en) * | 2018-01-07 | 2018-07-20 | 深圳前海易维教育科技有限公司 | A kind of comprehensive index system towards students in middle and primary schools' evaluation and theme attribute analysis |
CN108875800A (en) * | 2018-05-29 | 2018-11-23 | 重庆大学 | A kind of behavioural characteristic extracting method based on RFID card |
CN108876123A (en) * | 2018-06-01 | 2018-11-23 | 首都师范大学 | A kind of teaching interference method and device |
CN108985522A (en) * | 2018-08-02 | 2018-12-11 | 杭州华网信息技术有限公司 | A kind of Intelligent campus extension section's method for early warning and system |
CN110245867A (en) * | 2019-06-18 | 2019-09-17 | 青海大学 | A kind of grassland degeneration stage division based on bp neural network |
CN110852390A (en) * | 2019-11-13 | 2020-02-28 | 山东师范大学 | Student score classification prediction method and system based on campus behavior sequence |
CN112465260A (en) * | 2020-12-10 | 2021-03-09 | 成都寻道科技有限公司 | Student teaching management system based on campus data |
CN113705985A (en) * | 2021-08-12 | 2021-11-26 | 河南工业职业技术学院 | Student think of political affairs condition analysis and early warning method, system, terminal and medium |
CN113705985B (en) * | 2021-08-12 | 2023-09-29 | 河南工业职业技术学院 | Student status analysis early warning method, system, terminal and medium |
CN115185996A (en) * | 2022-07-19 | 2022-10-14 | 广州凯园软件科技有限公司 | Education training comprehensive evaluation system based on point system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105894119A (en) | Student ranking prediction method based on campus data | |
Fouss et al. | Algorithms and models for network data and link analysis | |
Subasi | Practical machine learning for data analysis using python | |
Wikle et al. | Spatio-temporal statistics with R | |
Wauthier et al. | Bayesian bias mitigation for crowdsourcing | |
Gangwar et al. | Partitions based computational method for high-order fuzzy time series forecasting | |
Prado et al. | Time series: modeling, computation, and inference | |
Getis | Spatial interaction and spatial autocorrelation: a cross-product approach | |
Shahriari et al. | An entropy search portfolio for Bayesian optimization | |
Carrijo et al. | Modified Moran's I for small samples | |
Moschen et al. | A ground motion record selection approach based on multiobjective optimization | |
Chachi et al. | A hybrid fuzzy regression model and its application in hydrology engineering | |
CN104063429A (en) | Predicting method for user behavior in e-commerce | |
Widiputra et al. | Multiple time-series prediction through multiple time-series relationships profiling and clustered recurring trends | |
Jebaseel et al. | M-learning sentiment analysis with data mining techniques | |
Islam et al. | Incorporating spatial information in machine learning: The Moran eigenvector spatial filter approach | |
Gupta et al. | K-Means clustering based high order weighted probabilistic fuzzy time series forecasting method | |
Pourzeynali et al. | Robust multi-objective optimization design of active tuned mass damper system to mitigate the vibrations of a high-rise building | |
Amirteimoori et al. | Increasing the discrimination power of data envelopment analysis | |
CN109241275A (en) | A kind of text subject clustering algorithm based on natural language processing | |
CN112465260A (en) | Student teaching management system based on campus data | |
Widiputra et al. | Dynamic interaction networks versus local trend models for multiple time-series prediction | |
Mishra et al. | Optimization of fuzzified economic order quantity model allowing shortage and deterioration with full backlogging | |
Proietti et al. | Stochastic trends and seasonality in economic time series: new evidence from Bayesian stochastic model specification search | |
Christou et al. | Nonlinear dimension reduction for conditional quantiles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160824 |