CN105631536A

CN105631536A - Massive open online course (MOOC) quitting prediction algorithm based on semi-supervised learning

Info

Publication number: CN105631536A
Application number: CN201510967503.0A
Authority: CN
Inventors: 江峰; 李文涛
Original assignee: Chongqing Technology and Business Institute
Current assignee: Chongqing Technology and Business Institute
Priority date: 2015-12-21
Filing date: 2015-12-21
Publication date: 2016-06-01

Abstract

The invention relates to a massive open online course (MOOC) quitting prediction algorithm based on semi-supervised learning. Firstly, learning log files of users are acquired from an MOOC website, one part of the acquired users forms a test sample set, and the other part forms a training sample set; secondly, according to the learning log files of the users, behavior features of all samples in the training sample set are counted to obtain n behavior features which most express common features of all samples in the training sample set; thirdly, according to the n behavior features, a semi-supervised learning method is adopted to acquire R classifiers; fourthly, the test sample set is used for testing tagging accuracy of the R classifiers, and the classifier with the highest tagging accuracy is selected; and finally, behavior features of any unmarked user are inputted to the above classifier, and the user is marked. The algorithm of the invention only needs few marking samples, a large amount of manpower and material resources cost for tagging the samples can be reduced, the prediction cost is saved, and the prediction accuracy is also improved.

Description

Large scale network Open Course based on semi-supervised learning moves back class prediction algorithm

Technical field

The present invention relates to computer and information technology, be specifically related to a kind of large scale network Open Course based on semi-supervised learning and move back class prediction algorithm.

Background technology

The maturation of the technology such as Web2.0 and cloud computing provides new opportunity to IT application in education sector, and the large-scale online course (MOOC, also known as admiring class) that discloses is the product that internet, applications is innovated. Along with rise and the MIT of the MOOC websites such as edx, coursera and udacity, the university such as Stanford opens a course at MOOC platform in succession, and MOOC is of increased attention with approval. MOOC relies on the Internet, provides for substantial amounts of student and educates such as answer, examination, sees the educational experience such as video, and student can be allowed to utilize the form Cooperative Studies such as network forum. And the open characteristics that MOOC possesses so that the student that MOOC learns background for difference provides opportunity to study.

Although MOOC has the advantage of its uniqueness compared with traditional education, but the learner colony of MOOC has bigger diversity. This diversity be mainly reflected in education background with in education motivation, if any student register certain subject just to obtaining some knowledge point, and the cost owing to exiting MOOC course is relatively low, this result in learner to move back class rate too high. It is a kind of general phenomenon that many educators point out that the height of MOOC moves back class rate, if taking countermeasure the development causing MOOC platform to be restricted not in time.

The analysis that student exits factor is possible not only to help the construction of MOOC platform improving, and can pass through, and improves the retention ratio of student, thus ensureing carrying out in order of course. Therefore, it is predicted helping MOOC to reach better teaching efficiency by setting up the model class behavior of moving back to student. MOOC moves back the value short-period value of class prediction: by judging whether a user moves back class, it is possible to the user being likely to move back class is intervened by assisted teacher or system, reduces them and moves back class possibility. Long-term value: analyze curriculum character and the relation moving back class rate, design and move back the course that class rate is relatively low, improves the quality of MOOC course.

Existing prediction algorithm mainly has two kinds, a kind of is that some behaviors to student are tracked, if students' work User behavior, video-see behavior, other resource acquisition behaviors etc. are tracked, add up the number of times that these behaviors occur, thus judging that prediction student moves back class or retains. First this prediction algorithm has following defects that, use the study of supervision type, one model of training on a large amount of mark sample sets, but the acquisition cost of sample label very big this is mainly reflected in: the first sample size is big, much more very second sample mark needs cost manpower and time, and marker samples needs professional to carry out; Secondly, this prediction algorithm uses and is characterized by a kind of generalized features, it is impossible to accurately portray moving back class student, therefore, it was predicted that accuracy is relatively low. Another kind of pre-measuring method is finally to move back class rate according to what class rate of moving back weekly calculated this subject, although what this Forecasting Methodology can predict a certain subject moves back class rate, but cannot judge for concrete student or user, namely cannot judge which student or user move back class.

Summary of the invention

For the problems referred to above that prior art exists, it is an object of the invention to one and can accurately judge that certain user moves back class or retains large scale network Open Course and move back class prediction algorithm.

For achieving the above object, the present invention adopts the following technical scheme that the large scale network Open Course based on semi-supervised learning moves back class prediction algorithm, comprises the steps:

S1: obtain the learning log file of user from MOOC website, the user's part obtained constitutes test sample set, another part composing training sample set, the test sample that wherein test sample is concentrated is entirely marker samples, this training sample is concentrated and is included unmarked sample and marker samples, all unmarked samples constitute unmarked sample set, and all marker samples constitute marker samples collection;

S2: add up training sample according to the learning log file of user and concentrate the behavior characteristics of all samples, obtaining can all samples individual n kind behavior characteristics altogether in assertiveness training sample set;

If the course persistent period of a certain course is K week;

If U_i=U (i, 1) ..., U (i, j) ..., U (i, n) }, U_iThe i-th sample that expression training sample is concentrated, U (i, j)=(U (i, j)₁,...,U(i,j)_k,...,U(i,j)_K), U (i, j) represents that the jth kind behavior characteristics of training sample concentration i-th sample is vectorial, and U (i, j)_kRepresent the number of times that the jth kind behavior characteristics of i-th user occurs in the kth week of course persistent period;

S3: randomly select m kind behavior characteristics from n kind behavior characteristics, and adopt following manner to obtain R kind grader, wherein, m��n,

R = C_{n}^{m} = \frac{n!}{m! (n - m)!}, r = 1, 2, 3 ... R;

The acquisition pattern of R kind grader is as follows:

S301: set r=1;

S302:j=1;

S303:v=1;

S304: set P_rj(C | U (i, j)) concentrates i-th sample to be noted as the probability of C under jth kind behavior characteristics for training sample, wherein, is marked the sample of C=0 and represents and retain user, is marked the sample of C=1 and represents and move back class user;

S305: select all unmarked sample under jth kind behavior characteristics in unmarked sample set, the set U that under jth kind behavior characteristics, all unmarked samples are formed_j, set of computations U respectively_jIn the P of each unmarked sample_rj(C=0 | and U (v, j)_k) and P_rj(C=1 | and U (v, j)_k), wherein v=1,2 ..., | U_j|, | U_j| represent set U_jThe sum of middle sample;

P_{r j} (C = 0 | U (v, j)) = \frac{P_{r j} (U (v, j) | C = 0) \cdot P_{r j} (C = 0)}{P_{r j} (U (v, j))} - - - (1);

P_{r j} (C = 0) = \frac{| L_{j, C = 0} |}{| U_{j} | + | L_{j} |} - - - (1 a);

Wherein, | L_{J, C=0}| represent that marker samples concentrates the sum of the sample being marked C=0, L under jth kind behavior characteristics_jRepresent all under jth kind behavior characteristics and marked the set that sample is formed, | L_j| represent set L_jThe sum of middle sample, | U_j|+|L_j| represent that under jth kind behavior characteristics, training sample concentrates the sum of sample;

P_rj(U (v, j) | C=0)=P_rj(U(v,j)₁| C=0) P_rj(U(v,j)₂| C=0) ...,

(1b);

P_rj(U(v,j)_k| C=0) ... P_rj(U(v,j)_K| C=0)

P_{r j} (U {(v, j)}_{k} | C = 0) = \frac{| L_{j, C = 0} (U {(v, j)}_{k}) |}{| L_{j, C = 0} |} - - - (1 b - 1);

Wherein, | L_{J, C=0}| represent that marker samples concentrates the sum of the sample being marked C=0 under jth kind behavior characteristics, | L_{J, C=0}(U(v,j)_k) | represent that under jth kind behavior characteristics marker samples has been concentrated and be marked in the sample of C=0, the number of times that jth kind behavior occurs in the kth week of course persistent period be U (v, j)_kThe sum of sample;

P_{r j} (C = 1 | U (v, j)) = \frac{P_{r j} (U (v, j) | C = 1) \cdot P_{r j} (C = 1)}{P_{r j} (U (v, j))} - - - (2);

P_{r j} (C = 1) = \frac{| L_{j, C = 1} |}{| U_{j} | + | L_{j} |} - - - (2 a);

Wherein, | L_{J, C=1}| represent that marker samples concentrates the sum of the sample being marked C=1 under jth kind behavior characteristics;

P_rj(U (v, j) | C=1)=P_rj(U(v,j)₁| C=1) P_rj(U(v,j)₂| C=1) ...,

(2b);

P_rj(U(v,j)_k| C=1) ... P_rj(U(v,j)_K| C=1)

P_{r j} (U {(v, j)}_{k} | C = 1) = \frac{| L_{j, C = 1} (U {(v, j)}_{k}) |}{| L_{j, C = 1} |} - - - (2 b - 1);

Wherein, | L_{J, C=1}| represent that marker samples concentrates the sum of the sample being marked C=1 under jth kind behavior characteristics, | L_{J, C=1}(U(v,j)_k) | represent that under jth kind behavior characteristics marker samples has been concentrated and be marked in the sample of C=1, the number of times that jth kind behavior occurs in the kth week of course persistent period be U (v, j)_kThe sum of sample;

P_rj(U (v, j))=P (U (v, j) | C=0) P (C=0)

(3);

P (U (v, j) | C=1) P (C=1)

Output P_rj(C=0 | and U (v, j)_k) and P_rj(C=1 | and U (v, j)_k);

S306: make v=v+1;

S307: work as v > | U_j| time, perform next step, otherwise return step S304;

S308:max{P_rj(C=0 | U (v, j)) }=max{P_rj(C=0 | U (v, j)), v=1,2,3...u_j, by max{P_rj(C=0 | U (v, j)) } corresponding unmarked sample is from set U_jMiddle rejecting, simultaneously by max{P_rj(C=0 | U (v, j)) } corresponding unmarked sample moves into set L_j, and by max{P_rj(C=0 | U (v, j)) } corresponding unmarked sample mark C=0;

max{P_rj(C=1 | U (v, j)) }=max{P_rj(C=1 | U (v, j)), v=1,2,3...TU_j, by max{P_rj(C=1 | U (v, j)) } corresponding unmarked sample is from set U_jMiddle rejecting, simultaneously by max{P_rj(C=1 | U (v, j)) } corresponding unmarked sample moves into set L_j, and by max{P_rj(C=1 | U (v, j)) } corresponding unmarked sample mark C=1;

S309: update the set U that under jth kind behavior characteristics, all unmarked samples are formed_jAll set L having marked sample formation with under jth kind behavior characteristics_j, order | U_j|=| U_j|-2, | T_j|=| T_j|+2;

S310:| U_j| when >=2, return step S303, otherwise perform next step;

S311: make j=j+1;

During S312: as j > m, the current marker samples collection of output, and perform next step; Otherwise return step S303;

S313: make r=r+1;

During S314: as r > R, perform next step; Otherwise return step S302;

S4: select optimum grader

S401: the test sample set in obtaining step S1, this test sample is concentrated and is had H test sample, h=1, and 2 ... H;

S402: make r=1;

S403: make h=1;

S404: calculate P according to formula (4)_h(C=0 | U (v, j)):

P_{h} (C = 0 | U (v, j)) = Σ_{j = 1}^{m} P_{r j} (C = 0 | U (v, j)) - - - (4);

P is calculated according to formula (5)_h(C=1 | U (v, j)):

P_{h} (C = 1 | U (v, j)) = Σ_{j = 1}^{m} P_{r j} (C = 1 | U (v, j)) - - - (5);

S405: if P_h(C=0 | U (v, j)) >=P_h(C=1 | U (v, j)), then the h is tested sample mark C=0, otherwise marks C=1, the h test sample after output token;

S406: make h=h+1;

S407: if h > H, then perform next step, otherwise returns step S404;

S408: calculate the accuracy rate �� of r grader_r,Wherein S=H represents the number of times using the r grader to be labeled, and S ' represents the number of times using the r grader mark correct;

S409: make r=r+1;

S410: if r > R, then perform next step, otherwise returns step S403;

S411:max{ ��_r}=max{ ��_r, r=1,2,3...R}, max{ ��_rCorresponding grader is the grader that mark accuracy rate is the highest, finally output max{ ��_rCorresponding grader, this grader is designated as

S5: for any one unlabelled user U_x, according to its learning log file, obtain the n kind behavior characteristics of this user, the grader of selected step S411 output, then calculate according to formula (6)

P_{U_{x}} (C = 0 | U (1, j));

P_{U_{x}} (C = 0 | U (1, j)) = Σ_{j = 1}^{m} P_{r_{\max^{j}}} (C = 0 | U (1, j)) - - - (6);

Calculate according to formula (7)

P_{U_{x}} (C = 1 | U (1, j));

P_{U_{x}} (C = 1 | U (1, j)) = Σ_{j = 1}^{m} P_{r_{\max^{j}}} (C = 1 | U (1, j)) - - - (7);

If

P_{U_{x}} (C = 0 | U (1, j)) &GreaterEqual; P_{U_{x}} (C = 1 | U (1, j))

Then by user U_xMark C=0, otherwise marks C=1.

Relative to prior art, present invention have the advantage that

1, the present invention is based on the prediction algorithm of semi-supervised learning, semi-supervised learning is mainly reflected in the acquisition of R grader, this semi-supervised learning has only to use less marker samples, thus decreasing a large amount of man power and materials that sample is labeled cost, not only save forecast cost, and predictablity rate has also improved.

2, obtain, according to cycle statistical activity number of times, the user that behavior characteristics can be portrayed in MOOC preferably;

3, use multiple behavior characteristics can embody the thought of integrated study, improve predictablity rate;

4, semi-supervised learning can use mark sample set simultaneously and not mark sample set, is more suitable for promoting in practice

5, various features carries out semi-supervised learning and reduces the cumulative errors of mark sample, it is possible to be predicted not marking sample preferably, improve predictablity rate. That is: R grader of training simultaneously, selecting optimum and each grader is training on behavior characteristics in m.

Accompanying drawing explanation

Fig. 1 is the comparison diagram (F value) of BSP and additive method.

Detailed description of the invention

To step numbers carry out as described below: S1, S2, S3 and S4 represent step S1, step S2, step S3 and step S4 respectively; S301 represents the 01st little step in step S3, and S302 represents the 02nd little step in step S3, the like; S401 represents the 01st little step in step S4, and S402 represents the 02nd little step in step S4, the like.

Below the present invention is described in further detail.

Large scale network Open Course based on semi-supervised learning moves back class prediction algorithm, comprises the steps:

If the course persistent period of a certain course is K week;

If U_i=U (i, 1) ...., U (i, j) ...., U (i, n) }, U_iThe i-th sample that expression training sample is concentrated, U (i, j)=(U (i, j)₁,....U(i,j)_k....U(i,j)_K), U (i, j) represents that the jth kind behavior characteristics of training sample concentration i-th sample is vectorial, and U (i, j)_kRepresent the number of times that the jth kind behavior characteristics of i-th user occurs in the kth week of course persistent period;

R = C_{n}^{m} = \frac{n!}{m! (n - m)!}, r = 1, 2, 3 ... R;

Obtaining R kind grader based on semi-supervised learning method (BSP), the acquisition pattern of R kind grader is as follows:

S301: set r=1;

S302:j=1;

S303:v=1;

S305: select all unmarked sample under jth kind behavior characteristics in unmarked sample set, the set U that under jth kind behavior characteristics, all unmarked samples are formed_j, set of computations U respectively_jIn the P of each unmarked sample_rj(C=0 | and U (v, j)_k) and P_rj(C=1 | and U (v, j)_k), wherein v=1,2 ..., | U_j|_,|U_j| represent set U_jThe sum of middle sample;

P_{r j} (C = 0 | U (v, j)) = \frac{P_{r j} (U (v, j) | C = 0) \cdot P_{r j} (C = 0)}{P_{r j} (U (v, j))} - - - (1);

P_{r j} (C = 0) = \frac{| L_{j, C = 0} |}{| U_{j} | + | L_{j} |} - - - (1 a);

P_rj(U (v, j) | C=0)=P_rj(U(v,j)₁| C=0) P_rj(U(v,j)₂| C=0) ...,

(1b);

P_rj(U(v,j)_k| C=0) ... P_rj(U(v,j)_K| C=0)

P_{r j} (U {(v, j)}_{k} | C = 0) = \frac{| L_{j, C = 0} (U {(v, j)}_{k}) |}{| L_{j, C = 0} |} - - - (1 b - 1);

P_{r j} (C = 1 | U (v, j)) = \frac{P_{r j} (U (v, j) | C = 1) \cdot P_{r j} (C = 1)}{P_{r j} (U (v, j))} - - - (2);

P_{r j} (C = 1) = \frac{| L_{j, C = 1} |}{| U_{j} | + | L_{j} |} - - - (2 a);

P_rj(U (v, j) | C=1)=P_rj(U(v,j)₁| C=1) P_rj(U(v,j)₂| C=1) ...,

(2b);

P_rj(U(v,j)_k| C=1) ... P_rj(U(v,j)_K| C=1)

P_{r j} (U {(v, j)}_{k} | C = 1) = \frac{| L_{j, C = 1} (U {(v, j)}_{k}) |}{| L_{j, C = 1} |} - - - (2 b - 1);

P_rj(U (v, j))=P (U (v, j) | C=0) P (C=0)

(3);

P (U (v, j) | C=1) P (C=1)

Output P_rj(C=0 | and U (v, j)_k) and P_rj(C=1 | and U (v, j)_k);

S306: make v=v+1;

S307: work as v > | U_j| time, perform next step, otherwise return step S304;

S310:| U_j| when >=2, return step S303, otherwise perform next step; Namely as | U_j| when=1 or 0, represent the set U that under jth kind behavior characteristics, all unmarked samples are formed_jInterior unmarked sample labelling completes;

S311: make j=j+1;

S313: make r=r+1;

During S314: as r > R, perform next step; Otherwise return step S302;

S4: select optimum grader

S402: make r=1;

S403: make h=1;

S404: calculate P according to formula (4)_h(C=0 | U (v, j)):

P_{h} (C = 0 | U (v, j)) = Σ_{j = 1}^{m} P_{r j} (C = 0 | U (v, j)) - - - (4);

P is calculated according to formula (5)_h(C=1 | U (v, j)):

P_{h} (C = 1 | U (v, j)) = Σ_{j = 1}^{m} P_{r j} (C = 1 | U (v, j)) - - - (5);

S406: make h=h+1;

S407: if h > H, then perform next step, otherwise returns step S404;

S408: calculate the accuracy rate �� of r grader_r,Wherein S=H represents the number of times using the r grader to be labeled, and S ' represents the number of times using the r grader mark correct; Being known owing to testing the labelling of sample, the number of times that therefore labelling is correct is also known;

S409: make r=r+1;

S410: if r > R, then perform next step, otherwise returns step S403;

P_{U_{x}} (C = 0 | U (1, j));

P_{U_{x}} (C = 0 | U (1, j)) = Σ_{j = 1}^{m} P_{r_{\max^{j}}} (C = 0 | U (1, j)) - - - (6);

Calculate according to formula (7)

P_{U_{x}} (C = 1 | U (1, j));

P_{U_{x}} (C = 1 | U (1, j)) = Σ_{j = 1}^{m} P_{r_{\max^{j}}} (C = 1 | U (1, j)) - - - (7);

If

P_{U_{x}} (C = 0 | U (1, j)) &GreaterEqual; P_{U_{x}} (C = 1 | U (1, j))

Then by user U_xMark C=0, otherwise marks C=1.

Contrast test:

Data set

The data set that experiment uses is that XueTangX is in the KDDCup2015 public data collection provided. This data set includes the students ' behavior record information of 39 subjects and whether each student moves back the information of class. Data set is divided into 39 parts according to course classification by us, and each part of data subset comprises the learning records of this subject of Students ' Learning. Therefrom to randomly draw a branch of instruction in school C, this course comprises the learning records of 2392 users, and wherein 546 users are for retaining student, and 1846 users are for moving back class student. The purpose of experiment is by training grader, whether student to be exited course to judge.

Evaluation criterion

According to course number, we have 39 data subsets, using the predictablity rate average on these 39 data subsets as the last performance of algorithm. To each data subset, we use the method for ten folding cross validations that the performance of algorithm is estimated, and finally the estimated performance of all of 39 data subsets are averaged, and obtain the final performance of algorithm. In order to the predictive ability of algorithm is estimated, paper uses accuracy, recall rate and F value as evaluation criterion. According to predicting the outcome with whether a student really moves back class, we obtain student's classification four kinds different:

Turepositive (TP): be predicted as and move back class, the actual student really moving back class

Turenegative (TN): be predicted as retention, the actual student really not moving back class

Falsepositive (FP): be predicted as and move back class, but do not move back the student of class

Falsenegative (FN): be predicted as retention, but actually move back the student of class

And then, it is possible to obtain accuracy (Precision), recall rate (Recall) as follows with the computing formula of F value (F-measure):

\Pr e c i s i o n = \frac{t r u e P o s i t i v e s}{t r u e P o s i t i v e s + f a l s e P o s i t i v e s}

Re c a l l = \frac{t r u e P o s i t i v e s}{t r u e P o s i t i v e s + f a l s e N e g a t i v e s}

F - m e a s u r e = \frac{2 (\Pr e c i s i o n \times Re c a l l)}{\Pr e c i s i o n + Re c a l l}

Behavior characteristics extracts:

Learning log file statistics training sample according to user concentrates the behavior characteristics of all samples, obtaining can all samples individual 6 kinds of behavior characteristics altogether in assertiveness training sample set, these behavior characteristicss are user in the course persistent period behavior number of times in each week, the specific descriptions of behavior are shown in table 1 below:

Table 1

The performance comparison of various actions feature

We move back class as a kind of classification problem using prediction student, namely according to the learning behavior record of user, user are divided into two classes: retain user and move back class user. In order to analyze the impact on prediction effect of all kinds of learning behavior, we extract six kinds for feature. Each behavior characteristics, in conjunction with basic sorting technique such as naive Bayesian, decision tree and support vector machine, obtains moving back class based on the student of a certain behavior characteristics and predicts the outcome. Using the method that feature is assembled with Feature Fusion to combine with conventional sorting methods simultaneously as with reference to us, obtain the prediction algorithm of reference, feature is assembled: only considers behavior kind, and is left out the time; Feature Fusion: only consider the time, and be left out behavior kind. The prediction effect of each category feature is as shown in the table:

The prediction effect of various features under table 2 naive Bayesian

Table 3: the prediction effect of various features under decision tree

Table 4: the prediction effect of various features under support vector machine

From table 2-table 4 it appeared that under decision Tree algorithms the performance of various features better, simultaneously except support vector machine, the feature of single kind is better than the feature that traditional characteristic extracting method obtains. This illustrates whether user is moved back class and be respectively provided with certain predictive ability by the behavior characteristics that the feature extracting method that the present invention proposes obtains, and traditional feature extracting method or the kind of neglect, ignore time concept, cause in most cases lower than the prediction effect of single features.

In practical problem, due to mark student, whether to move back the cost of class relatively big, and this results in label and determines that cost is higher, and therefore the present invention proposes the student of Behavior-based control feature and semi-supervised learning and moves back class prediction algorithm.

In six kinds of behavior characteristicss that the inventive method obtains, based on problem (#1 feature), based on video-see (#2 feature), watch (#3 feature) based on other resources and the feature (#4 feature) based on page viewing has a good performance, and watch (#5 feature) based on wiki and accuracy of behavior classification (#6 feature) be discussed lower than other two kinds with problem. Therefore in the contrast test carried out below, we only employ four kinds of behavior characteristicss above.

Above-mentioned four kinds of features are carried out combination of two by us, are then based on double; two attempting coorinated training method and obtaining BSP. BSP takes the sample of 10% as marker samples collection, and the sample of 70% is unmarked sample set, and remaining 20% sample is test sample set. We are using supervised learning method as reference method, we have found that decision tree has a good estimated performance according to the result tested above, therefore using decision tree as benchmark grader. Wherein, the Forecasting Methodology using aggregation characteristic and decision Tree algorithms is designated as aggregation, uses the Forecasting Methodology of fusion feature and decision Tree algorithms to be designated as merge, uses the algorithm of #4 feature and decision tree to be designated as SVF, obtain following prediction effect, referring to table 5 and Fig. 1.

The contrast (accuracy and recall rate) of table 5:BSP and additive method

	Precision	Recall
			#1+#2 feature	0.9571	0.8739
#1+#3 feature	0.9927	0.8747
			#1+#4 feature	0.954	0.8878
#2+#3 feature	0.9958	0.8758
			#2+#4 feature	0.9579	0.8767
#3+#4 feature	0.9519	0.8767
			Aggregation	0.837	0.843
Merging	0.831	0.837
			SVF	0.843	0.851

The supervision type method of single-view predicts the outcome well below the semi-supervised Forecasting Methodology of dual-view as can be found from Table 5. Although the inventive method has simply used the marker samples collection of 10%, but the accuracy rate of prediction is all more than 95%, it was predicted that recall rate all more than 85%, this shows that the inventive method has good estimated performance. Simultaneously it appeared that utilize the prediction algorithm training that two kinds of behavior characteristicss carry out semi-supervised learning to be far superior to simply use the prediction algorithm of a kind of behavior characteristics in overall effect simultaneously in Fig. 1.

It is the factor hindering MOOC platform long-run development that student moves back class, and Forecasting Methodology is possible not only to convenient a course is estimated accurately, it is also possible to analyzing affects student and move back the factor of class, thus carrying out early warning to moving back class phenomenon. The inventive method is from student learning behavior, therefore, it is possible to portray the feature of individuality.

What finally illustrate is, above example is only in order to illustrate technical scheme and unrestricted, although the present invention being described in detail with reference to preferred embodiment, it will be understood by those within the art that, technical scheme can be modified or equivalent replacement, without deviating from objective and the scope of technical solution of the present invention, it all should be encompassed in the middle of scope of the presently claimed invention.

Claims

1. the large scale network Open Course based on semi-supervised learning moves back class prediction algorithm, it is characterised in that comprise the steps:

If the course persistent period of a certain course is K week;

R = C_{n}^{m} = \frac{n!}{m! (n - m)!}, r = 1, 2, 3 ... R;

The acquisition pattern of R kind grader is as follows:

S301: set r=1;

S302:j=1;

S303:v=1;

P_{r j} (C = 0 | U (v, j)) = \frac{P_{r j} (U (v, j) | C = 0) \cdot P_{r j} (C = 0)}{P_{r j} (U (v, j))} - - - (1);

P_{r j} (C = 0) = \frac{| L_{j, C = 0} |}{| U_{j} | + | L_{j} |} - - - (1 a);

P_rj(U (v, j) | C=0)=P_rj(U(v,j)₁| C=0) P_rj(U(v,j)₂| C=0) ..., (1b);

P_rj(U(v,j)_k| C=0) ... P_rj(U(v,j)_K| C=0)

P_{r j} (U {(v, j)}_{k} | C = 0) = \frac{| L_{j, C = 0} (U {(v, j)}_{k}) |}{| L_{j, C = 0} |} - - - (1 b - 1);

P_{r j} (C = 1 | U (v, j)) = \frac{P_{r j} (U (v, j) | C = 1) \cdot P_{r j} (C = 1)}{P_{r j} (U (v, j))} - - - (2);

P_{r j} (C = 1) = \frac{| L_{j, C = 1} |}{| U_{j} | + | L_{j} |} - - - (2 a);

P_rj(U (v, j) | C=1)=P_rj(U(v,j)₁| C=1) P_rj(U(v,j)₂| C=1) ..., (2b);

P_rj(U(v,j)_k| C=1) ... P_rj(U(v,j)_K| C=1)

P_{r j} (U {(v, j)}_{k} | C = 1) = \frac{| L_{j, C = 1} (U {(v, j)}_{k}) |}{| L_{j, C = 1} |} - - - (2 b - 1);

P_rj(U (v, j))=P (U (v, j) | C=0) P (C=0) (3);

P (U (v, j) | C=1) P (C=1)

Output P_rj(C=0 | and U (v, j)_k) and P_rj(C=1 | and U (v, j)_k);

S306: make v=v+1;

S307: work as v > | U_j| time, perform next step, otherwise return step S304;

S310:| U_j| when >=2, return step S303, otherwise perform next step;

S311: make j=j+1;

S313: make r=r+1;

During S314: as r > R, perform next step; Otherwise return step S302;

S4: select optimum grader

S402: make r=1;

S403: make h=1;

S404: calculate P according to formula (4)_h(C=0 | U (v, j)):

P_{h} (C = 0 | U (v, j)) = Σ_{j = 1}^{m} P_{r j} (C = 0 | U (v, j)) - - - (4);

P is calculated according to formula (5)_h(C=1 | U (v, j)):

P_{h} (C = 1 | U (v, j)) = Σ_{j = 1}^{m} P_{r j} (C = 1 | U (v, j)) - - - (5);

S406: make h=h+1;

S407: if h > H, then perform next step, otherwise returns step S404;

S409: make r=r+1;

S410: if r > R, then perform next step, otherwise returns step S403;

P_{U_{x}} (C = 0 | U (1, j));

P_{U_{x}} (C = 0 | U (1, j)) = Σ_{j = 1}^{m} P_{{r_{m a x}}^{j}} (C = 0 | U (1, j)) - - - (6);

Calculate according to formula (7)

P_{U_{x}} (C = 1 | U (1, j)) = Σ_{j = 1}^{m} P_{{r_{m a x}}^{j}} (C = 1 | U (1, j)) - - - (7);

If

P_{U_{x}} (C = 0 | U (1, j)) &GreaterEqual; P_{U_{x}} (C = 1 | U (1, j))

Then by user U_xMark C=0, otherwise notes C=1.