CN110276018A - Personalized recommendation method, terminal and the storage medium of on-line education system - Google Patents
Personalized recommendation method, terminal and the storage medium of on-line education system Download PDFInfo
- Publication number
- CN110276018A CN110276018A CN201910455421.6A CN201910455421A CN110276018A CN 110276018 A CN110276018 A CN 110276018A CN 201910455421 A CN201910455421 A CN 201910455421A CN 110276018 A CN110276018 A CN 110276018A
- Authority
- CN
- China
- Prior art keywords
- user
- resource
- preference
- log file
- action log
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000003860 storage Methods 0.000 title claims abstract description 37
- 230000009471 action Effects 0.000 claims abstract description 58
- 238000004422 calculation algorithm Methods 0.000 claims description 104
- 230000006399 behavior Effects 0.000 claims description 51
- 238000001914 filtration Methods 0.000 claims description 48
- 230000003542 behavioural effect Effects 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 10
- 230000003993 interaction Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 238000005520 cutting process Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 abstract description 12
- 238000012545 processing Methods 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000012546 transfer Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000005096 rolling process Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of personalized recommendation method of on-line education system, terminal and storage mediums, are related to intelligent recommendation algorithmic technique field.The present invention is based on being stored by extracting User action log onto Hadoop in terms of online education, utilize Mahout technology, analytical calculation is carried out to user behavior data and the HDFS and MapReduce of Hadoop is combined to carry out the processing of data, recommendation results are generated, to realize the personalized recommendation based on user.
Description
Technical field
The present invention relates to the personalized recommendation methods of personalized recommendation technical field more particularly to on-line education system, end
End and storage medium.
Background technique
From the proposition of " internet+" concept in 2015, " internet+education " has become a kind of novel clothes of education sector
Business mode, online education also result in the huge of educational relation and the system of education as one of the product under " internet+education "
Variation.Although current online education has broken traditional fixation classroom instruction and " exercises-stuffed teaching method " mode, online education platform
Type is also more and more, but there is some problems always.Most of online education platform is that itself benefit is sought by educational institution
A kind of means of benefit, the way of mechanism is stiff, and for the course of on-line study, user likes just seeing, needs to pay and just pay, very
Accomplish effectively link up with user less, provides the study suggested design of a set of personalization for user, meanwhile, educational resource is in number
Explosive growth in amount and scale, makes common learners that may face the difficulty of selection when choosing education resource, and passes through
The resource normally result that traditional search engines obtain is numerous and jumbled, accuracy is poor, no decree Students Satisfaction.
Recommender system has been applied in multiple internet areas at present, including social activity, e-commerce, music, video,
The multiple fields such as film, news.Recommender system has a diversified personalized recommendation in other field, and develop increasingly at
It is ripe, but be somebody's turn to do in the most of recommender system of education sector more using based on content and based on the recommendation of correlation rule
Recommend second-rate, makes student that can not obtain optimal education resource, still there is the research of personalized recommendation in terms of online education
A little lag.
Domestic education cloud platform construction at present has only used a small amount of cloud computing technology, and the scale of cloud is also smaller,
The characteristic for the big data that cloud platform is capable of handling also with it is few, many times only teaching resource simply store and arrives cloud
The centralized management of information is realized in platform, relatively low to the utilization rate of information, the individualized education for cloud platform is applied just
Less.
External more early, the mature of online education platform starting, course quantity is more and quality is high, there is certain advantage,
But domestic education national conditions are different from foreign countries, Foreign User more has initiative, also becomes apparent from oneself point of interest and talent.State
Interior many users, which are not aware that, oneself to be liked what or is difficult to be described with exact language clear, and user is with greater need for system pair
They carry out accurate behavioural analysis to transfer the proactive of user's study.
Therefore, a kind of personalized online education recommender system of suitable domestic learner's situation is needed to meet learner's
Demand preferably experiences the mode of learning of " internet+education ".
Summary of the invention
It is online the technical problem to be solved by the present invention is to how provide a kind of personalization of suitable domestic learner's situation
Recommender system is educated to meet the needs of learner, the preference of learner is more bonded, preferably experiences of " internet+education "
Habit mode.
To solve the above-mentioned problems, the present invention proposes following technical scheme:
In a first aspect, the embodiment of the present invention proposes a kind of personalized recommendation method of on-line education system, including following step
It is rapid:
Receive the User action log file that user terminal uploads;
The User action log file is dumped in Hadoop platform, and according to the HDFS of Hadoop platform spy
Property to User action log file carry out distributed storage backup;
Offline pre- place is carried out to the User action log file according to the distributed computing framework of the Hadoop platform
Reason, obtains filtered data;
Filtered data are extracted by Mahout, the filtered data are calculated using the Mahout,
Calculated result is obtained, the calculated result is stored into database as recommendation results;
If receiving the trigger signal that user terminal request is recommended, recommendation results are transferred from database and are sent to user
End.
Further technical solution is for it, described to extract filtered data by Mahout, utilizes described Mahout pairs
The filtered data are calculated, and calculated result is obtained, comprising:
Using merged content-based recommendation algorithm and based on mixing Collaborative Filtering Recommendation Algorithm formula (1), calculate
User U is to resource diInitial preference P1(U,di):
Wherein:
α=| PCb(U,di)-PHcf(U,di) |, α >=0,
β=| PCb(U,di)+PHcf(U,di) |, β >=0,
PCb(U,di) indicate that user U is to resource d in content-based recommendation algorithmiPreference;
PHcf(U,di) indicate based on the user U in mixing Collaborative Filtering Recommendation Algorithm to resource diPreference;
max{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms maximum user U to resource diPreference
Maximum value;
min{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms the smallest user U to resource diPreference
Minimum value;
α represent based on content and mixing collaborative filtering under user U to resource diPreference deviation;
β is represented based on user U under content and mixing collaborative filtering to resource diPreference total preference value;
P1(U,di) indicate that user U is to resource d under the algorithm of formula (1)iInitial preference.
Further technical solution is for it, further includes:
User U is calculated to resource d using formula (2)iFinal preference P (U, di), by user U to resource diMost
The whole highest resource d of preferenceiAs calculated result:
P(U,di)=e-w×Pu(U,di)+(1-e-w)*P1(U,di)
Formula (2)
Wherein: w ∝ t, t indicate user's history behavior record item number;
Pu(U,di) indicate that user U is to resource d in the proposed algorithm based on user information similarityiInitial preference;
P(U,di) indicate that user U is to resource d under the algorithm of formula (2)iFinal preference.
Further technical solution is for it, the method also includes:
The User action log file is stored into the database based on distributed document storage by user terminal.
Further technical solution is for it, and the distributed computing framework according to the Hadoop platform is to the user
User behaviors log file is pre-processed offline, comprising:
Identification cutting is carried out to the field in User action log file, removes and does not conform in the User action log file
The record of method extracts characteristic information according to statistical demand.
Further technical solution is for it, and the characteristic information includes:
The personal characteristics of user: educational background, profession, occupation, age, gender, personality, interest, the following study plan;
Dominant user behavior characteristics: user, which scores, to feed back, downloading resource, does topic record, search course resources and course
Interact number, each interaction time, system online hours;
Hidden customer behavioural characteristic: page residence time, page access number, the mobile number of mouse, scroll bar rolling time
Number.
Second aspect, the embodiment of the present invention provide a kind of terminal, comprising: for executing method as described in relation to the first aspect
Unit.
The third aspect, the embodiment of the present invention provide a kind of terminal, which includes processor, input equipment, output equipment
And memory, the processor, input equipment, output equipment and memory are connected with each other, the memory is supported for storing
Terminal executes the application code of method as described in relation to the first aspect, and the processor is configured for executing such as first aspect
The method.
Fourth aspect, the embodiment of the present invention provide a kind of computer readable storage medium, and the computer storage medium is deposited
Computer program is contained, the computer program includes program instruction, and described program instruction makes described when being executed by a processor
Processor executes method as described in relation to the first aspect.
Compared with prior art, the attainable technical effect of present invention institute includes:
In terms of based on online education by extracting User action log storage on Hadoop, using Mahout technology,
Analytical calculation is carried out to user behavior data and the HDFS and MapReduce of Hadoop is combined to carry out the processing of data, generation pushes away
It recommends as a result, to realize the personalized recommendation based on user.
By build Hadoop data processing platform (DPP) and using the open source algorithms library Apache Mahout of data mining come pair
User behavior data carries out off-line analysis and processing, and whole system building is all based on MapReduce computation module, makes full use of
The powerful data-handling capacity of cloud platform, off-line calculation user's recommendation results improve system using parallelization and distribution
Efficiency and the scalability for improving system solve conventional individual recommended models computing capability deficiency, real-time recommendation overlong time
Problem.
Detailed description of the invention
Technical solution in order to illustrate the embodiments of the present invention more clearly, below will be to needed in embodiment description
Attached drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the invention, general for this field
For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1, for the personalized recommendation method flow chart for the on-line education system that one embodiment of the invention provides;
Fig. 2 is the Hadoop platform in the personalized recommendation method for the on-line education system that one embodiment of the invention provides
Process flow diagram;
Fig. 3, for another embodiment of the present invention provides a kind of 300 schematic block diagram of terminal;
Fig. 4, for another embodiment of the present invention provides proposed algorithm structural schematic diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, is clearly and completely retouched to the technical solution in embodiment
It states, similar reference numerals represent similar component in attached drawing.Obviously, will be described below embodiment is only the present invention one
Divide embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making
Every other embodiment obtained, shall fall within the protection scope of the present invention under the premise of creative work.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction
Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded
Body, step, operation, the presence or addition of element, component and/or its set.
It is also understood that in this embodiment of the present invention term used in the description merely for the sake of description particular implementation
Example purpose and be not intended to limit the embodiment of the present invention.Such as the institute in specification and appended book of the embodiment of the present invention
As use, other situations unless the context is clearly specified, otherwise " one " of singular, "one" and "the" are intended to wrap
Include plural form.
Embodiment 1
Referring to Fig. 1-2, in a first aspect, the embodiment of the present invention provides the personalized recommendation method of on-line education system, including
Following steps:
S101 receives the User action log file that user terminal uploads;
In specific implementation, the behavioural information of user terminal real-time collecting user generates User action log file and is sent out
It send to system, system receives the User action log file that user terminal uploads.
In specific implementation, the behavioural information of user includes the personal characteristics of user, dominant user behavior characteristics and hidden
Property user behavior characteristics, wherein
The personal characteristics of user includes: educational background, profession, occupation, age, gender, personality, interest, the following study plan;
Dominant user behavior characteristics include: user score feedback, downloading resource, do topic record, search course resources, with
Course interacts number, each interaction time, system online hours;
Hidden customer behavioural characteristic includes: page residence time, page access number, the mobile number of mouse, scroll bar rolling
Dynamic number.
In a certain embodiment, the method also includes:
S1011, the User action log file are stored into the database stored based on distributed document by user terminal.
In specific implementation, User action log file collection is mainly passed through user terminal and is received using javaScript script
Collection, and User action log file is stored in Mongodb (database based on distributed document storage) by user terminal.
S102 dumps to the User action log file in Hadoop platform, and according to the Hadoop platform
HDFS (Hadoop Distributed File System, distributed file system) characteristic carries out User action log file
Distributed storage backup;
In specific implementation, the framework of HDFS is constructed based on one group of specific node, this was determined by the characteristics of own
Fixed.These nodes include a host node NameNode and multiple mention inside HDFS from node DataNode, NameNode
For Metadata Service;DataNode, it provides memory block for HDFS.The file being stored in HDFS is divided into block, then by this
A little blocks copy in multiple computers (DataNode), to safeguard multiple operational data copies, it is ensured that can be for failure
Node redistribution processing, improves system reliability.
S103 carries out the User action log file according to the distributed computing framework of the Hadoop platform offline
Pretreatment, obtains filtered data;
In specific implementation, the distributed computing framework of Hadoop platform is MapReduce, in MapReduce Computational frame
On the basis of using hive to the User action log file carry out off line data analysis, pretreatment, filter out clean number
According to.
In a certain embodiment, the concrete operation step of step S103 are as follows: sharp on the basis of MapReduce Computational frame
Identification cutting is carried out to the field in User action log file with hive, is removed illegal in the User action log file
Record characteristic information is extracted according to statistical demand.
It should be noted that the field of the identification is to need sets itself, this hair according to actual count by technical staff
It is bright that this is not repeated them here.
In specific implementation, by analyzing the user behavior in User action log file, to more pay close attention to
Culture, demand and the growth of user guarantees the accuracy and rich recommended to provide the user with reasonable recommendation service,
And then the proactive of user's study is transferred, improve user's stickiness.The characteristic information includes:
The personal characteristics of user: educational background, profession, occupation, age, gender, personality, interest, the following study plan;
Dominant user behavior characteristics: user, which scores, to feed back, downloading resource, does topic record, search course resources and course
Interact number, each interaction time, system online hours;
Hidden customer behavioural characteristic: page residence time, page access number, the mobile number of mouse, scroll bar rolling time
Number.
Judge that user to the preference of resource, generates user resources preference by collecting the characteristic information of user behavior
Collection carries out calculating for subsequent proposed algorithm and provides data set.
S104 extracts filtered data by Mahout, is carried out using the Mahout to the filtered data
It calculates, obtains calculated result, the calculated result is stored into database as recommendation results;
Referring to fig. 4, in specific implementation, the following are the recommendation calculations for combining the feature of online education to use in the embodiment of the present invention
Method:
1) proposed algorithm based on mixing collaborative filtering, comprising the following steps:
A. according to user behavior information, the similarity between user is calculated using Pearson correlation coefficient measure formulas;
B. find with the higher neighbor user set of target user's similarity, using neighbor user to course feedback,
Predict target user to the preference of course;
C. according to the behavior record of target user, inter-course similarity is calculated using Euclidean distance calculation formula;
D. the higher neighbours' course set of similarity for watching course with target user is found, the hot topic of neighbours' course is passed through
Degree predicts target user to the preference of neighbours' course.
E. weight calculation is carried out to obtained target education resource set (course, neighbours' course), finally obtains recommendation
Target education resource is ranked up according to preference, the highest education resource of preference is recommended user.
It should be noted that the proposed algorithm based on mixing collaborative filtering refers to that having merged the collaborative filtering based on user calculates
The mixing proposed algorithm of method and project-based collaborative filtering.Wherein, step a, b is the collaborative filtering based on user
Calculating process, step c, d is the calculating process of project-based collaborative filtering, and step e is the result to two kinds of algorithms
It is integrated, the recommendation results of the proposed algorithm based on mixing collaborative filtering is generated, so that recommendation results more meet the inclined of user
Good degree.
2) it according to the proposed algorithm based on user of user information similarity, specifically includes that
According to target user's registration information, obtains " personal characteristics of user " and sought using k-means clustering algorithm thought
Similar users collection is looked for, similar users are clustered together, are estimated using COS distance, finds out most phase in similar users concentration
As user, the i.e. user of COS distance minimum value, and according to most like user to the preference of each education resource to target
User recommends.
It should be noted that this is mainly used for solving user according to the proposed algorithm based on user of user information similarity
Cold start-up problem.
3) according to the content-based recommendation algorithm of user behavior, comprising:
According to the historical behavior information before user, the course seen including user or other education resources are user
Recommend the education resource similar with the resource content seen, such as other courses that same position teacher said.
But only relying on a certain proposed algorithm can always have disadvantages that, a small number of platforms are tied using a variety of recommendations
Close, but seldom consider the behavior of user, a variety of proposed algorithms combine more stiff, cannot smooth smooth conversion, recommend
As a result undesirable.
When user generates search behavior, it is known that user is stronger to the purpose of a certain content at this time, has to the content
Instant, strong demand, the click that should be mainly searched at this time according to user, watches the content of course, theme is based on
The recommendation of content can suitably increase content-based recommendation specific gravity, to be closed with being continuously increased for search behavior number
The recommendation of reason guarantees the accuracy and rich recommended.For example, the concrete operations of step S104 include: in specific implementation
Step S1041, using merged content-based recommendation algorithm and based on mixing Collaborative Filtering Recommendation Algorithm public affairs
Formula (1) calculates user U to resource diInitial preference P1(U,di):
Wherein:
α=| PCb(U,di)-PHcf(U,di) |, α >=0,
β=| PCb(U,di)+PHcf(U,di) |, β >=0,
PCb(U,di) indicate that user U is to resource d in content-based recommendation algorithmiPreference;
PHcf(U,di) indicate based on the user U in mixing Collaborative Filtering Recommendation Algorithm to resource diPreference;
max{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms maximum user U to resource diPreference
Maximum value;
min{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms the smallest user U to resource diPreference
Minimum value;
α represent based on content and mixing collaborative filtering under user U to resource diPreference deviation;α
Value it is smaller, illustrate that user U is to resource d under both algorithmsiPreference similarity it is bigger, then recommend preference more accurate.
β is represented based on user U under content and mixing collaborative filtering to resource diPreference total preference value;β
Value it is bigger, illustrate that user U is to resource d under both algorithmsiPreference total preference value it is bigger, illustrate resource diMore it is worth
It must be recommended.
P1(U,di) indicate that user U is to resource d under the algorithm of formula (1)iInitial preference.
It should be noted that the value as α is smaller, i.e., user U is to resource diBased on the preference found out under two kinds of algorithms
It is closer.Work as PHcf(U,di)=PCb(U,di) when, α=0 then represents the user U under based on content and mixing collaborative filtering
To resource diPreference it is identical, user U is to resource d at this timeiPreference be namely based on content proposed algorithm (or
Be based on mixing Collaborative Filtering Recommendation Algorithm) in user U to resource diPreference.When the value of α is bigger, i.e. user U is to money
Source diPreference similarity it is smaller, at this point, should based on different weight ratios carry out two kinds of algorithms between reconciliation.Therefore, root
According to formula (1) can smoothly merge content-based recommendation algorithm and based on mixing collaborative filtering proposed algorithm so that pushing away
Recommend result closer to user demand.
It is to calculate basis that collaborative filtering, which is with the historical behavior data of user,.But new user does not have historical behavior
Record, this generates cold start-up problems.Most of proposed algorithms cold start-up problems using user is recommended at random, it is newest
Most pick recommend, using user's registration information recommend method, etc. user data collections to it is certain when be switched to personalization again
Recommend, and during this section for collecting user data, it is easy to cause the loss of user.For the cold start-up for solving the problems, such as user,
Further include step S1042 in the embodiment of the present invention on the basis of step S1041:
Step S1042 calculates user U to resource d using formula (2)iFinal preference P (U, di), by U pairs of user
Resource diThe highest resource d of final preferenceiAs calculated result:
P(U,di)=e-w×Pu(U,di)+(1-e-w)*P1(U,di)
Formula (2)
Wherein: w ∝ t, t indicate user's history behavior record item number;
Pu(U,di) indicate that user U is to resource d in the proposed algorithm based on user information similarityiInitial preference;
P(U,di) indicate that user U is to resource d under the algorithm of formula (2)iFinal preference.
User U can be calculated to resource d using formula (2)iFinal preference P (U, di), by resource diAccording to most
Whole preference P (U, di) be ranked up from high to low, by the highest resource d of final preferenceiAs calculated result, by institute
Calculated result is stated to store into database as recommendation results.
In another embodiment, final preference is taken to be greater than at least one resource d of preset thresholdiIt is tied as calculating
Fruit stores the calculated result into database as recommendation results.
It should be noted that at the beginning, new user's registration does not have historical behavior record, then w=0, P (U, di)=Pu(U,
di), then it represents that new user is essentially according to proposed algorithm (the i.e. basis in Fig. 4 based on user according to user information similarity
The proposed algorithm based on user of user characteristics).When user's history behavior record item number t is more, then the value of w is bigger, P1(U,di)
Weight ratio it is bigger, be finally slowly converted into according to user's history behavior record carry out recommendation calculate.To smoothly
It solves the problems, such as the cold start-up of new user, so that new user can be transitted smoothly to old user, avoids the loss of new user, improve
The stickiness of user.
S105 transfers recommendation results from database and is sent to if receiving the trigger signal that user terminal request is recommended
User terminal.
In specific implementation, user generates trigger signal when logging in the user terminal of online Educational website, and recommender system receives
The trigger signal that user terminal request is recommended, then transfer recommendation results from database and be sent to user terminal.
The embodiment of the present invention will be by building Hadoop data processing platform (DPP) and using the open source algorithms library of data mining
Apache Mahout to carry out user behavior data off-line analysis and processing, and whole system building is all based on MapReduce
Computation model, makes full use of the data-handling capacity that cloud platform is powerful, and off-line calculation user's recommendation results using parallelization and are divided
Cloth improves the efficiency of system and improves the scalability of system, and it is insufficient to solve conventional individual recommended models computing capabilitys,
Real-time recommendation overlong time problem.
In actual use, the basic performance that recommender system has includes: within 2 seconds response times of client's request;Branch
Hold millions of users online access simultaneously;Server CPU average load rate≤50%;
Highly reliable: system has 7 × 24 × 365 hours high availability, and reliability is 99.9999% or more;Ensure
Data access service is accurate, does not lose data;
It is with good expansibility: can meet the needs of user extends in next three years, can support subsequent application system
System resource is gradually integrated;Existing system function and structure is not influenced when system user increases or data volume increases, and can be facilitated
Subsequent system extension.
On-line education system is absorbed in have the user of demand to recommend personalized Learning Scheme and suitable study money
The design in source, user behavior analysis and personalized recommendation based on Hadoop and Mahout allows user to reach by big data analysis
The requirement of study simultaneously promotes oneself, while generating huge social benefit, promotes the fast development of online education industry.
Embodiment 2,
The embodiment of the present invention provides a kind of terminal.Terminal in the present embodiment can include: for executing as described in Example 1
Method unit.
Receiving unit, for receiving the User action log file of user terminal upload;
In specific implementation, the behavioural information of user terminal real-time collecting user generates User action log file and is sent out
It send to system, system receives the User action log file that user terminal uploads.
In specific implementation, the behavioural information of user includes the personal characteristics of user, dominant user behavior characteristics and hidden
Property user behavior characteristics, wherein
The personal characteristics of user includes: educational background, profession, occupation, age, gender, personality, interest, the following study plan;
Dominant user behavior characteristics include: user score feedback, downloading resource, do topic record, search course resources, with
Course interacts number, each interaction time, system online hours;
Hidden customer behavioural characteristic includes: page residence time, page access number, the mobile number of mouse, scroll bar rolling
Dynamic number.
In a certain embodiment, further includes:
Storage element, for being stored into the User action log file by user terminal based on distributed document storage
In database.;
In specific implementation, User action log file collection is mainly passed through user terminal and is received using javaScript script
Collection, and User action log file is stored in Mongodb (database based on distributed document storage) by user terminal.Point
Cloth storage unit, for the User action log file to be dumped to Hadoop platform, and it is flat according to the Hadoop
HDFS (Hadoop Distributed File System, distributed file system) characteristic of platform is to User action log file
Carry out distributed storage backup;
In specific implementation, the framework of HDFS is constructed based on one group of specific node, this was determined by the characteristics of own
Fixed.These nodes include a host node NameNode and multiple from node DataNodeNameNode (only one),
NameNode it Metadata Service is provided inside HDFS;DataNode, it provides memory block for HDFS.It is stored in HDFS
File is divided into block, these blocks are then copied in multiple computers (DataNode), to safeguard multiple operational data pairs
This, it is ensured that system reliability can be improved for the node redistribution processing of failure.
Pretreatment unit, for the distributed computing framework according to the Hadoop platform to the User action log text
Part is pre-processed offline, obtains filtered data;
In specific implementation, in specific implementation, the distributed computing framework of Hadoop platform is MapReduce,
Off line data analysis is carried out to the User action log file using hive on the basis of MapReduce Computational frame, it is pre- to locate
Reason, filters out clean data.
In a certain embodiment, pretreatment unit is specifically used for: hive is utilized on the basis of MapReduce Computational frame
Identification cutting is carried out to the field in User action log file, removes illegal note in the User action log file
Record extracts characteristic information according to statistical demand.
It should be noted that the field of the identification is to need sets itself, this hair according to actual count by technical staff
It is bright that this is not repeated them here.
In specific implementation, by analyzing the user behavior in User action log file, to more pay close attention to
Culture, demand and the growth of user guarantees the accuracy and rich recommended to provide the user with reasonable recommendation service,
And then the proactive of user's study is transferred, improve user's stickiness.The characteristic information includes:
The personal characteristics of user: educational background, profession, occupation, age, gender, personality, interest, the following study plan;
Dominant user behavior characteristics: user, which scores, to feed back, downloading resource, does topic record, search course resources and course
Interact number, each interaction time, system online hours;
Hidden customer behavioural characteristic: page residence time, page access number, the mobile number of mouse, scroll bar rolling time
Number.
Judge that user to the preference of resource, generates user resources preference by collecting the characteristic information of user behavior
Collection carries out calculating for subsequent proposed algorithm and provides data set.Computing unit, for extracting filtered number by Mahout
According to, the filtered data are calculated using the Mahout, obtain calculated result, by the calculated result store to
Recommendation results are used as in database;
Referring to fig. 4, in specific implementation, the following are the proposed algorithms used in the embodiment of the present invention:
1) proposed algorithm based on mixing collaborative filtering, comprising the following steps:
A. according to user behavior information, the similarity between user is calculated using Pearson correlation coefficient measure formulas;
B. find with the higher neighbor user set of target user's similarity, using neighbor user to course feedback,
Predict target user to the preference of course;
C. according to the behavior record of target user, inter-course similarity is calculated using Euclidean distance calculation formula;
D. the higher neighbours' course set of similarity for watching course with target user is found, the hot topic of neighbours' course is passed through
Degree predicts target user to the preference of neighbours' course.
E. weight calculation is carried out to obtained target education resource set (course, neighbours' course), finally obtains recommendation
Target education resource is ranked up according to preference, the highest education resource of preference is recommended user.
It should be noted that the proposed algorithm based on mixing collaborative filtering refers to that having merged the collaborative filtering based on user calculates
The mixing proposed algorithm of method and project-based collaborative filtering.Wherein, step a, b is the collaborative filtering based on user
Calculating process, step c, d is the calculating process of project-based collaborative filtering, and step e is the result to two kinds of algorithms
It is integrated, the recommendation results of the proposed algorithm based on mixing collaborative filtering is generated, so that recommendation results more meet the inclined of user
Good degree.2) it according to the proposed algorithm based on user of user information similarity, specifically includes that
According to target user's registration information, obtains " personal characteristics of user " and sought using k-means clustering algorithm thought
Similar users collection is looked for, similar users are clustered together, are estimated using COS distance, finds out most phase in similar users concentration
As user, the i.e. user of COS distance minimum value, and according to most like user to the preference of each education resource to target
User recommends.
It should be noted that this is mainly used for solving user according to the proposed algorithm based on user of user information similarity
Cold start-up problem.
3) according to the content-based recommendation algorithm of user behavior, comprising:
According to the historical behavior information before user, the course seen including user or other education resources are user
Recommend the education resource similar with the resource content seen, such as other courses that same position teacher said.
But only relying on a certain proposed algorithm can always have disadvantages that, a small number of platforms are tied using a variety of recommendations
Close, but seldom consider the behavior of user, a variety of proposed algorithms combine more stiff, cannot smooth smooth conversion, recommend
As a result undesirable.
When user generates search behavior, it is known that user is stronger to the purpose of a certain content at this time, has to the content
Instant, strong demand, the click that should be mainly searched at this time according to user, watches the content of course, theme is based on
The recommendation of content can suitably increase content-based recommendation specific gravity, to be closed with being continuously increased for search behavior number
The recommendation of reason guarantees the accuracy and rich recommended.For example, computing unit specifically includes in specific implementation:
Fusion calculation unit has merged content-based recommendation algorithm and based on mixing collaborative filtering recommending calculation for utilizing
The formula (1) of method calculates user U to resource diInitial preference P1(U,di):
Wherein:
α=| PCb(U,di)-PHcf(U,di) |, α >=0,
β=| PCb(U,di)+PHcf(U,di) |, β >=0,
PCb(U,di) indicate that user U is to resource d in content-based recommendation algorithmiPreference;
PHcf(U,di) indicate based on the user U in mixing Collaborative Filtering Recommendation Algorithm to resource diPreference;
max{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms maximum user U to resource diPreference
Maximum value;
min{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms the smallest user U to resource diPreference
Minimum value;
α represent based on content and mixing collaborative filtering under user U to resource diPreference deviation;α
Value it is smaller, illustrate that user U is to resource d under both algorithmsiPreference similarity it is bigger, then recommend preference more accurate.
β is represented based on user U under content and mixing collaborative filtering to resource diPreference total preference value;β
Value it is bigger, illustrate that user U is to resource d under both algorithmsiPreference total preference value it is bigger, illustrate resource diMore it is worth
It must be recommended.
P1(U,di) indicate that user U is to resource d under the algorithm of formula (1)iInitial preference.
It should be noted that the value as α is smaller, i.e., user U is to resource diBased on the preference found out under two kinds of algorithms
It is closer.Work as PHcf(U,di)=PCb(U,di) when, α=0 then represents the user U under based on content and mixing collaborative filtering
To resource diPreference it is identical, user U is to resource d at this timeiPreference be namely based on content proposed algorithm (or
Be based on mixing Collaborative Filtering Recommendation Algorithm) in user U to resource diPreference.When the value of α is bigger, i.e. user U is to money
Source diPreference similarity it is smaller, at this point, should based on different weight ratios carry out two kinds of algorithms between reconciliation.Therefore, root
According to formula (1) can smoothly merge content-based recommendation algorithm and based on mixing collaborative filtering proposed algorithm so that pushing away
Recommend result closer to user demand.
It is to calculate basis that collaborative filtering, which is with the historical behavior data of user,.But new user does not have historical behavior
Record, this generates cold start-up problems.Most of proposed algorithms cold start-up problems using user is recommended at random, it is newest
Most pick recommend, using user's registration information recommend method, etc. user data collections to it is certain when be switched to personalization again
Recommend, and during this section for collecting user data, it is easy to cause the loss of user.For the cold start-up for solving the problems, such as user,
Further include final computing unit in the embodiment of the present invention on the basis of fusion calculation unit:
Final computing unit, for calculating user U to resource d using formula (2)iFinal preference P (U, di), it will
User U is to resource diThe highest resource d of final preferenceiAs calculated result:
P(U,di)=e-w×Pu(U,di)+(1-e-w)*P1(U,di)
Formula (2)
Wherein: w ∝ t, t indicate user's history behavior record item number;
Pu(U,di) indicate that user U is to resource d in the proposed algorithm based on user information similarityiInitial preference;
P(U,di) indicate that user U is to resource d under the algorithm of formula (2)iFinal preference.
User U can be calculated to resource d using formula (2)iFinal preference P (U, di), by resource diAccording to most
Whole preference P (U, di) be ranked up from high to low, by the highest resource d of final preferenceiAs calculated result, by institute
Calculated result is stated to store into database as recommendation results.
In another embodiment, final preference is taken to be greater than at least one resource d of preset thresholdiIt is tied as calculating
Fruit stores the calculated result into database as recommendation results.
It should be noted that at the beginning, new user's registration does not have historical behavior record, then w=0, P (U, di)=Pu(U,
di), then it represents that new user is essentially according to proposed algorithm (the i.e. basis in Fig. 4 based on user according to user information similarity
The proposed algorithm based on user of user characteristics).When user's history behavior record item number t is more, then the value of w is bigger, P1(U,di)
Weight ratio it is bigger, be finally slowly converted into according to user's history behavior record carry out recommendation calculate.To smoothly
It solves the problems, such as the cold start-up of new user, so that new user can be transitted smoothly to old user, avoids the loss of new user, improve
The stickiness of user.
Transmission unit, if the trigger signal recommended for receiving user terminal request, transfers recommendation knot from database
Fruit is sent to user terminal.
Embodiment 3
Referring to Fig. 3, another embodiment of the present invention provides a kind of 300 schematic block diagram of terminal.The present embodiment as shown in the figure
In terminal 300 may include: one or more processors 301;One or more input equipments 302, one or more output
Equipment 303 and memory 304.Above-mentioned processor 301, input equipment 302, output equipment 303 and memory 304 pass through bus
305 connections.For storing instruction, processor 301 is used to execute the instruction of the storage of memory 302 to memory 302.Wherein, it handles
Device 301 is for executing:
Receive the User action log file that user terminal uploads;The User action log file is dumped to Hadoop to put down
On platform, and distributed storage backup is carried out to User action log file according to the HDFS characteristic of the Hadoop platform;According to
The distributed computing framework of the Hadoop platform pre-processes the User action log file offline, after obtaining filtering
Data;Filtered data are extracted by Mahout, the filtered data are calculated using the Mahout, are obtained
To calculated result, the calculated result is stored into database as recommendation results;If receiving what user terminal request was recommended
Trigger signal then transfers recommendation results from database and is sent to user terminal.
Further, it is also used to execute: it is described that filtered data are extracted by Mahout, using the Mahout to institute
It states filtered data to be calculated, obtains calculated result, comprising: using having merged content-based recommendation algorithm and based on mixed
The formula (1) of Collaborative Filtering Recommendation Algorithm is closed, calculates user U to resource diInitial preference P1(U,di):
Wherein:
α=| PCb(U,di)-PHcf(U,di) |, α >=0,
β=| PCb(U,di)+PHcf(U,di) |, β >=0,
PCb(U,di) indicate that user U is to resource d in content-based recommendation algorithmiPreference;
PHcf(U,di) indicate based on the user U in mixing Collaborative Filtering Recommendation Algorithm to resource diPreference;
max{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms maximum user U to resource diPreference
Maximum value;
min{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms the smallest user U to resource diPreference
Minimum value;
α represent based on content and mixing collaborative filtering under user U to resource diPreference deviation;
β is represented based on user U under content and mixing collaborative filtering to resource diPreference total preference value;
P1(U,di) indicate that user U is to resource d under the algorithm of formula (1)iInitial preference.
It is further also used to execute: calculating user U to resource d using formula (2)iFinal preference P (U, di),
By user U to resource diThe highest resource d of final preferenceiAs calculated result:
P(U,di)=e-w×Pu(U,di)+(1-e-w)*P1(U,di)
Formula (2)
Wherein: w ∝ t, t indicate user's history behavior record item number;
Pu(U,di) indicate that user U is to resource d in the proposed algorithm based on user information similarityiInitial preference;
P(U,di) indicate that user U is to resource d under the algorithm of formula (2)iFinal preference.
Further be also used to execute: the User action log file is stored by user terminal to be deposited based on distributed document
In the database of storage.
Further be also used to execute: the distributed computing framework according to the Hadoop platform is to user's row
It is pre-processed offline for journal file, comprising: identification cutting carried out to the field in User action log file, described in removal
Illegal record in User action log file extracts characteristic information according to statistical demand.
Wherein, the characteristic information includes: the personal characteristics of user: educational background, profession, occupation, the age, gender, personality, emerging
Interest, the following study plan;Dominant user behavior characteristics: user's scoring feedback, does topic record, searches for course money downloading resource
Source interacts number, each interaction time, system online hours with course;Hidden customer behavioural characteristic: page residence time, page
The mobile number of face access times, mouse, scroll bar roll number.
It should be appreciated that in embodiments of the present invention, alleged processor 301 can be central processing unit (Central
Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital
Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit,
ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic
Device, discrete gate or transistor logic, discrete hardware components etc..General processor can be microprocessor or this at
Reason device is also possible to any conventional processor etc..
Input equipment 302 may include that Trackpad, fingerprint adopt sensor (for acquiring the finger print information and fingerprint of user
Directional information), microphone etc., output equipment 303 may include display (LCD etc.), loudspeaker etc..
The memory 304 may include read-only memory and random access memory, and to processor 301 provide instruction and
Data.The a part of of memory 304 can also include nonvolatile RAM.For example, memory 304 can also be deposited
Store up the information of device type.
In the specific implementation, processor 301 described in the embodiment of the present invention, input equipment 302, output equipment 303 can
Implementation described in a kind of a embodiment of parameter regulation means provided in an embodiment of the present invention is executed, this also can be performed
The implementation of terminal 300 described in inventive embodiments, details are not described herein.
A kind of computer readable storage medium, the computer-readable storage medium are provided in another embodiment of the invention
Matter is stored with computer program, the realization when computer program is executed by processor:
Receive the User action log file that user terminal uploads;The User action log file is dumped to Hadoop to put down
On platform, and distributed storage backup is carried out to User action log file according to the HDFS characteristic of the Hadoop platform;According to
The distributed computing framework of the Hadoop platform pre-processes the User action log file offline, after obtaining filtering
Data;Filtered data are extracted by Mahout, the filtered data are calculated using the Mahout, are obtained
To calculated result, the calculated result is stored into database as recommendation results;If receiving what user terminal request was recommended
Trigger signal then transfers recommendation results from database and is sent to user terminal.
It is described that filtered data are extracted by Mahout, the filtered data are counted using the Mahout
It calculates, obtains calculated result, comprising: using having merged content-based recommendation algorithm and based on mixing Collaborative Filtering Recommendation Algorithm
Formula (1) calculates user U to resource diInitial preference P1(U,di):
Wherein:
α=| PCb(U,di)-PHcf(U,di) |, α >=0,
β=| PCb(U,di)+PHcf(U,di) |, β >=0,
PCb(U,di) indicate that user U is to resource d in content-based recommendation algorithmiPreference;
PHcf(U,di) indicate based on the user U in mixing Collaborative Filtering Recommendation Algorithm to resource diPreference;
max{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms maximum user U to resource diPreference
Maximum value;
min{PCb(U,di),PHcf(U,di) indicate, take under two kinds of algorithms the smallest user U to resource diPreference
Minimum value;
α represent based on content and mixing collaborative filtering under user U to resource diPreference deviation;
β is represented based on user U under content and mixing collaborative filtering to resource diPreference total preference value;
P1(U,di) indicate that user U is to resource d under the algorithm of formula (1)iInitial preference.
User U is calculated to resource d using formula (2)iFinal preference P (U, di), by user U to resource diMost
The whole highest resource d of preferenceiAs calculated result:
P(U,di)=e-w×Pu(U,di)+(1-e-w)*P1(U,di)
Formula (2)
Wherein: w ∝ t, t indicate user's history behavior record item number;
Pu(U,di) indicate that user U is to resource d in the proposed algorithm based on user information similarityiInitial preference;
P(U,di) indicate that user U is to resource d under the algorithm of formula (2)iFinal preference.
The method also includes: the User action log file is stored by user terminal based on distributed document storage
In database.
The distributed computing framework according to the Hadoop platform carries out the User action log file offline
Pretreatment, comprising: identification cutting is carried out to the field in User action log file, is removed in the User action log file
Illegal record extracts characteristic information according to statistical demand.
Wherein, the characteristic information includes: the personal characteristics of user: educational background, profession, occupation, the age, gender, personality, emerging
Interest, the following study plan;Dominant user behavior characteristics: user's scoring feedback, does topic record, searches for course money downloading resource
Source interacts number, each interaction time, system online hours with course;Hidden customer behavioural characteristic: page residence time, page
The mobile number of face access times, mouse, scroll bar roll number.
The computer readable storage medium can be the internal storage unit of terminal described in aforementioned any embodiment, example
Such as the hard disk or memory of terminal.The computer readable storage medium is also possible to the External memory equipment of the terminal, such as
The plug-in type hard disk being equipped in the terminal, intelligent memory card (Smart Media Card, SMC), secure digital (Secure
Digital, SD) card, flash card (Flash Card) etc..Further, the computer readable storage medium can also be wrapped both
The internal storage unit for including the terminal also includes External memory equipment.The computer readable storage medium is described for storing
Other programs and data needed for computer program and the terminal.The computer readable storage medium can be also used for temporarily
When store the data that has exported or will export.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware
With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This
A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially
Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not
It is considered as beyond the scope of this invention.
It is apparent to those skilled in the art that for convenience of description and succinctly, the end of foregoing description
The specific work process at end and unit, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In several embodiments provided by the present invention, it should be understood that disclosed terminal and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.In addition, shown or discussed phase
Mutually between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication of device or unit
Connection is also possible to electricity, mechanical or other form connections.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.Some or all of unit therein can be selected to realize the embodiment of the present invention according to the actual needs
Purpose.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention
Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey
The medium of sequence code.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in some embodiment
Part, reference can be made to the related descriptions of other embodiments.
The above is a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, any ripe
It knows those skilled in the art in the technical scope disclosed by the present invention, various equivalent modifications can be readily occurred in or replaces
It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right
It is required that protection scope subject to.
Claims (9)
1. a kind of personalized recommendation method of on-line education system, which comprises the following steps:
Receive the User action log file that user terminal uploads;
The User action log file is dumped in Hadoop platform, and according to the HDFS characteristic pair of the Hadoop platform
User action log file carries out distributed storage backup;
The User action log file is pre-processed offline according to the distributed computing framework of the Hadoop platform, is obtained
To filtered data;
Filtered data are extracted by Mahout, the filtered data are calculated using the Mahout, are obtained
Calculated result stores the calculated result into database as recommendation results;
If receiving the trigger signal that user terminal request is recommended, recommendation results are transferred from database and are sent to user terminal.
2. the personalized recommendation method of on-line education system as described in claim 1, which is characterized in that described to pass through Mahout
Filtered data are extracted, the filtered data are calculated using the Mahout, obtain calculated result, comprising:
Using merged content-based recommendation algorithm and based on mixing Collaborative Filtering Recommendation Algorithm formula (1), calculate user U
To resource diInitial preference P1(U, di):
Wherein:
α=| PCb(U, di)-PHcf(U, di) |, α >=0,
β=| PCb(U, di)+PHcf(U, di) |, β >=0,
PCb(U, di) indicate that user U is to resource d in content-based recommendation algorithmiPreference;
PHcf(U, di) indicate based on the user U in mixing Collaborative Filtering Recommendation Algorithm to resource diPreference;
max{PCb(U, di), PHcf(U, di) indicate, take under two kinds of algorithms maximum user U to resource diPreference most
Big value;
min{PCb(U, di), PHcf(U, di) indicate, take under two kinds of algorithms the smallest user U to resource diPreference most
Small value;
α represent based on content and mixing collaborative filtering under user U to resource diPreference deviation;
β is represented based on user U under content and mixing collaborative filtering to resource diPreference total preference value;
P1(U, di) indicate that user U is to resource d under the algorithm of formula (1)iInitial preference.
3. the personalized recommendation method of on-line education system as claimed in claim 2, which is characterized in that further include:
User U is calculated to resource d using formula (2)iFinal preference P (U, di), by user U to resource diIt is final partially
The good highest resource d of degreeiAs calculated result:
P (U, di)=e-w×Pu(U, di)+(1-e-w)*P1(U, di)
Formula (2)
Wherein: w ∝ t, t indicate user's history behavior record item number;
Pu(U, di) indicate that user U is to resource d in the proposed algorithm based on user information similarityiInitial preference;
P (U, di) indicate that user U is to resource d under the algorithm of formula (2)iFinal preference.
4. the personalized recommendation method of on-line education system as claimed in claim 3, which is characterized in that the method is also wrapped
It includes:
The User action log file is stored into the database based on distributed document storage by user terminal.
5. the personalized recommendation method of on-line education system as described in claim 1, which is characterized in that described according to
The distributed computing framework of Hadoop platform pre-processes the User action log file offline, comprising:
Identification cutting is carried out to the field in User action log file, is removed illegal in the User action log file
Record extracts characteristic information according to statistical demand.
6. the personalized recommendation method of on-line education system as claimed in claim 5, which is characterized in that the characteristic information packet
It includes:
The personal characteristics of user: educational background, profession, occupation, age, gender, personality, interest, the following study plan;
Dominant user behavior characteristics: user, which scores, to feed back, downloading resource, does topic record, search course resources, interacts with course
Number, each interaction time, system online hours;
Hidden customer behavioural characteristic: page residence time, page access number, the mobile number of mouse, scroll bar roll number.
7. a kind of terminal characterized by comprising for executing the unit of as the method according to claim 1 to 6.
8. a kind of terminal, which includes processor, input equipment, output equipment and memory, and the processor, input are set
Standby, output equipment and memory are connected with each other, which is characterized in that the memory supports terminal to execute as right is wanted for storing
The application code of the described in any item methods of 1-6 is sought, the processor is configured for executing as claim 1-6 is any
Method described in.
9. a kind of computer readable storage medium, the computer storage medium is stored with computer program, the computer journey
Sequence includes program instruction, and described program instruction executes the processor as claim 1-6 is any
Method described in.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910455421.6A CN110276018A (en) | 2019-05-29 | 2019-05-29 | Personalized recommendation method, terminal and the storage medium of on-line education system |
PCT/CN2019/104888 WO2020237898A1 (en) | 2019-05-29 | 2019-09-09 | Personalized recommendation method for online education system, terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910455421.6A CN110276018A (en) | 2019-05-29 | 2019-05-29 | Personalized recommendation method, terminal and the storage medium of on-line education system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110276018A true CN110276018A (en) | 2019-09-24 |
Family
ID=67960151
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910455421.6A Pending CN110276018A (en) | 2019-05-29 | 2019-05-29 | Personalized recommendation method, terminal and the storage medium of on-line education system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110276018A (en) |
WO (1) | WO2020237898A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111292212A (en) * | 2020-03-04 | 2020-06-16 | 湖北文理学院 | Personalized thinking political affairs education system |
CN112559873A (en) * | 2020-12-21 | 2021-03-26 | 周欢 | User recommendation system based on intelligent education |
CN113065060A (en) * | 2021-02-18 | 2021-07-02 | 山东师范大学 | Deep learning-based education platform course recommendation method and system |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113177181B (en) * | 2021-06-29 | 2021-08-31 | 长沙豆芽文化科技有限公司 | Online teaching information pushing method and system based on interactive customization plan |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001006398A2 (en) * | 1999-07-16 | 2001-01-25 | Agentarts, Inc. | Methods and system for generating automated alternative content recommendations |
CN106874522A (en) * | 2017-03-29 | 2017-06-20 | 珠海习悦信息技术有限公司 | Information recommendation method, device, storage medium and processor |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886487B (en) * | 2014-03-28 | 2016-01-27 | 焦点科技股份有限公司 | Based on personalized recommendation method and the system of distributed B2B platform |
CN104021483B (en) * | 2014-06-26 | 2017-08-25 | 陈思恩 | Passenger demand recommends method |
US10163061B2 (en) * | 2015-06-18 | 2018-12-25 | International Business Machines Corporation | Quality-directed adaptive analytic retraining |
CN107169572B (en) * | 2016-12-23 | 2018-09-18 | 福州大学 | A kind of machine learning Service Assembly method based on Mahout |
CN106982150B (en) * | 2017-03-27 | 2020-05-26 | 重庆邮电大学 | Hadoop-based mobile internet user behavior analysis method |
CN109670116A (en) * | 2018-11-30 | 2019-04-23 | 内江亿橙网络科技有限公司 | A kind of intelligent recommendation system based on big data |
-
2019
- 2019-05-29 CN CN201910455421.6A patent/CN110276018A/en active Pending
- 2019-09-09 WO PCT/CN2019/104888 patent/WO2020237898A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001006398A2 (en) * | 1999-07-16 | 2001-01-25 | Agentarts, Inc. | Methods and system for generating automated alternative content recommendations |
CN106874522A (en) * | 2017-03-29 | 2017-06-20 | 珠海习悦信息技术有限公司 | Information recommendation method, device, storage medium and processor |
Non-Patent Citations (2)
Title |
---|
刘顺文: "《中国优秀硕士学位论文全文数据库 信息科技辑》", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
李龙飞: "基于Hadoop+Mahout的智能终端云应用推荐引擎的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111292212A (en) * | 2020-03-04 | 2020-06-16 | 湖北文理学院 | Personalized thinking political affairs education system |
CN112559873A (en) * | 2020-12-21 | 2021-03-26 | 周欢 | User recommendation system based on intelligent education |
CN113065060A (en) * | 2021-02-18 | 2021-07-02 | 山东师范大学 | Deep learning-based education platform course recommendation method and system |
CN113065060B (en) * | 2021-02-18 | 2022-11-29 | 山东师范大学 | Deep learning-based education platform course recommendation method and system |
Also Published As
Publication number | Publication date |
---|---|
WO2020237898A1 (en) | 2020-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110276018A (en) | Personalized recommendation method, terminal and the storage medium of on-line education system | |
CN110781321B (en) | Multimedia content recommendation method and device | |
Cleger-Tamayo et al. | Top-N news recommendations in digital newspapers | |
CN106028071A (en) | Video recommendation method and system | |
CN110147882A (en) | Training method, crowd's method of diffusion, device and the equipment of neural network model | |
CN112258301B (en) | Commodity recommendation method, commodity recommendation device, commodity recommendation system, readable storage medium and electronic equipment | |
CN111026977B (en) | Information recommendation method and device and storage medium | |
CN113836131A (en) | Big data cleaning method and device, computer equipment and storage medium | |
CN110209875A (en) | User content portrait determines method, access object recommendation method and relevant apparatus | |
CN114996486A (en) | Data recommendation method and device, server and storage medium | |
CN112860989A (en) | Course recommendation method and device, computer equipment and storage medium | |
CN114371946A (en) | Information push method and information push server based on cloud computing and big data | |
EP2613275B1 (en) | Search device, search method, search program, and computer-readable memory medium for recording search program | |
Kharrat et al. | Recommendation system based contextual analysis of Facebook comment | |
US20230308360A1 (en) | Methods and systems for dynamic re-clustering of nodes in computer networks using machine learning models | |
CN115131052A (en) | Data processing method, computer equipment and storage medium | |
CN115618121A (en) | Personalized information recommendation method, device, equipment and storage medium | |
Bonomo et al. | Customer recommendation based on profile matching and customized campaigns in on-line social networks | |
CN115858815A (en) | Method for determining mapping information, advertisement recommendation method, device, equipment and medium | |
CN109635193A (en) | A kind of books reading shared platform | |
CN114580533A (en) | Method, apparatus, device, medium, and program product for training feature extraction model | |
CN115878839A (en) | Video recommendation method and device, computer equipment and computer program product | |
CN113761272A (en) | Data processing method, data processing equipment and computer readable storage medium | |
JP7003481B2 (en) | Reinforcing rankings for social media accounts and content | |
Warlop | Novel Learning and Exploration-Exploitation Methods for Effective Recommender Systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190924 |
|
RJ01 | Rejection of invention patent application after publication |