WO2020237898A1 - 在线教育系统的个性化推荐方法、终端及存储介质 - Google Patents

在线教育系统的个性化推荐方法、终端及存储介质 Download PDF

Info

Publication number
WO2020237898A1
WO2020237898A1 PCT/CN2019/104888 CN2019104888W WO2020237898A1 WO 2020237898 A1 WO2020237898 A1 WO 2020237898A1 CN 2019104888 W CN2019104888 W CN 2019104888W WO 2020237898 A1 WO2020237898 A1 WO 2020237898A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
preference
resources
recommendation
degree
Prior art date
Application number
PCT/CN2019/104888
Other languages
English (en)
French (fr)
Inventor
梁立新
何欢
Original Assignee
深圳技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳技术大学 filed Critical 深圳技术大学
Publication of WO2020237898A1 publication Critical patent/WO2020237898A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Definitions

  • the present invention relates to the technical field of personalized recommendation, in particular to a personalized recommendation method, terminal and storage medium of an online education system.
  • the recommendation system has been applied in many Internet fields, including social, e-commerce, music, video, movies, news and other fields.
  • the recommendation system has a variety of personalized recommendations in other fields, and its development is becoming more and more mature.
  • most recommendation systems in the education field use content-based and association rule-based recommendations.
  • the recommendation quality is poor, which makes students Unable to obtain the best learning resources, research on personalized recommendations in online education is still somewhat lagging.
  • the construction of the domestic education cloud platform only uses a small amount of cloud computing technology, and the scale of the cloud is relatively small.
  • the characteristics of the big data that the cloud platform can handle are also rarely used.
  • teaching resources are simply stored in the cloud.
  • the centralized management of information is realized in the platform, and the utilization rate of information is relatively low, and there are fewer personalized education applications for the cloud platform.
  • the technical problem to be solved by the present invention is how to provide a personalized online education recommendation system suitable for domestic learners to meet the needs of learners, better fit learners’ preferences, and better experience "Internet + education” learning mode.
  • the present invention proposes the following technical solutions:
  • an embodiment of the present invention proposes a personalized recommendation method for an online education system, including the following steps:
  • Extracting filtered data through Mahout calculating the filtered data using the Mahout to obtain a calculation result, and storing the calculation result in a database as a recommendation result;
  • the recommendation result is retrieved from the database and sent to the user terminal.
  • a further technical solution is that, extracting filtered data through Mahout, and calculating the filtered data using Mahout to obtain a calculation result, including:
  • P Cb (U, d i) represents the content recommendation algorithm based on the degree of preference of the user U d i of the resource
  • P Hcf (U, d i ) represents the preference degree of user U for resources d i based on the hybrid collaborative filtering recommendation algorithm
  • represents the total preference value of user U's preference for resources d i based on content and hybrid collaborative filtering algorithm
  • P 1 (U, d i) represents the initial degree of preference U resources algorithms d i in equation (1) user.
  • Equation (2) calculates the degree of preference for the user U to the end of resources d i P (U, d i), the user U to the highest degree of preference of the final resource resources d i d i as the calculation result:
  • w ⁇ t, t represents the number of user history behavior records
  • P u (U, d i) represents a user information recommendation algorithm based on similarity in an initial degree of preference of the user U d i of the resource;
  • P (U, d i) represents the final degree of preference of the user U in the resource d i Algorithm Equation (2).
  • the user behavior log file is stored by the user terminal in a database based on distributed file storage.
  • Explicit user behavior characteristics user rating feedback, download resources, record of questions, search for course resources, number of interactions with courses, each interaction time, system online duration;
  • an embodiment of the present invention provides a terminal including: a unit for executing the method described in the first aspect.
  • inventions of the present invention provide a terminal.
  • the terminal includes a processor, an input device, an output device, and a memory.
  • the processor, input device, output device, and memory are connected to each other, and the memory is used to store and support the terminal.
  • the application code for executing the method according to the first aspect the processor is configured to execute the method according to the first aspect.
  • an embodiment of the present invention provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to Perform the method as described in the first aspect.
  • Figure 1 is a flowchart of a personalized recommendation method for an online education system provided by an embodiment of the present invention
  • FIG. 2 is a processing flowchart of the Hadoop platform in the personalized recommendation method of the online education system provided by an embodiment of the present invention
  • FIG. 3 is a schematic block diagram of a terminal 300 according to another embodiment of the present invention.
  • Fig. 4 is a schematic structural diagram of a recommendation algorithm provided by another embodiment of the present invention.
  • an embodiment of the present invention provides a personalized recommendation method for an online education system, including the following steps:
  • the user terminal collects user behavior information in real time, generates a user behavior log file and sends it to the system, and the system receives the user behavior log file uploaded by the user terminal.
  • the user’s behavior information includes the user’s personal characteristics, explicit user behavior characteristics, and implicit user behavior characteristics, among which,
  • the user’s personal characteristics include: education, major, occupation, age, gender, personality, interests, and future learning plans;
  • Obvious user behavior characteristics include: user rating feedback, download resources, record of questions, search for course resources, number of interactions with courses, time for each interaction, and system online time;
  • Hidden user behavior characteristics include: page dwell time, page visits, mouse movements, and scroll bar scrolling times.
  • the method further includes:
  • S1011 The user behavior log file is stored by the user terminal into a database based on distributed file storage.
  • the user behavior log file collection is mainly collected by the user side using javaScript scripts, and the user side saves the user behavior log file in Mongodb (a database based on distributed file storage).
  • the architecture of HDFS is constructed based on a set of specific nodes, which is determined by its own characteristics. These nodes include a master node NameNode and multiple slave nodes DataNode. NameNode provides metadata services inside HDFS; DataNode, which provides storage blocks for HDFS. Files stored in HDFS are divided into blocks, and then these blocks are copied to multiple computers (DataNodes), thereby maintaining multiple copies of working data, ensuring that the processing can be redistributed for failed nodes and improving system reliability.
  • DataNodes dataNodes
  • the distributed computing framework of the Hadoop platform is MapReduce, and on the basis of the MapReduce computing framework, Hive is used to perform offline data analysis, preprocessing, and filtering out clean data on the user behavior log files.
  • step S103 the specific operation steps of step S103 are: on the basis of the MapReduce computing framework, use hive to identify and segment the fields in the user behavior log file, and remove illegal records in the user behavior log file. According to statistical requirements, feature information is extracted.
  • the characteristic information includes:
  • Explicit user behavior characteristics user rating feedback, download resources, record of questions, search for course resources, number of interactions with courses, each interaction time, system online duration;
  • Hidden user behavior characteristics page dwell time, page visits, mouse movement times, scroll bar scroll times.
  • the user's preference for resources is judged, and the user resource preference set is generated, which provides a data set for the calculation of the following recommendation algorithm.
  • the recommendation algorithm based on hybrid collaborative filtering includes the following steps:
  • the recommendation algorithm based on hybrid collaborative filtering refers to a hybrid recommendation algorithm that combines a user-based collaborative filtering algorithm and an item-based collaborative filtering algorithm.
  • steps a and b are the calculation process of the user-based collaborative filtering algorithm
  • steps c and d are the calculation process of the item-based collaborative filtering algorithm
  • step e is to integrate the results of the two algorithms to generate a hybrid collaborative filtering-based
  • the recommendation result of the recommendation algorithm makes the recommendation result more consistent with the user's preference.
  • the registration information of the target user obtain the "personal characteristics of the user", use the idea of k-means clustering algorithm to find similar user sets, cluster similar users together, and use the cosine distance measurement to find the most similar among similar users The user who has the smallest cosine distance, and recommends the target user according to the preference of the most similar users to each learning resource.
  • this user-based recommendation algorithm based on user information similarity is mainly used to solve the user's cold start problem.
  • step S104 When the user has a search behavior, it can be known that the user has a strong purpose for a certain content and has an immediate and strong demand for the content. At this time, it should be based on the content and theme of the course based on the user’s search clicks.
  • Content recommendation as the number of search behaviors continues to increase, the proportion of content-based recommendations can be appropriately increased, so as to make reasonable recommendations and ensure the accuracy and richness of recommendations.
  • the specific operations of step S104 include:
  • Step S1041 using the recommendation algorithm based on the content of fusion and the equation (1) based on hybrid collaborative filtering recommendation algorithm, calculating the initial user preference degree U resources d i of P 1 (U, d i) :
  • P Cb (U, d i) represents the content recommendation algorithm based on the degree of preference of the user U d i of the resource
  • P Hcf (U, d i ) represents the preference degree of user U for resources d i based on the hybrid collaborative filtering recommendation algorithm
  • represents the user U based on the deviation at the content and hybrid collaborative filtering algorithm preference degree of resources d i; the smaller the value of ⁇ , the greater the resources described user U preference similarity d i both algorithms, then The more accurate the recommended preference.
  • represents the total preference value of user U's preference for resources d i based on content and hybrid collaborative filtering algorithms; the larger the value of ⁇ , the greater the total preference value of user U's preference for resources d i under these two algorithms , which meant that more resources d i deserves to be recommended.
  • P 1 (U, d i) represents the initial degree of preference U resources algorithms d i in equation (1) user.
  • the user preference for the smaller U similarity d i resources at this time should be based on a different weight ratio between the two algorithms reconcile. Therefore, according to formula (1), the content-based recommendation algorithm and the recommendation algorithm based on hybrid collaborative filtering can be smoothly integrated, so that the recommendation result is closer to the user's needs.
  • step S1042 the embodiment of the present invention further includes step S1042:
  • w ⁇ t, t represents the number of user history behavior records
  • P u (U, d i) represents a user information recommendation algorithm based on similarity in an initial degree of preference of the user U d i of the resource;
  • P (U, d i) represents the final degree of preference of the user U in the resource d i Algorithm Equation (2).
  • equation (2) can be calculated preference degree of the final user U resources d i of P (U, d i), d i resources are sorted according to the degree of preference final P (U, d i) from high to low, the final highest degree of preference resource as the calculation result d i, the calculation result stored in the database as a recommendation result.
  • the at least one resource d i taking the final degree of preference than a preset threshold value as a calculation result, the calculation result stored in the database as a recommendation result.
  • User-based recommendation algorithm ie, the user-based recommendation algorithm according to user characteristics in Figure 4.
  • a trigger signal is generated when a user logs in to the user terminal of the online education website, and the recommendation system receives the trigger signal for the user terminal requesting recommendation, and then retrieves the recommendation result from the database and sends it to the user terminal.
  • the embodiment of the present invention will conduct offline analysis and processing of user behavior data by building a Hadoop data processing platform and using the open source algorithm library Apache Mahout for data mining.
  • the entire system construction is based on the MapReduce computing model, making full use of the powerful data processing of the cloud platform Ability, offline calculation of user recommendation results, using parallelization and distribution to improve the efficiency of the system and improve the scalability of the system, to solve the traditional stand-alone recommendation model computing power shortage, real-time recommendation time is too long.
  • the basic performance of the recommendation system includes: the response time of customer requests is within 2 seconds; supports simultaneous online access by millions of users; server CPU average load rate ⁇ 50%;
  • the system has 7 ⁇ 24 ⁇ 365 hours of high availability, with a reliability of more than 99.9999%; to ensure accurate data access services and no data loss;
  • the online education system focuses on recommending personalized learning programs and appropriate learning resources for users in need.
  • the design of user behavior analysis and personalized recommendation based on Hadoop and Mahout allows users to achieve learning requirements and improve themselves through big data analysis. At the same time, it produces huge social benefits and promotes the rapid development of the online education industry.
  • the embodiment of the present invention provides a terminal.
  • the terminal in this embodiment may include: a unit for executing the method described in Embodiment 1.
  • the receiving unit is used to receive the user behavior log file uploaded by the user terminal;
  • the user terminal collects user behavior information in real time, generates a user behavior log file and sends it to the system, and the system receives the user behavior log file uploaded by the user terminal.
  • the user’s behavior information includes the user’s personal characteristics, explicit user behavior characteristics, and implicit user behavior characteristics, among which,
  • the user’s personal characteristics include: education, major, occupation, age, gender, personality, interests, and future learning plans;
  • Obvious user behavior characteristics include: user rating feedback, download resources, record of questions, search for course resources, number of interactions with courses, time for each interaction, and system online time;
  • Hidden user behavior characteristics include: page dwell time, page visits, mouse movements, and scroll bar scrolling times.
  • it further includes:
  • the storage unit is used to store the user behavior log file from the user side into a database based on distributed file storage. ;
  • the user behavior log file collection is mainly collected by the user side using javaScript scripts, and the user side saves the user behavior log file in Mongodb (a database based on distributed file storage).
  • the distributed storage unit is used to dump the user behavior log file to the Hadoop platform, and perform distributed storage of the user behavior log file according to the HDFS (Hadoop Distributed File System) feature of the Hadoop platform Backup
  • the architecture of HDFS is constructed based on a set of specific nodes, which is determined by its own characteristics. These nodes include a master node NameNode and multiple slave nodes DataNode NameNode (only one). NameNode provides metadata services inside HDFS; DataNode provides storage blocks for HDFS. Files stored in HDFS are divided into blocks, and then these blocks are copied to multiple computers (DataNodes), thereby maintaining multiple copies of working data, ensuring that the processing can be redistributed for failed nodes and improving system reliability.
  • DataNodes computers
  • a preprocessing unit configured to perform offline preprocessing on the user behavior log file according to the distributed computing framework of the Hadoop platform to obtain filtered data
  • the distributed computing framework of the Hadoop platform is MapReduce.
  • MapReduce On the basis of the MapReduce computing framework, hive is used to perform offline data analysis on the user behavior log file, preprocess it, and filter out clean data.
  • the preprocessing unit is specifically configured to: use hive to identify and segment the fields in the user behavior log file based on the MapReduce computing framework, remove illegal records in the user behavior log file, and Statistic requirements, extract characteristic information.
  • the characteristic information includes:
  • Explicit user behavior characteristics user rating feedback, download resources, record of questions, search for course resources, number of interactions with courses, each interaction time, system online duration;
  • a calculation unit configured to extract filtered data through Mahout, calculate the filtered data by using the Mahout to obtain a calculation result, and store the calculation result in a database as a recommendation result;
  • the recommendation algorithm based on hybrid collaborative filtering includes the following steps:
  • the recommendation algorithm based on hybrid collaborative filtering refers to a hybrid recommendation algorithm that combines a user-based collaborative filtering algorithm and an item-based collaborative filtering algorithm.
  • steps a and b are the calculation process of the user-based collaborative filtering algorithm
  • steps c and d are the calculation process of the item-based collaborative filtering algorithm
  • step e is to integrate the results of the two algorithms to generate a hybrid collaborative filtering-based
  • the recommendation result of the recommendation algorithm makes the recommendation result more consistent with the user's preference.
  • User-based recommendation algorithm based on user information similarity including:
  • the registration information of the target user obtain the "personal characteristics of the user", use the idea of k-means clustering algorithm to find similar user sets, cluster similar users together, and use the cosine distance measurement to find the most similar among similar users The user who has the smallest cosine distance, and recommends the target user according to the preference of the most similar users to each learning resource.
  • this user-based recommendation algorithm based on user information similarity is mainly used to solve the user's cold start problem.
  • the calculation unit specifically includes:
  • Fusion calculation unit for utilizing fusion recommendation algorithm based on the content and the equation (1) based on hybrid collaborative filtering recommendation algorithm, calculating the initial user preference degree U resources d i of P 1 (U, d i) :
  • P Cb (U, d i) represents the content recommendation algorithm based on the degree of preference of the user U d i of the resource
  • P Hcf (U, d i) represents the hybrid-user collaborative filtering recommendation algorithm preference degree U d i of the resource
  • represents the user U based on the deviation at the content and hybrid collaborative filtering algorithm preference degree of resources d i; the smaller the value of ⁇ , the greater the resources described user U preference similarity d i both algorithms, then The more accurate the recommended preference.
  • represents the total preference value of user U's preference for resources d i based on content and hybrid collaborative filtering algorithms; the larger the value of ⁇ , the greater the total preference value of user U's preference for resources d i under these two algorithms , which meant that more resources d i deserves to be recommended.
  • P 1 (U, d i) represents the initial degree of preference U resources algorithms d i in equation (1) user.
  • the user preference for the smaller U similarity d i resources at this time should be based on a different weight ratio between the two algorithms reconcile. Therefore, according to formula (1), the content-based recommendation algorithm and the recommendation algorithm based on hybrid collaborative filtering can be smoothly integrated, so that the recommendation result is closer to the user's needs.
  • the collaborative filtering algorithm is based on the user's historical behavior data.
  • the new user has no historical behavior record, which creates a cold start problem.
  • Most of the cold-start problems of recommendation algorithms use random recommendations to users, the latest and hottest recommendations, and recommendations using user registration information.
  • user data is collected, it will switch to personalized recommendation. During this period, it is easy to cause the loss of users.
  • the final calculation unit is further included:
  • the final calculation unit for using Equation (2) calculates the degree of preference for the user U to the end of resources d i P (U, d i), the user U to the highest degree of preference of the final resource resources d i d i as the calculation result:
  • w ⁇ t, t represents the number of user history behavior records
  • P u (U, d i) represents a user information recommendation algorithm based on similarity in an initial degree of preference of the user U d i of the resource;
  • P (U, d i) represents the final degree of preference of the user U in the resource d i Algorithm Equation (2).
  • equation (2) can be calculated preference degree of the final user U resources d i of P (U, d i), d i resources are sorted according to the degree of preference final P (U, d i) from high to low, the final highest degree of preference resource as the calculation result d i, the calculation result stored in the database as a recommendation result.
  • the at least one resource d i taking the final degree of preference than a preset threshold value as a calculation result, the calculation result stored in the database as a recommendation result.
  • User-based recommendation algorithm ie, the user-based recommendation algorithm according to user characteristics in Figure 4.
  • the sending unit is configured to retrieve the recommendation result from the database and send it to the user terminal if the trigger signal of the user terminal requesting recommendation is received.
  • the terminal 300 in this embodiment may include: one or more processors 301; one or more input devices 302, one or more output devices 303, and a memory 304.
  • the aforementioned processor 301, input device 302, output device 303, and memory 304 are connected via a bus 305.
  • the memory 302 is used to store instructions
  • the processor 301 is used to execute instructions stored in the memory 302. Among them, the processor 301 is used to execute:
  • the distributed computing framework performs offline preprocessing on the user behavior log file to obtain filtered data; extracts the filtered data through Mahout, uses the Mahout to calculate the filtered data, and obtains the calculation result.
  • the calculation result is stored in the database as the recommendation result; if the trigger signal of the user end requesting recommendation is received, the recommendation result is retrieved from the database and sent to the user end.
  • filtering algorithm calculates the initial user preference degree U d i resources of P 1 (U, d i) :
  • P Cb (U, d i) represents the content recommendation algorithm based on the degree of preference of the user U d i of the resource
  • P Hcf (U, d i ) represents the preference degree of user U for resources d i based on the hybrid collaborative filtering recommendation algorithm
  • represents the total preference value of user U's preference for resources d i based on content and hybrid collaborative filtering algorithm
  • P 1 (U, d i) represents the initial degree of preference U resources algorithms d i in equation (1) user.
  • w ⁇ t, t represents the number of user history behavior records
  • P u (U, d i) represents a user information recommendation algorithm based on similarity in an initial degree of preference of the user U d i of the resource;
  • P (U, d i) represents the final degree of preference of the user U in the resource d i Algorithm Equation (2).
  • the user behavior log file is stored by the user terminal into a database based on distributed file storage.
  • the offline preprocessing of the user behavior log file according to the distributed computing framework of the Hadoop platform includes: identifying and segmenting the fields in the user behavior log file to remove the user For illegal records in the behavior log file, feature information is extracted according to statistical requirements.
  • the characteristic information includes: the user's personal characteristics: education, major, occupation, age, gender, personality, interest, future learning plan; explicit user behavior characteristics: user rating feedback, downloading resources, recording questions, searching Curriculum resources, the number of interactions with the course, the duration of each interaction, and the length of time the system is online; hidden user behavior characteristics: page dwell time, page visits, mouse movement times, scroll bar scroll times.
  • the so-called processor 301 may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors or digital signal processors (DSP). , Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the input device 302 may include a touch panel, a fingerprint sensor (used to collect user fingerprint information and fingerprint orientation information), a microphone, etc.
  • the output device 303 may include a display (LCD, etc.), a speaker, etc.
  • the memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A part of the memory 304 may also include a non-volatile random access memory. For example, the memory 304 may also store device type information.
  • the processor 301, input device 302, and output device 303 described in the embodiment of the present invention can execute the implementation described in the embodiments of a parameter adjustment method provided by the embodiment of the present invention, and can also execute The implementation of the terminal 300 described in the embodiment of the present invention will not be repeated here.
  • a computer-readable storage medium stores a computer program, and the computer program is executed by a processor to realize:
  • the distributed computing framework performs offline preprocessing on the user behavior log file to obtain filtered data; extracts the filtered data through Mahout, uses the Mahout to calculate the filtered data, and obtains the calculation result.
  • the calculation result is stored in the database as the recommendation result; if the trigger signal of the user end requesting recommendation is received, the recommendation result is retrieved from the database and sent to the user end.
  • P Cb (U, d i) represents the content recommendation algorithm based on the degree of preference of the user U d i of the resource
  • P Hcf (U, d i ) represents the preference degree of user U for resources d i based on the hybrid collaborative filtering recommendation algorithm
  • represents the total preference value of user U's preference for resources d i based on content and hybrid collaborative filtering algorithm
  • P 1 (U, d i) represents the initial degree of preference U resources algorithms d i in equation (1) user.
  • Equation (2) calculates the degree of preference for the user U to the end of resources d i P (U, d i), the user U to the highest degree of preference of the final resource resources d i d i as the calculation result:
  • w ⁇ t, t represents the number of user history behavior records
  • P u (U, d i) represents a user information recommendation algorithm based on similarity in an initial degree of preference of the user U d i of the resource;
  • P (U, d i) represents the final degree of preference of the user U in the resource d i Algorithm Equation (2).
  • the method further includes: the user behavior log file is stored by the user terminal in a database based on distributed file storage.
  • the offline preprocessing of the user behavior log file according to the distributed computing framework of the Hadoop platform includes: identifying and segmenting fields in the user behavior log file, and removing illegal content in the user behavior log file Record and extract characteristic information according to statistical requirements.
  • the characteristic information includes: the user's personal characteristics: education, major, occupation, age, gender, personality, interest, future learning plan; explicit user behavior characteristics: user rating feedback, downloading resources, recording questions, searching Curriculum resources, the number of interactions with the course, the duration of each interaction, and the length of time the system is online; hidden user behavior characteristics: page dwell time, page visits, mouse movement times, scroll bar scroll times.
  • the computer-readable storage medium may be the internal storage unit of the terminal described in any of the foregoing embodiments, such as the hard disk or memory of the terminal.
  • the computer-readable storage medium may also be an external storage device of the terminal, for example, a plug-in hard disk equipped on the terminal, a smart memory card (Smart Media Card, SMC), or a Secure Digital (SD) card. , Flash Card, etc.
  • the computer-readable storage medium may also include both an internal storage unit of the terminal and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the terminal.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
  • the disclosed terminal and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present invention.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present invention is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种在线教育系统的个性化推荐方法、终端及存储介质,涉及智能推荐算法技术领域。本发明通过提取用户行为日志并存储到Hadoop上,利用Mahout技术,对用户行为数据进行分析计算;并结合Hadoop的HDFS和MapReduce进行数据的处理,产生推荐结果,从而实现基于用户的个性化推荐。

Description

在线教育系统的个性化推荐方法、终端及存储介质 技术领域
本发明涉及个性化推荐技术领域,尤其涉及在线教育系统的个性化推荐方法、终端及存储介质。
背景技术
自2015年“互联网+”概念的提出,“互联网+教育”已经成为教育行业的一种新型服务模式,在线教育作为“互联网+教育”下的产物之一,也导致了教育关系和教育制度的巨大变化。虽然目前在线教育打破了传统的固定课堂教育和“题海战术”模式,在线教育平台的种类也越来越多,但始终存在着一些问题。大部分在线教育平台只是教育机构谋取自身利益的一种手段,机构的做法生硬,对于在线学习的课程,用户喜欢就看,需要付费就付费,很少做到能和用户有效的沟通,为用户提供一套个性化的学习推荐方案,同时,教育资源在数量和规模上的爆炸式增长,使普通学习者在选取学习资源时可能面临选择的困难,而通过传统搜索引擎获取的资源通常结果庞杂、准确性较差,无法令学生满意。
推荐系统目前已经应用在多个互联网领域,其中包括社交、电子商务、音乐、视频、电影、新闻等多个领域。推荐系统在其他领域有着多种多样的个性化推荐,并且发展日益成熟,但是在教育领域大多数的推荐系统更多采用的是基于内容和基于关联规则的推荐,该推荐质量较差,使学生无法获得最佳的学习资源,在线教育方面个性化推荐的研究还是有些滞后。
目前国内的教育云平台建设只是使用了少量的云计算技术,云的规模也比较小,对云平台能够处理的大数据的特性也用之甚少,很多时候只是将教学资源简单的存储到云平台中实现信息的集中式管理,对信息的利用率比较低,针对云平台的个性化教育应用就更少。
国外的在线教育平台起步较早,发展成熟,课程数量多且质量高,有一定的优势,但是国内的教育国情与国外不同,国外用户更有主动 性,也更清楚自己的兴趣点和天赋。国内很多用户并不知道自己喜欢什么或者很难用确切的语言来描述清楚,用户更需要系统对他们进行准确的行为分析来调动用户学习的积极主动性。
因此,亟需一种适合国内学习者情况的个性化在线教育推荐系统来满足学习者的需求,更好地体验“互联网+教育”的学习模式。
技术问题
本发明所要解决的技术问题是如何提供一种适合国内学习者情况的个性化在线教育推荐系统来满足学习者的需求,更贴合学习者的偏好,更好地体验“互联网+教育”的学习模式。
技术解决方案
为了解决上述问题,本发明提出以下技术方案:
第一方面,本发明实施例提出一种在线教育系统的个性化推荐方法,包括以下步骤:
接收用户端上传的用户行为日志文件;
将所述用户行为日志文件转存到Hadoop平台上,并根据所述Hadoop平台的HDFS特性对用户行为日志文件进行分布式存储备份;
根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,得到过滤后的数据;
通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,将所述计算结果存储至数据库中作为推荐结果;
若接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
其进一步地技术方案为,所述通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,包括:
利用融合了基于内容的推荐算法和基于混合协同过滤推荐算法的公式(1),计算用户U对资源d i的初始偏好程度P 1(U,d i):
Figure PCTCN2019104888-appb-000001
公式(1)
其中:
α=|P Cb(U,d i)-P Hcf(U,d i)|,α≥0,
β=|P Cb(U,d i)+P Hcf(U,d i)|,β≥0,
P Cb(U,d i)表示基于内容的推荐算法中用户U对资源d i的偏好程度;
P Hcf(U,d i)表示基于混合协同过滤推荐算法中的用户U对资源d i的偏好程度;
max{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最大的用户U对资源d i的偏好程度的最大值;
min{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最小的用户U对资源d i的偏好程度的最小值;
α代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的偏差值;
β代表基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的总偏好值;
P 1(U,d i)表示在公式(1)的算法下用户U对资源d i的初始偏好程度。
其进一步地技术方案为,还包括:
利用公式(2)计算用户U对资源d i的最终偏好程度P(U,d i),将用户U对资源d i的最终偏好程度最高的资源d i作为计算结果:
P(U,d i)=e -w×P u(U,d i)+(1-e -w)*P 1(U,d i)
公式(2)
其中:w∝t,t表示用户历史行为记录条数;
P u(U,d i)表示基于用户信息相似度的推荐算法中用户U对资源d i的初始偏好程度;
P(U,d i)表示在公式(2)的算法下用户U对资源d i的最终偏好程 度。
其进一步地技术方案为,所述方法还包括:
所述用户行为日志文件由用户端储存到基于分布式文件存储的数据库中。
其进一步地技术方案为,所述根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,包括:
对用户行为日志文件中的字段进行识别切分,去除所述用户行为日志文件中不合法的记录,根据统计需求,提取特征信息。
其进一步地技术方案为,所述特征信息包括:
用户的个人特征:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;
显性的用户行为特征:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;
隐性用户行为特征:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
第二方面,本发明实施例提供一种终端,包括:用于执行如第一方面所述的方法的单元。
第三方面,本发明实施例提供一种终端,该终端包括处理器、输入设备、输出设备和存储器,所述处理器、输入设备、输出设备和存储器相互连接,所述存储器用于存储支持终端执行如第一方面所述的方法的应用程序代码,所述处理器被配置用于执行如第一方面所述的方法。
第四方面,本发明实施例提供一种计算机可读存储介质,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如第一方面所述的方法。
有益效果
与现有技术相比,本发明所能达到的技术效果包括:
基于在线教育方面的通过提取用户行为日志存储到Hadoop上, 利用Mahout技术,对用户行为数据进行分析计算并结合Hadoop的HDFS和MapReduce进行数据的处理,产生推荐结果,从而实现基于用户的个性化推荐。
通过搭建Hadoop数据处理平台并使用数据挖掘的开源算法库Apache Mahout来对用户行为数据进行离线分析与处理,整个系统构建都是基于MapReduce计算模型,充分利用云平台强大的数据处理能力,离线计算用户推荐结果,采用并行化和分布式来提高系统的效率和改善系统的可扩展性,解决了传统单机推荐模型计算能力不足、实时推荐时间过长问题。
附图说明
为了更清楚地说明本发明实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1,为本发明一实施例提供的在线教育系统的个性化推荐方法流程图;
图2,为本发明一实施例提供的在线教育系统的个性化推荐方法中的Hadoop平台的处理流程图;
图3,为本发明另一实施例提供的一种终端300示意框图;
图4,为本发明另一实施例提供的推荐算法结构示意图。
本发明的实施方式
下面将结合本发明实施例中的附图,对实施例中的技术方案进行清楚、完整地描述,附图中类似的组件标号代表类似的组件。显然,以下将描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包 括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本发明实施例说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本发明实施例。如在本发明实施例说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
实施例1
参见图1-2,第一方面,本发明实施例提供在线教育系统的个性化推荐方法,包括以下步骤:
S101,接收用户端上传的用户行为日志文件;
具体实施中,用户端实时收集用户的行为信息,生成用户行为日志文件并将其发送至系统,系统接收用户端上传的用户行为日志文件。
具体实施中,用户的行为信息包括用户的个人特征、显性的用户行为特征以及隐性用户行为特征,其中,
用户的个人特征包括:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;
显性的用户行为特征包括:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;
隐性用户行为特征包括:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
在某一实施例中,所述方法还包括:
S1011,所述用户行为日志文件由用户端储存到基于分布式文件存储的数据库中。
具体实施中,用户行为日志文件收集主要通过用户端使用javaScript脚本进行收集,并由用户端将用户行为日志文件保存在Mongodb中(基于分布式文件存储的数据库)。
S102,将所述用户行为日志文件转存到Hadoop平台上,并根据 所述Hadoop平台的HDFS(Hadoop Distributed File System,分布式文件系统)特性对用户行为日志文件进行分布式存储备份;
具体实施中,HDFS的架构是基于一组特定的节点构建的,这是由它自身的特点决定的。这些节点包括一个主节点NameNode和多个从节点DataNode,NameNode在HDFS内部提供元数据服务;DataNode,它为HDFS提供存储块。存储在HDFS中的文件被分成块,然后将这些块复制到多个计算机中(DataNode),从而维护多个工作数据副本,确保能够针对失败的节点重新分布处理,提高系统可靠性。
S103,根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,得到过滤后的数据;
具体实施中,Hadoop平台的分布式计算框架为MapReduce,在MapReduce计算框架的基础上利用hive对所述用户行为日志文件进行离线数据分析,预处理,过滤出干净的数据。
在某一实施例中,步骤S103的具体操作步骤为:在MapReduce计算框架的基础上利用hive对用户行为日志文件中的字段进行识别切分,去除所述用户行为日志文件中不合法的记录,根据统计需求,提取特征信息。
需要说明的是,所述识别的字段是由技术人员根据实际统计需要自行设定,本发明对此不做赘述。
具体实施中,通过对用户行为日志文件中的用户行为进行分析,从而更多的关注用户的培养、需求以及成长,以给用户提供合理的推荐服务,保证推荐的精准性和丰富性,进而调动用户学习的积极主动性,提高用户黏性。所述特征信息包括:
用户的个人特征:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;
显性的用户行为特征:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;
隐性用户行为特征:页面停留时间、页面访问次数、鼠标移动次 数、滚动条滚动次数。
通过收集用户行为的特征信息来判断用户对资源的偏好程度,产生用户资源偏好集,为后面的推荐算法进行计算提供数据集。
S104,通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,将所述计算结果存储至数据库中作为推荐结果;
参见图4,具体实施中,以下为本发明实施例中结合在线教育的特征采用的推荐算法:
1)基于混合协同过滤的推荐算法,包括以下步骤:
a.根据用户行为信息,利用皮尔逊相关系数度量公式计算用户间的相似度;
b.找到与目标用户相似度较高的邻居用户集合,利用邻居用户对课程反馈情况,预测目标用户对课程的偏好度;
c.根据目标用户的行为记录,利用欧式距离计算公式计算课程间的相似度;
d.找到与目标用户观看课程的相似度较高的邻居课程集合,通过邻居课程的热门程度预测目标用户对邻居课程的偏好度。
e.对得到的目标学习资源集合(课程、邻居课程)进行权重计算,最终得到推荐的目标学习资源,依据偏好程度进行排序,将偏好程度最高的学习资源推荐给用户。
需要说明的是,基于混合协同过滤的推荐算法是指融合了基于用户的协同过滤算法和基于项目的协同过滤算法的混合推荐算法。其中,步骤a、b是基于用户的协同过滤算法的计算过程,步骤c、d是基于项目的协同过滤算法的计算过程,步骤e是对两种算法的结果进行整合,产生基于混合协同过滤的推荐算法的推荐结果,使得推荐结果更符合用户的偏好程度。
2)根据用户信息相似度的基于用户的推荐算法,主要包括:
根据目标用户注册信息,获得“用户的个人特征”,利用k-means聚类算法思想,寻找相似用户集,将相似用户进行聚类在一起,采用 余弦距离测度,在相似用户集中找出最相似的用户,即余弦距离最小值的用户,并依据最相似用户对各学习资源的偏好程度对目标用户进行推荐。
需要说明的是,此根据用户信息相似度的基于用户的推荐算法主要用于解决用户冷启动问题。
3)根据用户行为的基于内容的推荐算法,包括:
根据用户之前的历史行为信息,包括用户看过的课程或者其他学习资源,为用户推荐与看过的资源内容类似的学习资源,比如同一位老师讲过的其它课程。
但是,仅依靠某一种推荐算法总是会有很多缺点,少数平台采用多种推荐进行结合,但是很少考虑用户的行为,多种推荐算法结合的比较生硬,不能流畅平滑的转化,推荐结果不理想。
当用户产生搜索行为时,可知此时用户对某一内容的目的性较强,对该内容有着即时的、强烈的需求,此时应该主要根据用户搜索的点击,观看课程的内容、主题进行基于内容的推荐,随着搜索行为次数的不断增加,可适当增加基于内容的推荐比重,从而进行合理的推荐,保证推荐的精准性和丰富性。例如,具体实施中,步骤S104的具体操作包括:
步骤S1041,利用融合了基于内容的推荐算法和基于混合协同过滤推荐算法的公式(1),计算用户U对资源d i的初始偏好程度P 1(U,d i):
Figure PCTCN2019104888-appb-000002
其中:
α=|P Cb(U,d i)-P Hcf(U,d i)|,α≥0,
β=|P Cb(U,d i)+P Hcf(U,d i)|,β≥0,
P Cb(U,d i)表示基于内容的推荐算法中用户U对资源d i的偏好程度;
P Hcf(U,d i)表示基于混合协同过滤推荐算法中的用户U对资源d i 的偏好程度;
max{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最大的用户U对资源d i的偏好程度的最大值;
min{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最小的用户U对资源d i的偏好程度的最小值;
α代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的偏差值;α的值越小,说明这两种算法下用户U对资源d i的偏好相似度越大,则推荐偏好越准确。
β代表基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的总偏好值;β的值越大,说明这两种算法下用户U对资源d i的偏好程度的总偏好值越大,说明资源d i越值得被推荐。
P 1(U,d i)表示在公式(1)的算法下用户U对资源d i的初始偏好程度。
需要说明的是,当α的值越小,即用户U对资源d i基于两种算法下求出的偏好程度越接近。当P Hcf(U,d i)=P Cb(U,d i)时,α=0,则代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度相同,此时用户U对资源d i的偏好程度就是基于内容的推荐算法(或者是基于混合协同过滤推荐算法)中用户U对资源d i的偏好程度。当α的值越大,即用户U对资源d i的偏好相似度越小,此时,应该基于不同的权重比进行两种算法之间的调和。因此,根据公式(1)可以平滑地融合基于内容的推荐算法和基于混合协同过滤的推荐算法,使得推荐结果更接近于用户的需求。
协同过滤算法是以用户的历史行为数据为计算基础的。但是新用户没有历史行为记录,这就产生了冷启动问题。大多数推荐算法冷启动问题采用的是对用户随机推荐、最新最热推荐、利用用户注册信息推荐的方法,等用户数据收集到一定的时候再切换为个性化推荐,而在收集用户数据的这段期间,很容易造成用户的流失。为解决用户的冷启动问题,本发明实施例中在步骤S1041的基础上,还包括步骤S1042:
步骤S1042,利用公式(2)计算用户U对资源d i的最终偏好程度P(U,d i),将用户U对资源d i的最终偏好程度最高的资源d i作为计算结果:
P(U,d i)=e -w×P u(U,d i)+(1-e -w)*P 1(U,d i)
公式(2)
其中:w∝t,t表示用户历史行为记录条数;
P u(U,d i)表示基于用户信息相似度的推荐算法中用户U对资源d i的初始偏好程度;
P(U,d i)表示在公式(2)的算法下用户U对资源d i的最终偏好程度。
利用公式(2)可以计算出用户U对资源d i的最终偏好程度P(U,d i),将资源d i按照最终偏好程度P(U,d i)由高到低进行排序,将最终偏好程度最高的资源d i作为计算结果,将所述计算结果存储至数据库中作为推荐结果。
在另一实施例中,取最终偏好程度大于预设阈值的至少一个资源d i作为计算结果,将所述计算结果存储至数据库中作为推荐结果。
需要说明的是,一开始,新用户注册没有历史行为记录,则w=0,P(U,d i)=P u(U,d i),则表示新用户主要按照根据用户信息相似度的基于用户的推荐算法(即图4中的根据用户特征的基于用户的推荐算法)。当用户历史行为记录条数t越多,则w的值越大,P 1(U,d i)的权重比就越大,最终慢慢转化为依据用户历史行为记录进行的推荐计算。从而平滑地也解决了新用户的冷启动问题,使得新用户可平滑地过渡到老用户,避免新用户的流失,提高用户的黏性。
S105,若接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
具体实施中,用户登录在线教育网站的用户端时产生触发信号,推荐系统接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
本发明实施例将通过搭建Hadoop数据处理平台并使用数据挖掘 的开源算法库Apache Mahout来对用户行为数据进行离线分析与处理,整个系统构建都是基于MapReduce计算模型,充分利用云平台强大的数据处理能力,离线计算用户推荐结果,采用并行化和分布式来提高系统的效率和改善系统的可扩展性,解决了传统单机推荐模型计算能力不足,实时推荐时间过长问题。
在实际使用中,推荐系统具有的基本性能包括:客户请求的响应时间2秒以内;支持上百万用户同时在线访问;服务器CPU平均负荷率≤50%;
可靠性强:系统具有7×24×365小时的高可用性,可靠性为99.9999%以上;确保数据存取服务准确,不丢失数据;
具有良好的可扩展性:可满足未来三年内用户扩展的需求,可支持后续的应用系统资源逐步整合;当系统用户增多或数据量加大时不影响现有系统功能和结构,能够方便后续的系统扩展。
在线教育系统专注于为有需求的用户推荐个性化的学习方案以及合适的学习资源,基于Hadoop和Mahout的用户行为分析及个性化推荐的设计让用户通过大数据分析达到学习的要求并提升自己,同时产生巨大的社会效益,促进在线教育行业的快速发展。
实施例2,
本发明实施例提供一种终端。本实施例中的终端可包括:用于执行如实施例1所述的方法的单元。
接收单元,用于接收用户端上传的用户行为日志文件;
具体实施中,用户端实时收集用户的行为信息,生成用户行为日志文件并将其发送至系统,系统接收用户端上传的用户行为日志文件。
具体实施中,用户的行为信息包括用户的个人特征、显性的用户行为特征以及隐性用户行为特征,其中,
用户的个人特征包括:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;
显性的用户行为特征包括:用户评分反馈、下载资源、做题记录、 搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;
隐性用户行为特征包括:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
在某一实施例中,还包括:
储存单元,用于将所述用户行为日志文件由用户端储存到基于分布式文件存储的数据库中。;
具体实施中,用户行为日志文件收集主要通过用户端使用javaScript脚本进行收集,并由用户端将用户行为日志文件保存在Mongodb中(基于分布式文件存储的数据库)。分布式存储单元,用于将所述用户行为日志文件转存到Hadoop平台上,并根据所述Hadoop平台的HDFS(Hadoop Distributed File System,分布式文件系统)特性对用户行为日志文件进行分布式存储备份;
具体实施中,HDFS的架构是基于一组特定的节点构建的,这是由它自身的特点决定的。这些节点包括一个主节点NameNode和多个从节点DataNode NameNode(仅一个),NameNode它在HDFS内部提供元数据服务;DataNode,它为HDFS提供存储块。存储在HDFS中的文件被分成块,然后将这些块复制到多个计算机中(DataNode),从而维护多个工作数据副本,确保能够针对失败的节点重新分布处理,提高系统可靠性。
预处理单元,用于根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,得到过滤后的数据;
具体实施中,具体实施中,Hadoop平台的分布式计算框架为MapReduce,在MapReduce计算框架的基础上利用hive对所述用户行为日志文件进行离线数据分析,预处理,过滤出干净的数据。
在某一实施例中,预处理单元具体用于:在MapReduce计算框架的基础上利用hive对用户行为日志文件中的字段进行识别切分,去除所述用户行为日志文件中不合法的记录,根据统计需求,提取特征信息。
需要说明的是,所述识别的字段是由技术人员根据实际统计需要 自行设定,本发明对此不做赘述。
具体实施中,通过对用户行为日志文件中的用户行为进行分析,从而更多的关注用户的培养、需求以及成长,以给用户提供合理的推荐服务,保证推荐的精准性和丰富性,进而调动用户学习的积极主动性,提高用户黏性。所述特征信息包括:
用户的个人特征:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;
显性的用户行为特征:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;
隐性用户行为特征:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
通过收集用户行为的特征信息来判断用户对资源的偏好程度,产生用户资源偏好集,为后面的推荐算法进行计算提供数据集。计算单元,用于通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,将所述计算结果存储至数据库中作为推荐结果;
参见图4,具体实施中,以下为本发明实施例中使用到的推荐算法:
1)基于混合协同过滤的推荐算法,包括以下步骤:
a.根据用户行为信息,利用皮尔逊相关系数度量公式计算用户间的相似度;
b.找到与目标用户相似度较高的邻居用户集合,利用邻居用户对课程反馈情况,预测目标用户对课程的偏好度;
c.根据目标用户的行为记录,利用欧式距离计算公式计算课程间的相似度;
d.找到与目标用户观看课程的相似度较高的邻居课程集合,通过邻居课程的热门程度预测目标用户对邻居课程的偏好度。
e.对得到的目标学习资源集合(课程、邻居课程)进行权重计算,最终得到推荐的目标学习资源,依据偏好程度进行排序,将偏好程度 最高的学习资源推荐给用户。
需要说明的是,基于混合协同过滤的推荐算法是指融合了基于用户的协同过滤算法和基于项目的协同过滤算法的混合推荐算法。其中,步骤a、b是基于用户的协同过滤算法的计算过程,步骤c、d是基于项目的协同过滤算法的计算过程,步骤e是对两种算法的结果进行整合,产生基于混合协同过滤的推荐算法的推荐结果,使得推荐结果更符合用户的偏好程度。2)根据用户信息相似度的基于用户的推荐算法,主要包括:
根据目标用户注册信息,获得“用户的个人特征”,利用k-means聚类算法思想,寻找相似用户集,将相似用户进行聚类在一起,采用余弦距离测度,在相似用户集中找出最相似的用户,即余弦距离最小值的用户,并依据最相似用户对各学习资源的偏好程度对目标用户进行推荐。
需要说明的是,此根据用户信息相似度的基于用户的推荐算法主要用于解决用户冷启动问题。
3)根据用户行为的基于内容的推荐算法,包括:
根据用户之前的历史行为信息,包括用户看过的课程或者其他学习资源,为用户推荐与看过的资源内容类似的学习资源,比如同一位老师讲过的其它课程。
但是,仅依靠某一种推荐算法总是会有很多缺点,少数平台采用多种推荐进行结合,但是很少考虑用户的行为,多种推荐算法结合的比较生硬,不能流畅平滑的转化,推荐结果不理想。
当用户产生搜索行为时,可知此时用户对某一内容的目的性较强,对该内容有着即时的、强烈的需求,此时应该主要根据用户搜索的点击,观看课程的内容、主题进行基于内容的推荐,随着搜索行为次数的不断增加,可适当增加基于内容的推荐比重,从而进行合理的推荐,保证推荐的精准性和丰富性。例如,具体实施中,计算单元的具体包括:
融合计算单元,用于利用融合了基于内容的推荐算法和基于混合 协同过滤推荐算法的公式(1),计算用户U对资源d i的初始偏好程度P 1(U,d i):
Figure PCTCN2019104888-appb-000003
其中:
α=|P Cb(U,d i)-P Hcf(U,d i)|,α≥0,
β=|P Cb(U,d i)+P Hcf(U,d i)|,β≥0,
P Cb(U,d i)表示基于内容的推荐算法中用户U对资源d i的偏好程度;
P Hcf(U,d i)表示基于混合协同过滤推荐算法中的用户U对资源d i的偏好程度;
max{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最大的用户U对资源d i的偏好程度的最大值;
min{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最小的用户U对资源d i的偏好程度的最小值;
α代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的偏差值;α的值越小,说明这两种算法下用户U对资源d i的偏好相似度越大,则推荐偏好越准确。
β代表基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的总偏好值;β的值越大,说明这两种算法下用户U对资源d i的偏好程度的总偏好值越大,说明资源d i越值得被推荐。
P 1(U,d i)表示在公式(1)的算法下用户U对资源d i的初始偏好程度。
需要说明的是,当α的值越小,即用户U对资源d i基于两种算法下求出的偏好程度越接近。当P Hcf(U,d i)=P Cb(U,d i)时,α=0,则代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度相同,此时用户U对资源d i的偏好程度就是基于内容的推荐算法(或者是基于混合协同过滤推荐算法)中用户U对资源d i的偏好程度。当α的值 越大,即用户U对资源d i的偏好相似度越小,此时,应该基于不同的权重比进行两种算法之间的调和。因此,根据公式(1)可以平滑地融合基于内容的推荐算法和基于混合协同过滤的推荐算法,使得推荐结果更接近于用户的需求。
协同过滤算法是以用户的历史行为数据为计算基础的。但是新用户没有历史行为记录,这就产生了冷启动问题。大多数推荐算法冷启动问题采用的是对用户随机推荐、最新最热推荐、利用用户注册信息推荐的方法,等用户数据收集到一定的时候再切换为个性化推荐,而在收集用户数据的这段期间,很容易造成用户的流失。为解决用户的冷启动问题,本发明实施例中在融合计算单元的基础上,还包括最终计算单元:
最终计算单元,用于利用公式(2)计算用户U对资源d i的最终偏好程度P(U,d i),将用户U对资源d i的最终偏好程度最高的资源d i作为计算结果:
P(U,d i)=e -w×P u(U,d i)+(1-e -w)*P 1(U,d i)
公式(2)
其中:w∝t,t表示用户历史行为记录条数;
P u(U,d i)表示基于用户信息相似度的推荐算法中用户U对资源d i的初始偏好程度;
P(U,d i)表示在公式(2)的算法下用户U对资源d i的最终偏好程度。
利用公式(2)可以计算出用户U对资源d i的最终偏好程度P(U,d i),将资源d i按照最终偏好程度P(U,d i)由高到低进行排序,将最终偏好程度最高的资源d i作为计算结果,将所述计算结果存储至数据库中作为推荐结果。
在另一实施例中,取最终偏好程度大于预设阈值的至少一个资源d i作为计算结果,将所述计算结果存储至数据库中作为推荐结果。
需要说明的是,一开始,新用户注册没有历史行为记录,则w=0,P(U,d i)=P u(U,d i),则表示新用户主要按照根据用户信息相似度的 基于用户的推荐算法(即图4中的根据用户特征的基于用户的推荐算法)。当用户历史行为记录条数t越多,则w的值越大,P 1(U,d i)的权重比就越大,最终慢慢转化为依据用户历史行为记录进行的推荐计算。从而平滑地也解决了新用户的冷启动问题,使得新用户可平滑地过渡到老用户,避免新用户的流失,提高用户的黏性。
发送单元,用于若接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
实施例3
参见图3,本发明另一实施例提供的一种终端300示意框图。如图所示的本实施例中的终端300可以包括:一个或多个处理器301;一个或多个输入设备302,一个或多个输出设备303和存储器304。上述处理器301、输入设备302、输出设备303和存储器304通过总线305连接。存储器302用于存储指令,处理器301用于执行存储器302存储的指令。其中,处理器301用于执行:
接收用户端上传的用户行为日志文件;将所述用户行为日志文件转存到Hadoop平台上,并根据所述Hadoop平台的HDFS特性对用户行为日志文件进行分布式存储备份;根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,得到过滤后的数据;通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,将所述计算结果存储至数据库中作为推荐结果;若接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
进一步地,还用于执行:所述通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,包括:利用融合了基于内容的推荐算法和基于混合协同过滤推荐算法的公式(1),计算用户U对资源d i的初始偏好程度P 1(U,d i):
Figure PCTCN2019104888-appb-000004
Figure PCTCN2019104888-appb-000005
其中:
α=|P Cb(U,d i)-P Hcf(U,d i)|,α≥0,
β=|P Cb(U,d i)+P Hcf(U,d i)|,β≥0,
P Cb(U,d i)表示基于内容的推荐算法中用户U对资源d i的偏好程度;
P Hcf(U,d i)表示基于混合协同过滤推荐算法中的用户U对资源d i的偏好程度;
max{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最大的用户U对资源d i的偏好程度的最大值;
min{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最小的用户U对资源d i的偏好程度的最小值;
α代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的偏差值;
β代表基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的总偏好值;
P 1(U,d i)表示在公式(1)的算法下用户U对资源d i的初始偏好程度。
进一步地还用于执行:利用公式(2)计算用户U对资源d i的最终偏好程度P(U,d i),将用户U对资源d i的最终偏好程度最高的资源d i作为计算结果:
P(U,d i)=e -w×P u(U,d i)+(1-e -w)*P 1(U,d i)
公式(2)
其中:w∝t,t表示用户历史行为记录条数;
P u(U,d i)表示基于用户信息相似度的推荐算法中用户U对资源d i的初始偏好程度;
P(U,d i)表示在公式(2)的算法下用户U对资源d i的最终偏好程度。
进一步地还用于执行:所述用户行为日志文件由用户端储存到基 于分布式文件存储的数据库中。
进一步地还用于执行:所述根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,包括:对用户行为日志文件中的字段进行识别切分,去除所述用户行为日志文件中不合法的记录,根据统计需求,提取特征信息。
其中,所述特征信息包括:用户的个人特征:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;显性的用户行为特征:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;隐性用户行为特征:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
应当理解,在本发明实施例中,所称处理器301可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
输入设备302可以包括触控板、指纹采传感器(用于采集用户的指纹信息和指纹的方向信息)、麦克风等,输出设备303可以包括显示器(LCD等)、扬声器等。
该存储器304可以包括只读存储器和随机存取存储器,并向处理器301提供指令和数据。存储器304的一部分还可以包括非易失性随机存取存储器。例如,存储器304还可以存储设备类型的信息。
具体实现中,本发明实施例中所描述的处理器301、输入设备302、输出设备303可执行本发明实施例提供的一种参数调整方法的个实施例中所描述的实现方式,也可执行本发明实施例所描述的终端300的实现方式,在此不再赘述。
在本发明的另一实施例中提供一种计算机可读存储介质,所述计 算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现:
接收用户端上传的用户行为日志文件;将所述用户行为日志文件转存到Hadoop平台上,并根据所述Hadoop平台的HDFS特性对用户行为日志文件进行分布式存储备份;根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,得到过滤后的数据;通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,将所述计算结果存储至数据库中作为推荐结果;若接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
所述通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,包括:利用融合了基于内容的推荐算法和基于混合协同过滤推荐算法的公式(1),计算用户U对资源d i的初始偏好程度P 1(U,d i):
Figure PCTCN2019104888-appb-000006
其中:
α=|P Cb(U,d i)-P Hcf(U,d i)|,α≥0,
β=|P Cb(U,d i)+P Hcf(U,d i)|,β≥0,
P Cb(U,d i)表示基于内容的推荐算法中用户U对资源d i的偏好程度;
P Hcf(U,d i)表示基于混合协同过滤推荐算法中的用户U对资源d i的偏好程度;
max{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最大的用户U对资源d i的偏好程度的最大值;
min{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最小的用户U对资源d i的偏好程度的最小值;
α代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好 程度的偏差值;
β代表基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的总偏好值;
P 1(U,d i)表示在公式(1)的算法下用户U对资源d i的初始偏好程度。
利用公式(2)计算用户U对资源d i的最终偏好程度P(U,d i),将用户U对资源d i的最终偏好程度最高的资源d i作为计算结果:
P(U,d i)=e -w×P u(U,d i)+(1-e -w)*P 1(U,d i)
公式(2)
其中:w∝t,t表示用户历史行为记录条数;
P u(U,d i)表示基于用户信息相似度的推荐算法中用户U对资源d i的初始偏好程度;
P(U,d i)表示在公式(2)的算法下用户U对资源d i的最终偏好程度。
所述方法还包括:所述用户行为日志文件由用户端储存到基于分布式文件存储的数据库中。
所述根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,包括:对用户行为日志文件中的字段进行识别切分,去除所述用户行为日志文件中不合法的记录,根据统计需求,提取特征信息。
其中,所述特征信息包括:用户的个人特征:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;显性的用户行为特征:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;隐性用户行为特征:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
所述计算机可读存储介质可以是前述任一实施例所述的终端的内部存储单元,例如终端的硬盘或内存。所述计算机可读存储介质也可以是所述终端的外部存储设备,例如所述终端上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD) 卡,闪存卡(Flash Card)等。进一步地,所述计算机可读存储介质还可以既包括所述终端的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算机程序以及所述终端所需的其他程序和数据。所述计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的终端和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本发明所提供的几个实施例中,应该理解到,所揭露的终端和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本发明实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上 单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详细描述的部分,可以参见其他实施例的相关描述。
以上所述,为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。

Claims (9)

  1. 一种在线教育系统的个性化推荐方法,其特征在于,包括以下步骤:
    接收用户端上传的用户行为日志文件;
    将所述用户行为日志文件转存到Hadoop平台上,并根据所述Hadoop平台的HDFS特性对用户行为日志文件进行分布式存储备份;
    根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,得到过滤后的数据;
    通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,将所述计算结果存储至数据库中作为推荐结果;
    若接收到用户端请求推荐的触发信号,则从数据库中调取推荐结果发送给用户端。
  2. 如权利要求1所述的在线教育系统的个性化推荐方法,其特征在于,所述通过Mahout提取过滤后的数据,利用所述Mahout对所述过滤后的数据进行计算,得到计算结果,包括:
    利用融合了基于内容的推荐算法和基于混合协同过滤推荐算法的公式(1),计算用户U对资源d i的初始偏好程度P 1(U,d i):
    Figure PCTCN2019104888-appb-100001
    其中:
    α=|P Cb(U,d i)-P Hcf(U,d i)|,α≥0,
    β=|P Cb(U,d i)+P Hcf(U,d i)|,β≥0,
    P Cb(U,d i)表示基于内容的推荐算法中用户U对资源d i的偏好程度;
    P Hcf(U,d i)表示基于混合协同过滤推荐算法中的用户U对资源d i的偏好程度;
    max{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最大的用户U对资源d i的偏好程度的最大值;
    min{P Cb(U,d i),P Hcf(U,d i)}表示,取两种算法下最小的用户U对资源d i的偏好程度的最小值;
    α代表在基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的偏差值;
    β代表基于内容和混合协同过滤算法下用户U对资源d i的偏好程度的总偏好值;
    P 1(U,d i)表示在公式(1)的算法下用户U对资源d i的初始偏好程度。
  3. 如权利要求2所述的在线教育系统的个性化推荐方法,其特征在于,还包括:
    利用公式(2)计算用户U对资源d i的最终偏好程度P(U,d i),将用户U对资源d i的最终偏好程度最高的资源d i作为计算结果:
    P(U,d i)=e -w×P u(U,d i)+(1-e -w)*P 1(U,d i)
    公式(2)
    其中:w∝t,t表示用户历史行为记录条数;
    P u(U,d i)表示基于用户信息相似度的推荐算法中用户U对资源d i的初始偏好程度;
    P(U,d i)表示在公式(2)的算法下用户U对资源d i的最终偏好程度。
  4. 如权利要求3所述的在线教育系统的个性化推荐方法,其特征在于,所述方法还包括:
    所述用户行为日志文件由用户端储存到基于分布式文件存储的数据库中。
  5. 如权利要求1所述的在线教育系统的个性化推荐方法,其特征在于,所述根据所述Hadoop平台的分布式计算框架对所述用户行为日志文件进行离线预处理,包括:
    对用户行为日志文件中的字段进行识别切分,去除所述用户行为 日志文件中不合法的记录,根据统计需求,提取特征信息。
  6. 如权利要求5所述的在线教育系统的个性化推荐方法,其特征在于,所述特征信息包括:
    用户的个人特征:学历、专业、职业、年龄、性别、性格、兴趣、未来学习计划;
    显性的用户行为特征:用户评分反馈、下载资源、做题记录、搜索课程资源、与课程互动次数、每次互动时间、系统在线时长;
    隐性用户行为特征:页面停留时间、页面访问次数、鼠标移动次数、滚动条滚动次数。
  7. 一种终端,其特征在于,包括:用于执行如权利要求1-6任一项所述的方法的单元。
  8. 一种终端,该终端包括处理器、输入设备、输出设备和存储器,所述处理器、输入设备、输出设备和存储器相互连接,其特征在于,所述存储器用于存储支持终端执行如权利要求1-6任一项所述的方法的应用程序代码,所述处理器被配置用于执行如权利要求1-6任一项所述的方法。
  9. 一种计算机可读存储介质,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-6任一项所述的方法。
PCT/CN2019/104888 2019-05-29 2019-09-09 在线教育系统的个性化推荐方法、终端及存储介质 WO2020237898A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910455421.6 2019-05-29
CN201910455421.6A CN110276018A (zh) 2019-05-29 2019-05-29 在线教育系统的个性化推荐方法、终端及存储介质

Publications (1)

Publication Number Publication Date
WO2020237898A1 true WO2020237898A1 (zh) 2020-12-03

Family

ID=67960151

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/104888 WO2020237898A1 (zh) 2019-05-29 2019-09-09 在线教育系统的个性化推荐方法、终端及存储介质

Country Status (2)

Country Link
CN (1) CN110276018A (zh)
WO (1) WO2020237898A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177181A (zh) * 2021-06-29 2021-07-27 长沙豆芽文化科技有限公司 基于交互定制计划的在线教学信息推送方法及系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292212A (zh) * 2020-03-04 2020-06-16 湖北文理学院 个性化思政教育系统
CN112559873B (zh) * 2020-12-21 2021-08-13 融易学控股(深圳)有限公司 一种基于智慧教育的用户推荐系统
CN113065060B (zh) * 2021-02-18 2022-11-29 山东师范大学 基于深度学习的教育平台课程推荐方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886487A (zh) * 2014-03-28 2014-06-25 焦点科技股份有限公司 基于分布式的b2b平台的个性化推荐方法与系统
CN104021483A (zh) * 2014-06-26 2014-09-03 陈思恩 旅客需求推荐方法
CN106982150A (zh) * 2017-03-27 2017-07-25 重庆邮电大学 一种基于Hadoop的移动互联网用户行为分析方法
CN107169572A (zh) * 2016-12-23 2017-09-15 福州大学 一种基于Mahout的机器学习服务组装方法
US10163061B2 (en) * 2015-06-18 2018-12-25 International Business Machines Corporation Quality-directed adaptive analytic retraining
CN109670116A (zh) * 2018-11-30 2019-04-23 内江亿橙网络科技有限公司 一种基于大数据的智能推荐系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4743740B2 (ja) * 1999-07-16 2011-08-10 マイクロソフト インターナショナル ホールディングス ビー.ブイ. 自動化された代替コンテンツ推奨を作成する方法及びシステム
CN106874522A (zh) * 2017-03-29 2017-06-20 珠海习悦信息技术有限公司 信息推荐方法、装置、存储介质及处理器

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886487A (zh) * 2014-03-28 2014-06-25 焦点科技股份有限公司 基于分布式的b2b平台的个性化推荐方法与系统
CN104021483A (zh) * 2014-06-26 2014-09-03 陈思恩 旅客需求推荐方法
US10163061B2 (en) * 2015-06-18 2018-12-25 International Business Machines Corporation Quality-directed adaptive analytic retraining
CN107169572A (zh) * 2016-12-23 2017-09-15 福州大学 一种基于Mahout的机器学习服务组装方法
CN106982150A (zh) * 2017-03-27 2017-07-25 重庆邮电大学 一种基于Hadoop的移动互联网用户行为分析方法
CN109670116A (zh) * 2018-11-30 2019-04-23 内江亿橙网络科技有限公司 一种基于大数据的智能推荐系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177181A (zh) * 2021-06-29 2021-07-27 长沙豆芽文化科技有限公司 基于交互定制计划的在线教学信息推送方法及系统

Also Published As

Publication number Publication date
CN110276018A (zh) 2019-09-24

Similar Documents

Publication Publication Date Title
WO2020237898A1 (zh) 在线教育系统的个性化推荐方法、终端及存储介质
US10057349B2 (en) Data stream consolidation in a social networking system for near real-time analysis
CN110781321B (zh) 一种多媒体内容推荐方法及装置
US10776885B2 (en) Mutually reinforcing ranking of social media accounts and contents
US20150032492A1 (en) Methods of Identifying Relevant Content and Subject Matter Expertise for Online Communities
CN110223186B (zh) 用户相似度确定方法以及信息推荐方法
US20140101134A1 (en) System and method for iterative analysis of information content
US20140149583A1 (en) Social network forensic apparatus and method for analyzing sns data using the apparatus
CN111159341B (zh) 基于用户投资理财偏好的资讯推荐方法及装置
US20180046628A1 (en) Ranking social media content
CN107809370B (zh) 用户推荐方法及装置
Liu et al. QA document recommendations for communities of question–answering websites
CN111104590A (zh) 信息推荐方法、装置、介质及电子设备
CN111429161B (zh) 特征提取方法、特征提取装置、存储介质及电子设备
CN113254696B (zh) 一种封面图像获取方法及装置
WO2022095661A1 (zh) 推荐模型的更新方法、装置、计算机设备和存储介质
KR101780237B1 (ko) 온라인 상에 공개된 질의응답 데이터를 기초로 한 사용자 질의에 대한 응답 방법 및 장치
CN112839063A (zh) 消息推送方法、消息显示方法、存储介质和计算机设备
US9292515B1 (en) Using follow-on search behavior to measure the effectiveness of online video ads
CN115131052A (zh) 一种数据处理方法、计算机设备和存储介质
CN117312657A (zh) 金融应用的异常功能定位方法、装置、计算机设备和介质
WO2023087933A1 (zh) 内容推荐方法、装置、设备、存储介质及程序产品
Wu et al. Affective contextual mobile recommender system
Kalatzis et al. Social media and google trends in support of audience analytics: Methodology and architecture
US10853820B2 (en) Method and apparatus for recommending topic-cohesive and interactive implicit communities in social customer relationship management

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19930195

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19930195

Country of ref document: EP

Kind code of ref document: A1