CN109033202A - A kind of book recommendation method and system based on Apriori algorithm - Google Patents

A kind of book recommendation method and system based on Apriori algorithm Download PDF

Info

Publication number
CN109033202A
CN109033202A CN201810696747.3A CN201810696747A CN109033202A CN 109033202 A CN109033202 A CN 109033202A CN 201810696747 A CN201810696747 A CN 201810696747A CN 109033202 A CN109033202 A CN 109033202A
Authority
CN
China
Prior art keywords
module
data
book
apriori algorithm
association rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810696747.3A
Other languages
Chinese (zh)
Inventor
程阳
章韵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201810696747.3A priority Critical patent/CN109033202A/en
Publication of CN109033202A publication Critical patent/CN109033202A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Abstract

The invention discloses a kind of book recommendation method based on Apriori algorithm, include the following steps: to obtain the book borrowing and reading historical data in database, and be deposited into distributed file system;Rule digging is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm, obtains specific Strong association rule;By the Strong association rule storage of acquisition to data storing platform;User interface form is converted by the Strong association rule being stored in data storing platform.For the present invention according to the book borrowing and reading historical data of user, analysis user borrows the relevance between books, and then recommends the books of similar features for user, effectively increases the accuracy rate of book recommendation.

Description

A kind of book recommendation method and system based on Apriori algorithm
Technical field
The present invention relates to big data distributed treatment and the field of data mining, are specifically related to a kind of based on Apriori calculation The book recommendation method and system of method.
Background technique
With the continuous enhancing of social informatization degree, libraries of the universities tend to digitize, and generated data present quick-fried Fried formula increases.Although existing library book administrative skill alleviates this problem to a certain extent, to these magnanimity In terms of user borrows being effectively treated and provide good search recommendation service for user of book data, however it remains it is serious Deficiency.How to be the recommendation pertinent texts of reader promptly and accurately, reduces reader to the complicated process of book retrieval, efficiently Accurate book recommendation system is come into being.
Currently, Books in University Library recommender system mostly uses greatly collaborative filtering, by finding arest neighbors, carries out books and push away It recommends, which can have the problems such as such as expansibility is poor.
In consideration of it, it is necessory to be improved to existing book recommendation method and book recommendation system, to solve above-mentioned ask Topic.
Summary of the invention
The purpose of the present invention is to provide a kind of book recommendation methods based on Apriori algorithm.
To achieve the goals above, the present invention adopts the following technical scheme: a kind of book recommendation based on Apriori algorithm Method includes the following steps:
Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system;
Step 2: rule is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm It excavates, obtains specific Strong association rule;
Step 3: by the Strong association rule obtained in step 2 storage to data storing platform;
Step 4: converting user interface form for the Strong association rule being stored in data storing platform.
Technical solution as a further improvement of that present invention, the Apriori algorithm be based on distributed computing framework simultaneously The Apriori algorithm that multiple reduction programs are arranged in reduction stages to be calculated.
Technical solution as a further improvement of that present invention, the data storing platform are HBase database.
The book recommendation system based on Apriori algorithm that the object of the invention is also to provide a kind of.
To achieve the goals above, the present invention adopts the following technical scheme: a kind of book recommendation based on Apriori algorithm System, including book borrowing and reading historical data module, data preprocessing module, association rule mining module and book recommendation module;
The book borrowing and reading historical data module is used to store the book borrowing and reading historical data of user;
The data preprocessing module is used to carry out data cleansing and data format to the book borrowing and reading historical data of user Conversion;
The association rule mining module is based on Apriori algorithm and is associated rule to user's book borrowing and reading historical data It excavates, and obtains Strong association rule;
The book recommendation module turns the Strong association rule for interacting with the association rule mining module It is changed to user interface form.
Technical solution as a further improvement of that present invention, the data preprocessing module are interacted with database, for pair The book borrowing and reading historical data of user carries out data cleansing, and converts the book borrowing and reading historical data of user to and be appropriate for closing Join the data format of rule digging.
Technical solution as a further improvement of that present invention, the association rule mining module is by pruning module, distribution Computing module and distributed storage module composition.
Technical solution as a further improvement of that present invention, the pruning module are used for the data set range to algorithm iteration Carry out beta pruning.
Technical solution as a further improvement of that present invention, the distributed computing module are used for the data prediction The data that module obtains are associated rule digging on a distributed.
Technical solution as a further improvement of that present invention, the distributed storage module calculate mould for distributed storage The Strong association rule that block obtains.
Technical solution as a further improvement of that present invention, the book recommendation module are used for and the distributed storage mould Block interacts, and the Strong association rule being stored in the distributed storage module is converted to user interface form.
The beneficial effects of the present invention are: book borrowing and reading historical data of the present invention according to user, analysis user borrows books Between relevance, and then for user recommend similar features books, effectively increase the accuracy rate of book recommendation.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow diagrams of the book recommendation method of Apriori algorithm.
Fig. 2 is that the present invention is based on the structural schematic diagrams of the book recommendation system of Apriori algorithm.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and detailed description The present invention will be described in detail.
As shown in Figure 1, the present invention provides a kind of book recommendation sides for being based on Apriori algorithm (association rule algorithm) Method mainly includes the following steps:
Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system;
Step 2: rule is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm It excavates, obtains specific Strong association rule;
Step 3: by the Strong association rule obtained in step 2 storage to data storing platform;
Step 4: converting user interface form for the Strong association rule being stored in data storing platform.
Specifically, firstly, user logs in book recommendation system, the book recommendation system can be read in real time is stored in number It according to the book borrowing and reading historical data of the user in library, and is deposited into distributed file system (HDFS), is counted as distribution Calculate the data source of frame (MapReduce);The database is generally traditional Relational DataBase.
Then, the book borrowing and reading that improved Apriori algorithm program reads magnanimity from distributed file system is gone through History data, and other resources of required by task are obtained, it is associated rule digging.Beta pruning is based on to original Apriori algorithm first Strategy and distributed computing framework improve;Improved Apriori algorithm is based on the continuous reduced scanning of the Pruning strategy The range of data set;Then, data set is divided into based on distributed computing framework by different data blocks, mapping phase (map) In each mapping function be responsible for handling different data blocks, convert<key, value>key-value pair for input data, key value is The subset of data set, value are the corresponding support of the subset.After mapping phase (map) terminates, it will execute scramble (Shuffle) program, scramble (Shuffle) program be responsible for by key value it is identical<key, value>key-value pair closed And shaped like < key, < value1,value2,…,valuem> >, is simultaneously ranked up with key value size, the scramble (Shuffle) EP (end of program).Then, multiple reduction (reduce) program is executed, instead of reduction single in traditional distributed Computational frame (reduce) program, input of the output result of scramble (Shuffle) program as reduction (reduce) program, Mei Gegui About (reduce) program is responsible for calculating the sum of value set intermediate value, each < key, < value1,value2,…,valuem> > key Value pair, obtained result is
<key, value>, whereinThen user is schemed with improved Apriori algorithm Book Borrowing History data are associated rule digging, obtain the Strong association rule between each item, provide promptly and accurately for user Book recommendation.
Then, the Strong association rule is stored to data storing platform, is specifically as follows Hbase database.
Finally, the Strong association rule that will be stored in data storing platform, is presented to the user by Web page form, is made It obtains user and obtains more intuitive recommendation results.
The present invention, which uses, is based on distributed computing framework (MapReduce), and multiple reduction programs are arranged in reduction stages The improved Apriori algorithm for carrying out calculation processing is associated rule digging, by the improved Apriori algorithm application of parallelization In book recommendation, it is more applicable for mass data excavation, book recommendation accuracy is effectively improved, to meet Modern Library data The demand of amount sharply increased.Finally obtained Result is stored in data storing platform, realizes persistent storage.
As shown in Fig. 2, the present invention also provides a kind of book recommendation system based on Apriori algorithm, including books are borrowed Read historical data module, data preprocessing module, association rule mining module and book recommendation module.
The function of book borrowing and reading historical data module is: for storing the book borrowing and reading historical data of user.
The function of data preprocessing module is: firstly, using the book borrowing and reading historical data of user as input data, to original Beginning data are cleaned, reject it is some it is duplicate, format apparent error borrow record, significantly more efficient can be provided for user The book recommendation of high quality;Then, data format conversion is carried out to the data after cleaning, converts data to and is appropriate for being associated with The data format of rule digging, it may be assumed that each is borrowed into record as a row data, and every row data are numbered TID。
Association rule mining module is by pruning module, distributed computing module and distributed storage module composition, in which:
Pruning module: beta pruning is carried out for the data set range to algorithm iteration.Utilize an important spy of Pruning strategy Property --- antimonotone does not need then to be iterated the item collection comprising this, energy once certain items are marked as non-frequent Iteration ranges are enough effectively reduced, reduce the quantity of candidate.
The data that distributed computing module is used to obtain the data preprocessing module on a distributed into Row association rule mining obtains Strong association rule.
Distributed storage module realizes data for storing Strong association rule obtained in the distributed computing module Colleges and universities' persistent storage.
The function of book recommendation module is: interacting with the association rule mining module, specifically, with the distribution Memory module interacts, and the Strong association rule being stored in the distributed storage module is visualized, and be in In present Web page, allow users to more intuitive see book recommendation result.
In conclusion Apriori algorithm has its unique advantage in the field of data mining, using more especially in recommender system To be extensive, big data processing and storage platform are applied to book recommendation by the present invention, are based primarily upon Pruning strategy and distributed meter It calculates frame to improve original Apriori algorithm, and improved Apriori algorithm is applied to book recommendation of the invention System carries out user's book borrowing and reading association rule mining, can be accurately much sooner user's Recommended Books, effectively increase The accuracy rate of book recommendation.
Above embodiments are merely to illustrate the present invention and not limit the technical scheme described by the invention, to this specification Understanding should based on person of ordinary skill in the field, although this specification referring to the above embodiments to the present invention Detailed description is had been carried out, still, those skilled in the art should understand that, person of ordinary skill in the field is still Can so modify or equivalently replace the present invention, and all do not depart from the spirit and scope of the present invention technical solution and It is improved, and should all be covered in scope of the presently claimed invention.

Claims (10)

1. a kind of book recommendation method based on Apriori algorithm, characterized by the following steps:
Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system;
Step 2: regular digging is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm Pick, obtains specific Strong association rule;
Step 3: by the Strong association rule obtained in step 2 storage to data storing platform;
Step 4: converting user interface form for the Strong association rule being stored in data storing platform.
2. the book recommendation method according to claim 1 based on Apriori algorithm, it is characterised in that: the Apriori Algorithm is the Apriori algorithm that multiple reduction programs are arranged based on distributed computing framework and in reduction stages to be calculated.
3. the book recommendation method according to claim 1 based on Apriori algorithm, it is characterised in that: the data are deposited Storage platform is HBase database.
4. a kind of book recommendation system based on Apriori algorithm, it is characterised in that: including book borrowing and reading historical data module, Data preprocessing module, association rule mining module and book recommendation module;
The book borrowing and reading historical data module is used to store the book borrowing and reading historical data of user;
The data preprocessing module is used to carry out data cleansing and data format conversion to the book borrowing and reading historical data of user;
The association rule mining module is used to be associated rule based on book borrowing and reading historical data of the Apriori algorithm to user It then excavates, and obtains Strong association rule;
The book recommendation module is converted to the Strong association rule for interacting with the association rule mining module User interface form.
5. the book recommendation system according to claim 4 based on Apriori algorithm, it is characterised in that: the data are pre- Processing module is interacted with database, carries out data cleansing for the book borrowing and reading historical data to user, and by the books of user Borrowing History data are converted into the data format for being appropriate for association rule mining.
6. the book recommendation system according to claim 4 based on Apriori algorithm, it is characterised in that: the association rule Module is then excavated by pruning module, distributed computing module and distributed storage module composition.
7. the book recommendation system according to claim 6 based on Apriori algorithm, it is characterised in that: the beta pruning mould Block is used to carry out beta pruning to the data set range of algorithm iteration.
8. the book recommendation system according to claim 6 based on Apriori algorithm, it is characterised in that: the distribution The data that computing module is used to obtain data preprocessing module are associated rule digging on a distributed, and obtain To Strong association rule.
9. the book recommendation system according to claim 8 based on Apriori algorithm, it is characterised in that: the distribution Memory module is for storing the Strong association rule that the distributed computing module obtains.
10. the book recommendation system according to claim 9 based on Apriori algorithm, it is characterised in that: the books push away It recommends module to be used to interact with the distributed storage module, the strong pass that will be stored in the distributed storage module Connection rule is converted to user interface form.
CN201810696747.3A 2018-06-29 2018-06-29 A kind of book recommendation method and system based on Apriori algorithm Pending CN109033202A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810696747.3A CN109033202A (en) 2018-06-29 2018-06-29 A kind of book recommendation method and system based on Apriori algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810696747.3A CN109033202A (en) 2018-06-29 2018-06-29 A kind of book recommendation method and system based on Apriori algorithm

Publications (1)

Publication Number Publication Date
CN109033202A true CN109033202A (en) 2018-12-18

Family

ID=65520882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810696747.3A Pending CN109033202A (en) 2018-06-29 2018-06-29 A kind of book recommendation method and system based on Apriori algorithm

Country Status (1)

Country Link
CN (1) CN109033202A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339427A (en) * 2020-03-23 2020-06-26 卓尔智联(武汉)研究院有限公司 Book information recommendation method, device and system and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945240A (en) * 2012-09-11 2013-02-27 杭州斯凯网络科技有限公司 Method and device for realizing association rule mining algorithm supporting distributed computation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945240A (en) * 2012-09-11 2013-02-27 杭州斯凯网络科技有限公司 Method and device for realizing association rule mining algorithm supporting distributed computation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郝天曙: "基于Hadoop的并行数据挖掘的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339427A (en) * 2020-03-23 2020-06-26 卓尔智联(武汉)研究院有限公司 Book information recommendation method, device and system and storage medium
CN111339427B (en) * 2020-03-23 2022-12-20 卓尔智联(武汉)研究院有限公司 Book information recommendation method, device and system and storage medium

Similar Documents

Publication Publication Date Title
CN104166668B (en) News commending system and method based on FOLFM models
CN107193967A (en) A kind of multi-source heterogeneous industry field big data handles full link solution
CN110457577B (en) Data processing method, device, equipment and computer storage medium
CN109544316B (en) Method and system for urging collection of real-time case division according to proportion
Sanz-Cruzado et al. A simple multi-armed nearest-neighbor bandit for interactive recommendation
CN108427891A (en) Neighborhood based on difference secret protection recommends method
Ahmed et al. A literature review on NoSQL database for big data processing
CN108446408A (en) A kind of short text method of abstracting based on PageRank
CN102479217A (en) Method and device for realizing computation balance in distributed data warehouse
CN107623639A (en) Data flow distribution similarity join method based on EMD distances
CN110472004A (en) A kind of method and system of scientific and technological information data multilevel cache management
JP2019212243A (en) Learning identification device and learning identification method
CN101526960B (en) support vector data description shell algorithm
Sun et al. Learned index: A comprehensive experimental evaluation
CN109033202A (en) A kind of book recommendation method and system based on Apriori algorithm
Liang et al. BasicTS: An Open Source Fair Multivariate Time Series Prediction Benchmark
Yin et al. Scalable distributed belief propagation with prioritized block updates
CN108717445A (en) A kind of online social platform user interest recommendation method based on historical data
CN112383828B (en) Quality of experience prediction method, equipment and system with brain-like characteristics
CN112287674A (en) Method and system for identifying homonymous large nodes among enterprises, electronic equipment and storage medium
Steinert et al. Where to begin? Using network analytics for the recommendation of scientific papers
Qiu et al. An embedded bandit algorithm based on agent evolution for cold-start problem
CN109739840A (en) Data processing empty value method, apparatus and terminal device
CN109948720A (en) A kind of hierarchy clustering method based on density
CN104504156B (en) A kind of textstream methods of sampling based on compressive sensing theory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181218