CN109033202A

CN109033202A - A kind of book recommendation method and system based on Apriori algorithm

Info

Publication number: CN109033202A
Application number: CN201810696747.3A
Authority: CN
Inventors: 程阳; 章韵
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University; Nanjing University of Posts and Telecommunications
Priority date: 2018-06-29
Filing date: 2018-06-29
Publication date: 2018-12-18

Abstract

The invention discloses a kind of book recommendation method based on Apriori algorithm, include the following steps: to obtain the book borrowing and reading historical data in database, and be deposited into distributed file system；Rule digging is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm, obtains specific Strong association rule；By the Strong association rule storage of acquisition to data storing platform；User interface form is converted by the Strong association rule being stored in data storing platform.For the present invention according to the book borrowing and reading historical data of user, analysis user borrows the relevance between books, and then recommends the books of similar features for user, effectively increases the accuracy rate of book recommendation.

Description

A kind of book recommendation method and system based on Apriori algorithm

Technical field

The present invention relates to big data distributed treatment and the field of data mining, are specifically related to a kind of based on Apriori calculation The book recommendation method and system of method.

Background technique

With the continuous enhancing of social informatization degree, libraries of the universities tend to digitize, and generated data present quick-fried Fried formula increases.Although existing library book administrative skill alleviates this problem to a certain extent, to these magnanimity In terms of user borrows being effectively treated and provide good search recommendation service for user of book data, however it remains it is serious Deficiency.How to be the recommendation pertinent texts of reader promptly and accurately, reduces reader to the complicated process of book retrieval, efficiently Accurate book recommendation system is come into being.

Currently, Books in University Library recommender system mostly uses greatly collaborative filtering, by finding arest neighbors, carries out books and push away It recommends, which can have the problems such as such as expansibility is poor.

In consideration of it, it is necessory to be improved to existing book recommendation method and book recommendation system, to solve above-mentioned ask Topic.

Summary of the invention

The purpose of the present invention is to provide a kind of book recommendation methods based on Apriori algorithm.

To achieve the goals above, the present invention adopts the following technical scheme: a kind of book recommendation based on Apriori algorithm Method includes the following steps:

Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system；

Step 2: rule is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm It excavates, obtains specific Strong association rule；

Step 3: by the Strong association rule obtained in step 2 storage to data storing platform；

Step 4: converting user interface form for the Strong association rule being stored in data storing platform.

Technical solution as a further improvement of that present invention, the Apriori algorithm be based on distributed computing framework simultaneously The Apriori algorithm that multiple reduction programs are arranged in reduction stages to be calculated.

Technical solution as a further improvement of that present invention, the data storing platform are HBase database.

The book recommendation system based on Apriori algorithm that the object of the invention is also to provide a kind of.

To achieve the goals above, the present invention adopts the following technical scheme: a kind of book recommendation based on Apriori algorithm System, including book borrowing and reading historical data module, data preprocessing module, association rule mining module and book recommendation module；

The book borrowing and reading historical data module is used to store the book borrowing and reading historical data of user；

The data preprocessing module is used to carry out data cleansing and data format to the book borrowing and reading historical data of user Conversion；

The association rule mining module is based on Apriori algorithm and is associated rule to user's book borrowing and reading historical data It excavates, and obtains Strong association rule；

The book recommendation module turns the Strong association rule for interacting with the association rule mining module It is changed to user interface form.

Technical solution as a further improvement of that present invention, the data preprocessing module are interacted with database, for pair The book borrowing and reading historical data of user carries out data cleansing, and converts the book borrowing and reading historical data of user to and be appropriate for closing Join the data format of rule digging.

Technical solution as a further improvement of that present invention, the association rule mining module is by pruning module, distribution Computing module and distributed storage module composition.

Technical solution as a further improvement of that present invention, the pruning module are used for the data set range to algorithm iteration Carry out beta pruning.

Technical solution as a further improvement of that present invention, the distributed computing module are used for the data prediction The data that module obtains are associated rule digging on a distributed.

Technical solution as a further improvement of that present invention, the distributed storage module calculate mould for distributed storage The Strong association rule that block obtains.

Technical solution as a further improvement of that present invention, the book recommendation module are used for and the distributed storage mould Block interacts, and the Strong association rule being stored in the distributed storage module is converted to user interface form.

The beneficial effects of the present invention are: book borrowing and reading historical data of the present invention according to user, analysis user borrows books Between relevance, and then for user recommend similar features books, effectively increase the accuracy rate of book recommendation.

Detailed description of the invention

Fig. 1 is that the present invention is based on the flow diagrams of the book recommendation method of Apriori algorithm.

Fig. 2 is that the present invention is based on the structural schematic diagrams of the book recommendation system of Apriori algorithm.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and detailed description The present invention will be described in detail.

As shown in Figure 1, the present invention provides a kind of book recommendation sides for being based on Apriori algorithm (association rule algorithm) Method mainly includes the following steps:

Specifically, firstly, user logs in book recommendation system, the book recommendation system can be read in real time is stored in number It according to the book borrowing and reading historical data of the user in library, and is deposited into distributed file system (HDFS), is counted as distribution Calculate the data source of frame (MapReduce)；The database is generally traditional Relational DataBase.

Then, the book borrowing and reading that improved Apriori algorithm program reads magnanimity from distributed file system is gone through History data, and other resources of required by task are obtained, it is associated rule digging.Beta pruning is based on to original Apriori algorithm first Strategy and distributed computing framework improve；Improved Apriori algorithm is based on the continuous reduced scanning of the Pruning strategy The range of data set；Then, data set is divided into based on distributed computing framework by different data blocks, mapping phase (map) In each mapping function be responsible for handling different data blocks, convert<key, value>key-value pair for input data, key value is The subset of data set, value are the corresponding support of the subset.After mapping phase (map) terminates, it will execute scramble (Shuffle) program, scramble (Shuffle) program be responsible for by key value it is identical<key, value>key-value pair closed And shaped like < key, < value₁,value₂,…,value_m> >, is simultaneously ranked up with key value size, the scramble (Shuffle) EP (end of program).Then, multiple reduction (reduce) program is executed, instead of reduction single in traditional distributed Computational frame (reduce) program, input of the output result of scramble (Shuffle) program as reduction (reduce) program, Mei Gegui About (reduce) program is responsible for calculating the sum of value set intermediate value, each < key, < value₁,value₂,…,value_m> > key Value pair, obtained result is

<key, value>, whereinThen user is schemed with improved Apriori algorithm Book Borrowing History data are associated rule digging, obtain the Strong association rule between each item, provide promptly and accurately for user Book recommendation.

Then, the Strong association rule is stored to data storing platform, is specifically as follows Hbase database.

Finally, the Strong association rule that will be stored in data storing platform, is presented to the user by Web page form, is made It obtains user and obtains more intuitive recommendation results.

The present invention, which uses, is based on distributed computing framework (MapReduce), and multiple reduction programs are arranged in reduction stages The improved Apriori algorithm for carrying out calculation processing is associated rule digging, by the improved Apriori algorithm application of parallelization In book recommendation, it is more applicable for mass data excavation, book recommendation accuracy is effectively improved, to meet Modern Library data The demand of amount sharply increased.Finally obtained Result is stored in data storing platform, realizes persistent storage.

As shown in Fig. 2, the present invention also provides a kind of book recommendation system based on Apriori algorithm, including books are borrowed Read historical data module, data preprocessing module, association rule mining module and book recommendation module.

The function of book borrowing and reading historical data module is: for storing the book borrowing and reading historical data of user.

The function of data preprocessing module is: firstly, using the book borrowing and reading historical data of user as input data, to original Beginning data are cleaned, reject it is some it is duplicate, format apparent error borrow record, significantly more efficient can be provided for user The book recommendation of high quality；Then, data format conversion is carried out to the data after cleaning, converts data to and is appropriate for being associated with The data format of rule digging, it may be assumed that each is borrowed into record as a row data, and every row data are numbered TID。

Association rule mining module is by pruning module, distributed computing module and distributed storage module composition, in which:

Pruning module: beta pruning is carried out for the data set range to algorithm iteration.Utilize an important spy of Pruning strategy Property --- antimonotone does not need then to be iterated the item collection comprising this, energy once certain items are marked as non-frequent Iteration ranges are enough effectively reduced, reduce the quantity of candidate.

The data that distributed computing module is used to obtain the data preprocessing module on a distributed into Row association rule mining obtains Strong association rule.

Distributed storage module realizes data for storing Strong association rule obtained in the distributed computing module Colleges and universities' persistent storage.

The function of book recommendation module is: interacting with the association rule mining module, specifically, with the distribution Memory module interacts, and the Strong association rule being stored in the distributed storage module is visualized, and be in In present Web page, allow users to more intuitive see book recommendation result.

In conclusion Apriori algorithm has its unique advantage in the field of data mining, using more especially in recommender system To be extensive, big data processing and storage platform are applied to book recommendation by the present invention, are based primarily upon Pruning strategy and distributed meter It calculates frame to improve original Apriori algorithm, and improved Apriori algorithm is applied to book recommendation of the invention System carries out user's book borrowing and reading association rule mining, can be accurately much sooner user's Recommended Books, effectively increase The accuracy rate of book recommendation.

Above embodiments are merely to illustrate the present invention and not limit the technical scheme described by the invention, to this specification Understanding should based on person of ordinary skill in the field, although this specification referring to the above embodiments to the present invention Detailed description is had been carried out, still, those skilled in the art should understand that, person of ordinary skill in the field is still Can so modify or equivalently replace the present invention, and all do not depart from the spirit and scope of the present invention technical solution and It is improved, and should all be covered in scope of the presently claimed invention.

Claims

1. a kind of book recommendation method based on Apriori algorithm, characterized by the following steps:

Step 2: regular digging is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm Pick, obtains specific Strong association rule；

2. the book recommendation method according to claim 1 based on Apriori algorithm, it is characterised in that: the Apriori Algorithm is the Apriori algorithm that multiple reduction programs are arranged based on distributed computing framework and in reduction stages to be calculated.

3. the book recommendation method according to claim 1 based on Apriori algorithm, it is characterised in that: the data are deposited Storage platform is HBase database.

4. a kind of book recommendation system based on Apriori algorithm, it is characterised in that: including book borrowing and reading historical data module, Data preprocessing module, association rule mining module and book recommendation module；

The data preprocessing module is used to carry out data cleansing and data format conversion to the book borrowing and reading historical data of user；

The association rule mining module is used to be associated rule based on book borrowing and reading historical data of the Apriori algorithm to user It then excavates, and obtains Strong association rule；

The book recommendation module is converted to the Strong association rule for interacting with the association rule mining module User interface form.

5. the book recommendation system according to claim 4 based on Apriori algorithm, it is characterised in that: the data are pre- Processing module is interacted with database, carries out data cleansing for the book borrowing and reading historical data to user, and by the books of user Borrowing History data are converted into the data format for being appropriate for association rule mining.

6. the book recommendation system according to claim 4 based on Apriori algorithm, it is characterised in that: the association rule Module is then excavated by pruning module, distributed computing module and distributed storage module composition.

7. the book recommendation system according to claim 6 based on Apriori algorithm, it is characterised in that: the beta pruning mould Block is used to carry out beta pruning to the data set range of algorithm iteration.

8. the book recommendation system according to claim 6 based on Apriori algorithm, it is characterised in that: the distribution The data that computing module is used to obtain data preprocessing module are associated rule digging on a distributed, and obtain To Strong association rule.

9. the book recommendation system according to claim 8 based on Apriori algorithm, it is characterised in that: the distribution Memory module is for storing the Strong association rule that the distributed computing module obtains.

10. the book recommendation system according to claim 9 based on Apriori algorithm, it is characterised in that: the books push away It recommends module to be used to interact with the distributed storage module, the strong pass that will be stored in the distributed storage module Connection rule is converted to user interface form.