CN109033202A - A kind of book recommendation method and system based on Apriori algorithm - Google Patents
A kind of book recommendation method and system based on Apriori algorithm Download PDFInfo
- Publication number
- CN109033202A CN109033202A CN201810696747.3A CN201810696747A CN109033202A CN 109033202 A CN109033202 A CN 109033202A CN 201810696747 A CN201810696747 A CN 201810696747A CN 109033202 A CN109033202 A CN 109033202A
- Authority
- CN
- China
- Prior art keywords
- module
- data
- book
- apriori algorithm
- association rule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Abstract
The invention discloses a kind of book recommendation method based on Apriori algorithm, include the following steps: to obtain the book borrowing and reading historical data in database, and be deposited into distributed file system;Rule digging is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm, obtains specific Strong association rule;By the Strong association rule storage of acquisition to data storing platform;User interface form is converted by the Strong association rule being stored in data storing platform.For the present invention according to the book borrowing and reading historical data of user, analysis user borrows the relevance between books, and then recommends the books of similar features for user, effectively increases the accuracy rate of book recommendation.
Description
Technical field
The present invention relates to big data distributed treatment and the field of data mining, are specifically related to a kind of based on Apriori calculation
The book recommendation method and system of method.
Background technique
With the continuous enhancing of social informatization degree, libraries of the universities tend to digitize, and generated data present quick-fried
Fried formula increases.Although existing library book administrative skill alleviates this problem to a certain extent, to these magnanimity
In terms of user borrows being effectively treated and provide good search recommendation service for user of book data, however it remains it is serious
Deficiency.How to be the recommendation pertinent texts of reader promptly and accurately, reduces reader to the complicated process of book retrieval, efficiently
Accurate book recommendation system is come into being.
Currently, Books in University Library recommender system mostly uses greatly collaborative filtering, by finding arest neighbors, carries out books and push away
It recommends, which can have the problems such as such as expansibility is poor.
In consideration of it, it is necessory to be improved to existing book recommendation method and book recommendation system, to solve above-mentioned ask
Topic.
Summary of the invention
The purpose of the present invention is to provide a kind of book recommendation methods based on Apriori algorithm.
To achieve the goals above, the present invention adopts the following technical scheme: a kind of book recommendation based on Apriori algorithm
Method includes the following steps:
Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system;
Step 2: rule is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm
It excavates, obtains specific Strong association rule;
Step 3: by the Strong association rule obtained in step 2 storage to data storing platform;
Step 4: converting user interface form for the Strong association rule being stored in data storing platform.
Technical solution as a further improvement of that present invention, the Apriori algorithm be based on distributed computing framework simultaneously
The Apriori algorithm that multiple reduction programs are arranged in reduction stages to be calculated.
Technical solution as a further improvement of that present invention, the data storing platform are HBase database.
The book recommendation system based on Apriori algorithm that the object of the invention is also to provide a kind of.
To achieve the goals above, the present invention adopts the following technical scheme: a kind of book recommendation based on Apriori algorithm
System, including book borrowing and reading historical data module, data preprocessing module, association rule mining module and book recommendation module;
The book borrowing and reading historical data module is used to store the book borrowing and reading historical data of user;
The data preprocessing module is used to carry out data cleansing and data format to the book borrowing and reading historical data of user
Conversion;
The association rule mining module is based on Apriori algorithm and is associated rule to user's book borrowing and reading historical data
It excavates, and obtains Strong association rule;
The book recommendation module turns the Strong association rule for interacting with the association rule mining module
It is changed to user interface form.
Technical solution as a further improvement of that present invention, the data preprocessing module are interacted with database, for pair
The book borrowing and reading historical data of user carries out data cleansing, and converts the book borrowing and reading historical data of user to and be appropriate for closing
Join the data format of rule digging.
Technical solution as a further improvement of that present invention, the association rule mining module is by pruning module, distribution
Computing module and distributed storage module composition.
Technical solution as a further improvement of that present invention, the pruning module are used for the data set range to algorithm iteration
Carry out beta pruning.
Technical solution as a further improvement of that present invention, the distributed computing module are used for the data prediction
The data that module obtains are associated rule digging on a distributed.
Technical solution as a further improvement of that present invention, the distributed storage module calculate mould for distributed storage
The Strong association rule that block obtains.
Technical solution as a further improvement of that present invention, the book recommendation module are used for and the distributed storage mould
Block interacts, and the Strong association rule being stored in the distributed storage module is converted to user interface form.
The beneficial effects of the present invention are: book borrowing and reading historical data of the present invention according to user, analysis user borrows books
Between relevance, and then for user recommend similar features books, effectively increase the accuracy rate of book recommendation.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow diagrams of the book recommendation method of Apriori algorithm.
Fig. 2 is that the present invention is based on the structural schematic diagrams of the book recommendation system of Apriori algorithm.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and detailed description
The present invention will be described in detail.
As shown in Figure 1, the present invention provides a kind of book recommendation sides for being based on Apriori algorithm (association rule algorithm)
Method mainly includes the following steps:
Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system;
Step 2: rule is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm
It excavates, obtains specific Strong association rule;
Step 3: by the Strong association rule obtained in step 2 storage to data storing platform;
Step 4: converting user interface form for the Strong association rule being stored in data storing platform.
Specifically, firstly, user logs in book recommendation system, the book recommendation system can be read in real time is stored in number
It according to the book borrowing and reading historical data of the user in library, and is deposited into distributed file system (HDFS), is counted as distribution
Calculate the data source of frame (MapReduce);The database is generally traditional Relational DataBase.
Then, the book borrowing and reading that improved Apriori algorithm program reads magnanimity from distributed file system is gone through
History data, and other resources of required by task are obtained, it is associated rule digging.Beta pruning is based on to original Apriori algorithm first
Strategy and distributed computing framework improve;Improved Apriori algorithm is based on the continuous reduced scanning of the Pruning strategy
The range of data set;Then, data set is divided into based on distributed computing framework by different data blocks, mapping phase (map)
In each mapping function be responsible for handling different data blocks, convert<key, value>key-value pair for input data, key value is
The subset of data set, value are the corresponding support of the subset.After mapping phase (map) terminates, it will execute scramble
(Shuffle) program, scramble (Shuffle) program be responsible for by key value it is identical<key, value>key-value pair closed
And shaped like < key, < value1,value2,…,valuem> >, is simultaneously ranked up with key value size, the scramble (Shuffle)
EP (end of program).Then, multiple reduction (reduce) program is executed, instead of reduction single in traditional distributed Computational frame
(reduce) program, input of the output result of scramble (Shuffle) program as reduction (reduce) program, Mei Gegui
About (reduce) program is responsible for calculating the sum of value set intermediate value, each < key, < value1,value2,…,valuem> > key
Value pair, obtained result is
<key, value>, whereinThen user is schemed with improved Apriori algorithm
Book Borrowing History data are associated rule digging, obtain the Strong association rule between each item, provide promptly and accurately for user
Book recommendation.
Then, the Strong association rule is stored to data storing platform, is specifically as follows Hbase database.
Finally, the Strong association rule that will be stored in data storing platform, is presented to the user by Web page form, is made
It obtains user and obtains more intuitive recommendation results.
The present invention, which uses, is based on distributed computing framework (MapReduce), and multiple reduction programs are arranged in reduction stages
The improved Apriori algorithm for carrying out calculation processing is associated rule digging, by the improved Apriori algorithm application of parallelization
In book recommendation, it is more applicable for mass data excavation, book recommendation accuracy is effectively improved, to meet Modern Library data
The demand of amount sharply increased.Finally obtained Result is stored in data storing platform, realizes persistent storage.
As shown in Fig. 2, the present invention also provides a kind of book recommendation system based on Apriori algorithm, including books are borrowed
Read historical data module, data preprocessing module, association rule mining module and book recommendation module.
The function of book borrowing and reading historical data module is: for storing the book borrowing and reading historical data of user.
The function of data preprocessing module is: firstly, using the book borrowing and reading historical data of user as input data, to original
Beginning data are cleaned, reject it is some it is duplicate, format apparent error borrow record, significantly more efficient can be provided for user
The book recommendation of high quality;Then, data format conversion is carried out to the data after cleaning, converts data to and is appropriate for being associated with
The data format of rule digging, it may be assumed that each is borrowed into record as a row data, and every row data are numbered
TID。
Association rule mining module is by pruning module, distributed computing module and distributed storage module composition, in which:
Pruning module: beta pruning is carried out for the data set range to algorithm iteration.Utilize an important spy of Pruning strategy
Property --- antimonotone does not need then to be iterated the item collection comprising this, energy once certain items are marked as non-frequent
Iteration ranges are enough effectively reduced, reduce the quantity of candidate.
The data that distributed computing module is used to obtain the data preprocessing module on a distributed into
Row association rule mining obtains Strong association rule.
Distributed storage module realizes data for storing Strong association rule obtained in the distributed computing module
Colleges and universities' persistent storage.
The function of book recommendation module is: interacting with the association rule mining module, specifically, with the distribution
Memory module interacts, and the Strong association rule being stored in the distributed storage module is visualized, and be in
In present Web page, allow users to more intuitive see book recommendation result.
In conclusion Apriori algorithm has its unique advantage in the field of data mining, using more especially in recommender system
To be extensive, big data processing and storage platform are applied to book recommendation by the present invention, are based primarily upon Pruning strategy and distributed meter
It calculates frame to improve original Apriori algorithm, and improved Apriori algorithm is applied to book recommendation of the invention
System carries out user's book borrowing and reading association rule mining, can be accurately much sooner user's Recommended Books, effectively increase
The accuracy rate of book recommendation.
Above embodiments are merely to illustrate the present invention and not limit the technical scheme described by the invention, to this specification
Understanding should based on person of ordinary skill in the field, although this specification referring to the above embodiments to the present invention
Detailed description is had been carried out, still, those skilled in the art should understand that, person of ordinary skill in the field is still
Can so modify or equivalently replace the present invention, and all do not depart from the spirit and scope of the present invention technical solution and
It is improved, and should all be covered in scope of the presently claimed invention.
Claims (10)
1. a kind of book recommendation method based on Apriori algorithm, characterized by the following steps:
Step 1: obtaining the book borrowing and reading historical data in database, and be deposited into distributed file system;
Step 2: regular digging is associated to the book borrowing and reading historical data in distributed file system based on Apriori algorithm
Pick, obtains specific Strong association rule;
Step 3: by the Strong association rule obtained in step 2 storage to data storing platform;
Step 4: converting user interface form for the Strong association rule being stored in data storing platform.
2. the book recommendation method according to claim 1 based on Apriori algorithm, it is characterised in that: the Apriori
Algorithm is the Apriori algorithm that multiple reduction programs are arranged based on distributed computing framework and in reduction stages to be calculated.
3. the book recommendation method according to claim 1 based on Apriori algorithm, it is characterised in that: the data are deposited
Storage platform is HBase database.
4. a kind of book recommendation system based on Apriori algorithm, it is characterised in that: including book borrowing and reading historical data module,
Data preprocessing module, association rule mining module and book recommendation module;
The book borrowing and reading historical data module is used to store the book borrowing and reading historical data of user;
The data preprocessing module is used to carry out data cleansing and data format conversion to the book borrowing and reading historical data of user;
The association rule mining module is used to be associated rule based on book borrowing and reading historical data of the Apriori algorithm to user
It then excavates, and obtains Strong association rule;
The book recommendation module is converted to the Strong association rule for interacting with the association rule mining module
User interface form.
5. the book recommendation system according to claim 4 based on Apriori algorithm, it is characterised in that: the data are pre-
Processing module is interacted with database, carries out data cleansing for the book borrowing and reading historical data to user, and by the books of user
Borrowing History data are converted into the data format for being appropriate for association rule mining.
6. the book recommendation system according to claim 4 based on Apriori algorithm, it is characterised in that: the association rule
Module is then excavated by pruning module, distributed computing module and distributed storage module composition.
7. the book recommendation system according to claim 6 based on Apriori algorithm, it is characterised in that: the beta pruning mould
Block is used to carry out beta pruning to the data set range of algorithm iteration.
8. the book recommendation system according to claim 6 based on Apriori algorithm, it is characterised in that: the distribution
The data that computing module is used to obtain data preprocessing module are associated rule digging on a distributed, and obtain
To Strong association rule.
9. the book recommendation system according to claim 8 based on Apriori algorithm, it is characterised in that: the distribution
Memory module is for storing the Strong association rule that the distributed computing module obtains.
10. the book recommendation system according to claim 9 based on Apriori algorithm, it is characterised in that: the books push away
It recommends module to be used to interact with the distributed storage module, the strong pass that will be stored in the distributed storage module
Connection rule is converted to user interface form.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810696747.3A CN109033202A (en) | 2018-06-29 | 2018-06-29 | A kind of book recommendation method and system based on Apriori algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810696747.3A CN109033202A (en) | 2018-06-29 | 2018-06-29 | A kind of book recommendation method and system based on Apriori algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109033202A true CN109033202A (en) | 2018-12-18 |
Family
ID=65520882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810696747.3A Pending CN109033202A (en) | 2018-06-29 | 2018-06-29 | A kind of book recommendation method and system based on Apriori algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109033202A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339427A (en) * | 2020-03-23 | 2020-06-26 | 卓尔智联(武汉)研究院有限公司 | Book information recommendation method, device and system and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102945240A (en) * | 2012-09-11 | 2013-02-27 | 杭州斯凯网络科技有限公司 | Method and device for realizing association rule mining algorithm supporting distributed computation |
-
2018
- 2018-06-29 CN CN201810696747.3A patent/CN109033202A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102945240A (en) * | 2012-09-11 | 2013-02-27 | 杭州斯凯网络科技有限公司 | Method and device for realizing association rule mining algorithm supporting distributed computation |
Non-Patent Citations (1)
Title |
---|
郝天曙: "基于Hadoop的并行数据挖掘的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339427A (en) * | 2020-03-23 | 2020-06-26 | 卓尔智联(武汉)研究院有限公司 | Book information recommendation method, device and system and storage medium |
CN111339427B (en) * | 2020-03-23 | 2022-12-20 | 卓尔智联(武汉)研究院有限公司 | Book information recommendation method, device and system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104166668B (en) | News commending system and method based on FOLFM models | |
CN107193967A (en) | A kind of multi-source heterogeneous industry field big data handles full link solution | |
CN110457577B (en) | Data processing method, device, equipment and computer storage medium | |
CN109544316B (en) | Method and system for urging collection of real-time case division according to proportion | |
Sanz-Cruzado et al. | A simple multi-armed nearest-neighbor bandit for interactive recommendation | |
CN108427891A (en) | Neighborhood based on difference secret protection recommends method | |
Ahmed et al. | A literature review on NoSQL database for big data processing | |
CN108446408A (en) | A kind of short text method of abstracting based on PageRank | |
CN102479217A (en) | Method and device for realizing computation balance in distributed data warehouse | |
CN107623639A (en) | Data flow distribution similarity join method based on EMD distances | |
CN110472004A (en) | A kind of method and system of scientific and technological information data multilevel cache management | |
JP2019212243A (en) | Learning identification device and learning identification method | |
CN101526960B (en) | support vector data description shell algorithm | |
Sun et al. | Learned index: A comprehensive experimental evaluation | |
CN109033202A (en) | A kind of book recommendation method and system based on Apriori algorithm | |
Liang et al. | BasicTS: An Open Source Fair Multivariate Time Series Prediction Benchmark | |
Yin et al. | Scalable distributed belief propagation with prioritized block updates | |
CN108717445A (en) | A kind of online social platform user interest recommendation method based on historical data | |
CN112383828B (en) | Quality of experience prediction method, equipment and system with brain-like characteristics | |
CN112287674A (en) | Method and system for identifying homonymous large nodes among enterprises, electronic equipment and storage medium | |
Steinert et al. | Where to begin? Using network analytics for the recommendation of scientific papers | |
Qiu et al. | An embedded bandit algorithm based on agent evolution for cold-start problem | |
CN109739840A (en) | Data processing empty value method, apparatus and terminal device | |
CN109948720A (en) | A kind of hierarchy clustering method based on density | |
CN104504156B (en) | A kind of textstream methods of sampling based on compressive sensing theory |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181218 |