CN110619084B - Method for recommending books according to borrowing behaviors of library readers - Google Patents
Method for recommending books according to borrowing behaviors of library readers Download PDFInfo
- Publication number
- CN110619084B CN110619084B CN201910807495.1A CN201910807495A CN110619084B CN 110619084 B CN110619084 B CN 110619084B CN 201910807495 A CN201910807495 A CN 201910807495A CN 110619084 B CN110619084 B CN 110619084B
- Authority
- CN
- China
- Prior art keywords
- readers
- borrowing
- data
- books
- groups
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for recommending books according to borrowing behaviors of library readers, which comprises the following steps: step 1, acquiring a reader borrowing set as an excavation sample by combining a book management authority and interview through an access system which is mainly conducted by a pre-expert; step 2, preprocessing the data of the excavated sample; step 3, clustering according to book types, finding out reading commonalities among readers, and dividing the books into four groups by using a k-Means algorithm and taking book borrowing conditions as parameters, wherein each group represents a reading mode; and 4, building a model by utilizing a GRNN algorithm, training, inputting book borrowing information of readers, and outputting four clustered groups. The method for recommending books according to the borrowing behavior of the readers in the library can be used for recommending books to the personal digital library of the readers and guiding the readers to borrow books with strong correlation; hidden relationships between disciplines are found to optimize the reader discipline knowledge structure.
Description
Technical Field
The invention belongs to the technical field of book informatization management, and particularly relates to a method for recommending books according to library reader borrowing behaviors.
Background
The library has the advantages of culturing morals, improving literacy, optimizing class, expanding knowledge surface and the like, and is widely established in various universities. With the rapid development of society and the continuous improvement of the living standard of people, the book borrowing information level can not meet the demands of readers. In recent years, information technology has been developed at a high speed, meaning the arrival of a large data age, and data mining technology has been applied to various fields. The reader borrows the books to be information-rich, and valuable data in the books borrowing information is obtained by utilizing a data mining technology, so that a foundation is laid for building an intelligent library.
Data mining is a process of extracting and exploring knowledge from a large amount of data which looks irregular, and searching for knowledge hidden in deep layers. The common data mining method has the advantages of classification, clustering, regression analysis, association rules, deviation analysis, web page mining and the like, and has the advantages of high data utilization rate, good book classification effect and the like. However, the data mining technique has a number of disadvantages, such as the inability to learn access to it for a long period of time.
The neural network algorithm has the advantages of strong self-learning capability, high running speed, good self-adaptation performance and the like, and can solve the problems of complex environment information, unclear background knowledge and ambiguous reasoning rules. Therefore, on the basis of data mining, library readers can be recommended with books by combining with the GRNN algorithm.
Disclosure of Invention
The invention aims to provide a method for recommending books according to library reader borrowing behaviors, which can effectively recommend readers and enhance reading experience of readers.
The technical scheme adopted by the invention is as follows: a method for recommending books according to library reader borrowing behaviors comprises the following steps:
step 1, acquiring a reader borrowing set as an excavation sample by combining a book management authority and interview through an access system which is mainly conducted by a pre-expert;
step 2, preprocessing the data of the excavated sample;
step 3, clustering according to book types, finding out reading commonalities among readers, and dividing the books into four groups by using a k-Means algorithm and taking book borrowing conditions as parameters, wherein each group represents a reading mode;
and 4, building a model by utilizing a GRNN algorithm, training, inputting book borrowing information of readers, outputting four groups of clusters, and recommending books contained in the four output cluster groups respectively.
The present invention is also characterized in that,
step 1, taking reader borrowing data of a library as an object, using an MS SQL Server2008 as a basic framework of the data, and carrying out pre-processing on the data in cooperation with an Excel tool to obtain 1000 transaction data related to the reader borrowing set as an excavation sample.
The step 2 specifically comprises the following steps:
step 2.1: firstly, cleaning data of the excavated sample: finding and correcting identifiable errors in the data file, checking data consistency, and processing invalid values and missing values;
step 2.2: and then converting the cleaned data: the books borrowed by readers are divided into 22 categories according to the "middle drawing method", and the related book types are predicted through the borrowing of the readers on certain books;
step 2.3: finally, integrating the converted data: the history of borrowing books by readers is integrated into transaction data which accords with the data mining requirement, and each transaction represents one borrowing action of the readers and comprises a unique identifier and an integration of various books borrowed by the readers.
The step 3 specifically comprises the following steps:
step 3.1: clustering according to book types, randomly searching four data points as initial centroid of each class, and representing four finally divided groups;
step 3.2: searching the class most similar to each object according to the principle of nearest to the center, and dividing other objects into corresponding classes;
step 3.3: after the assignment of the objects is completed, for each class, the average of all the objects is found as the new "centroid" of the group;
step 3.4: according to the principle of nearest to the center, dividing all objects again;
step 3.5: returning to step 3.3 until all generated classes have not changed.
The GRNN model training step of the step 4 specifically comprises the following steps:
step 4.1: dividing 1000 groups of samples into 20 samples with the same number, selecting 19 samples for training, establishing a GRNN model, and taking the rest one sample, namely 50 groups of samples, as test samples for verification, and calculating an error mean square value of a test result and a true value;
step 4.2: setting the value interval of the smooth factor sigma as [0.01,1], and changing the step length to 0.01 and gradually increasing;
step 4.3: and (3) readjusting the training sample and the test sample, repeating the steps 4.1 and 4.2, and repeating the steps for 20 times to obtain 20 groups of error mean square values corresponding to each smoothing factor, and obtaining the average value of the 20 groups of error mean square values.
The beneficial effects of the invention are as follows: according to the method for recommending books according to library reader borrowing behaviors, readers are divided into a plurality of groups with similar attributes through a cluster analysis model, the groups have similar attributes, and the reading mode of the readers is mined through the analysis cluster model; the clustered readers are divided into different groups according to different attributes, the groups are provided with similar characteristics, GRNN algorithm analysis is carried out according to the characteristics, the relevance of book borrowing is obtained, effective recommendation is carried out on the readers, reading experience of the readers is enhanced, and book management efficiency and the intelligent level of a library are improved.
Drawings
FIG. 1 is a flowchart of a K-means algorithm in a method for book recommendation based on library reader borrowing behavior in accordance with the present invention;
FIG. 2 is a block diagram of the GRNN algorithm in the method of book recommendation based on library reader borrowing.
Detailed Description
The invention will be described in detail with reference to the accompanying drawings and detailed description.
The invention provides a method for recommending books according to library reader borrowing behaviors, which is implemented according to the following steps:
step 1, acquiring a reader borrowing set as an excavation sample by combining a book management authority and interview through an access system which is mainly conducted by a pre-expert.
The related data set takes library readers to borrow data as a study object, and uses MS SQL Server2008 as a basic framework of the data to perform pre-processing on the data in cooperation with an Excel tool. And (3) obtaining 1000 pieces of transaction data related to the borrowing set of readers by processing the original data, and mining a sample.
And 2, preprocessing such as screening, denoising and the like is needed to be carried out on the data before digging. The method is implemented according to the following steps:
(1) The data cleaning comprises the detection of null values and the pertinence and uniqueness of sample data, the discovery and correction of identifiable errors in a data file, the checking of data consistency, the processing of invalid values and missing values and the like.
(2) The data conversion divides books borrowed by readers into 22 categories according to the "middle-image method", and the related book types are predicted through borrowing of books of certain categories by readers.
(3) The integration of data is to integrate the history list of borrowing of readers into transaction data which accords with the data mining requirement, and each transaction represents one borrowing action of readers and comprises a unique identification and a set of various books borrowed by the readers.
And step 3, clustering according to the book type to find out reading commonalities among readers. The k-Means algorithm is utilized, book borrowing conditions are used as parameters to be divided into four groups, and each group represents a reading mode.
The clustering analysis of the k-Means algorithm is carried out, as shown in fig. 1, specifically according to the following method:
(1) Randomly finding four data points as the initial "centroid" of each class, and representing the four classes of groups that are ultimately divided;
(2) Searching the class most similar to each object according to the principle of nearest to the center, and dividing other objects into corresponding classes;
(3) After the assignment of the objects is completed, for each class, the average of all the objects is found as the new "centroid" of the group;
(4) According to the principle of nearest to the center, dividing all objects again;
(5) Returning to the step (3) until all the generated classes have not changed.
Step 4, building a model by utilizing a GRNN algorithm and training, inputting book information borrowed by readers, outputting four groups of clusters, recommending books contained in the four output cluster groups respectively, and accordingly carrying out collection construction and information service work in a targeted manner; strengthen the knowledge of the relevant disciplines of the librarian and accurately provide discipline consultation.
The GRNN algorithm is used to build a model, as shown in fig. 2, in which neurons included in the input layer serve as distribution units, the number of which depends on characteristic signal data of the samples, and the learning samples are transmitted to the pattern layer without processing. The number of neurons contained in the pattern layer depends on the total number of learning samples, and each neuron corresponds to a different type of sample, and the pattern layer is connected through a connection systemNumber Y i The summing layer is connected. The summing layer sums the two different neurons and passes the result of the computation to the output layer. Dividing the obtained two calculation results by each neuron in the output layer to obtain a final output result, wherein the number of the neurons depends on the output type of the sample, and the output final result corresponds to the serial number of the estimation result.
Examples
The method comprises the steps of selecting borrowing data of readers in a library of western engineering university as a study object, wherein each borrowing of the readers has one piece of record data, compiling a program to integrate a record data set by utilizing a storage process provided by a program SQL Server2008, so that each reader borrows the record to form a transaction; the reader's identity, book borrowing type, number borrowed, book class number are displayed in the same row. All books are borrowed to get '1', otherwise '0'. The reader borrows transaction data as shown in table 1.
Table 1 readers borrow transaction data
Clustering is carried out according to the book types, and reading commonalities among readers are found out. The k-Means algorithm is used, the book borrowing condition is taken as a parameter to be divided into 4 groups, and each group represents 1 reading mode.
Table 2 book borrowing mode clustering
The GRNN sample is obtained by a librarian to borrow book information from college students, and the book types comprise I1, I2, TM, TN, TQ, H3, TP, O1 and O4, and the total test data is 1000 groups. After the above data are obtained, I1, I2, TM, TN, TQ, H3, TP, O1, and O4 are taken as characteristic values. Each sample has 9 eigenvalues, corresponding to 4 different types of clusters, namely cluster-1, cluster-2, cluster-3 and cluster-4, and the sample data is shown in Table 2 and is part of the sample data.
Table 3 book borrowing sample data
And 3, establishing a model and a training sample by using the GRNN to finish nonlinear mapping of input parameters and output results. And establishing a GRNN model, taking 9 characteristic values as input parameters, and taking a cluster-1, a cluster-2, a cluster-3 and a cluster-4 as output results. In addition, the essence of the network model training searches for the optimal smoothness factor, and the model evaluation performance is improved. The GRNN training procedure is as follows:
1. the 1000 groups of samples are divided into 20 samples with the same number, 19 samples are selected for training, a GRNN model is built, the remaining one sample, namely 50 groups of samples, are used as test samples for verification, and the error mean square value of the test result and the true value is calculated.
2. Setting the value interval of the smooth factor sigma as [0.01,1], and changing the step length by 0.01 to gradually increase.
3. Readjusting the training sample and the test sample, and repeating steps 1 and 2. And (3) circulating for 20 times to obtain 20 groups of error mean square values corresponding to each smoothing factor, and calculating the average value of the 20 groups of error mean square values.
Through the training steps, when the value of the smoothing factor sigma is smaller than or equal to 0.1, the diagnosis effect is good, and in order to improve the generalization capability of the diagnosis model, the value of the optimal smoothing factor is finally determined to be 0.1, and the corresponding minimum mean square error is 3.4e -4 . Therefore, the optimal smoothness factor, σ=0.1, was used for the book borrowing model.
The training accuracy of the intelligent analysis algorithm for the borrowing behavior of the library reader is 94.00%.
According to the intelligent analysis algorithm for the borrowing behavior of the library readers, the borrowing behavior of the readers recorded in the circulation information data of the library is researched, the K-means cluster analysis algorithm and the GRNN algorithm are utilized for mining and analyzing the data, and the readers are divided into a plurality of groups with similar attributes, namely the borrowing preference of the readers. The results of the cluster analysis and the GRNN algorithm can be used for recommending books to the personal digital library of readers, and guiding the readers to borrow books with strong correlation; hidden relationships between disciplines are found to optimize the reader discipline knowledge structure. In practical work applications such as database purchasing, book purchasing and subject construction, the borrowing tendency of readers of each group is known, the collection construction and information service work are facilitated, the subject consultation is accurately provided, and the popularization and application of the mining result are enhanced.
Claims (1)
1. A method for recommending books according to library reader borrowing behavior, comprising the steps of:
step 1, acquiring a reader borrowing set as an excavation sample by combining a book management authority and interview through an access system which is mainly conducted by a pre-expert; taking reader borrowing data of a library as an object, using an MS SQL Server2008 as a basic framework of the data, and carrying out pre-processing on the data by matching with an Excel tool to obtain 1000 transaction data related to the reader borrowing set as an excavation sample;
step 2, preprocessing the data of the excavated sample; the method specifically comprises the following steps:
step 2.1: firstly, cleaning data of the excavated sample: finding and correcting identifiable errors in the data file, checking data consistency, and processing invalid values and missing values;
step 2.2: and then converting the cleaned data: the books borrowed by readers are divided into 22 categories according to the "middle drawing method", and the related book types are predicted through the borrowing of the readers on certain books;
step 2.3: finally, integrating the converted data: the history of borrowing books by readers is collected into transaction data which accords with the data mining requirement, and each transaction represents one borrowing behavior of the readers and comprises a unique identifier and a collection of various books borrowed by the readers;
step 3, clustering according to book types, finding out reading commonalities among readers, and dividing the books into four groups by using a k-Means algorithm and taking book borrowing conditions as parameters, wherein each group represents a reading mode; the method specifically comprises the following steps:
step 3.1: clustering according to book types, randomly searching four data points as initial centroid of each class, and representing four finally divided groups;
step 3.2: searching the class most similar to each object according to the principle of nearest to the center, and dividing other objects into corresponding classes;
step 3.3: after the assignment of the objects is completed, for each class, the average of all the objects is found as the new "centroid" of the group;
step 3.4: according to the principle of nearest to the center, dividing all objects again;
step 3.5: returning to the step 3.3 until all the generated classes have not changed;
step 4, building a model by utilizing a GRNN algorithm and training, inputting book information borrowed by readers, outputting four groups of clusters, and recommending books contained in the four output cluster groups respectively; the GRNN model training steps specifically comprise:
step 4.1: dividing 1000 groups of samples into 20 samples with the same number, selecting 19 samples for training, establishing a GRNN model, and taking the rest one sample, namely 50 groups of samples, as test samples for verification, and calculating an error mean square value of a test result and a true value;
step 4.2: setting the value interval of the smooth factor sigma as [0.01,1], and changing the step length to 0.01 and gradually increasing;
step 4.3: and (3) readjusting the training sample and the test sample, repeating the steps 4.1 and 4.2, and repeating the steps for 20 times to obtain 20 groups of error mean square values corresponding to each smoothing factor, and obtaining the average value of the 20 groups of error mean square values.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910807495.1A CN110619084B (en) | 2019-08-29 | 2019-08-29 | Method for recommending books according to borrowing behaviors of library readers |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910807495.1A CN110619084B (en) | 2019-08-29 | 2019-08-29 | Method for recommending books according to borrowing behaviors of library readers |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110619084A CN110619084A (en) | 2019-12-27 |
CN110619084B true CN110619084B (en) | 2023-05-05 |
Family
ID=68922622
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910807495.1A Active CN110619084B (en) | 2019-08-29 | 2019-08-29 | Method for recommending books according to borrowing behaviors of library readers |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110619084B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111667171A (en) * | 2020-06-04 | 2020-09-15 | 广州博高信息科技有限公司 | Big data-based group reading behavior analysis method, device, equipment and medium |
CN112069390B (en) * | 2020-07-15 | 2023-09-26 | 西安工程大学 | User book borrowing behavior analysis and interest prediction method based on space-time dimension |
CN111859094A (en) * | 2020-08-10 | 2020-10-30 | 广州驰兴通用技术研究有限公司 | Information analysis method and system based on cloud computing |
CN113590945B (en) * | 2021-07-26 | 2023-07-28 | 西安工程大学 | Book recommendation method and device based on user borrowing behavior-interest prediction |
CN117575287B (en) * | 2024-01-15 | 2024-03-26 | 北京家音顺达数据技术有限公司 | Sharing book borrowing circulation method and system for subway station |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104050517A (en) * | 2014-06-27 | 2014-09-17 | 哈尔滨工业大学 | Photovoltaic power generation forecasting method based on GRNN |
CN104539484A (en) * | 2014-12-31 | 2015-04-22 | 深圳先进技术研究院 | Method and system for dynamically estimating network connection reliability |
CN105760547A (en) * | 2016-03-16 | 2016-07-13 | 中山大学 | Book recommendation method and system based on user clustering |
CN109459669A (en) * | 2019-01-09 | 2019-03-12 | 国网上海市电力公司 | 10kV one-phase earthing failure in electric distribution network Section Location |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160092512A1 (en) * | 2014-09-26 | 2016-03-31 | Kobo Inc. | System and method for using book recognition to recommend content items for a user |
-
2019
- 2019-08-29 CN CN201910807495.1A patent/CN110619084B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104050517A (en) * | 2014-06-27 | 2014-09-17 | 哈尔滨工业大学 | Photovoltaic power generation forecasting method based on GRNN |
CN104539484A (en) * | 2014-12-31 | 2015-04-22 | 深圳先进技术研究院 | Method and system for dynamically estimating network connection reliability |
CN105760547A (en) * | 2016-03-16 | 2016-07-13 | 中山大学 | Book recommendation method and system based on user clustering |
CN109459669A (en) * | 2019-01-09 | 2019-03-12 | 国网上海市电力公司 | 10kV one-phase earthing failure in electric distribution network Section Location |
Non-Patent Citations (1)
Title |
---|
"高校图书馆读者借阅行为与馆藏资源推荐实证研究";王琴;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20190215;第33-42页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110619084A (en) | 2019-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110619084B (en) | Method for recommending books according to borrowing behaviors of library readers | |
Chiroma et al. | Progress on artificial neural networks for big data analytics: a survey | |
CN110956273A (en) | Credit scoring method and system integrating multiple machine learning models | |
CN110377605B (en) | Sensitive attribute identification and classification method for structured data | |
CN111222847A (en) | Open-source community developer recommendation method based on deep learning and unsupervised clustering | |
CN116386899A (en) | Graph learning-based medicine disease association relation prediction method and related equipment | |
CN103246685B (en) | The method and apparatus that the attribution rule of object instance is turned to feature | |
Jastrzebska et al. | Fuzzy cognitive map-driven comprehensive time-series classification | |
Wen et al. | MapReduce-based BP neural network classification of aquaculture water quality | |
CN101702172A (en) | Data discretization method based on category-attribute relation dependency | |
CN117611918A (en) | Marine organism classification method based on hierarchical neural network | |
Wang et al. | Temporal dual-attributed network generation oriented community detection model | |
CN102004801A (en) | Information classification method | |
CN116188174A (en) | Insurance fraud detection method and system based on modularity and mutual information | |
CN105512249A (en) | Noumenon coupling method based on compact evolution algorithm | |
CN113868597A (en) | Regression fairness measurement method for age estimation | |
CN106446160A (en) | Content polymerization method and system oriented to mobile internet self-adaptive increments | |
Wang et al. | Power load forecasting using data mining and knowledge discovery technology | |
Sun et al. | An evaluation model for the teaching reform of the physical education industry | |
Wang et al. | Hierarchical attention network for short-term runoff forecasting | |
Tu | Analysis and prediction method of student behavior mining based on campus big data | |
Aher et al. | Prediction of course selection by student using combination of data mining algorithms in E-learning | |
Tang | Research on Intelligent Data Mining and Knowledge Discovery Method Based on Software Information System | |
CN118114812B (en) | Shale gas yield prediction method, computer equipment and storage medium | |
CN113205274B (en) | Quantitative ranking method for construction quality |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |