CN106649583A - Book borrowing data association rule analysis method based on SAS - Google Patents

Book borrowing data association rule analysis method based on SAS Download PDF

Info

Publication number
CN106649583A
CN106649583A CN201611014422.XA CN201611014422A CN106649583A CN 106649583 A CN106649583 A CN 106649583A CN 201611014422 A CN201611014422 A CN 201611014422A CN 106649583 A CN106649583 A CN 106649583A
Authority
CN
China
Prior art keywords
data
rule
sas
mining
reader
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611014422.XA
Other languages
Chinese (zh)
Inventor
王学杰
李永刚
汪征
刘树峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Huabo Shengxun Information Technologies Co Ltd
Original Assignee
Anhui Huabo Shengxun Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Huabo Shengxun Information Technologies Co Ltd filed Critical Anhui Huabo Shengxun Information Technologies Co Ltd
Priority to CN201611014422.XA priority Critical patent/CN106649583A/en
Publication of CN106649583A publication Critical patent/CN106649583A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a book borrowing data association rule analysis method based on SAS. A result obtained by association mining by means of the method is compared with a comparative result of the library practical work and the reader investigation, and the results are similar; the obtained result plays an important reference role on the management work of a library, an associated rule is subjected to mining by means of the method, a suggestion and recommended books can be provided for book borrowing of a reader, accordingly, a personalized service for a specific reader is achieved, borrowing circulation records and digital resource service data can be subjected to mining, a certain latent relation is found, and the book and document borrowing trend, the utilization rate of the document and hobby parameters of the reader are obtained, a decision basis is provided for book purchasing and increasing or decreasing of document materials, the personal information service can be provided for the reader while information introductions and support are provided for discipline construction, and service quality is improved.

Description

A kind of Borrowed Books Data Association Rule Analysis method based on SAS
Technical field
The present invention relates to technical field of information management, more particularly to a kind of Borrowed Books Data association rule based on SAS Then analysis method.
Background technology
Data mining (Data Mining), the Knowledge Discovery (KDD) being also called in database.Data mining technology is from two In generation in ten centurial years, starts to be subject to many favors, its main cause to be because computer technology especially data base administration aspect Technology it is very complicated thorny, data in database increase quickly so that manual lookup information becomes extremely difficult.Data mining Technology is highly useful for the stealth mode in discovery and description relation table, and the algorithm that data mining is provided allows automatic mold Formula is searched.Data mining technology is existing, and oneself is widely used in telecommunications, ecommerce and market management aspect, substantially increases its management Efficiency, service level and economic benefit.
University Digital Library is with the becoming increasingly abundant of information resources, computer technology is applied in daily management mission, Data are also being increased while also expose some problems in processing data with geometry multiple:
1. during collection building, purchase of books purpose is fuzzy, specific aim is not strong, takes often with personal hobby To discipline construction direction and the developing goal of school can not be reflected.
2. resource utilization is low, and retrieval is inaccurate, and effective information can not be obtained within the effective time, exists substantial amounts of Redundancy.
In most of the cases, 3. the information in current database be due to can not easily accessing, analyzing, hence without Do not use to enough attention or fully.
4. some databases increase too fast, even if so that system manager Jing does not often know which information in system yet Can be used for the subject matter for being currently needed for processing, and the relation between the data and current problem in system.
5. information development is under-utilized for mass data only makees general maintenance, no further to develop profit to it With for Internet information resource, do not screened, screened and recombinated, caused the waste of information resources for electronic literature is provided Source, secondary development is not enough etc..
The application that these problems for appearing above are data mining technology in University Digital Library provides possibility And increasingly mature and data mining products increasingly the improving of data mining technology is data mining technology in Digital Books in University Using there is provided feasibility in the management of information resources in shop.
Maintenance data digging technology, has the following advantages:
Such as clustered using data mining technology, correlation rule, can to borrow circulation record, digital resource use data Excavated, found certain contact hidden, shown that book document borrows the hobby ginseng of trend, the utilization rate of document and reader Number, the increase and decrease for purchase of books, documents and materials provides decision-making foundation, provides information explanation for discipline construction and supports while also may be used Individual info service is provided for reader, is improved service quality.
Using data mining technology as clustered, the correlation of resource in shop is found out, i.e., determined by cluster and deposited between data Similitude, the data clustering with most like property simultaneously integrates to these data.Strengthen to various destructurings Database such as text data, graph data, vedio data, voice data, the integration of comprehensive multimedia data, can enrich Information resources, improve utilization rate, the accuracy of retrieval of resource.
The content of the invention
It is an object of the invention to provide a kind of Borrowed Books Data Association Rule Analysis method based on SAS.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of Borrowed Books Data Association Rule Analysis method based on SAS, including:
Step one, will from book management system derived data with text mode preservation after import in database, lead to Crossing query statement carries out data conversion, merging, screening, with the incoherent redundancy of data mining in removal Libraries data , preserve the important attribute related to data mining;
Step 2, data are extracted from database, by the data extracted from database with text mode preservation, then led Enter EXCEL tables and be saved in Readers ' Borrowing Books data preprocessing file folder, readers' ID number, call number data are pre-processed, Obtain pretreated readers' ID number, call number tables of data;
Step 3, the database that SAS will be imported through the Borrowed Books Data of pretreatment, carry out relevant correlation rule digging Pick, so as to obtain association rule mining result;
Step 4, the Shelf number that data are concentrated is sampled, sample mode is to cluster and specifies cluster numbers for 13, fortune Sample data is generated after row;
Step 5, the ASSOCIATION analytical models for selecting SAS, arrange regular minimum support, and specifying 10% is Minimum support;
The maxitem included in step 6, one correlation rule of setting, the maxitem for specifying rule is 4;
Step 7, the min confidence for arranging rule, the min confidence for specifying rule is 10%;
Step 8, rule analysis are associated to sample data and correlation rule is obtained;
Step 9, for ineligible " support>=10% and confidence level>=60% " redundant rule elimination falls, and obtains Expected results.
Beneficial effects of the present invention:
A kind of Borrowed Books Data Association Rule Analysis method based on SAS provided by the present invention, closes with this method Connection excavates the result that obtains and compares with library's real work, readers investigation results contrast, as a result close;The result pair for obtaining Work of Library Management plays important reference role, by the correlation rule excavated with this method, can be to reader Book borrowing and reading suggestion and Recommended Books are provided, so as to realize the personalized service for specific reader, can be to borrowing circulation Record, digital resource are excavated using data, find certain contact hidden, and show that book document borrows trend, document Utilization rate and the preference parameters of reader, the increase and decrease for purchase of books, documents and materials provides decision-making foundation, for discipline construction letter is provided Simultaneously alternatively reader provides individual info service for breath explanation and support, improves service quality.
Description of the drawings
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can be with basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is method of the present invention schematic diagram.
Specific embodiment
The core of the present invention is to provide a kind of Borrowed Books Data Association Rule Analysis method based on SAS.
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, and described embodiment is only the present invention A part of embodiment, rather than the embodiment of whole.Based on the embodiment in the present invention, those of ordinary skill in the art are not having The every other embodiment obtained under the premise of creative work is made, the scope of protection of the invention is belonged to.
As shown in figure 1, the invention provides a kind of Borrowed Books Data Association Rule Analysis method based on SAS, the party Method comprises the steps:
A kind of Borrowed Books Data Association Rule Analysis method based on SAS, including:
Step one, will from book management system derived data with text mode preservation after import to SQL Server numbers In in storehouse, data conversion, merging, screening are carried out by query statement, removed in Libraries data with data mining not Related redundancy, preserves the important attribute related to data mining, such as date of operation, readers' ID number, Shelf number, document bar Code number.
Step 2, from database data are extracted, the data extracted from SQL Server are preserved with text mode and taken Entitled JYZHSQH52, is then introduced into EXCEL tables and is saved in Readers ' Borrowing Books data preprocessing file folder with JYZHSQH52.XLS, Readers' ID number, call number data are pre-processed with SQL, sql programs are as follows:
Build table Sheet1 $
CREATE TABLE[master].[dbo].[sheet1$](
[suoqu]varchar(255)NULL,
[reader_no]bigint NULL)
Readers' ID number, call number data are pre-processed using sql like language, SQL programs are as follows:
Select readers' IDs number, Shelf number from sheet1 $ order by readers' IDs number;
Pretreated readers' ID number, call number tables of data are obtained after operation.
Step 3, the database that SAS will be imported through the Borrowed Books Data of pretreatment, carry out relevant correlation rule digging Pick, so as to obtain association rule mining result;
Into after SAS system, import, JYZHSQH52.XSL
In being saved in sheet1 $.Program is as follows
Proc import out=work.jyzhsqh52
Datefile=" E:\zzhsqh.xls"
Dbms=excel replace;
Sheet=" Sheet1 $ ";
Getnames=yes;
Mixed=no;
Scantext=yes;
Usedate=yes;
run;
In SAS, select SASUSER as permanent Library, Menmber is named as JYZHSQH52, and in C disks The file of entitled JYZHSQH52.SAS is set up, can be supplied afterwards in the related data of SAS system operation with preserving Can repeat to call in related data mining process.Finally will set up entitled SASUSER.JYZHSQH52's in SAS system Database, opens SASUSER.JYZHSQH52 databases, it can be seen that a table being made up of readers' ID number and Shelf number.
Step 4, after SASUSER.JYZHSQH52 Databases, to data concentrate Shelf number be sampled, Sample mode is to cluster and specifies cluster numbers to be 13, raw 66 sample datas after operation.
Step 5, the ASSOCIATION analytical models for selecting SAS, arrange regular minimum support, and specifying 10% is Minimum support.
The maxitem included in step 6, one correlation rule of setting, the maxitem for specifying rule is 4.
Step 7, the min confidence for arranging rule, the min confidence for specifying rule is 10%.
Step 8, rule analysis are associated to sample data and correlation rule is obtained.
Step 9, from the point of view of the rule for producing, only those supports>=10% and confidence level>=60% rule is With practical significance, for the rule that those supports and confidence level do not meet data mining requirement will be deleted.
A kind of Borrowed Books Data Association Rule Analysis method based on SAS provided by the present invention, closes with this method Connection excavates the result that obtains and compares with library's real work, readers investigation results contrast, as a result close;The result pair for obtaining Work of Library Management plays important reference role, by the correlation rule excavated with this method, can be to reader Book borrowing and reading suggestion and Recommended Books are provided, so as to realize the personalized service for specific reader, can be to borrowing circulation Record, digital resource are excavated using data, find certain contact hidden, and show that book document borrows trend, document Utilization rate and the preference parameters of reader, the increase and decrease for purchase of books, documents and materials provides decision-making foundation, for discipline construction letter is provided Simultaneously alternatively reader provides individual info service for breath explanation and support, improves service quality.
Above content is only to present configuration example and explanation, affiliated those skilled in the art couple Described specific embodiment is made various modifications or supplements or substituted using similar mode, without departing from invention Structure surmounts scope defined in the claims, all should belong to protection scope of the present invention.

Claims (1)

1. a kind of Borrowed Books Data Association Rule Analysis method based on SAS, it is characterised in that include:
Step one, will from book management system derived data with text mode preservation after import in database, by looking into Asking sentence carries out data conversion, merging, screening, with the incoherent redundancy of data mining in removal Libraries data, protects Deposit the important attribute related to data mining;
Step 2, data are extracted from database, by the data extracted from database with text mode preservation, be then introduced into EXCEL tables are simultaneously saved in Readers ' Borrowing Books data preprocessing file folder, and readers' ID number, call number data are pre-processed, and are obtained To pretreated readers' ID number, call number tables of data;
Step 3, the database that the Borrowed Books Data for passing through pretreatment is imported SAS, carry out relevant association rule mining, So as to obtain association rule mining result;
Step 4, the Shelf number that data are concentrated is sampled, sample mode is to cluster and specifies cluster numbers for 13, after operation Generate sample data;
Step 5, the ASSOCIATION analytical models for selecting SAS, arrange regular minimum support, specify 10% for minimum Support;
The maxitem included in step 6, one correlation rule of setting, the maxitem for specifying rule is 4;
Step 7, the min confidence for arranging rule, the min confidence for specifying rule is 10%;
Step 8, rule analysis are associated to sample data and correlation rule is obtained;
Step 9, for ineligible " support>=10% and confidence level>=60% " redundant rule elimination falls, and is expected As a result.
CN201611014422.XA 2016-11-17 2016-11-17 Book borrowing data association rule analysis method based on SAS Pending CN106649583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611014422.XA CN106649583A (en) 2016-11-17 2016-11-17 Book borrowing data association rule analysis method based on SAS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611014422.XA CN106649583A (en) 2016-11-17 2016-11-17 Book borrowing data association rule analysis method based on SAS

Publications (1)

Publication Number Publication Date
CN106649583A true CN106649583A (en) 2017-05-10

Family

ID=58807582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611014422.XA Pending CN106649583A (en) 2016-11-17 2016-11-17 Book borrowing data association rule analysis method based on SAS

Country Status (1)

Country Link
CN (1) CN106649583A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107741993A (en) * 2017-11-06 2018-02-27 佛山市章扬科技有限公司 A kind of method of University Digital Library data mining
CN109344320A (en) * 2018-08-03 2019-02-15 昆明理工大学 A kind of book recommendation method based on Apriori
CN116662673A (en) * 2023-07-28 2023-08-29 西安银信博锐信息科技有限公司 User preference data analysis method based on data monitoring

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254019A (en) * 2011-07-25 2011-11-23 上海应用技术学院 Method for generating literature association semantic based on multi-information fusion

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254019A (en) * 2011-07-25 2011-11-23 上海应用技术学院 Method for generating literature association semantic based on multi-information fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
欧阳烽: "基于数据挖掘的高校数字图书馆信息资源管理", 《中国优秀硕士学位论文全文数据库信息科技辑(月刊)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107741993A (en) * 2017-11-06 2018-02-27 佛山市章扬科技有限公司 A kind of method of University Digital Library data mining
CN109344320A (en) * 2018-08-03 2019-02-15 昆明理工大学 A kind of book recommendation method based on Apriori
CN116662673A (en) * 2023-07-28 2023-08-29 西安银信博锐信息科技有限公司 User preference data analysis method based on data monitoring
CN116662673B (en) * 2023-07-28 2023-11-03 西安银信博锐信息科技有限公司 User preference data analysis method based on data monitoring

Similar Documents

Publication Publication Date Title
CN101593200B (en) Method for classifying Chinese webpages based on keyword frequency analysis
JP5536851B2 (en) Method and system for symbolic linking and intelligent classification of information
US20140201035A1 (en) Using model information groups in searching
CN109271477A (en) A kind of method and system by internet building taxonomy library
CN101021857A (en) Video searching system based on content analysis
CN111192176B (en) Online data acquisition method and device supporting informatization assessment of education
CN105095436A (en) Automatic modeling method for data of data sources
CN111353005A (en) Drug research and development reporting document management method and system
CN106649583A (en) Book borrowing data association rule analysis method based on SAS
CN105335506A (en) Electronic archive compiling-studying method and system
CN106874368B (en) RTB bidding advertisement position value analysis method and system
CN111143394B (en) Knowledge data processing method, device, medium and electronic equipment
CN116010552A (en) Engineering cost data analysis system and method based on keyword word library
KR102575507B1 (en) Article writing soulution using artificial intelligence and device using the same
CN113407678B (en) Knowledge graph construction method, device and equipment
CN110941952A (en) Method and device for perfecting audit analysis model
Lehmberg Web table integration and profiling for knowledge base augmentation
El Haddadi et al. Mining unstructured data for a competitive intelligence system XEW
Acker et al. The Neil deGrasse Tyson Problem: Methods for Exploring Base Memes in Web Archives
Lin et al. Realtime event summarization from tweets with inconsistency detection
Hood et al. Indexing terms in the LISA database on CD-ROM
CN110704421A (en) Data processing method, device, equipment and computer readable storage medium
Goode et al. A Toolkit for the Analysis of the NIME Proceedings Archive
Bugla Name identification in scientific publications
US20230334231A1 (en) Labeled clustering preprocessing for natural language processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170510

RJ01 Rejection of invention patent application after publication