CN107741993A - A kind of method of University Digital Library data mining - Google Patents

A kind of method of University Digital Library data mining Download PDF

Info

Publication number
CN107741993A
CN107741993A CN201711077156.XA CN201711077156A CN107741993A CN 107741993 A CN107741993 A CN 107741993A CN 201711077156 A CN201711077156 A CN 201711077156A CN 107741993 A CN107741993 A CN 107741993A
Authority
CN
China
Prior art keywords
data
information
books
reader
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711077156.XA
Other languages
Chinese (zh)
Inventor
崔垒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan Zhangyang Technology Co Ltd
Original Assignee
Foshan Zhangyang Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan Zhangyang Technology Co Ltd filed Critical Foshan Zhangyang Technology Co Ltd
Priority to CN201711077156.XA priority Critical patent/CN107741993A/en
Publication of CN107741993A publication Critical patent/CN107741993A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The method that a kind of University Digital Library data provided in an embodiment of the present invention are dug, it is more for college student quantity, school systems are various and complicated, each individually subsystem has these information of the data storage database of oneself to be closed in respective subsystem, it can not be interacted between subsystem database, information can not be shared, simple inquiry, addition, modification, deletion and statistical function can only be provided, become qualified information island, using isolated island, cause the resource serious waste of school, the problems such as cannot get effective reasonable utilization.A kind of method dug it is an object of the present invention to provide University Digital Library data, avoid above mentioned problem, related data largely is borrowed using caused in digital library, therefrom excavates our useful informations interested, for teaching management and offer decision-making foundation of optimizing allocation of resources.

Description

A kind of method of University Digital Library data mining
Technical field
The present invention relates to library data search field, more particularly to a kind of side of University Digital Library data mining Method.
Background technology
The rapid progress of the rapid development of computer science, particularly database technology and network technology so that people obtain Win the confidence breath and propagate information approach is more and more extensive, speed is increasingly faster, mode is more and more diversified, bar codes technique and letter With a large amount of uses of card, the IT application process of the association areas such as business, insurance, finance is caused to accelerate, All Around The World seemingly night Between enter an entirely different fresh information epoch.As the development of information technology is, it is necessary to the information content for storing and propagating Increasing, the form and species of information are increasingly abundanter, and the mechanism of traditional libraries obviously can not meet these needs.Cause This, there has been proposed the imagination of digital library.Digital library is the storage of a digitized information, can be stored a large amount of each The information of kind form, user can easily access it by network, to obtain these information, and the storage of its information and user Access without geographical restrictions.
Data mining technology is applied relatively extensively, but in the education sector of non-profit property in the various business of profitability Using but extremely poor.IT application in education sector is an indispensable important ring for China's information project, and modern education The only way which must be passed.China has built up and come into operation at present education and research network, national Broadband Satellite remote education network, height " Campus Interconnectivity " information engineering of school " Digital Campus " construction project and common primary school is all China's IT application in education sector Important content.Colleges and universities are the most important things of education sector, and the building action of Digital Campus has some idea of.The hair of Digital Campus Exhibition, the predicament of " information magnanimity, knowledge are very few " is inevitably also brought, how to turn into carefully using these information must face To realistic problem.Data mining technology can it is convenient and swift and it is efficient from vastness Digital Campus information in extract Implicit useful information, there is provided to policymaker as decision-making foundation, not merely with theory significance, the more information to education sector Change to build and there is important realistic function.
Therefore a kind of method for needing University Digital Library data to dig, can utilize caused a large amount of in digital library Borrow related data and obtain information interested, the offer decision-making foundation that manages and optimize allocation of resources is provided for teaching.
The content of the invention
A kind of method dug it is an object of the present invention to provide University Digital Library data, utilizes digital library In it is caused largely borrow related data, therefrom excavate our useful informations interested, be teaching management and optimization resource Configuration provides decision-making foundation.
A kind of method that University Digital Library data are dug, methods described include:
Step S101:Obtain the information base data of library's Borrowing History;
Step S102:Preprocessed data;
Step S103:Mining data;
Step S104:It is stored in linked database;
Step S105:Export book recommendation information.
Specifically, step S101:Obtain the information base data of library's Borrowing History;Wherein described information storehouse includes reader Information bank and book information storehouse.
Specifically, the information reader library package includes:The classification of reader, the age of reader, the specialty of reader and reading The hobby interests of person.
Specifically, the book information library package includes:The call number of books, the bar code of books, the title of books, figure The author of the publishing house of book, the publication date of books and books.
Specifically, step S102:Preprocessed data;Including by with different-format, separate sources, different geographical positions Put, the data that characteristic is different physically or are in logic integrated together, the data acquisition system of one unified standard of formation.
Specifically, step S102:Preprocessed data;It is clear including the data of mistake and careless mistake are carried out.
Specifically, step S102:Preprocessed data;Also include unified specificationization and handle all data, find being total to for data Same feature, then find a suitable description method and stipulations conversion is carried out to data.
Specifically, step S103:Mining data;Including being excavated using packet Apriori algorithm.
As seen through the above technical solutions:The side that a kind of University Digital Library data provided in an embodiment of the present invention are dug Method, more for college student quantity, school systems are various and complicated, and each individually subsystem has the data storage number of oneself According to storehouse, these information are closed in respective subsystem, can not be interacted between subsystem database, and information can not be shared, can only The simple inquiry of offer, addition, modification, deletion and statistical function, become qualified information island, using isolated island, lead Cause the resource serious waste of school, the problems such as cannot get effective reasonable utilization.It is an object of the present invention to provide one kind The method that University Digital Library data are dug, avoids above mentioned problem, largely correlation is borrowed using caused in digital library Data, our useful informations interested are therefrom excavated, for teaching management and offer decision-making foundation of optimizing allocation of resources.
Brief description of the drawings
Some specific embodiments of the present invention are described in detail by way of example, and not by way of limitation with reference to the accompanying drawings hereinafter. Identical reference denotes same or similar part or part in accompanying drawing.It should be appreciated by those skilled in the art that these What accompanying drawing was not necessarily drawn to scale.In accompanying drawing:
Fig. 1 is the method flow diagram that a kind of University Digital Library data of the embodiment of the present invention are dug.
Embodiment
The process of traditional Readers ' Borrowing Books books and periodicals is substantially as follows:Reader logs in book lending system, specific not knowing (such case refers to mostly, although borrowing direction at heart, to specifically borrowing bibliography simultaneously in the case of book borrowing purpose Do not decide), by browsing Library Frontpage, nearest one section may be searched and borrow bibliography ranking list, or browse graph The new book of the newest restocking in book shop etc. approach determines the bibliography finally to be borrowed, then logs in the book retrieval system in library System, the list that checks out is filled in, complete book borrowing and reading;Another situation is that had clearly to borrow very much books, directly logs in books The Books Retrieve System in shop, is then filled out the list that checks out, it is possible to which books are borrowed in completion.Book borrowing and reading process is carefully analyzed, is held very much Easy can finds that big too many levels all has uncertainty to reader's Many times wherein, if timely closed at this moment to reader Suitable recommendation, the demand that quickly auxiliary determines reader is so not only able to, reduces meaningless planless query process The time wasted, and whole process seemingly has special messenger to accompany, and gives reader more preferable Interactive Experience.
This below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out it is clear, Complete description, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Base Embodiment in the present invention, those of ordinary skill in the art obtained under the premise of creative work is not made it is all its His embodiment, belongs to the scope of protection of the invention.
Referring to Fig. 1, the method flow diagram dug for a kind of University Digital Library data of the embodiment of the present application;
Step S101:Obtain the information base data of library's Borrowing History;
Step S102:Preprocessed data;
The data of various forms, various sources, various geographical position, various characteristics physically or are in logic integrated To together, the data acquisition system of a unified standard is formed.
Step S103:Mining data;
It should be noted that data mining has following characteristics:First, data source must be the real original application of magnanimity Data;These real application data number of levelss are quite big, and there are many incomplete fuzzy data item, or even together Sample also vicious item (having noise), this is just needed during data mining, carries out data prediction;2nd, data mining Purpose be to find the knowledge for potentially having actual value.The purpose of data mining is for convenience it is found that hidden The knowledge of Tibetan, the knowledge excavated easily should be realized and be employed.3rd, data mining has specific aim.To the greatest extent The method of pipe data mining is diversified, and data mining is often analyzed and researched for a certain particular problem, is dug It is specific specific to excavate the knowledge come.
Step S104:It is stored in linked database;
Step S105:Export book recommendation information.
Obtain reader's books interested.
Further, step S101:Obtain the information base data of library's Borrowing History;Wherein described information storehouse includes reading Person's information bank and book information storehouse.
Further, the information reader library package includes:The classification of reader, the age of reader, reader specialty and The hobby interests of reader.
Each reader that the reader of Borrowing System occurs or is browsed by Catalog Search system queries to library, All it is potential service object, personalized ventilation system can be carried out to them.These potential service objects, have different special Belong to classification (undergraduate, Master degree candidate, doctoral candidate, teacher, scientific research personnel, administrative staff, common teaching and administrative staff etc.), they With different category attributes.
Further, the book information library package includes:The call number of books, the bar code of books, books title, The author of the publishing house of books, the publication date of books and books.
Further, step S102:Preprocessed data;Including by with different-format, separate sources, different geographical positions Put, the data that characteristic is different physically or are in logic integrated together, the data acquisition system of one unified standard of formation.
Further, step S102:Preprocessed data;It is clear including the data of mistake and careless mistake are carried out.
The initial data obtained from data source can have such-and-such mistake and careless mistake unavoidably.Such as in some tables Library card attribute should be 14 integers, and some is but shown as 0, and the ratio that this data occupy in total data is especially small, So will not be had an impact to the whole structure of data mining, therefore take the method directly deleted.
Further, step S102:Preprocessed data;Also include unified specificationization and handle all data, find data Common trait, then find a suitable description method and stipulations conversion is carried out to data.
Further, step S103:Mining data;Including being excavated using packet Apriori algorithm.
The process that Apriori algorithm is associated rule digging to data is broadly divided into two steps:First, ceaselessly follow Ring iterative, all frequent item sets are calculated, it is necessary that these obtained frequent item sets must are fulfilled for such a condition-support The minimum support threshold value being previously set more than or equal to user;Second step, generated on the basis of these frequent item sets out More than or equal to the rule of the min confidence of user's setting.Wherein search frequent item set is the core of Apriori algorithm, is accounted for whole The overwhelming majority of the amount of calculation of algorithm.Apriori algorithm shortcoming:The frequent scanning of first pair of database, in circulating each time Will scan database, cause sizable I/O expenses.Second generates substantial amounts of potential candidate.
Apriori algorithm improvement strategy has both direction, and one is the number for controlling scan database, and another is exactly to control Make the scale of potential candidate.We just put forth effort to improve this algorithm in terms of the two, improve the effect of algorithm performs Rate.For shortcoming one, the method for taking database to be grouped reduces the number of data record in scan database, reduces I/O Expense.For shortcoming two, we take the method that first beta pruning reconnects, and Apriori algorithm is first to connect beta pruning again, packet Apriori algorithm is acted in a diametrically opposite way, and so equivalent to the radix reduced before connecting, has deleted those nonmatching grids, institute Can effectively reduce connection number, so as to enhance the efficiency of algorithm.
Database is grouped, i.e., when database scan for the first time, the occurrence number of each is counted, 1- item Candidate Set C1 are produced, then transaction database D is grouped according to the maximum number of affairs middle term, that is to say, that there are i The set of the affairs of item is designated as Di, so as to which transaction database, D points have been N number of group of D1, D2... DN(N is the maximal term included Number).When by frequent 1- item collections L1 generation candidate's 2- item Candidate Sets C2, during to C2 each candidate's item count, it is not necessary to scan whole Individual database D, but only scan D2 to DN.By that analogy, the record number scanned every time is all being reduced.
Apriori algorithm is grouped, i.e., first beta pruning reconnects.Directly first connection can produce many non-frequent subsets.Packet Apriori algorithm can avoid producing many non-frequent subsets.
As seen through the above technical solutions:The side that a kind of University Digital Library data provided in an embodiment of the present invention are dug Method, more for college student quantity, school systems are various and complicated, and each individually subsystem has the data storage number of oneself According to storehouse, these information are closed in respective subsystem, can not be interacted between subsystem database, and information can not be shared, can only The simple inquiry of offer, addition, modification, deletion and statistical function, become qualified information island, using isolated island, lead Cause the resource serious waste of school, the problems such as cannot get effective reasonable utilization.It is an object of the present invention to provide one kind The method that University Digital Library data are dug, avoids above mentioned problem, largely correlation is borrowed using caused in digital library Data, our useful informations interested are therefrom excavated, for teaching management and offer decision-making foundation of optimizing allocation of resources.
So far, although those skilled in the art will appreciate that detailed herein have shown and described multiple showing for the present invention Example property embodiment, still, still can be direct according to present disclosure without departing from the spirit and scope of the present invention It is determined that or derive many other variations or modifications for meeting the principle of the invention.Therefore, the scope of the present invention is understood that and recognized It is set to and covers other all these variations or modifications.

Claims (8)

1. a kind of University Digital Library data dig stubborn method, it is characterised in that methods described includes:
Step S101:Obtain the information base data of library's Borrowing History;
Step S102:Preprocessed data;
Step S103:Mining data;
Step S104:It is stored in linked database;
Step S105:Export book recommendation information.
2. according to the method for claim 1, it is characterised in that step S101:Obtain the information bank of library's Borrowing History Data;Wherein described information storehouse includes information reader storehouse and book information storehouse.
3. according to the method for claim 2, it is characterised in that the information reader library package includes:The classification of reader, read Age, the specialty of reader and the hobby interests of reader of person.
4. according to the method for claim 2, it is characterised in that the book information library package includes:The call number of books, The bar code of books, the title of books, the publishing house of books, the author of the publication date of books and books.
5. according to the method for claim 1, it is characterised in that step S102:Preprocessed data;Including that will have not apposition The different data of formula, separate sources, different geographical position, characteristic physically or are in logic integrated together, and form one The data acquisition system of individual unified standard.
6. according to the method for claim 1, it is characterised in that step S102:Preprocessed data;Including by mistake and careless mistake Data carry out it is clear.
7. according to the method for claim 1, it is characterised in that step S102:Preprocessed data;Also include unified specification All data are handled, find the common trait of data, a suitable description method is then found and stipulations conversion is carried out to data.
8. according to the method for claim 1, it is characterised in that step S103:Mining data;Including using packet Apriori algorithm is excavated.
CN201711077156.XA 2017-11-06 2017-11-06 A kind of method of University Digital Library data mining Pending CN107741993A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711077156.XA CN107741993A (en) 2017-11-06 2017-11-06 A kind of method of University Digital Library data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711077156.XA CN107741993A (en) 2017-11-06 2017-11-06 A kind of method of University Digital Library data mining

Publications (1)

Publication Number Publication Date
CN107741993A true CN107741993A (en) 2018-02-27

Family

ID=61234056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711077156.XA Pending CN107741993A (en) 2017-11-06 2017-11-06 A kind of method of University Digital Library data mining

Country Status (1)

Country Link
CN (1) CN107741993A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344320A (en) * 2018-08-03 2019-02-15 昆明理工大学 A kind of book recommendation method based on Apriori
CN115794801A (en) * 2022-12-23 2023-03-14 东南大学 Data analysis method for mining chain relation of automatic driving accident cause

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183727A (en) * 2014-05-29 2015-12-23 上海研深信息科技有限公司 Method and system for recommending book
CN105760547A (en) * 2016-03-16 2016-07-13 中山大学 Book recommendation method and system based on user clustering
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN106649583A (en) * 2016-11-17 2017-05-10 安徽华博胜讯信息科技股份有限公司 Book borrowing data association rule analysis method based on SAS
CN107153850A (en) * 2016-03-04 2017-09-12 上海阿法迪智能标签系统技术有限公司 The acquisition methods and system of books relevant information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183727A (en) * 2014-05-29 2015-12-23 上海研深信息科技有限公司 Method and system for recommending book
CN107153850A (en) * 2016-03-04 2017-09-12 上海阿法迪智能标签系统技术有限公司 The acquisition methods and system of books relevant information
CN105760547A (en) * 2016-03-16 2016-07-13 中山大学 Book recommendation method and system based on user clustering
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN106649583A (en) * 2016-11-17 2017-05-10 安徽华博胜讯信息科技股份有限公司 Book borrowing data association rule analysis method based on SAS

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
司贯中等: "分组 Apriori 在图书借阅系统中的应用研究", 《微处理机》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344320A (en) * 2018-08-03 2019-02-15 昆明理工大学 A kind of book recommendation method based on Apriori
CN115794801A (en) * 2022-12-23 2023-03-14 东南大学 Data analysis method for mining chain relation of automatic driving accident cause
CN115794801B (en) * 2022-12-23 2023-08-15 东南大学 Data analysis method for mining cause chain relation of automatic driving accidents

Similar Documents

Publication Publication Date Title
Zhang et al. Multi-database mining
Frank Requirements for a database management system for a GIS.
CN106547809A (en) Complex relation is represented in chart database
CN106294521A (en) Date storage method and data warehouse
CN101853305A (en) Method for establishing comprehensive agricultural environmental information database
CN104391908B (en) Multiple key indexing means based on local sensitivity Hash on a kind of figure
Danping et al. The data mining of the human resources data warehouse in university based on association rule
CN106020794A (en) Layout method of complex page portal page
Wang et al. Research and implementation of the customer-oriented modern hotel management system using fuzzy analytic hiererchical process (FAHP)
CN107741993A (en) A kind of method of University Digital Library data mining
CN107679151A (en) A kind of data processing method based on ELA big data flight deck systems
Kopáčková et al. Decision support systems or business intelligence: what can help in decision making?
Wang et al. Facilitating connectivity in composite information systems
Zhao et al. Design of digital business center of enterprise project management system based on Information Technology
Tian Application of artificial intelligence system in libraries through data mining and content filtering methods
CN106095443A (en) A kind of API call mode method for digging based on C/C++ code library
Sweeney et al. Teradata Data Mart Consolidation Return on Investment at GST
Yi et al. Study the personal push service of university library based on big data mining
Li Research on the Social Security and Elderly Care System under the Background of Big Data
Umadevi et al. An Analytical Review on Big Data in a Diversified Approach
Huang et al. Research on precision marketing of real estate market based on data mining
Renfro Economic database systems: further reflections on the state of the art
Tou Information systems
Jain NSF workshop on visual information management systems: workshop report
Ruoxin et al. Design of MICE service platform based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180227

RJ01 Rejection of invention patent application after publication