CN106682072A - Knowledge management based data mining method for digital archives - Google Patents

Knowledge management based data mining method for digital archives Download PDF

Info

Publication number
CN106682072A
CN106682072A CN201611013730.0A CN201611013730A CN106682072A CN 106682072 A CN106682072 A CN 106682072A CN 201611013730 A CN201611013730 A CN 201611013730A CN 106682072 A CN106682072 A CN 106682072A
Authority
CN
China
Prior art keywords
data
knowledge
archives
digital archives
digital
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611013730.0A
Other languages
Chinese (zh)
Inventor
王学杰
杨乃平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Huabo Shengxun Information Technologies Co Ltd
Original Assignee
Anhui Huabo Shengxun Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Huabo Shengxun Information Technologies Co Ltd filed Critical Anhui Huabo Shengxun Information Technologies Co Ltd
Priority to CN201611013730.0A priority Critical patent/CN106682072A/en
Publication of CN106682072A publication Critical patent/CN106682072A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a knowledge management based data mining method for digital archives. The knowledge management based data mining method creates conditions for knowledge management implementation of the digital archives, and is a method and a strategy for coordinating and managing information processing technology clusters. The knowledge management based data mining method is based on networks and digital resources as well as coordination and cooperation of multiple information technologies, takes implementation of a mining algorithm and a mining model as a measure, organizes and discovers existing knowledge resources in the digital archives, and takes providing management objects to knowledge management implementation as the objective, so that the digital archives utilize knowledge effectively, and knowledge innovation is realized.

Description

A kind of data digging method in the Digital Archives of knowledge based management
Technical field
The present invention relates to field of information management, the data in more particularly to a kind of Digital Archives of knowledge based management Method for digging.
Background technology
Digital Archives, as conventional entity archives the information age new type organization forms, be that entity archives exist The certainty that information age constantly brings forth new ideas and develops, is the Challenges of Knowledge-based Economy Age, expands conventional entity archives function, full Sufficient user's request, provides the key of personalization, diversified service, is also the new opportunity for improving Social files sense4.So, how Refine from the vast as the open sea a large amount of digitalization resources of Digital Archives, excavate valuable, Digital Archives is known Know accumulation, knowledge innovation have data supporting act on effective information, this be future digital archives construction faced it is important Problem.Data mining technology exactly solves the effective way of this difficult problem, and data mining is the focus in computer nowadays field, its Achievement is also widely used in Library Information Science.
Data mining is the cross discipline of a very broad sense, is derived from computer, although having application to numerous areas, figure Book, the practice of intelligence community have also fully verified its value, but in archives circle, data mining be still treated as the technology of profundity and Theory, many Archives Workers are relatively fuzzyyer to this concept or misty.So what is data miningData Excavate (Data Ming), be exactly that extraction lies in from substantial amounts of, incomplete, noisy, fuzzy, random data The prior process that is ignorant but being potential useful information and knowledge of therein, people.The purpose of this process be in order to It was found that " the knowledge gold mine " that be hidden in mass data silt, therefore, data mining is defined as " knowledge excavations in data " It is more appropriate.So, data mining is also knowledge excavation, Knowledge Extraction etc. by person.
Data mining can be divided into data digging method conceptual description, association analysiss, divide according to the difference of mining task The polytypes such as alanysis, cluster analyses, separate-blas estimation, it is specific as follows:
Conceptual description exactly by analysis and compares, and certain class data that are mutually related are collected, and summarizes such right The correlated characteristic of elephant, to being described with regard to such bulk information, these descriptions are abstract, it is intended that justice.Its type There are two kinds:Characteristic is described and distinctiveness description.
1) characteristic description is applied to the something in common for describing certain class object, for example, in the archive database in certain archives There is substantial amounts of user basic information, be directed to:The information such as name, age, work, utilization hobby, if to historic survey Person is described, it is more likely that draw following result:Based on College Teachers, student, ground with compiling various district annals, writing historiography For the purpose of studying carefully article.
2) distinctiveness description, for describing the difference between two or more class objects, for example, to enterprise customer and history Researcher feature is compared, and perhaps can draw following rule:It is main using production management and research and development management in terms of archives letter Breath, to obtain certain economic benefit and social benefit for the purpose of.
Association analysiss are exactly the correlation properties existed between data item in descriptive data base, that is, excavate and be hidden in data item Between mutual relation, specifically, if wherein two item datas or many item datas exist certain association, one of which data are just Can be predicted according to other data.Association analysiss can find user using the association between different archive informations, analyses and prediction User's Land use models.
Classification analysises are exactly to condense together the data in data base are orderly, contribute to comprehensively handle of the people to things Hold.Classification analysises can be divided into structural data classification analysises, the classification point of the such as data in relational database, and unstructured data Analysis, such as text data.The detailed process of classification analysises is:It is the data in a data acquisition system with the different classification of a stack features Classified, then find out and describe the model of these data, and data are divided in different classifications according to this model, profit Unknown data can be predicted with this model.Classification analysises can pass through the data in existing subscriber's archive database, disclose User characteristicses and user are classified according to the degree for affecting user behavior using the relation between behavior to these data, For predicting the user behavior in future.
Cluster analyses are exactly the process that the data in data base are divided into different pieces of information class, and it is different from classification analysises, The former is that, in the case where known disaggregated model is not considered in advance, in placing the data into different classification, the purpose of cluster is root According to the similarity maximized in class, similarity this principle minimized between class reasonably divides data acquisition system, in simple terms It is to minimize the difference in class, the difference between class is maximized, and thus similar data can be organized together and derived Certain rule.
Separate-blas estimation be exactly by finding data base in abnormal conditions process that the data of deviation are analyzed, emphasis It is to find the ANOMALOUS VARIATIONS in data, the data variation in data base is probably what mistake caused, is more likely data The result of the natural trends such as renewal.The meaning of separate-blas estimation is can effectively to exclude a large amount of incoherent data.For example, certain shelves Line retrieval is first entered being formed before certain volume is ground into fruit in case shop in User Information Database, and with archives data base in Existing resource is combined, then is excluded incoherent user using model with data mining technology, using remaining as emphasis, is formulated Targetedly compile and grind strategy.
Digital Archives resource, information management sum are tackled first in the data mining of the Digital Archives of knowledge based management According to the relation positioning excavated.The knowledge resource of Digital Archives will be organized and found, this is that Digital Archives realizes modernization Scientific management, provides the quick, basis of good service.It is to choose the reply era of knowledge-driven economy to implement information management to Digital Archives War, maximizes Digital Archives knowledge resource potentiality, finally realizes the inevitable requirement of Digital Archives knowledge innovation.Without enforcement The Digital Archives of information management cannot meet the needs of future development, lack the knowledge of management object also into water without a source. Data mining is the effective way for organizing and finding knowledge resource in Digital Archives, is that Digital Archives implements information management wound Condition is made, has been that both are able to forming a connecting link the stage for seamless link.Here data mining can not regard pure information as Treatment technology, it is the method and strategy for coordinating to information processing technology cluster and managing.The numeral of knowledge based management Data mining in archives be based on network and digitalization resource, based on the coordination and cooperation of several information, It is real to organize and find already present knowledge resource in Digital Archives to implement mining algorithm and mining model as means Apply information management to provide for the purpose of management object, allow Digital Archives effectively utilizes knowledge, realize the process of knowledge innovation.
Main excavation object in the Digital Archives of knowledge based management mainly includes:
1) the solidification resource in Digital Archives, this is the Explicit Knowledge being present in Digital Archives, that is, be recorded in one Determine the knowledge on material carrier, including:Digitized Collection Resources, existing e-file, gopher, volume are ground into fruit, with Digital Archives work related various laws and regulations, rules and regulations, industry standard etc., around produced by Numerical Archives ' Construction Achievement in research, technical data and contribute to Digital Archives development other relevant knowledges.
2) intellectual resources in Digital Archives, this is the implicit knowledge being present in Digital Archives, is to be present in shelves That what is laid in the brains such as case shop administration staff, policies and regulations research worker, information technologist, external coordination personnel is big Amount non-coding intellectual resources, including:Various management methods, computer processing technology, ability of process problem etc..Because people is to know Know the core of management, be the factor of most active most active in information management, so the excavation to this partial knowledge is also numeral The emphasis of archives knowledge excavation.
3) user utilizes the utilization behavioural information of behavioural information, user to include two aspects, using information and feedback information.Profit With information be user in order to solve practical problems, meet science, scientific research, the demand such as production, implementing concrete using behavior when institute The information of generation, including:Content, access frequency, access time etc. are accessed, they reflect individual character of the user to digitalization resource Change, diversified demand and assimilated equations.Feedback information is that in File use this continuously active, what File use person had found asks Topic and situation, requirement, suggestion, evaluation and benefit etc..Excavation to these data, can be used to that user will be utilized in future The analyses and prediction of gesture, and management decision-making on this basis is proposed, the service level to improve Digital Archives provides foundation.
The content of the invention
It is an object of the invention to provide the data digging method in a kind of Digital Archives of knowledge based management.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of data digging method in Digital Archives of knowledge based management, comprises the following steps:
Step one, determine theme:It is determined that needing the datum target for excavating;
Step 2, requirement definition:According to the theme that step one determines, theme is defined, what explicit data was excavated will Summation purpose;
Step 3, data collection:While being defined to theme, to Explicit Knowledge and recessiveness in archive database Knowledge is collected extraction, and the correlated characteristic that conceptual description summarizes demand is carried out to it;
Step 4, analyze and formed result:By cluster analyses, according to similarity and diversity different demands point are formed Class model, and place the data in different classification, by demand classification model and user using the combination of information, carry out difference Analysis and separate-blas estimation, exclude a large amount of incoherent data, form Result;
Step 5, Result is evaluated:The Result of formation there may exist unrelated data, it is also possible to Demand is unsatisfactory for, if do not meet excavation required and purpose, step 3 is gone to, and repeats mining process;Otherwise, step is gone to Six;
Step 6, through evaluating, Result reaches data mining requirement, can be used by Digital Archives information management, In then enriching legacy data storehouse, the knowledge innovation in archives is realized.
Beneficial effects of the present invention:
Data digging method in a kind of Digital Archives of knowledge based management provided by the present invention, is digital archives Shop is implemented information management and creates condition, and the present invention is the method and plan for coordinating to information processing technology cluster and managing Slightly, the data mining in the Digital Archives of knowledge based management of the present invention is based on network and digitalization resource, to base oneself upon In the coordination and cooperation of several information, to implement mining algorithm and mining model as means, to organize and find digital shelves Already present knowledge resource in case shop, manages for the purpose of object to implement information management and providing, allows Digital Archives effectively utilizes Knowledge, realizes knowledge innovation.
Description of the drawings
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can be with basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is the schematic diagram of the present invention.
Specific embodiment
The core of the present invention is to provide the data digging method in a kind of Digital Archives of knowledge based management.
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, and described embodiment is only the present invention A part of embodiment, rather than the embodiment of whole.Based on the embodiment in the present invention, those of ordinary skill in the art are not having The every other embodiment obtained under the premise of creative work is made, the scope of protection of the invention is belonged to.
As shown in figure 1, the invention provides the data digging method in a kind of Digital Archives of knowledge based management, is somebody's turn to do Method comprises the steps:
Step one, theme is determined, it is determined that needing the datum target for excavating.
Step 2, requirement definition:According to the theme that step one determines, theme is defined, what explicit data was excavated will Summation purpose.
Step 3, data collection:While being defined to problem, to Explicit Knowledge and recessiveness in archive database Knowledge is collected extraction, and the correlated characteristic that conceptual description summarizes demand is carried out to it.
Step 4, analyze and formed result:By cluster analyses, according to similarity and diversity different demands point are formed Class model, and place the data in different classification, by demand classification model and user using the combination of information, carry out difference Analysis and separate-blas estimation, exclude a large amount of incoherent data, form Result.
Step 5, Result is evaluated:The Result of formation there may exist unrelated data, it is also possible to Demand is unsatisfactory for, if do not meet excavation required and purpose, step 3 is gone to, and repeats mining process;Otherwise, step is gone to Six.
Step 6, through evaluating, Result reaches data mining requirement, can be used by Digital Archives information management, In then enriching legacy data storehouse, the knowledge innovation in archives is realized.
Data digging method in a kind of Digital Archives of knowledge based management provided by the present invention, is digital archives Shop is implemented information management and creates condition, and the present invention is the method and plan for coordinating to information processing technology cluster and managing Slightly, the data mining in the Digital Archives of knowledge based management of the present invention is based on network and digitalization resource, to base oneself upon In the coordination and cooperation of several information, to implement mining algorithm and mining model as means, to organize and find digital shelves Already present knowledge resource in case shop, manages for the purpose of object to implement information management and providing, allows Digital Archives effectively utilizes Knowledge, realizes knowledge innovation.
Above content is only to present configuration example and explanation, affiliated those skilled in the art couple Described specific embodiment is made various modifications or supplements or substituted using similar mode, without departing from invention Structure surmounts scope defined in the claims, all should belong to protection scope of the present invention.

Claims (1)

1. the data digging method in the Digital Archives that a kind of knowledge based is managed, it is characterised in that comprise the following steps:
Step one, determine theme:It is determined that needing the datum target for excavating;
Step 2, requirement definition:According to step one determine theme, theme is defined, explicit data excavate requirement and Purpose;
Step 3, data collection:While being defined to theme, to Explicit Knowledge and implicit knowledge in archive database Extraction is collected, and the correlated characteristic that conceptual description summarizes demand is carried out to it;
Step 4, analyze and formed result:By cluster analyses, according to similarity and diversity different demand classification moulds are formed Type, and place the data in different classification, by demand classification model and user using the combination of information, carry out variation analyses And separate-blas estimation, a large amount of incoherent data are excluded, form Result;
Step 5, Result is evaluated:The Result of formation there may exist unrelated data, it is also possible to discontented Sufficient demand, if do not meet excavation required and purpose, goes to step 3, and repeats mining process;Otherwise, step 6 is gone to;
Step 6, process are evaluated, and Result reaches data mining requirement, can then be filled used by Digital Archives information management In actual arrival legacy data storehouse, the knowledge innovation in archives is realized.
CN201611013730.0A 2016-11-17 2016-11-17 Knowledge management based data mining method for digital archives Pending CN106682072A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611013730.0A CN106682072A (en) 2016-11-17 2016-11-17 Knowledge management based data mining method for digital archives

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611013730.0A CN106682072A (en) 2016-11-17 2016-11-17 Knowledge management based data mining method for digital archives

Publications (1)

Publication Number Publication Date
CN106682072A true CN106682072A (en) 2017-05-17

Family

ID=58839685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611013730.0A Pending CN106682072A (en) 2016-11-17 2016-11-17 Knowledge management based data mining method for digital archives

Country Status (1)

Country Link
CN (1) CN106682072A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506404A (en) * 2017-08-03 2017-12-22 广东省科技基础条件平台中心 A kind of scientific research public information monitoring method
CN107886650A (en) * 2017-12-17 2018-04-06 江西睿创科技有限公司 A kind of people's livelihood archives self-service query integrated machine system
CN113610194A (en) * 2021-09-09 2021-11-05 重庆数字城市科技有限公司 Automatic classification method for digital files
CN117251587A (en) * 2023-11-17 2023-12-19 北京因朵数智档案科技产业发展有限公司 Intelligent information mining method for digital archives

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘志勇等: "关联规则数据挖掘在图书馆个性化服务中的应用研究", 《电子设计工程》 *
罗艳等: "一个数字档案馆中的数据挖掘系统工作流程", 《广西科学院学报》 *
黄小忠等: "基于知识管理的数字档案馆中的数据挖掘", 《档案学通讯》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506404A (en) * 2017-08-03 2017-12-22 广东省科技基础条件平台中心 A kind of scientific research public information monitoring method
CN107886650A (en) * 2017-12-17 2018-04-06 江西睿创科技有限公司 A kind of people's livelihood archives self-service query integrated machine system
CN113610194A (en) * 2021-09-09 2021-11-05 重庆数字城市科技有限公司 Automatic classification method for digital files
CN113610194B (en) * 2021-09-09 2023-08-11 重庆数字城市科技有限公司 Automatic classification method for digital files
CN117251587A (en) * 2023-11-17 2023-12-19 北京因朵数智档案科技产业发展有限公司 Intelligent information mining method for digital archives

Similar Documents

Publication Publication Date Title
Holdaway Harness oil and gas big data with analytics: Optimize exploration and production with data-driven models
Silwattananusarn et al. Data mining and its applications for knowledge management: a literature review from 2007 to 2012
Rogalewicz et al. Methodologies of knowledge discovery from data and data mining methods in mechanical engineering
CN106682072A (en) Knowledge management based data mining method for digital archives
Vo et al. A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases
Nohuddin et al. A case study in knowledge acquisition for logistic cargo distribution data mining framework
Hikmawati et al. How to determine minimum support in association rule
Vu et al. FTKHUIM: a fast and efficient method for mining top-K high-utility itemsets
Jabbour et al. On maximal frequent itemsets mining with constraints
El Wakil et al. Data management for construction processes using fuzzy approach
Kachaoui et al. Challenges and benefits of deploying big data storage solution
Singh et al. Knowledge based retrieval scheme from big data for aviation industry
Shao et al. Mining range associations for classification and characterization
Wang et al. Distinguishing investment changes in metro construction project based on a factor space algorithm
Chawla et al. Reverse apriori approach—an effective association rule mining algorithm
Popov et al. Expert system of selection of competitive options of systems of underground development of ore deposits
Liu et al. Mining top-k high average-utility itemsets based on breadth-first search
Huchard et al. Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures
Ertek et al. Data mining of project management data: An analysis of applied research studies
Martins et al. Data Collection, Storage, and Retrieval
Miri et al. Association Rules Mining for STO Dataset of HSES Knowledge Portal System
Pullagura et al. Crime Rate Prediction in Tamil Nadu Using Machine Learning
CN118093546A (en) Offshore oil and gas field well main data management and storage method and system
Gajera et al. Improvisation in frequent pattern mining technique
Hu et al. Research on Readers’ Behavior of College Libraries Based on Data Mining Technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170517

RJ01 Rejection of invention patent application after publication