CN106682072A - Knowledge management based data mining method for digital archives - Google Patents
Knowledge management based data mining method for digital archives Download PDFInfo
- Publication number
- CN106682072A CN106682072A CN201611013730.0A CN201611013730A CN106682072A CN 106682072 A CN106682072 A CN 106682072A CN 201611013730 A CN201611013730 A CN 201611013730A CN 106682072 A CN106682072 A CN 106682072A
- Authority
- CN
- China
- Prior art keywords
- data
- knowledge
- archives
- digital archives
- digital
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Fuzzy Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a knowledge management based data mining method for digital archives. The knowledge management based data mining method creates conditions for knowledge management implementation of the digital archives, and is a method and a strategy for coordinating and managing information processing technology clusters. The knowledge management based data mining method is based on networks and digital resources as well as coordination and cooperation of multiple information technologies, takes implementation of a mining algorithm and a mining model as a measure, organizes and discovers existing knowledge resources in the digital archives, and takes providing management objects to knowledge management implementation as the objective, so that the digital archives utilize knowledge effectively, and knowledge innovation is realized.
Description
Technical field
The present invention relates to field of information management, the data in more particularly to a kind of Digital Archives of knowledge based management
Method for digging.
Background technology
Digital Archives, as conventional entity archives the information age new type organization forms, be that entity archives exist
The certainty that information age constantly brings forth new ideas and develops, is the Challenges of Knowledge-based Economy Age, expands conventional entity archives function, full
Sufficient user's request, provides the key of personalization, diversified service, is also the new opportunity for improving Social files sense4.So, how
Refine from the vast as the open sea a large amount of digitalization resources of Digital Archives, excavate valuable, Digital Archives is known
Know accumulation, knowledge innovation have data supporting act on effective information, this be future digital archives construction faced it is important
Problem.Data mining technology exactly solves the effective way of this difficult problem, and data mining is the focus in computer nowadays field, its
Achievement is also widely used in Library Information Science.
Data mining is the cross discipline of a very broad sense, is derived from computer, although having application to numerous areas, figure
Book, the practice of intelligence community have also fully verified its value, but in archives circle, data mining be still treated as the technology of profundity and
Theory, many Archives Workers are relatively fuzzyyer to this concept or misty.So what is data miningData
Excavate (Data Ming), be exactly that extraction lies in from substantial amounts of, incomplete, noisy, fuzzy, random data
The prior process that is ignorant but being potential useful information and knowledge of therein, people.The purpose of this process be in order to
It was found that " the knowledge gold mine " that be hidden in mass data silt, therefore, data mining is defined as " knowledge excavations in data "
It is more appropriate.So, data mining is also knowledge excavation, Knowledge Extraction etc. by person.
Data mining can be divided into data digging method conceptual description, association analysiss, divide according to the difference of mining task
The polytypes such as alanysis, cluster analyses, separate-blas estimation, it is specific as follows:
Conceptual description exactly by analysis and compares, and certain class data that are mutually related are collected, and summarizes such right
The correlated characteristic of elephant, to being described with regard to such bulk information, these descriptions are abstract, it is intended that justice.Its type
There are two kinds:Characteristic is described and distinctiveness description.
1) characteristic description is applied to the something in common for describing certain class object, for example, in the archive database in certain archives
There is substantial amounts of user basic information, be directed to:The information such as name, age, work, utilization hobby, if to historic survey
Person is described, it is more likely that draw following result:Based on College Teachers, student, ground with compiling various district annals, writing historiography
For the purpose of studying carefully article.
2) distinctiveness description, for describing the difference between two or more class objects, for example, to enterprise customer and history
Researcher feature is compared, and perhaps can draw following rule:It is main using production management and research and development management in terms of archives letter
Breath, to obtain certain economic benefit and social benefit for the purpose of.
Association analysiss are exactly the correlation properties existed between data item in descriptive data base, that is, excavate and be hidden in data item
Between mutual relation, specifically, if wherein two item datas or many item datas exist certain association, one of which data are just
Can be predicted according to other data.Association analysiss can find user using the association between different archive informations, analyses and prediction
User's Land use models.
Classification analysises are exactly to condense together the data in data base are orderly, contribute to comprehensively handle of the people to things
Hold.Classification analysises can be divided into structural data classification analysises, the classification point of the such as data in relational database, and unstructured data
Analysis, such as text data.The detailed process of classification analysises is:It is the data in a data acquisition system with the different classification of a stack features
Classified, then find out and describe the model of these data, and data are divided in different classifications according to this model, profit
Unknown data can be predicted with this model.Classification analysises can pass through the data in existing subscriber's archive database, disclose
User characteristicses and user are classified according to the degree for affecting user behavior using the relation between behavior to these data,
For predicting the user behavior in future.
Cluster analyses are exactly the process that the data in data base are divided into different pieces of information class, and it is different from classification analysises,
The former is that, in the case where known disaggregated model is not considered in advance, in placing the data into different classification, the purpose of cluster is root
According to the similarity maximized in class, similarity this principle minimized between class reasonably divides data acquisition system, in simple terms
It is to minimize the difference in class, the difference between class is maximized, and thus similar data can be organized together and derived
Certain rule.
Separate-blas estimation be exactly by finding data base in abnormal conditions process that the data of deviation are analyzed, emphasis
It is to find the ANOMALOUS VARIATIONS in data, the data variation in data base is probably what mistake caused, is more likely data
The result of the natural trends such as renewal.The meaning of separate-blas estimation is can effectively to exclude a large amount of incoherent data.For example, certain shelves
Line retrieval is first entered being formed before certain volume is ground into fruit in case shop in User Information Database, and with archives data base in
Existing resource is combined, then is excluded incoherent user using model with data mining technology, using remaining as emphasis, is formulated
Targetedly compile and grind strategy.
Digital Archives resource, information management sum are tackled first in the data mining of the Digital Archives of knowledge based management
According to the relation positioning excavated.The knowledge resource of Digital Archives will be organized and found, this is that Digital Archives realizes modernization
Scientific management, provides the quick, basis of good service.It is to choose the reply era of knowledge-driven economy to implement information management to Digital Archives
War, maximizes Digital Archives knowledge resource potentiality, finally realizes the inevitable requirement of Digital Archives knowledge innovation.Without enforcement
The Digital Archives of information management cannot meet the needs of future development, lack the knowledge of management object also into water without a source.
Data mining is the effective way for organizing and finding knowledge resource in Digital Archives, is that Digital Archives implements information management wound
Condition is made, has been that both are able to forming a connecting link the stage for seamless link.Here data mining can not regard pure information as
Treatment technology, it is the method and strategy for coordinating to information processing technology cluster and managing.The numeral of knowledge based management
Data mining in archives be based on network and digitalization resource, based on the coordination and cooperation of several information,
It is real to organize and find already present knowledge resource in Digital Archives to implement mining algorithm and mining model as means
Apply information management to provide for the purpose of management object, allow Digital Archives effectively utilizes knowledge, realize the process of knowledge innovation.
Main excavation object in the Digital Archives of knowledge based management mainly includes:
1) the solidification resource in Digital Archives, this is the Explicit Knowledge being present in Digital Archives, that is, be recorded in one
Determine the knowledge on material carrier, including:Digitized Collection Resources, existing e-file, gopher, volume are ground into fruit, with
Digital Archives work related various laws and regulations, rules and regulations, industry standard etc., around produced by Numerical Archives ' Construction
Achievement in research, technical data and contribute to Digital Archives development other relevant knowledges.
2) intellectual resources in Digital Archives, this is the implicit knowledge being present in Digital Archives, is to be present in shelves
That what is laid in the brains such as case shop administration staff, policies and regulations research worker, information technologist, external coordination personnel is big
Amount non-coding intellectual resources, including:Various management methods, computer processing technology, ability of process problem etc..Because people is to know
Know the core of management, be the factor of most active most active in information management, so the excavation to this partial knowledge is also numeral
The emphasis of archives knowledge excavation.
3) user utilizes the utilization behavioural information of behavioural information, user to include two aspects, using information and feedback information.Profit
With information be user in order to solve practical problems, meet science, scientific research, the demand such as production, implementing concrete using behavior when institute
The information of generation, including:Content, access frequency, access time etc. are accessed, they reflect individual character of the user to digitalization resource
Change, diversified demand and assimilated equations.Feedback information is that in File use this continuously active, what File use person had found asks
Topic and situation, requirement, suggestion, evaluation and benefit etc..Excavation to these data, can be used to that user will be utilized in future
The analyses and prediction of gesture, and management decision-making on this basis is proposed, the service level to improve Digital Archives provides foundation.
The content of the invention
It is an object of the invention to provide the data digging method in a kind of Digital Archives of knowledge based management.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of data digging method in Digital Archives of knowledge based management, comprises the following steps:
Step one, determine theme:It is determined that needing the datum target for excavating;
Step 2, requirement definition:According to the theme that step one determines, theme is defined, what explicit data was excavated will
Summation purpose;
Step 3, data collection:While being defined to theme, to Explicit Knowledge and recessiveness in archive database
Knowledge is collected extraction, and the correlated characteristic that conceptual description summarizes demand is carried out to it;
Step 4, analyze and formed result:By cluster analyses, according to similarity and diversity different demands point are formed
Class model, and place the data in different classification, by demand classification model and user using the combination of information, carry out difference
Analysis and separate-blas estimation, exclude a large amount of incoherent data, form Result;
Step 5, Result is evaluated:The Result of formation there may exist unrelated data, it is also possible to
Demand is unsatisfactory for, if do not meet excavation required and purpose, step 3 is gone to, and repeats mining process;Otherwise, step is gone to
Six;
Step 6, through evaluating, Result reaches data mining requirement, can be used by Digital Archives information management,
In then enriching legacy data storehouse, the knowledge innovation in archives is realized.
Beneficial effects of the present invention:
Data digging method in a kind of Digital Archives of knowledge based management provided by the present invention, is digital archives
Shop is implemented information management and creates condition, and the present invention is the method and plan for coordinating to information processing technology cluster and managing
Slightly, the data mining in the Digital Archives of knowledge based management of the present invention is based on network and digitalization resource, to base oneself upon
In the coordination and cooperation of several information, to implement mining algorithm and mining model as means, to organize and find digital shelves
Already present knowledge resource in case shop, manages for the purpose of object to implement information management and providing, allows Digital Archives effectively utilizes
Knowledge, realizes knowledge innovation.
Description of the drawings
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this
Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can be with basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is the schematic diagram of the present invention.
Specific embodiment
The core of the present invention is to provide the data digging method in a kind of Digital Archives of knowledge based management.
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, and described embodiment is only the present invention
A part of embodiment, rather than the embodiment of whole.Based on the embodiment in the present invention, those of ordinary skill in the art are not having
The every other embodiment obtained under the premise of creative work is made, the scope of protection of the invention is belonged to.
As shown in figure 1, the invention provides the data digging method in a kind of Digital Archives of knowledge based management, is somebody's turn to do
Method comprises the steps:
Step one, theme is determined, it is determined that needing the datum target for excavating.
Step 2, requirement definition:According to the theme that step one determines, theme is defined, what explicit data was excavated will
Summation purpose.
Step 3, data collection:While being defined to problem, to Explicit Knowledge and recessiveness in archive database
Knowledge is collected extraction, and the correlated characteristic that conceptual description summarizes demand is carried out to it.
Step 4, analyze and formed result:By cluster analyses, according to similarity and diversity different demands point are formed
Class model, and place the data in different classification, by demand classification model and user using the combination of information, carry out difference
Analysis and separate-blas estimation, exclude a large amount of incoherent data, form Result.
Step 5, Result is evaluated:The Result of formation there may exist unrelated data, it is also possible to
Demand is unsatisfactory for, if do not meet excavation required and purpose, step 3 is gone to, and repeats mining process;Otherwise, step is gone to
Six.
Step 6, through evaluating, Result reaches data mining requirement, can be used by Digital Archives information management,
In then enriching legacy data storehouse, the knowledge innovation in archives is realized.
Data digging method in a kind of Digital Archives of knowledge based management provided by the present invention, is digital archives
Shop is implemented information management and creates condition, and the present invention is the method and plan for coordinating to information processing technology cluster and managing
Slightly, the data mining in the Digital Archives of knowledge based management of the present invention is based on network and digitalization resource, to base oneself upon
In the coordination and cooperation of several information, to implement mining algorithm and mining model as means, to organize and find digital shelves
Already present knowledge resource in case shop, manages for the purpose of object to implement information management and providing, allows Digital Archives effectively utilizes
Knowledge, realizes knowledge innovation.
Above content is only to present configuration example and explanation, affiliated those skilled in the art couple
Described specific embodiment is made various modifications or supplements or substituted using similar mode, without departing from invention
Structure surmounts scope defined in the claims, all should belong to protection scope of the present invention.
Claims (1)
1. the data digging method in the Digital Archives that a kind of knowledge based is managed, it is characterised in that comprise the following steps:
Step one, determine theme:It is determined that needing the datum target for excavating;
Step 2, requirement definition:According to step one determine theme, theme is defined, explicit data excavate requirement and
Purpose;
Step 3, data collection:While being defined to theme, to Explicit Knowledge and implicit knowledge in archive database
Extraction is collected, and the correlated characteristic that conceptual description summarizes demand is carried out to it;
Step 4, analyze and formed result:By cluster analyses, according to similarity and diversity different demand classification moulds are formed
Type, and place the data in different classification, by demand classification model and user using the combination of information, carry out variation analyses
And separate-blas estimation, a large amount of incoherent data are excluded, form Result;
Step 5, Result is evaluated:The Result of formation there may exist unrelated data, it is also possible to discontented
Sufficient demand, if do not meet excavation required and purpose, goes to step 3, and repeats mining process;Otherwise, step 6 is gone to;
Step 6, process are evaluated, and Result reaches data mining requirement, can then be filled used by Digital Archives information management
In actual arrival legacy data storehouse, the knowledge innovation in archives is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611013730.0A CN106682072A (en) | 2016-11-17 | 2016-11-17 | Knowledge management based data mining method for digital archives |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611013730.0A CN106682072A (en) | 2016-11-17 | 2016-11-17 | Knowledge management based data mining method for digital archives |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106682072A true CN106682072A (en) | 2017-05-17 |
Family
ID=58839685
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611013730.0A Pending CN106682072A (en) | 2016-11-17 | 2016-11-17 | Knowledge management based data mining method for digital archives |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682072A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107506404A (en) * | 2017-08-03 | 2017-12-22 | 广东省科技基础条件平台中心 | A kind of scientific research public information monitoring method |
CN107886650A (en) * | 2017-12-17 | 2018-04-06 | 江西睿创科技有限公司 | A kind of people's livelihood archives self-service query integrated machine system |
CN113610194A (en) * | 2021-09-09 | 2021-11-05 | 重庆数字城市科技有限公司 | Automatic classification method for digital files |
CN117251587A (en) * | 2023-11-17 | 2023-12-19 | 北京因朵数智档案科技产业发展有限公司 | Intelligent information mining method for digital archives |
-
2016
- 2016-11-17 CN CN201611013730.0A patent/CN106682072A/en active Pending
Non-Patent Citations (3)
Title |
---|
刘志勇等: "关联规则数据挖掘在图书馆个性化服务中的应用研究", 《电子设计工程》 * |
罗艳等: "一个数字档案馆中的数据挖掘系统工作流程", 《广西科学院学报》 * |
黄小忠等: "基于知识管理的数字档案馆中的数据挖掘", 《档案学通讯》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107506404A (en) * | 2017-08-03 | 2017-12-22 | 广东省科技基础条件平台中心 | A kind of scientific research public information monitoring method |
CN107886650A (en) * | 2017-12-17 | 2018-04-06 | 江西睿创科技有限公司 | A kind of people's livelihood archives self-service query integrated machine system |
CN113610194A (en) * | 2021-09-09 | 2021-11-05 | 重庆数字城市科技有限公司 | Automatic classification method for digital files |
CN113610194B (en) * | 2021-09-09 | 2023-08-11 | 重庆数字城市科技有限公司 | Automatic classification method for digital files |
CN117251587A (en) * | 2023-11-17 | 2023-12-19 | 北京因朵数智档案科技产业发展有限公司 | Intelligent information mining method for digital archives |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Holdaway | Harness oil and gas big data with analytics: Optimize exploration and production with data-driven models | |
Silwattananusarn et al. | Data mining and its applications for knowledge management: a literature review from 2007 to 2012 | |
Rogalewicz et al. | Methodologies of knowledge discovery from data and data mining methods in mechanical engineering | |
CN106682072A (en) | Knowledge management based data mining method for digital archives | |
Vo et al. | A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases | |
Nohuddin et al. | A case study in knowledge acquisition for logistic cargo distribution data mining framework | |
Hikmawati et al. | How to determine minimum support in association rule | |
Vu et al. | FTKHUIM: a fast and efficient method for mining top-K high-utility itemsets | |
Jabbour et al. | On maximal frequent itemsets mining with constraints | |
El Wakil et al. | Data management for construction processes using fuzzy approach | |
Kachaoui et al. | Challenges and benefits of deploying big data storage solution | |
Singh et al. | Knowledge based retrieval scheme from big data for aviation industry | |
Shao et al. | Mining range associations for classification and characterization | |
Wang et al. | Distinguishing investment changes in metro construction project based on a factor space algorithm | |
Chawla et al. | Reverse apriori approach—an effective association rule mining algorithm | |
Popov et al. | Expert system of selection of competitive options of systems of underground development of ore deposits | |
Liu et al. | Mining top-k high average-utility itemsets based on breadth-first search | |
Huchard et al. | Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures | |
Ertek et al. | Data mining of project management data: An analysis of applied research studies | |
Martins et al. | Data Collection, Storage, and Retrieval | |
Miri et al. | Association Rules Mining for STO Dataset of HSES Knowledge Portal System | |
Pullagura et al. | Crime Rate Prediction in Tamil Nadu Using Machine Learning | |
CN118093546A (en) | Offshore oil and gas field well main data management and storage method and system | |
Gajera et al. | Improvisation in frequent pattern mining technique | |
Hu et al. | Research on Readers’ Behavior of College Libraries Based on Data Mining Technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170517 |
|
RJ01 | Rejection of invention patent application after publication |