CN113641663A - Big data management method and system based on DAMA theory - Google Patents

Big data management method and system based on DAMA theory Download PDF

Info

Publication number
CN113641663A
CN113641663A CN202111213648.3A CN202111213648A CN113641663A CN 113641663 A CN113641663 A CN 113641663A CN 202111213648 A CN202111213648 A CN 202111213648A CN 113641663 A CN113641663 A CN 113641663A
Authority
CN
China
Prior art keywords
data
service
accessible
management
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111213648.3A
Other languages
Chinese (zh)
Other versions
CN113641663B (en
Inventor
周文群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jinhongrui Information Technology Co ltd
Original Assignee
Beijing Jinhongrui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jinhongrui Information Technology Co ltd filed Critical Beijing Jinhongrui Information Technology Co ltd
Priority to CN202111213648.3A priority Critical patent/CN113641663B/en
Publication of CN113641663A publication Critical patent/CN113641663A/en
Application granted granted Critical
Publication of CN113641663B publication Critical patent/CN113641663B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a big data management method based on DAMA theory, which comprises the following steps: obtaining accessible data resources; accessing the accessible data, organizing and comprehensively managing the accessible data based on a preset big data base platform, and generating a service management result; receiving the service management result and transmitting the service management result to a consumption port; the invention covers the full life cycle treatment characteristic processing of data, provides functions of a data standard management system of different levels, a flexible and extensible data source adaptation method, a rapid data service release mode and the like, provides a set of efficient and scientific solution for the treatment of big data, and realizes a uniform treatment system across big data platforms.

Description

Big data management method and system based on DAMA theory
Technical Field
The invention relates to the technical field of data processing systems and big data management, in particular to a big data management method and a big data management system based on a DAMA theory.
Background
At present, a data processing system is applied to various industries, administrative, commercial, financial and other industries, and provides requirements for various components and unified data, in order to meet the problems that different business requirements can be stored in different components, a plurality of calculation paths such as real-time calculation, offline calculation and the like exist, most of the existing data processing systems adopt similar narrow data processing methods on the problem of data processing, the data processing systems cannot be controlled to the most source and the most tail end of the data, and the unified control of platform products compatible with various specifications is difficult to realize.
The DAMA theory is a knowledge system for data management and mainly provides a set of theoretical framework for data management work.
Disclosure of Invention
The invention provides a big data management method and a big data management system based on a DAMA theory, which aim to solve the problems.
The invention provides a big data management method based on DAMA theory, which is characterized by comprising the following steps:
step 1: obtaining accessible data resources;
step 2: accessing the accessible data resources, organizing and comprehensively managing the accessible data resources based on a preset big data base platform, and generating a service management result;
and step 3: receiving the service management result and transmitting the service management result to a consumption port; wherein the content of the first and second substances,
the consumption ports include at least one or more of a user port, an application port, and an analysis port.
As an embodiment of the present technical solution, the step 2 further includes, before:
step 100: calculating the sampling capacity of the accessible data resources;
Figure 355127DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 893555DEST_PATH_IMAGE002
representing the sample capacity of the accessible data resources,
Figure 356898DEST_PATH_IMAGE003
represents the first
Figure 638974DEST_PATH_IMAGE004
The sample size of the data resource is accessible in batches,
Figure 136952DEST_PATH_IMAGE005
Figure 643020DEST_PATH_IMAGE006
representing the total number of batches of accessible data resources,
Figure 796920DEST_PATH_IMAGE007
represents the first
Figure 679426DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 969593DEST_PATH_IMAGE008
The number of the resource data is one,
Figure 646562DEST_PATH_IMAGE009
Figure 84496DEST_PATH_IMAGE010
represents the total number of resource data,
Figure 708376DEST_PATH_IMAGE011
represents the first
Figure 915366DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 966499DEST_PATH_IMAGE008
The number of the sampling capacity is one,
Figure 891729DEST_PATH_IMAGE012
represents the first
Figure 53720DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 380797DEST_PATH_IMAGE008
Number of resourcesAccording to the occupied capacity during sampling;
step 101: calculating a loss difference value of a preset memory access capacity and a sampling capacity;
Figure 665147DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 749778DEST_PATH_IMAGE014
representing the difference in the loss as a function of,
Figure 777777DEST_PATH_IMAGE015
represents the first
Figure 897043DEST_PATH_IMAGE004
The difference in the capacity of the batch of accessible data resources,
Figure 352295DEST_PATH_IMAGE016
representing memory access capacity;
step 102: dividing the loss difference value to determine a division result;
Figure 252118DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure 750052DEST_PATH_IMAGE018
in order to divide the result of the division,
Figure 786141DEST_PATH_IMAGE019
representing predicted loss difference, when the division result is
Figure 349978DEST_PATH_IMAGE020
I.e. by
Figure 471517DEST_PATH_IMAGE021
Then, the divided loss difference value is the first type of division result; when the division result is
Figure 841319DEST_PATH_IMAGE022
I.e. by
Figure 935177DEST_PATH_IMAGE023
Then, the divided loss difference value is the second type of division result; when the division result is
Figure 732231DEST_PATH_IMAGE024
I.e. by
Figure 278750DEST_PATH_IMAGE025
Then, the divided loss difference value is the third type of division result; when the division result is
Figure 186664DEST_PATH_IMAGE026
I.e. by
Figure 197345DEST_PATH_IMAGE027
Then, the divided loss difference value is the fourth type division result;
step 103: when the divided loss difference is the first-class division result, the loss difference of the accessible data resources is increased to determine the incremental data;
when the divided loss difference value is the second type division result and the third type division result, acquiring the data capacity of the accessible data resources, and determining corresponding batch data or full data according to the data capacity;
and when the divided loss difference is the fourth type of division result, performing real-time access on the accessible data resources to determine real-time data.
As an embodiment of the present technical solution, the step 2 includes:
step 201: the method comprises the steps of obtaining accessible data resources, performing data access on the accessible data resources to a big data unit through a preset data access mode, and determining multi-source heterogeneous data resources; wherein the content of the first and second substances,
the data access mode comprises batch data access, real-time data access, full data access and incremental data access;
step 202: uniformly converging the multi-source heterogeneous data resources, determining service data, and transmitting the service data to a one-stop data organization management unit;
step 203: and managing, comprehensively utilizing and managing the service data, and generating a corresponding service management result.
As an embodiment of the present technical solution, the step 203 further includes the following steps:
step S1: acquiring service data, and configuring service basic information according to the service data; wherein the content of the first and second substances,
the service basic information at least comprises a data source and an organization role;
step S2: based on a preset standard design tool, carrying out standard design on the service basic information to determine standard data; wherein the content of the first and second substances,
the standard design comprises business modeling, standard definition and physical table creation;
step S3: processing, refining and storing the standard data in a preset task processing model of the standard data based on a preset visual tool to generate target data;
step S4: periodically detecting the target data based on a preset data detection period and generating a quality report; wherein the content of the first and second substances,
the data detection period comprises a data demand response period and a data service construction period;
step S5: when the quality report is qualified, issuing data service in a preset data service mode; wherein the content of the first and second substances,
the data service mode is a mode of authorizing corresponding data through a preset rule;
step S6: receiving a service result of the data service, regulating technical assets through the service result, assisting a user to construct business assets through the technical assets, and generating a corresponding business management result.
As an embodiment of the present invention, the step S4 includes:
step S401: receiving asset data and safety control data, determining receiving time, and determining a data demand response period through the receiving time;
step S402: dividing a data service construction period according to the data demand response period, and passing the data service construction period;
step S403: and detecting the target data periodically through the data service construction period, and generating a quality report.
As an embodiment of the present invention, the step S6 includes:
step S601: receiving a service result of the data service, regulating technical assets through the service result, assisting a user to construct business assets through the technical assets, and generating a corresponding business management result.
Step S602: receiving a service result of the data service based on the big data center, and determining a corresponding service data type; wherein the content of the first and second substances,
the service data types comprise theme data, integration data and real-time data;
step S603: determining a data service according to the service data type, and regulating technical assets through the data service;
step S604: automatically summarizing asset metadata, applicable standards, a consanguineous chart for each asset through the technical assets;
step S606: and constructing the business assets through the asset metadata, the applicable standards and the blood relationship graph, and generating corresponding business management results.
A big data governance system based on DAMA theory is characterized by comprising:
a data resource module: for acquiring accessible data resources;
a data management platform module: the system is used for accessing the accessible data, organizing and comprehensively managing the accessible data based on a preset big data base platform, and generating a service management result;
a data consumption module: the system is used for receiving the service management result and transmitting the service management result to a consumption port; wherein the content of the first and second substances,
the consumption ports include at least one or more of a user port, an application port, and an analysis port.
As an embodiment of the present technical solution, the accessible data resource at least includes one or more of a relational database data source, a big data source, a file server data source, a message middleware data source, an interface data source, and a search engine data source.
As an embodiment of the present technical solution, the data management platform module includes:
a data access unit: the method comprises the steps of obtaining accessible data resources, accessing the accessible data resources to a big data unit through a preset data access mode, and determining multi-source heterogeneous data resources; wherein the content of the first and second substances,
the data access mode comprises a batch data access mode, a real-time data access mode, a full data access mode and an incremental data access mode;
big data unit: uniformly converging the multi-source heterogeneous data resources to determine service data;
a data management unit: and the system is used for managing, comprehensively utilizing and safely controlling the service data and generating a corresponding service management result.
As an embodiment of the present technical solution, the data management unit further includes:
a one-stop data organization management unit: the system is used for carrying out service mining on the service data, determining the mining data and carrying out organization management on the mining data; wherein the organization management comprises the standard design of data, data development, quality evaluation and asset estimation;
a data comprehensive utilization unit: the system is used for comprehensively utilizing the service data; wherein the comprehensive utilization comprises comprehensive retrieval of data, asset navigation, service navigation, a data cockpit and a knowledge base;
the data security management and control unit: the system is used for carrying out safety control on the service data; the safety management and control comprises user management, service management, audit management, log audit and panoramic operation and maintenance.
The invention has the following beneficial effects:
the embodiment of the invention provides a big data governance method based on DAMA theory, which comprises the steps of obtaining accessible data resources, accessing the accessible data resources, accessing different accessible data resources into a data management platform, correspondingly processing the data in a data access mode, organizing and comprehensively managing the accessible data resources based on a preset big data base platform, generating a service management result, receiving the service management result, transmitting the service management result to a consumption port, carrying out standard design on the data, carrying out data development and data quality and data asset prediction, comprehensively utilizing the data, namely comprehensively retrieving the data, navigating the asset, navigating the service, developing the service, and estimating the data asset, realizing the comprehensive governance of the data, wherein the consumption port at least comprises one or more of a user port, an application port and an analysis port, the method comprises the steps that data are transmitted to one-stop data organization management through a big data base platform, and schemes such as user management, service management, audit management, log audit and panoramic operation and maintenance are realized through safety management and control on the data, so that data consumption is realized; based on the field-oriented abstract design method, a unified management system across large data platforms is realized. A set of efficient and scientific solution is provided for the treatment of big data in a mode of tool + knowledge + operation.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a big data governance method based on DAMA theory in the embodiment of the present invention;
FIG. 2 is a block diagram of a big data governance system based on DAMA theory in an embodiment of the present invention;
FIG. 3 is a block diagram of a big data governance system based on DAMA theory in an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly or indirectly connected to the other element.
It will be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, as used herein, refer to an orientation or positional relationship indicated in the drawings that is solely for the purpose of facilitating the description and simplifying the description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and is therefore not to be construed as limiting the invention.
Moreover, it is noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions, and "a plurality" means two or more unless specifically limited otherwise. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Example 1:
as shown in fig. 1, an embodiment of the present invention provides 1 a big data governance method based on a DAMA theory, which is characterized by including:
step 1: obtaining accessible data resources;
step 2: accessing the accessible data resources, organizing and comprehensively managing the accessible data resources based on a preset big data base platform, and generating a service management result;
and step 3: receiving the service management result and transmitting the service management result to a consumption port; wherein the content of the first and second substances,
the consumption ports include at least one or more of a user port, an application port, and an analysis port.
The working principle and the beneficial effects of the technical scheme are as follows:
the embodiment of the invention provides a big data governance method based on DAMA theory, which comprises the steps of obtaining accessible data resources, accessing the accessible data resources, accessing different accessible data resources into a data management platform, correspondingly processing the data in a data access mode, organizing and comprehensively managing the accessible data resources based on a preset big data base platform, generating a service management result, receiving the service management result, transmitting the service management result to a consumption port, carrying out standard design on the data, carrying out data development and data quality and data asset prediction, comprehensively utilizing the data, namely comprehensively retrieving the data, navigating the asset, navigating the service, developing the service, and estimating the data asset, realizing the comprehensive governance of the data, wherein the consumption port at least comprises one or more of a user port, an application port and an analysis port, the method comprises the steps that data are transmitted to one-stop data organization management through a big data base platform, and schemes such as user management, service management, audit management, log audit and panoramic operation and maintenance are realized through safety management and control on the data, so that data consumption is realized; based on the field-oriented abstract design method, a unified management system across large data platforms is realized. A set of efficient and scientific solution is provided for the treatment of big data in a mode of tool + knowledge + operation.
Example 2:
this technical solution provides an embodiment, where step 2, before, further includes:
step 100: calculating the sampling capacity of the accessible data resources;
Figure 102984DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 199116DEST_PATH_IMAGE002
representing the sample capacity of the accessible data resources,
Figure 848403DEST_PATH_IMAGE003
represents the first
Figure 979170DEST_PATH_IMAGE004
The sample size of the data resource is accessible in batches,
Figure 118028DEST_PATH_IMAGE005
Figure 373560DEST_PATH_IMAGE006
representing the total number of batches of accessible data resources,
Figure 888855DEST_PATH_IMAGE007
represents the first
Figure 811811DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 121570DEST_PATH_IMAGE008
The number of the resource data is one,
Figure 926715DEST_PATH_IMAGE009
Figure 917804DEST_PATH_IMAGE010
represents the total number of resource data,
Figure 757585DEST_PATH_IMAGE011
represents the first
Figure 175928DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 733948DEST_PATH_IMAGE008
The number of the sampling capacity is one,
Figure 325466DEST_PATH_IMAGE012
represents the first
Figure 223015DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 874576DEST_PATH_IMAGE008
The occupied capacity of each resource data during sampling;
step 101: calculating a loss difference value of a preset memory access capacity and a sampling capacity;
Figure 591996DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 987206DEST_PATH_IMAGE014
representing the difference in the loss as a function of,
Figure 801578DEST_PATH_IMAGE015
represents the first
Figure 561724DEST_PATH_IMAGE004
The difference in the capacity of the batch of accessible data resources,
Figure 828757DEST_PATH_IMAGE016
representing memory access capacity;
step 102: dividing the loss difference value to determine a division result;
Figure 27657DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure 634219DEST_PATH_IMAGE018
in order to divide the result of the division,
Figure 627583DEST_PATH_IMAGE019
representing predicted loss difference, when the division result is
Figure 319595DEST_PATH_IMAGE020
I.e. by
Figure 322186DEST_PATH_IMAGE021
Then, the divided loss difference value is the first type of division result; when the division result is
Figure 845571DEST_PATH_IMAGE022
I.e. by
Figure 947520DEST_PATH_IMAGE023
Time, divided loss differenceThe value is the second type division result; when the division result is
Figure 189145DEST_PATH_IMAGE024
I.e. by
Figure 685109DEST_PATH_IMAGE025
Then, the divided loss difference value is the third type of division result; when the division result is
Figure 63001DEST_PATH_IMAGE026
I.e. by
Figure 601430DEST_PATH_IMAGE027
Then, the divided loss difference value is the fourth type division result;
step 103: when the divided loss difference is the first-class division result, the loss difference of the accessible data resources is increased to determine the incremental data;
when the divided loss difference value is the second type division result and the third type division result, acquiring the data capacity of the accessible data resources, and determining corresponding batch data or full data according to the data capacity;
and when the divided loss difference is the fourth type of division result, performing real-time access on the accessible data resources to determine real-time data.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme calculates the sampling capacity of the accessible data resources
Figure 64772DEST_PATH_IMAGE003
And by sampling capacity
Figure 346849DEST_PATH_IMAGE028
Calculating the loss difference value of the preset memory access capacity and the sampling capacity
Figure 579247DEST_PATH_IMAGE029
Dividing the loss difference value and determining the division result
Figure 350894DEST_PATH_IMAGE018
When the division result is
Figure 239215DEST_PATH_IMAGE020
I.e. by
Figure 121721DEST_PATH_IMAGE021
When the loss difference value of the division is a first-class division result, the loss difference value of the accessible data resource is increased to determine the incremental data, because the loss value of the data is overlarge, the data needs to be processed and increased, and when the division result is a first-class division result, the loss difference value of the accessible data resource is increased to determine the incremental data
Figure 474205DEST_PATH_IMAGE022
I.e. by
Figure 354436DEST_PATH_IMAGE023
Then, the divided loss difference value is the second type of division result; when the division result is
Figure 792371DEST_PATH_IMAGE024
I.e. by
Figure 416250DEST_PATH_IMAGE025
When the accessible data resources are within the threshold range, the full access can be performed, and according to the data resources under different conditions, classification processing is performed, a plurality of processes run simultaneously, so that the data running efficiency is improved, and the running cost is reduced. When the division result is
Figure 623240DEST_PATH_IMAGE026
I.e. by
Figure 736690DEST_PATH_IMAGE027
The loss difference of the division is the fourth type division result(ii) a And when the divided loss difference value is the fourth type of division result, the accessible data resources are accessed in real time, real-time data are determined, and when the loss difference value reaches the minimum value, the data are nearly regarded as ideal data, and the ideal data are accessed in real time, so that the precision rate and the flexibility of the data are improved.
Example 3:
this technical scheme provides an embodiment, step 2, including:
step 201: the method comprises the steps of obtaining accessible data resources, performing data access on the accessible data resources to a big data unit through a preset data access mode, and determining multi-source heterogeneous data resources; wherein the content of the first and second substances,
the data access mode comprises batch data access, real-time data access, full data access and incremental data access;
step 202: uniformly converging the multi-source heterogeneous data resources, determining service data, and transmitting the service data to a one-stop data organization management unit;
step 203: and managing, comprehensively utilizing and managing the service data, and generating a corresponding service management result.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme includes that accessible data resources are obtained, data access is conducted on the accessible data resources to a big data unit through a preset data access mode, multi-source heterogeneous data resources are determined, the data access mode comprises batch data access, real-time data access, full data access and incremental data access, the different access modes improve the efficiency of data transmission, the flexibility of process operation is improved, the multi-source heterogeneous data resources are uniformly converged, service data are determined, the service data are transmitted to a one-station data organization management unit, the service data are managed, comprehensively utilized and safely controlled, corresponding service management results are generated, comprehensive management of services is achieved, the utilization rate of data is improved, and the process of work operation is accelerated.
Example 4:
this technical solution provides an embodiment, and step 203 further includes the following steps:
step S1: acquiring service data, and configuring service basic information according to the service data; wherein the content of the first and second substances,
the service basic information at least comprises a data source and an organization role;
step S2: based on a preset standard design tool, carrying out standard design on the service basic information to determine standard data; wherein the content of the first and second substances,
the standard design comprises business modeling, standard definition and physical table creation;
step S3: processing, refining and storing the standard data in a preset task processing model of the standard data based on a preset visual tool to generate target data;
step S4: periodically detecting the target data based on a preset data detection period and generating a quality report;
step S5: when the quality report is qualified, issuing data service in a preset data service mode; wherein the content of the first and second substances,
the data service mode is a mode of authorizing corresponding data through a preset rule;
step S6: receiving a service result of the data service, regulating technical assets through the service result, assisting a user to construct business assets through the technical assets, and generating a corresponding business management result.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme is based on DAMA theory and best practice, obtains service data, starts a data management system, and configures basic information such as a data source and mechanism roles; the method comprises the steps of carrying out standard design on basic information of a service based on a preset standard design tool, carrying out service modeling, standard definition and physical table creation by using the standard design tool, configuring a data access task, realizing uniform convergence of multi-source heterogeneous data, and processing, refining and storing the standard data again to generate target data in a task processing model preset by the standard data based on a preset visual tool; periodically detecting target data based on a preset data detection period, setting quality detection operation, periodically executing a detection task and outputting a quality report; when the quality report is qualified, issuing data service in a preset data service mode, and issuing data service in a data + rule + authorization mode, wherein the data service mode is a mode of authorizing corresponding data through a preset rule; the method comprises the steps of receiving a service result of data service, automatically organizing technical assets through the service result, assisting a user in constructing business assets through the technical assets, automatically summarizing asset metadata, applicable standards, a consanguinity chart and other information for each asset, carrying out data modeling, data development, data management, data sharing and other work, achieving a scientific management system of a data full life cycle, and effectively supporting scene requirements of data organization management, data comprehensive utilization and data safety control.
Example 5:
the present technical solution provides an embodiment, where in the step S4, the method includes:
step S401: receiving asset data and safety control data, determining receiving time, and determining a data demand response period through the receiving time;
step S402: dividing a data service construction period according to the data demand response period, and passing the data service construction period;
step S403: and detecting the target data periodically through the data service construction period, and generating a quality report.
The working principle and the beneficial effects of the technical scheme are as follows:
according to the technical scheme, asset data and safety control data are received, the receiving time length is determined, the data demand response period is determined through the receiving time length, the data response rule is established, so that the data are mined more flexibly and pertinently according to the rule, the data service construction period is divided through the data demand response period, and the data service construction period is passed. The data demand period is used for periodically acquiring data, the data service period is used for mining and constructing the data period, the target data are periodically detected through the data service construction period, a quality report is generated, the data accuracy is improved, based on DAMA theory and best practice, the full life cycle management characteristic processing of the data is covered, and functions of different levels of data standard management systems, flexible and extensible data source adaptation methods, rapid data service release modes and the like are provided.
Example 6:
the present technical solution provides an embodiment, where in the step S6, the method includes:
step S601: receiving a service result of the data service, regulating technical assets through the service result, assisting a user in constructing business assets through the technical assets, and generating a corresponding business management result;
step S602: receiving a service result of the data service based on the big data center, and determining a corresponding service data type; wherein the content of the first and second substances,
the service data types comprise theme data, integration data and real-time data;
step S603: determining a data service according to the service data type, and regulating technical assets through the data service;
step S604: automatically summarizing asset metadata, applicable standards, a consanguineous chart for each asset through the technical assets;
step S606: and constructing the business assets through the asset metadata, the applicable standards and the blood relationship graph, and generating corresponding business management results.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme includes that a service result of a data service is received, technical assets are normalized through the service result, a user is assisted to construct business assets through the technical assets, a corresponding business management result is generated, the user is helped to construct business, the service result of the data service is received based on a big data center, and a corresponding service data type is determined; wherein the service data types comprise subject data, integration data and real-time data; determining a data service according to the service data type, and regulating technical assets through the data service; automatically summarizing asset metadata, applicable standards, a consanguineous chart for each asset through the technical assets; and constructing the business assets through the asset metadata, the applicable standards and the blood relationship graph, and generating corresponding business management results.
Example 7:
a big data governance system based on DAMA theory, comprising:
a data resource module: for acquiring accessible data resources;
a data management platform module: the system is used for accessing the accessible data, organizing and comprehensively managing the accessible data based on a preset big data base platform, and generating a service management result;
a data consumption module: the system is used for receiving the service management result and transmitting the service management result to a consumption port; wherein the content of the first and second substances,
the consumption ports include at least one or more of a user port, an application port, and an analysis port.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme provides a big data management system based on a DAMA theory, which comprises a data resource module, a data management platform module and a data consumption module, wherein the data resource module is used for acquiring accessible data resources; the data management platform module is used for accessing the accessible data, organizing and comprehensively managing the accessible data based on a preset big data base platform, and generating a service management result; the data consumption module is used for receiving the service management result and transmitting the service management result to the consumption port; the consumption ports include at least one or more of a user port, an application port, and an analysis port.
Example 8:
the present disclosure provides an embodiment, where the accessible data resources include at least one or more of a relational database data source, a big data source, a file server data source, a message middleware data source, an interface data source, and a search engine data source.
The working principle and the beneficial effects of the technical scheme are as follows:
the accessible data resources at least comprise one or more of a relational database data source, a big data source, a file server data source, a message middleware data source, an interface data source and a search engine data source, the heterogeneous data sources of multiple sources are uniformly converged, a service is constructed, and the utilization rate and the commercial value of data are improved.
Example 9:
this technical solution provides an embodiment, and the data management platform module includes:
a data access unit: the method comprises the steps of obtaining accessible data resources, accessing the accessible data resources to a big data unit through a preset data access mode, and determining multi-source heterogeneous data resources; wherein the content of the first and second substances,
the data access mode comprises a batch data access mode, a real-time data access mode, a full data access mode and an incremental data access mode;
big data unit: uniformly converging the multi-source heterogeneous data resources to determine service data;
a data management unit: and the system is used for managing, comprehensively utilizing and safely controlling the service data and generating a corresponding service management result.
The working principle and the beneficial effects of the technical scheme are as follows:
the data management platform module comprises a data access unit, a big data unit and a data management unit, wherein the data access unit is used for acquiring accessible data resources, accessing the accessible data resources to the big data unit through a preset data access mode and determining multi-source heterogeneous data resources, and the data access mode comprises a batch data access mode, a real-time data access mode, a full data access mode and an incremental data access mode; the big data unit is used for uniformly converging multi-source heterogeneous data resources and determining service data; the data management unit is used for managing, comprehensively utilizing and safely controlling the service data and generating a corresponding service management result.
Example 10:
the technical solution provides an embodiment, where the data management unit further includes:
a one-stop data organization management unit: the system is used for carrying out service mining on the service data, determining the mining data and carrying out organization management on the mining data; wherein the content of the first and second substances,
the organization management comprises the standard design of data, data development, quality evaluation and asset estimation;
a data comprehensive utilization unit: the system is used for comprehensively utilizing the service data; wherein the content of the first and second substances,
the comprehensive utilization comprises comprehensive retrieval of data, asset navigation, service navigation, a data cockpit and a knowledge base;
the data security management and control unit: the system is used for carrying out safety control on the service data; wherein the content of the first and second substances,
the safety management and control comprises user management, service management, audit management, log audit and panoramic operation and maintenance.
The working principle and the beneficial effects of the technical scheme are as follows:
in the data management unit of the technical scheme, the one-stop data organization management unit is used for carrying out service mining on service data, determining the mined data, carrying out organization management on the mined data, providing original data for data service, and carrying out organization management including standard design, data development, quality evaluation and asset estimation of the data and judging the value of the data so as to achieve maximum commercial utilization; the data comprehensive utilization unit is used for comprehensively utilizing the service data, and comprehensively utilizing the service data, including comprehensive retrieval of the data, asset navigation, service navigation, a data cockpit and a knowledge base; the data security management and control unit: the system is used for carrying out safety control on the service data; the safety management and control comprises user management, service management, audit management, log audit and panoramic operation and maintenance.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A big data governance method based on DAMA theory is characterized by comprising the following steps:
step 1: obtaining accessible data resources;
step 2: accessing the accessible data resources, organizing and comprehensively managing the accessible data resources based on a preset big data base platform, and generating a service management result;
and step 3: receiving the service management result and transmitting the service management result to a consumption port; wherein the content of the first and second substances,
the consumption ports include at least one or more of a user port, an application port, and an analysis port.
2. The big data governance method based on DAMA theory as claimed in claim 1, wherein said step 2, before, further comprises:
step 100: calculating the sampling capacity of the accessible data resources;
Figure 200850DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 211532DEST_PATH_IMAGE002
representing the sample capacity of the accessible data resources,
Figure 117171DEST_PATH_IMAGE003
represents the first
Figure 213303DEST_PATH_IMAGE004
The sample size of the data resource is accessible in batches,
Figure 597011DEST_PATH_IMAGE005
Figure 727778DEST_PATH_IMAGE006
representing the total number of batches of accessible data resources,
Figure 866635DEST_PATH_IMAGE007
represents the first
Figure 387746DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 637462DEST_PATH_IMAGE008
The number of the resource data is one,
Figure 560419DEST_PATH_IMAGE009
Figure 870177DEST_PATH_IMAGE010
represents the total number of resource data,
Figure 940902DEST_PATH_IMAGE011
represents the first
Figure 931991DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 771771DEST_PATH_IMAGE008
The number of the sampling capacity is one,
Figure 190114DEST_PATH_IMAGE012
represents the first
Figure 482555DEST_PATH_IMAGE004
First in a batch accessible data resource
Figure 339653DEST_PATH_IMAGE008
The occupied capacity of each resource data during sampling;
step 101: calculating a loss difference value of a preset memory access capacity and a sampling capacity;
Figure 971623DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 623184DEST_PATH_IMAGE014
representing the difference in the loss as a function of,
Figure 606183DEST_PATH_IMAGE015
represents the first
Figure 1393DEST_PATH_IMAGE004
The difference in the capacity of the batch of accessible data resources,
Figure 550186DEST_PATH_IMAGE016
representing memory access capacity;
step 102: dividing the loss difference value to determine a division result;
Figure 575910DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure 842944DEST_PATH_IMAGE018
in order to divide the result of the division,
Figure 708088DEST_PATH_IMAGE019
representing predicted loss difference, when the division result is
Figure 376967DEST_PATH_IMAGE020
I.e. by
Figure 635910DEST_PATH_IMAGE021
Then, the divided loss difference value is the first type of division result; when the division result is
Figure 62343DEST_PATH_IMAGE022
I.e. by
Figure 64934DEST_PATH_IMAGE023
Then, the divided loss difference value is the second type of division result; when the division result is
Figure 526003DEST_PATH_IMAGE024
I.e. by
Figure 690268DEST_PATH_IMAGE025
Then, the divided loss difference value is the third type of division result; when the division result is
Figure 931893DEST_PATH_IMAGE026
I.e. by
Figure 410279DEST_PATH_IMAGE027
Then, the divided loss difference value is the fourth type division result;
step 103: when the divided loss difference is the first-class division result, the loss difference of the accessible data resources is increased to determine the incremental data;
when the divided loss difference value is the second type division result and the third type division result, acquiring the data capacity of the accessible data resources, and determining corresponding batch data or full data according to the data capacity;
and when the divided loss difference is the fourth type of division result, performing real-time access on the accessible data resources to determine real-time data.
3. The big data governance method based on the DAMA theory as claimed in claim 1, wherein said step 2, comprises:
step 201: the method comprises the steps of obtaining accessible data resources, performing data access on the accessible data resources to a big data unit through a preset data access mode, and determining multi-source heterogeneous data resources; wherein the content of the first and second substances,
the data access mode comprises batch data access, real-time data access, full data access and incremental data access;
step 202: uniformly converging the multi-source heterogeneous data resources, determining service data, and transmitting the service data to a one-stop data organization management unit;
step 203: and managing, comprehensively utilizing and managing the service data, and generating a corresponding service management result.
4. The big data governance method based on DAMA theory as claimed in claim 3, wherein said step 203 further comprises the steps of:
step S1: acquiring service data, and configuring service basic information according to the service data; wherein the content of the first and second substances,
the service basic information at least comprises a data source and an organization role;
step S2: based on a preset standard design tool, carrying out standard design on the service basic information to determine standard data; wherein the content of the first and second substances,
the standard design comprises business modeling, standard definition and physical table creation;
step S3: processing, refining and storing the standard data in a preset task processing model of the standard data based on a preset visual tool to generate target data;
step S4: periodically detecting the target data based on a preset data detection period and generating a quality report; wherein the content of the first and second substances,
the data detection period comprises a data demand response period and a data service construction period;
step S5: when the quality report is qualified, issuing data service in a preset data service mode; wherein the content of the first and second substances,
the data service mode is a mode of authorizing corresponding data through a preset rule;
step S6: receiving a service result of the data service, regulating technical assets through the service result, assisting a user to construct business assets through the technical assets, and generating a corresponding business management result.
5. The big data governance method based on the DAMA theory as claimed in claim 4, wherein said step S4 includes:
step S401: receiving asset data and safety control data, determining receiving time, and determining a data demand response period through the receiving time;
step S402: dividing a data service construction period according to the data demand response period, and passing the data service construction period;
step S403: and detecting the target data periodically through the data service construction period, and generating a quality report.
6. The big data governance method based on the DAMA theory as claimed in claim 4, wherein said step S6 includes:
step S601: receiving a service result of the data service, regulating technical assets through the service result, assisting a user in constructing business assets through the technical assets, and generating a corresponding business management result;
step S602: receiving a service result of the data service based on the big data center, and determining a corresponding service data type; wherein the service data types comprise subject data, integration data and real-time data;
step S603: determining a data service according to the service data type, and regulating technical assets through the data service;
step S604: automatically summarizing asset metadata, applicable standards, a consanguineous chart for each asset through the technical assets;
step S606: and constructing the business assets through the asset metadata, the applicable standards and the blood relationship graph, and generating corresponding business management results.
7. A big data governance system based on DAMA theory is characterized by comprising:
a data resource module: for acquiring accessible data resources;
a data management platform module: the system is used for accessing the accessible data, organizing and comprehensively managing the accessible data based on a preset big data base platform, and generating a service management result;
a data consumption module: the system is used for receiving the service management result and transmitting the service management result to a consumption port; wherein the content of the first and second substances,
the consumption ports include at least one or more of a user port, an application port, and an analysis port.
8. The DAMA-theory based big data governance system in accordance with claim 7, wherein said accessible data resources comprise at least one or more of relational database data sources, big data sources, file server data sources, message middleware data sources, interface data sources, and search engine data sources.
9. The DAMA-theory based big data governance system according to claim 7, wherein said data management platform module comprises:
a data access unit: the method comprises the steps of obtaining accessible data resources, accessing the accessible data resources to a big data unit through a preset data access mode, and determining multi-source heterogeneous data resources; wherein the content of the first and second substances,
the data access mode comprises a batch data access mode, a real-time data access mode, a full data access mode and an incremental data access mode;
big data unit: uniformly converging the multi-source heterogeneous data resources to determine service data;
a data management unit: and the system is used for managing, comprehensively utilizing and safely controlling the service data and generating a corresponding service management result.
10. The DAMA theory-based big data governance system according to claim 9, wherein said data management unit further comprises:
a one-stop data organization management unit: the system is used for carrying out service mining on the service data, determining the mining data and carrying out organization management on the mining data; wherein the content of the first and second substances,
the organization management comprises the standard design of data, data development, quality evaluation and asset estimation;
a data comprehensive utilization unit: the system is used for comprehensively utilizing the service data; wherein the content of the first and second substances,
the comprehensive utilization comprises comprehensive retrieval of data, asset navigation, service navigation, a data cockpit and a knowledge base;
the data security management and control unit: the system is used for carrying out safety control on the service data; wherein the content of the first and second substances,
the safety management and control comprises user management, service management, audit management, log audit and panoramic operation and maintenance.
CN202111213648.3A 2021-10-19 2021-10-19 Big data management method and system based on DAMA theory Active CN113641663B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111213648.3A CN113641663B (en) 2021-10-19 2021-10-19 Big data management method and system based on DAMA theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111213648.3A CN113641663B (en) 2021-10-19 2021-10-19 Big data management method and system based on DAMA theory

Publications (2)

Publication Number Publication Date
CN113641663A true CN113641663A (en) 2021-11-12
CN113641663B CN113641663B (en) 2022-01-18

Family

ID=78427385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111213648.3A Active CN113641663B (en) 2021-10-19 2021-10-19 Big data management method and system based on DAMA theory

Country Status (1)

Country Link
CN (1) CN113641663B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303408A (en) * 2023-05-24 2023-06-23 中数通信息有限公司 DAMA data frame-based data governance process management method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170053242A1 (en) * 2015-08-18 2017-02-23 Satish Ayyaswami System and Method for a Big Data Analytics Enterprise Framework
CN110543537A (en) * 2019-08-22 2019-12-06 广东省城乡规划设计研究院 Intelligent planning space-time cloud GIS platform based on Docker container and micro-service architecture
CN110781236A (en) * 2019-10-29 2020-02-11 山西云时代技术有限公司 Method for constructing government affair big data management system
CN111159985A (en) * 2019-12-24 2020-05-15 平安养老保险股份有限公司 Data export method, data export device, computer equipment and computer-readable storage medium
CN111783107A (en) * 2020-07-09 2020-10-16 杭州安恒信息技术股份有限公司 Multi-source trusted data access method, device and equipment
CN112100457A (en) * 2020-09-22 2020-12-18 国网辽宁省电力有限公司电力科学研究院 Multi-source heterogeneous data integration method based on metadata
CN112364003A (en) * 2020-11-09 2021-02-12 南威软件股份有限公司 Big data management method, device, equipment and medium for different industries
CN112699175A (en) * 2021-01-15 2021-04-23 广州汇智通信技术有限公司 Data management system and method thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170053242A1 (en) * 2015-08-18 2017-02-23 Satish Ayyaswami System and Method for a Big Data Analytics Enterprise Framework
CN110543537A (en) * 2019-08-22 2019-12-06 广东省城乡规划设计研究院 Intelligent planning space-time cloud GIS platform based on Docker container and micro-service architecture
CN110781236A (en) * 2019-10-29 2020-02-11 山西云时代技术有限公司 Method for constructing government affair big data management system
CN111159985A (en) * 2019-12-24 2020-05-15 平安养老保险股份有限公司 Data export method, data export device, computer equipment and computer-readable storage medium
CN111783107A (en) * 2020-07-09 2020-10-16 杭州安恒信息技术股份有限公司 Multi-source trusted data access method, device and equipment
CN112100457A (en) * 2020-09-22 2020-12-18 国网辽宁省电力有限公司电力科学研究院 Multi-source heterogeneous data integration method based on metadata
CN112364003A (en) * 2020-11-09 2021-02-12 南威软件股份有限公司 Big data management method, device, equipment and medium for different industries
CN112699175A (en) * 2021-01-15 2021-04-23 广州汇智通信技术有限公司 Data management system and method thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303408A (en) * 2023-05-24 2023-06-23 中数通信息有限公司 DAMA data frame-based data governance process management method and system

Also Published As

Publication number Publication date
CN113641663B (en) 2022-01-18

Similar Documents

Publication Publication Date Title
US9870270B2 (en) Realizing graph processing based on the mapreduce architecture
US9038068B2 (en) Capacity reclamation and resource adjustment
US9778967B2 (en) Sophisticated run-time system for graph processing
US8843423B2 (en) Missing value imputation for predictive models
US20130198050A1 (en) Systems and methods for providing decision time brokerage in a hybrid cloud ecosystem
US9038086B2 (en) End to end modular information technology system
US20200050380A1 (en) Predictive forecasting and data growth trend in cloud services
US20200065127A1 (en) Virtualized resource monitoring system
US11956330B2 (en) Adaptive data fetching from network storage
CN113641663B (en) Big data management method and system based on DAMA theory
CN115033340A (en) Host selection method and related device
CN109558248A (en) A kind of method and system for the determining resource allocation parameters calculated towards ocean model
US10222849B2 (en) Power phase energy level monitoring and management in a data center
US11823077B2 (en) Parallelized scoring for ensemble model
CN117763024A (en) Data fragment extraction method and device
WO2022171075A1 (en) Monitoring health status of large cloud computing system
US20210357803A1 (en) Feature catalog enhancement through automated feature correlation
CN115167785A (en) Label-based network disk file management method and device, network disk and storage medium
US20210349903A1 (en) Row secure table plan generation
CN114490590A (en) Data warehouse quality evaluation method and device, electronic equipment and storage medium
WO2021096346A1 (en) A computer-implemented system for management of container logs and its method thereof
US20200195550A1 (en) Tree structure-based smart inter-computing routing model
US9412083B2 (en) Aggregation and workflow engines for managing project information
CN112365070B (en) Power load prediction method, device, equipment and readable storage medium
US11966866B2 (en) Providing resource access

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant