CN113434488A - Data element operating system and operating method based on data circulation - Google Patents

Data element operating system and operating method based on data circulation Download PDF

Info

Publication number
CN113434488A
CN113434488A CN202110984704.7A CN202110984704A CN113434488A CN 113434488 A CN113434488 A CN 113434488A CN 202110984704 A CN202110984704 A CN 202110984704A CN 113434488 A CN113434488 A CN 113434488A
Authority
CN
China
Prior art keywords
data
platform
circulation
data element
operating system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110984704.7A
Other languages
Chinese (zh)
Inventor
陆志鹏
王希勤
朱立锋
郑曦
周崇毅
国丽
李勇
乔亲旺
胡成盛
胡俊
谢冬水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHINA ELECTRONIC INFORMATION INDUSTRY GROUP Co
Original Assignee
CHINA ELECTRONIC INFORMATION INDUSTRY GROUP Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHINA ELECTRONIC INFORMATION INDUSTRY GROUP Co filed Critical CHINA ELECTRONIC INFORMATION INDUSTRY GROUP Co
Priority to CN202110984704.7A priority Critical patent/CN113434488A/en
Publication of CN113434488A publication Critical patent/CN113434488A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a data element operating system and method based on data circulation, wherein the system comprises: the data cleaning and treatment platform is used for cleaning and processing the collected data to form data resources and storing the data resources in a data warehouse; the data resource management platform is used for overall management of the data warehouse and realizing the functions of uniform scheduling of data resources, data source tracing and warehouse model management; the data element management platform is used for managing development and production of data elements, providing a data element development environment, developing the data elements based on data resources, carrying out unified scheduling management on the data elements, and carrying out hierarchical classification and element shelving on the generated data elements; the data circulation platform is used for providing data element transaction service for data application developers. The invention lays an important foundation for the integration of data element aggregation, processing, pricing and circulation transaction.

Description

Data element operating system and operating method based on data circulation
Technical Field
The invention relates to the technical field of data management, in particular to a data element operating system and method based on data circulation.
Background
In the aspect of data management technology, with the development of IT technology, the technology is mature day by day.
The data management technology comprises a data technology framework system taking data aggregation, data processing, data warehouse development, data analysis and mining as core links, and is an important technical basis for comprehensively promoting data element.
In the aspect of data collection, technologies such as ' lake and storage integration ', flow and batch integration ' and the like are mainly adopted to carry out real-time and offline combined aggregation and storage on mass data based on a distributed architecture, and a bottom architecture for data processing is migrated to a cloud platform and a distributed system. In the aspect of data processing, the data processing technology mainly comprises data standard formulation, standard mapping, quality audit, cleaning conversion and the like, and gradually evolves to automation, intellectualization and high efficiency along with the deep fusion of artificial intelligence. In the aspect of data warehouse development, a data model is mainly used for carrying out business combing analysis and arrangement on data, and a basic library and a theme library are generated. In the aspect of data analysis and mining, the traditional statistical analysis technology and the machine learning algorithm are mainly used for feature discovery and analysis. Among them, knowledge maps, feature engineering, natural language processing and other technologies play more and more important roles in the process of mining data values.
Disclosure of Invention
The invention provides a data element operating system and method based on data circulation, and aims to solve the technical problem of how to realize convergence, processing, pricing and circulation transaction of data elements.
The data element operating system based on data circulation comprises the following components:
the data cleaning and treating platform is used for cleaning and processing the collected data to form data resources and storing the data resources in the data warehouse;
the data resource management platform is used for carrying out overall management on the data warehouse and realizing the functions of unified scheduling of data resources, data source tracing and warehouse model management;
the data element management platform is used for managing development and production of data elements, providing a data element development environment, developing the data elements based on data resources, carrying out unified scheduling management on the data elements, and carrying out hierarchical classification and element shelving on the generated data elements;
and the data circulation platform is used for providing data element transaction service for data application developers or realizing sharing and opening through a data hub directly.
According to some embodiments of the invention, the system further comprises:
and the data collection platform is used for registering the accessed data sources, collecting the data and classifying the collected data in a grading way.
In some embodiments of the present invention, the data aggregation platform implements real-time aggregation of data by using a stream computation technique, and performs offline aggregation of data by using a batch processing technique.
According to some embodiments of the invention, the data resource management platform is used for cataloging and managing the collected data to form a standard library catalog and a catalog of a resource library.
In some embodiments of the invention, the data element management platform is further configured to catalog the generated data elements.
According to some embodiments of the present invention, the data element ownership is confirmed at the data flow platform, the data element is evaluated according to a preset platform rule, and the data element is released through a platform portal.
In some embodiments of the present invention, on the data circulation platform, a data application developer develops a data product by subscribing to a data element, the data circulation platform provides a charging and settlement function support for the data element, and the data application developer provides a production maintenance service for the data element through the data circulation platform.
According to some embodiments of the invention, the data application developer performs component design and model debugging based on sample data synchronously mapped by a data component management platform.
According to the data element operation method of the embodiment of the invention, the method adopts the data element operation system based on data circulation to perform data element operation, and the method comprises the following steps:
s100, performing original data aggregation;
s200, processing the original data and providing data resources for data element development;
s300, developing the data resource into a data element;
s400, determining the right and pricing of the data elements through the data circulation platform;
and S500, carrying out data element circulation transaction.
According to an embodiment of the present invention, a computer-readable storage medium stores a signal-mapped computer program, which when executed by at least one processor, implements a data element operating method as described above.
The data element operating system and the operating method based on data circulation provided by the invention have the following advantages:
the invention constructs a software-defined data element operating system, and creatively creates a data element operating system which defines infrastructure resources downwards and defines a data governance tool upwards by using the traditional system definition and surrounding data element full processes such as data collection, data governance, data element development, data element transaction and the like.
Firstly, the data element operating system not only covers three production platforms of a data cleaning management platform, a data resource management platform and a data element management platform, but also comprises a data circulation platform aiming at data element transaction, and the cooperative mode of front shop and back shop greatly improves the data management efficiency. And secondly, the data element operating system is responsible for scheduling and managing the data element process and hardware resources, software resources, data resources and data elements of the database, and lays an important foundation for integrating data element aggregation, processing, pricing and circulation transaction.
Drawings
FIG. 1 is a diagram illustrating a data element operating system based on data circulation, according to an embodiment of the present invention;
FIG. 2 is a data element operating system business flow diagram based on data flow according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating data flow of a data element operating system based on data flow according to an embodiment of the present invention;
FIG. 4 is a flow chart of a method of data element manipulation based on data circulation, according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a data element operating system based on data circulation according to an embodiment of the present invention.
Reference numerals:
the operating system (100) is operated in a manner,
the system comprises a data aggregation platform 10, a data cleaning treatment platform 20, a data resource management platform 30, a data element management platform 40 and a data circulation platform 50.
Detailed Description
To further explain the technical means and effects of the present invention adopted to achieve the intended purpose, the present invention will be described in detail with reference to the accompanying drawings and preferred embodiments.
The description of the method flow in the present specification and the steps of the flow chart in the drawings of the present specification are not necessarily strictly performed by the step numbers, and the execution order of the method steps may be changed. Moreover, certain steps may be omitted, multiple steps may be combined into one step execution, and/or a step may be broken down into multiple step executions.
As shown in fig. 1 and 5, a data element operating system 100 based on data circulation according to an embodiment of the present invention includes: data cleansing abatement platform 20, data resource management platform 30, data element management platform 40, and data circulation platform 50.
The data cleaning and treating platform 20 is used for cleaning and processing the collected data to form data resources, and storing the data resources in the data warehouse.
The data resource management platform 30 is used for overall management of the data warehouse, and realizes functions of unified scheduling of data resources, data tracing and warehouse model management.
The data element management platform 40 is used for managing development and production of data elements, providing a data element development environment, developing data elements based on data resources, performing unified scheduling management on the data elements, and performing hierarchical classification and element shelving on the generated data elements.
The data circulation platform 50 is used for providing data element transaction services for data application developers, or realizing sharing and opening directly through a data hub.
According to some embodiments of the invention, the system further comprises: and the data collecting platform 10 is used for registering the accessed data sources, collecting the data and classifying the collected data in a grading way.
In some embodiments of the present invention, the data aggregation platform 10 implements real-time aggregation of data by using a stream computation technique, and performs offline aggregation of data by using a batch processing technique.
According to some embodiments of the present invention, the data resource management platform 30 performs cataloging and cataloging management on the collected data to form a standard library catalog and a catalog of a resource library.
In some embodiments of the present invention, data element management platform 40 is also used to catalog generated data elements.
According to some embodiments of the present invention, the data element rights are confirmed at the data distribution platform 50, the data element is evaluated according to the preset platform rules, and the data element is released through the platform portal.
In some embodiments of the present invention, on the data circulation platform 50, a data application developer develops a data product by subscribing to a data element, the data circulation platform 50 provides a charging settlement function support for the data element, and the data application developer provides a production maintenance service for the data element through the data circulation platform 50.
According to some embodiments of the invention, a data application developer performs component design and model debugging based on sample data synchronously mapped by a data management platform.
As shown in fig. 2 to fig. 4, according to the data element operation method of the embodiment of the present invention, the method uses the data element operation system 100 based on data circulation as described above to perform data element operation, and the method includes:
s100, performing original data collection through the data collection platform 10;
s200, processing the original data through the data cleaning and treatment platform 20 and the data resource management platform 30, and providing data resources for data element development;
s300, developing the existing data resources into data elements of data element circulation transactions through the data resource management platform 30;
s400, determining the right and pricing of the data element through the data circulation platform 50;
and S500, the data element realizes circulation transaction through the data circulation platform 50.
According to the computer-readable storage medium of an embodiment of the present invention, a computer program for signal mapping is stored in the computer-readable storage medium, and the computer program is executed by at least one processor to implement the data element operation method as described above.
The data circulation-based data element operating system 100 and the operating method according to the present invention will be described in detail with reference to the accompanying drawings. It is to be understood that the following description is only exemplary in nature and should not be taken as a specific limitation on the invention.
Fig. 1 is an overall architecture diagram of the present invention. As shown in fig. 1, the data element operating system 100 of the present invention includes: the data cleaning treatment platform 20, the data resource management platform 30, the data element management platform 40 and the data circulation platform 50 are used for scheduling and managing hardware resources, software resources, data resources and data elements of a data element process and a data vault. And the data cleaning treatment platform 20 cleans and processes the collected data to form data resources, and the data resources are stored in a data warehouse. The data resource management platform 30 mainly performs overall management on the data warehouse, and realizes functions of unified scheduling of data resources, data tracing, warehouse model management and the like. The data element management platform 40 performs overall management on the whole process of element development and production, provides a complete element development environment, develops data elements based on data resources, performs unified scheduling management on the data elements, and performs hierarchical classification and element shelving on the generated data elements. The data element relies on the data circulation platform 50, provides data element transaction service for data application developers, and can also realize sharing and opening directly through a data center.
The system business process as shown in fig. 2 includes the following steps:
a100, data aggregation: registering the accessed data sources in the data collection system; the real-time collection of the data is realized by adopting a flow calculation technology, and the offline collection of large-scale data is realized by adopting a batch processing technology; and classifying the collected data in a grading way.
A200, data processing: managing the collection tasks on the data cleaning and treatment platform 20; on the data resource management platform 30, cataloging and cataloging management are performed on the collected data to form standard library cataloging and resource library cataloging.
A300, data element development: and on the data element management platform 40, data element design and element model development are carried out according to the data resources formed in the first two steps, data elements are produced in the system platform environment, and the generated data elements are catalogued.
A400, data element transaction: the data element ownership is confirmed in the data flow platform 50, the elements are evaluated according to the platform rules, and the elements are released through the platform portal.
A500, data product transaction: on the data circulation platform 50, a data product developer develops a data product by subscribing a data element, the data circulation platform 50 provides a charging settlement function support for the element, and the element developer provides a production maintenance service for the data element through the data circulation platform 50.
The system data flow is shown in fig. 3: the original data are collected through a data collection platform 10 and processed through a data cleaning treatment platform 20 and a data resource management platform 30, so that standard, uniform and high-quality available data resources are provided for data element development; developing the existing data resources into an intermediate state of data element circulation transaction, namely 'data element', through the data resource management platform 30, and performing right confirming and pricing on the data element circulation transaction; the data elements effect a currency transaction through the data currency platform 50.
In summary, the data element operating system 100 with the data cleaning management platform 20, the data resource management platform 30, the data element management platform 40 and the data circulation platform 50 as the core is constructed in the present invention, and the data element process and the hardware resources, software resources, data resources and data elements of the database are scheduled and managed; a technological data treatment process is formed, and the process comprises data collection, data cleaning treatment, data element development and data element transaction, wherein 20 data treatment procedures are carried out in order, and the whole life cycle of data element application is covered.
On one hand, the data element model and the application model can effectively realize clear data base and useful data, and provide theoretical support for the realization path of data element formation. On the other hand, the pricing and security auditing model establishes a foundation for efficient and secure circulation of data as an asset form from two aspects of technology and system while performing quantitative accounting on data elements by solidifying a pricing mechanism and a security auditing mechanism to the data element operating system 100.
The method takes the demand as traction, takes application scene construction as a gripper, is oriented to the key fields of treatment modernization, high-quality development, civil service, scientific and technological innovation and the like, constructs four application ecosystems by developing and utilizing data elements, and fully releases the supporting effect of data element on digital industrialization and industrial digitization.
The data element operating system 100 and the operating method based on data governance provided by the invention have the following advantages:
the invention constructs a data element operating system 100 defined by software, and creatively creates the data element operating system 100 for defining infrastructure resources downwards and defining data governance tools upwards by taking the traditional system definition as reference and surrounding the data element full flow of data collection, data governance, data element development, data element transaction and the like.
First, the data element operating system 100 not only covers three production platforms, namely, the data cleaning management platform 20, the data resource management platform 30 and the data element management platform 40, but also includes a data circulation platform 50 for data element transaction, and the cooperation mode of "front shop and back shop" greatly improves the data management efficiency. Secondly, the data element operating system 100 is responsible for scheduling and managing the data element process and the hardware resources, software resources, data resources and data elements of the database, and lays an important foundation for integrating data element aggregation, processing, pricing and circulation transaction.
While the invention has been described in connection with specific embodiments thereof, it is to be understood that it is intended by the appended drawings and description that the invention may be embodied in other specific forms without departing from the spirit or scope of the invention.

Claims (10)

1. A data element operating system based on data circulation, comprising:
the data cleaning and treating platform is used for cleaning and processing the collected data to form data resources and storing the data resources in the data warehouse;
the data resource management platform is used for carrying out overall management on the data warehouse and realizing the functions of unified scheduling of data resources, data source tracing and warehouse model management;
the data element management platform is used for managing development and production of data elements, providing a data element development environment, developing the data elements based on data resources, carrying out unified scheduling management on the data elements, and carrying out hierarchical classification and element shelving on the generated data elements;
and the data circulation platform is used for providing data element transaction service for data application developers or realizing sharing and opening through a data hub directly.
2. The data circulation-based data element operating system of claim 1, wherein the system further comprises:
and the data collection platform is used for registering the accessed data sources, collecting the data and classifying the collected data in a grading way.
3. The data circulation-based data element operating system of claim 2, wherein the data aggregation platform adopts a flow calculation technology to realize real-time aggregation of data, and adopts a batch processing technology to perform off-line aggregation of data.
4. The data circulation-based data element operating system as claimed in claim 2, wherein the data resource management platform is used for performing cataloging and cataloging management on the collected data to form standard library cataloging and resource library cataloging.
5. The data circulation-based data element operating system of claim 1, wherein the data element management platform is further configured to catalog the generated data elements.
6. The data flow-based data element operating system of claim 1, wherein the data element ownership is confirmed at the data flow platform, the data element is evaluated according to preset platform rules, and the data element is released through a platform portal.
7. The data circulation-based data element operating system of claim 6, wherein on the data circulation platform, a data application developer develops data products by subscribing to data elements, the data circulation platform provides charging settlement function support for the data elements, and the data application developer provides production maintenance services for the data elements through the data circulation platform.
8. The data circulation-based data element operating system of claim 7, wherein the data application developer performs element design and model debugging based on the sample data synchronously mapped by the data element management platform.
9. A method for data element manipulation using a data element manipulation system based on data circulation according to any one of claims 1 to 8, the method comprising:
s100, performing original data aggregation;
s200, processing the original data and providing data resources for data element development;
s300, developing the data resource into a data element;
s400, determining the right and pricing of the data elements through the data circulation platform;
and S500, carrying out data element circulation transaction.
10. A computer-readable storage medium, storing a signal-mapped computer program which, when executed by at least one processor, performs the method of data element manipulation of claim 9.
CN202110984704.7A 2021-08-26 2021-08-26 Data element operating system and operating method based on data circulation Pending CN113434488A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110984704.7A CN113434488A (en) 2021-08-26 2021-08-26 Data element operating system and operating method based on data circulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110984704.7A CN113434488A (en) 2021-08-26 2021-08-26 Data element operating system and operating method based on data circulation

Publications (1)

Publication Number Publication Date
CN113434488A true CN113434488A (en) 2021-09-24

Family

ID=77797866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110984704.7A Pending CN113434488A (en) 2021-08-26 2021-08-26 Data element operating system and operating method based on data circulation

Country Status (1)

Country Link
CN (1) CN113434488A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114186244A (en) * 2022-01-26 2022-03-15 中国电子信息产业集团有限公司 Data element operation framework and system
CN115203263A (en) * 2022-09-14 2022-10-18 中国电子信息产业集团有限公司 Data element acquisition method, system, device and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200202440A1 (en) * 2017-12-08 2020-06-25 Nasdaq Technology Ab Systems and methods for electronic continuous trading of variant inventories
CN111597173A (en) * 2020-04-02 2020-08-28 上海瀚之友信息技术服务有限公司 Data warehouse system
CN111858560A (en) * 2020-07-24 2020-10-30 厦门至恒融兴信息技术有限公司 Financial data automated testing and monitoring system based on data warehouse
CN112001766A (en) * 2020-04-13 2020-11-27 陶光灿 Multi-level and multi-state large scientific instrument sharing platform based on big data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200202440A1 (en) * 2017-12-08 2020-06-25 Nasdaq Technology Ab Systems and methods for electronic continuous trading of variant inventories
CN111597173A (en) * 2020-04-02 2020-08-28 上海瀚之友信息技术服务有限公司 Data warehouse system
CN112001766A (en) * 2020-04-13 2020-11-27 陶光灿 Multi-level and multi-state large scientific instrument sharing platform based on big data
CN111858560A (en) * 2020-07-24 2020-10-30 厦门至恒融兴信息技术有限公司 Financial data automated testing and monitoring system based on data warehouse

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆志鹏: "数据要素市场化实现路径的思考", 《中国发展观察》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114186244A (en) * 2022-01-26 2022-03-15 中国电子信息产业集团有限公司 Data element operation framework and system
CN115203263A (en) * 2022-09-14 2022-10-18 中国电子信息产业集团有限公司 Data element acquisition method, system, device and computer readable storage medium

Similar Documents

Publication Publication Date Title
Abd Elaziz et al. Advanced optimization technique for scheduling IoT tasks in cloud-fog computing environments
Vera-Baquero et al. Real-time business activity monitoring and analysis of process performance on big-data domains
CN113434488A (en) Data element operating system and operating method based on data circulation
CN1713192A (en) Method and device for processing logic mode establishment and carrying out
US10083061B2 (en) Cloud embedded process tenant system for big data processing
CN113741883B (en) RPA lightweight data middling station system
CN114492814A (en) Method, device and medium for simulating energy of target system based on quantum computation
CN105224299B (en) A kind of universal modeling method based on system meta-model construction system model
Mens et al. Separation of concerns for software evolution
CN110059138A (en) One kind being based on big data platform data analysis domain architecting method
Adhikari et al. A distinctive real-time information for industries and new business opportunity analysis offered by SAP and AnyLogic simulation
CN112948353B (en) Data analysis method, system and storage medium applied to DAstudio
US20220067659A1 (en) Research and development system and method
Taylor et al. Commercial-off-the-shelf simulation package interoperability: Issues and futures
Kusiak Interface structure matrix for analysis of products and processes
CN102938097B (en) Data processing equipment and data processing method for on-line analysing processing system
Uprety et al. MapReduce: A Big Data-Maintained Algorithm Empowering Big Data Processing for Enhanced Business Insights
Zhang Innovation of financial shared service center based on artificial intelligence
Liu et al. The research of integrated enterprise modeling method based on workflow model
Shukla et al. An agent-based simulation modeling approach for dynamic job-shop manufacturing system
Ermakov et al. Integration of Information Systems Through a Process-Based Approach
Kaswan et al. 8 Simulation Tools for Big Data Fabric
Borisovna et al. TENSOR APPROACH TO DESIGN AND STUDY OF COMPLEX SYSTEMS
CN118012619A (en) Large-scale high-order tensor network performance parallel computing method, device and equipment
CN117390108A (en) Statistical system for organizing, integrating, managing, fusing and analyzing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210924