CN112100181A - A sand table-based data resource management method - Google Patents

A sand table-based data resource management method Download PDF

Info

Publication number
CN112100181A
CN112100181A CN202011000586.3A CN202011000586A CN112100181A CN 112100181 A CN112100181 A CN 112100181A CN 202011000586 A CN202011000586 A CN 202011000586A CN 112100181 A CN112100181 A CN 112100181A
Authority
CN
China
Prior art keywords
data
business
source
classification
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011000586.3A
Other languages
Chinese (zh)
Other versions
CN112100181B (en
Inventor
王大维
李伟
李钊
王丽霞
田小蕾
冉冉
高强
齐俊
刘育博
刘中彦
孙岩
毛辉
孟令俐
黄运起
李玉林
柳树泽
梁明
秦宾
李斌
曹国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Nari Information and Communication Technology Co
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Nari Information and Communication Technology Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd, Nari Information and Communication Technology Co filed Critical State Grid Corp of China SGCC
Priority to CN202011000586.3A priority Critical patent/CN112100181B/en
Publication of CN112100181A publication Critical patent/CN112100181A/en
Application granted granted Critical
Publication of CN112100181B publication Critical patent/CN112100181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data resource management method based on a sand table belongs to the technical field of data resource management, and particularly relates to a data resource management method based on a sand table. Aiming at the problems of the existing data resources such as lack of standards, serious redundancy, poor data quality and the like, the requirements of people on the development and utilization of the data resources generally have the characteristics of dynamism, diversity, specialty and the like. By the sand table-based data resource management method, data resource management standards are unified, data resource hierarchies are defined, a data resource catalog is constructed, inquireable, understandable and easily-obtained data resources are provided for each specialty and each hierarchy, and work of cross-specialty and cross-hierarchy data acquisition, data sharing, data tracing and the like is supported.

Description

一种基于沙盘的数据资源管理方法A sand table-based data resource management method

技术领域technical field

本发明属于数据资源管理技术领域,尤其涉及一种基于沙盘的数据资源管理方法。The invention belongs to the technical field of data resource management, and in particular relates to a sand table-based data resource management method.

背景技术Background technique

随着我国信息系统建设与应用的不断深入,积累了大量基础数据资源和丰富数据资源需求,呈现出资源分散、种类多样、标准不一、共享度不高的特征,缺乏企业级层级统一管理及成果共享。数据资源服务因存在应用门槛高、数据难读懂、服务获取难等问题,无法满足企业级数据资产可视、可查、可溯、可取的需求,本发明基于沙盘为各专业、各层级提供一种可查询、看得懂、易获取的数据资源管理方法。With the continuous deepening of the construction and application of my country's information system, a large number of basic data resources and rich data resource needs have been accumulated, showing the characteristics of scattered resources, diverse types, different standards, and low sharing, lacking enterprise-level unified management and Results are shared. Due to the problems of high application threshold, data difficult to understand, and difficult to obtain services, data resource services cannot meet the needs of enterprise-level data assets to be visible, traceable, traceable, and desirable. A data resource management method that can be queried, understood and easily obtained.

发明内容SUMMARY OF THE INVENTION

针对已有数据资源普遍存在标准缺乏、冗余严重、数据质量较差等问题,而人们对数据资源开发利用的需求又普遍具有动态性、多样性和专业性等特点。通过基于沙盘的数据资源管理方法,统一数据资源管理标准、明确数据资源层级、构建数据资源目录,为各专业、各层级提供可查询、看得懂、易获取的数据资源,支撑跨专业、跨层级数据获取、数据共享和数据溯源等工作。Aiming at the problems of lack of standards, serious redundancy, and poor data quality in existing data resources, people's needs for the development and utilization of data resources are generally dynamic, diverse, and professional. Through the sand table-based data resource management method, unify data resource management standards, clarify data resource levels, and build data resource catalogs to provide queried, understandable, and easy-to-obtain data resources for all majors and levels, supporting cross-professional and cross-disciplinary Hierarchical data acquisition, data sharing, and data traceability.

本发明包括包括下述步骤:The present invention includes the following steps:

通过二叉树方法递归公式遍历数据库元数据获取数据资源基础信息,建立数据同步任务,考虑到日志信息数据量大且时效性强,采用全量+增量的方式,其它元数据(更新频率较低)每周采用全量的方式。添加dt字段,用于区分标识不同系统,数据同步任务建立时,将系统简称写入dt字段。The binary tree method recursively traverses the database metadata to obtain the basic information of data resources, and establishes a data synchronization task. Considering the large amount of log information and the strong timeliness, the method of full + incremental is adopted, and other metadata (low update frequency) are used every Weekly use the full amount method. Add the dt field to distinguish and identify different systems. When the data synchronization task is established, write the system abbreviation into the dt field.

递归公式:an+k×m1an+k-1+m2an+k-2+…mkan Recursive formula: a n+k ×m 1 a n+k-1 +m 2 a n+k-2 +…m k a n

(1)数据资源目录分类(1) Classification of data resource catalogs

本发明针对数据资源因存在应用门槛高、数据难读懂、服务获取难等问题,构建数据资源贴源层、共享层、分析层的分类体系,为各专业、各层级提供可查询、看得懂的数据资源沙盘。基于已盘点系统功能菜单,按照完整、独立、不可替代的原则,形成各层级的数据资源目录分类。Aiming at the problems of data resources such as high application threshold, difficult data to understand, and difficult to obtain services, the present invention builds a classification system of source layer, sharing layer and analysis layer of data resources, and provides queried, viewable and viewable data for all majors and all levels. Understand the data resource sandbox. Based on the inventory system function menu, according to the principle of completeness, independence and irreplaceability, the classification of data resource catalogues at all levels is formed.

贴源层:定位贴近源业务系统,表结构与源业务系表结构基本保持一致,主要存储源业务系统的存量及增量数据,数据范围为源业务系统的子集,数据资源按照专业、系统、一级业务功能、二级业务功能分类进行构建,更贴近业务系统。The source layer: positioned close to the source business system, the table structure is basically the same as the source business system table structure, mainly stores the stock and incremental data of the source business system, the data scope is a subset of the source business system, and the data resources are classified according to specialty, system , first-level business functions, and second-level business functions are classified and constructed, which are closer to the business system.

共享层:存储SG-CIM模型表和模型未覆盖的标准表。对源端业务系统按照统一的命名规范、口径,为上层应用和服务提供支撑。数据资源分类包括SG-CIM模型表数据和标准表数据两部分内容,其中SG-CIM模型资源分类按照SG-CIM模型的一级主题域、二级主题域进行构建;标准表资源分类,按照系统、一级分类、二级分类的方式进行构建,数据表所属系统作为标签进行标注。Shared Layer: Stores SG-CIM model tables and standard tables not covered by the model. Provide support for upper-layer applications and services in accordance with unified naming specifications and calibers for source-end business systems. The data resource classification includes two parts: SG-CIM model table data and standard table data. The SG-CIM model resource classification is constructed according to the first-level subject domain and the second-level subject domain of the SG-CIM model; the standard table resource classification is based on the system , first-level classification, and second-level classification are constructed, and the system to which the data table belongs is labeled as a label.

分析层:存储将共享层数据按照业务逻辑处理后的结果表,形成数据集、标签、指标等分析型服务。资源分类按照分析主题、一级主题、二级主题的方式进行构建,数据表的来源系统、所属部门、应用部门作为标签进行标注。Analysis layer: Stores the result table after processing the shared layer data according to business logic to form analytical services such as datasets, tags, and indicators. The resource classification is constructed according to the analysis theme, primary theme, and secondary theme. The source system, department, and application department of the data table are labeled as tags.

(2)数据资源表梳理(2) Data resource table sorting

数据表业务信息描述主要包括但不限于,具体如下:The description of the data sheet business information mainly includes, but is not limited to, as follows:

业务描述:不能与中文名称简单雷同,必须包括表的业务含义。如果该表存储业务实体信息,应对业务实体及相关术语进行简要说明;如存储单据信息,应对单据流程环节进行简要描述;如存储报表信息,应对报表用途进行简要描述,需要业务人员进行维护。例如C_CONS表,业务描述为:依法与供电企业建立供用电关系的组织或个人称为用电客户,简称用户,不同用电地址视为不同用户。通过新装增容及变更用电归档等业务产生记录信息,包括用户编号、用户名称、用电地址,用电类别,供电电压,负荷性质,合同容量等用电属性。Business description: It cannot be simply the same as the Chinese name, and must include the business meaning of the table. If the table stores business entity information, a brief description of the business entity and related terms should be given; if document information is stored, a brief description of the document process links should be given; if report information is stored, a brief description of the purpose of the report should be given, which needs to be maintained by business personnel. For example, in the C_CONS table, the business description is: Organizations or individuals that establish a power supply and consumption relationship with a power supply company in accordance with the law are called electricity customers, or users for short, and different electricity addresses are regarded as different users. Recorded information is generated through services such as new installations, capacity additions, and changes in electricity consumption archives, including user ID, user name, electricity consumption address, electricity consumption category, power supply voltage, load nature, contract capacity and other electricity consumption attributes.

Figure BDA0002694157300000031
Figure BDA0002694157300000031

(3)数据资源字段梳理(3) Sorting out data resource fields

数据表字段业务信息描述主要包括但不限于,具体如下:The description of the data table field business information mainly includes, but is not limited to, as follows:

字段描述:描述字段的业务含义。详细描述该数据项的业务含义、用途、所处业务活动等信息,由业务人员维护。例如:字段NOTE_T是PE_CODE,业务描述为“用电客户每月电费默认打印的票据类型,遵从国家电网企业级营销管理代码类集,主要为普通发票,增值税发票,收据等”。Field Description: Describes the business meaning of the field. Describe in detail the business meaning, purpose, and business activity of the data item, which is maintained by business personnel. For example: the field NOTE_T is PE_CODE, and the business description is "the bill type printed by default for the monthly electricity bill of electricity customers, which complies with the State Grid enterprise-level marketing management code class set, mainly ordinary invoices, value-added tax invoices, receipts, etc.".

Figure BDA0002694157300000041
Figure BDA0002694157300000041

Figure BDA0002694157300000051
Figure BDA0002694157300000051

(4)数据资源标签(4) Data resource label

为便于用户对数据资源有更好的理解,基于数据资源管理组建功能,完成数据表的数据标签标注工作,具体标签包括但不限于,具体如下:In order to facilitate users to have a better understanding of data resources, based on the data resource management function, complete the data label labeling work of the data table, the specific labels include but are not limited to, as follows:

所属专业:填写该数据表所属专业名称;Major: Fill in the name of the major to which the data sheet belongs;

所属业务部门:填写该数据表所属业务部门名称,例如:所属业务部门:财务部;Business Department: Fill in the name of the business department to which the data sheet belongs, for example: Business Department: Finance Department;

来源业务系统:填写该数据表数据来源业务系统名称,业务系统简称,涉及到多个来源系统需要以分号隔开全部枚举,如“来源系统:[ECP]”;Source business system: fill in the name of the data source business system in this data sheet, and the abbreviation of the business system. If multiple source systems are involved, all enumerations need to be separated by semicolons, such as "source system: [ECP]";

来源数据表名称:填写该数据表数据来源的数据表中文名称和英文名称,涉及到多个来源表需要以分号隔开全部枚举,例如:来源数据表名称:DATA_ID_MAP_INFO/数据标识映射表;Source data table name: fill in the Chinese name and English name of the data table of the data table data source, involving multiple source tables, all enumerations need to be separated by semicolons, for example: source data table name: DATA_ID_MAP_INFO/data identification mapping table;

负面清单数据:如该表或者该字段为纳入负面清单数据,需添加“负面清单数据”标签;Negative list data: If the table or this field is included in the negative list data, the label of "negative list data" needs to be added;

数据表类型标签:根据数据表类型,添加明细业务数据、编码数据、指标数据、报表数据等标签,例如:数据表类型标签:指标数据;Data table type label: According to the data table type, add labels such as detailed business data, coding data, indicator data, report data, etc., for example: data table type label: indicator data;

支撑应用场景名称标签:针对贴源层、共享层数据表如支撑应用建设,添加应用场景标签,场景标签支撑多个,场景名称需与分析层数据资源分类中场景名称保持一致,例如:企业资源实时掌握、产业板块单位按行业分布情况;Support application scene name label: For the source layer, shared layer data table such as supporting application construction, add application scene label, the scene label supports multiple, the scene name must be consistent with the scene name in the analysis layer data resource classification, for example: enterprise resources Real-time grasp, industry sector unit distribution by industry;

自定义标签:各单位可根据具体需求、业务场景等自定义标签。Custom labels: Each unit can customize labels according to specific needs, business scenarios, etc.

(5)数据资源分类与表映射关系(5) Data resource classification and table mapping relationship

基于梳理的数据表信息成果,完成信息系统、主题域及应用主题与数据表的映射关系。Based on the sorted data table information results, complete the mapping relationship between information systems, subject areas and application topics and data tables.

数据类目与业务表对应关系梳理:以末级数据目录为基础,开展数据类目与数据表间对应关系梳理,将业务表挂接至数据类目的末级类目,要求确保梳理的数据类目与数据表间对应关系完整、准确。具体如下:Sorting out the correspondence between data categories and business tables: Based on the last-level data catalog, sort out the correspondence between data categories and data tables, and link business tables to the last-level category of data categories, and ensure the sorted data The correspondence between categories and data tables is complete and accurate. details as follows:

贴源层数据资源分类梳理与表映射关系Source layer data resource classification and table mapping relationship

Figure BDA0002694157300000061
Figure BDA0002694157300000061

Figure BDA0002694157300000071
Figure BDA0002694157300000071

共享层数据资源分类梳理与表映射关系Shared layer data resource classification and table mapping relationship

Figure BDA0002694157300000072
Figure BDA0002694157300000072

分析层数据资源梳理标准规范Analysis layer data resource sorting standard specification

Figure BDA0002694157300000073
Figure BDA0002694157300000073

Figure BDA0002694157300000081
Figure BDA0002694157300000081

本发明有益效果。The present invention has beneficial effects.

本发明有效管理数据资源并规避数据资源分散、种类多样、标准不一、共享度不高、数据资源服务应用门槛高、数据难读懂、服务获取难等问题。The invention effectively manages data resources and avoids the problems of scattered data resources, diverse types, different standards, low sharing degree, high threshold for data resource service application, difficult data to understand, and difficult to obtain services.

本发明通过基于沙盘的数据资源管理方法建立可查询、看得懂、易获取的数据资源,实现数据资源的可视、可查、可溯、可取The invention establishes data resources that can be inquired, understandable and easy to obtain through the data resource management method based on the sand table, and realizes the visibility, inquiry, traceability and accessibility of data resources.

(1)建立数据资源目录,按照数据资源业务类型不同进行分类,分层级构建,提升数据资源可视、可查、可理解能力,满足对数据资源的快捷查询、定位、应用。(1) Establish data resource catalogues, classify them according to different business types of data resources, build them in layers, improve the ability of data resources to be visible, searchable, and understandable, and satisfy the quick query, positioning, and application of data resources.

(2)统一数据资源管理标准,有利于用户理解、使用数据资源,避免业务人员、开发人员因术语难以理解产生的分歧。(2) The unified data resource management standard is conducive to users' understanding and use of data resources, and avoids differences between business personnel and developers due to the incomprehension of terminology.

附图说明Description of drawings

下面结合附图和具体实施方式对本发明做进一步说明。本发明保护范围不仅局限于以下内容的表述。The present invention will be further described below with reference to the accompanying drawings and specific embodiments. The protection scope of the present invention is not limited to the following descriptions.

图1、2、3是本发明数据资源信息采集流程图。1, 2, and 3 are flow charts of data resource information collection according to the present invention.

具体实施方式Detailed ways

(1)数据资源信息采集(1) Data resource information collection

静态分析法,本方法的优势是,避免受人为因素的影响,精度不受文档描述的详细程度、测试案例和抽样数据的影响,本方法基于编译原理,通过对源代码进行扫描和语法分析,以及对程序逻辑涉及的路径进行静态分析和罗列,实现对数据流转的客观反映。Static analysis method, the advantage of this method is that it avoids the influence of human factors, and the accuracy is not affected by the detail of the document description, test cases and sampling data. This method is based on the principle of compilation. And statically analyze and list the paths involved in the program logic to achieve an objective reflection of the data flow.

通过静态分析法,采用二叉树的形式采集数据资源信息。遍历是对树的一种最基本的运算,所谓遍历二叉树,就是按一定的规则和顺序走遍二叉树的所有结点,使每一个结点都被访问一次,而且只被访问一次。由于二叉树是非线性结构,因此,树的遍历实质上是将二叉树的各个结点转换成为一个线性序列来表示。Through the static analysis method, the data resource information is collected in the form of a binary tree. Traversal is one of the most basic operations on a tree. The so-called traversal of a binary tree is to traverse all the nodes of the binary tree according to certain rules and sequences, so that each node is visited once and only once. Since the binary tree is a nonlinear structure, the traversal of the tree essentially converts each node of the binary tree into a linear sequence to represent it.

先序遍历:按照根节点->左子树->右子树的顺序访问二叉树,如图1所示,Preorder traversal: Access the binary tree in the order of root node -> left subtree -> right subtree, as shown in Figure 1,

先序遍历:访问根节点;采用先序递归遍历左子树;采用先序递归遍历右子树。Preorder traversal: visit the root node; traverse the left subtree with preorder recursion; traverse the right subtree with preorder recursion.

先序遍历结果:ABDFECGHIPreorder traversal result: ABDFECGHI

思维过程:Thought process:

先访问根节点A,A分为左右两个子树,因为是递归调用,所以左子树也遵循“先根节点-再左-再右”的顺序,所以访问B节点,然后访问D节点,访问F节点的时候有分支,同样遵循“先根节点-再左--再右”的顺序,访问E节点,此时左边的大的子树已经访问完毕,然后遵循最后访问右子树的顺序,访问右边大的子树,右边大子树同样先访问根节点C,访问左子树G,因为G的左子树没有,所以接下俩访问G的右子树H,最后访问C的右子树IFirst visit the root node A, A is divided into left and right subtrees, because it is a recursive call, so the left subtree also follows the order of "first root node - then left - then right", so visit B node, then visit D node, visit When there is a branch at the F node, it also follows the order of "root node-then left--right" to visit the E node. At this time, the large subtree on the left has been visited, and then follow the order of the last visit to the right subtree, Visit the big subtree on the right. The big subtree on the right also visits the root node C first, and then visits the left subtree G. Because the left subtree of G does not exist, the next two visit the right subtree H of G, and finally visit the right subtree of C. tree I

中序遍历:按照左子树->根节点->右子树的顺序访问,如图2所示,In-order traversal: visit in the order of left subtree -> root node -> right subtree, as shown in Figure 2,

中序遍历:采用中序遍历左子树;访问根节点;采用中序遍历右子树。Inorder traversal: traverse the left subtree in inorder; visit the root node; traverse the right subtree in inorder.

中序遍历结果:DBEFAGHCIInorder traversal result: DBEFAGHCI

3.后序遍历:按照左子树->右子树-->根节点的顺序访问,如图3所示,3. Post-order traversal: visit in the order of left subtree -> right subtree -> root node, as shown in Figure 3,

后序遍历:采用后序递归遍历左子树;采用后序递归遍历右子树;访问根节点。Post-order traversal: use post-order recursion to traverse the left subtree; use post-order recursion to traverse the right subtree; visit the root node.

后序遍历的结果:DEFBHGICAThe result of the post-order traversal: DEFBHGICA

(2)数据资源信息梳理盘点(2) Data resource information sorting and inventory

对已采集的数据资源表元数据基础上对数据的业务信息、管理信息等进行补充完善,包括中文名称、业务描述、责任部门、负面数据信息等,形成完整有效的数据资源体系。Based on the collected metadata of the data resource table, supplement and improve the business information and management information of the data, including Chinese name, business description, responsible department, negative data information, etc., to form a complete and effective data resource system.

(3)数据资源基础信息维护(3) Basic information maintenance of data resources

根据数据资源信息梳理盘点成果,开展数据资源基础信息维护工作,包括表描述、字段描述、表中文名、字段中文名、是否主键、负面清单等信息维护工作。Based on the results of data resource information sorting and inventory, carry out basic information maintenance of data resources, including table description, field description, table Chinese name, field Chinese name, primary key or not, negative list and other information maintenance work.

(4)数据资源目录构建(4) Data resource directory construction

基于已盘点系统功能菜单,按照完整、独立、不可替代的原则。数据资源目录构建根据数据资源目录分类维度,分别建立基于系统模块分类的贴源层数据资源目录、基于业务域分类的共享层数据资源目录以及基于业务应用分类的分析层数据资源目录。Based on the inventory system function menu, in accordance with the principle of completeness, independence and irreplaceability. Data resource catalog construction According to the classification dimension of the data resource catalog, a source layer data resource catalog based on system module classification, a shared layer data resource catalog based on business domain classification, and an analysis layer data resource catalog based on business application classification are established respectively.

(5)数据资源服务构建、发布及应用(5) Construction, release and application of data resource services

通过对数据资源进一步业务化处理,实现数据资源服务可视化展示、统一管理及共享,解决业务人员访问数据资源操作流程繁琐、相关术语难以理解、数据申请使用等不便问题,更好的发挥数据资源服务价值,供业务人员快速查询、订阅数据资源服务,并结合企业级负面清单梳理及建设成果,快速响应业务人员需求,提供数据资源服务共享与应用,并对数据资源进行业务化全生命流程管控。Through further business processing of data resources, the visualized display, unified management and sharing of data resource services can be realized, and inconvenient problems such as cumbersome operation procedures for business personnel to access data resources, difficult to understand related terms, and data application and use are solved, and data resource services can be better utilized. Value, for business personnel to quickly query and subscribe to data resource services, and combine with enterprise-level negative list sorting and construction results to quickly respond to business personnel needs, provide data resource service sharing and application, and conduct business-based full-life process management and control of data resources.

可以理解的是,以上关于本发明的具体描述,仅用于说明本发明而并非受限于本发明实施例所描述的技术方案,本领域的普通技术人员应当理解,仍然可以对本发明进行修改或等同替换,以达到相同的技术效果;只要满足使用需要,都在本发明的保护范围之内。It can be understood that the above specific description of the present invention is only used to illustrate the present invention and is not limited to the technical solutions described in the embodiments of the present invention. Those of ordinary skill in the art should understand that the present invention can still be modified or It is equivalent to replacement to achieve the same technical effect; as long as the needs of use are met, they are all within the protection scope of the present invention.

Claims (1)

1.一种基于沙盘的数据资源管理方法,其特征在于包括包括下述步骤:1. a data resource management method based on sand table, is characterized in that comprising the following steps: 通过二叉树方法递归公式遍历数据库元数据获取数据资源基础信息,建立数据同步任务,考虑到日志信息数据量大且时效性强,采用全量+增量的方式,其它元数据(更新频率较低)每周采用全量的方式。添加dt字段,用于区分标识不同系统,数据同步任务建立时,将系统简称写入dt字段。The binary tree method recursively traverses the database metadata to obtain the basic information of data resources, and establishes a data synchronization task. Considering the large amount of log information and the strong timeliness, the method of full + incremental is adopted, and other metadata (low update frequency) are used every Weekly use the full amount method. Add the dt field to distinguish and identify different systems. When the data synchronization task is established, write the system abbreviation into the dt field. 递归公式:an+k×m1an+k-1+m2an+k-2+…mkan Recursive formula: a n+k ×m 1 a n+k-1 +m 2 a n+k-2 +…m k a n (1)数据资源目录分类(1) Classification of data resource catalogs 本发明针对数据资源因存在应用门槛高、数据难读懂、服务获取难等问题,构建数据资源贴源层、共享层、分析层的分类体系,为各专业、各层级提供可查询、看得懂的数据资源沙盘。基于已盘点系统功能菜单,按照完整、独立、不可替代的原则,形成各层级的数据资源目录分类。Aiming at the problems of data resources such as high application threshold, difficult data to understand, and difficult to obtain services, the present invention builds a classification system of source layer, sharing layer and analysis layer of data resources, and provides queried, viewable and viewable data for all majors and all levels. Understand the data resource sandbox. Based on the inventory system function menu, according to the principle of completeness, independence and irreplaceability, the classification of data resource catalogues at all levels is formed. 贴源层:定位贴近源业务系统,表结构与源业务系表结构基本保持一致,主要存储源业务系统的存量及增量数据,数据范围为源业务系统的子集,数据资源按照专业、系统、一级业务功能、二级业务功能分类进行构建,更贴近业务系统。The source layer: positioned close to the source business system, the table structure is basically the same as the source business system table structure, mainly stores the stock and incremental data of the source business system, the data scope is a subset of the source business system, and the data resources are classified according to specialty, system , first-level business functions, and second-level business functions are classified and constructed, which are closer to the business system. 共享层:存储SG-CIM模型表和模型未覆盖的标准表。对源端业务系统按照统一的命名规范、口径,为上层应用和服务提供支撑。数据资源分类包括SG-CIM模型表数据和标准表数据两部分内容,其中SG-CIM模型资源分类按照SG-CIM模型的一级主题域、二级主题域进行构建;标准表资源分类,按照系统、一级分类、二级分类的方式进行构建,数据表所属系统作为标签进行标注。Shared Layer: Stores SG-CIM model tables and standard tables not covered by the model. Provide support for upper-layer applications and services in accordance with unified naming specifications and calibers for source-end business systems. The data resource classification includes two parts: SG-CIM model table data and standard table data. The SG-CIM model resource classification is constructed according to the first-level subject domain and the second-level subject domain of the SG-CIM model; the standard table resource classification is based on the system , first-level classification, and second-level classification are constructed, and the system to which the data table belongs is labeled as a label. 分析层:存储将共享层数据按照业务逻辑处理后的结果表,形成数据集、标签、指标等分析型服务。资源分类按照分析主题、一级主题、二级主题的方式进行构建,数据表的来源系统、所属部门、应用部门作为标签进行标注。Analysis layer: Stores the result table after processing the shared layer data according to business logic to form analytical services such as datasets, tags, and indicators. The resource classification is constructed according to the analysis theme, primary theme, and secondary theme. The source system, department, and application department of the data table are labeled as tags. (2)数据资源表梳理(2) Data resource table sorting 数据表业务信息描述主要包括但不限于,具体如下:The description of the data sheet business information mainly includes, but is not limited to, as follows: 业务描述:不能与中文名称简单雷同,必须包括表的业务含义。如果该表存储业务实体信息,应对业务实体及相关术语进行简要说明;如存储单据信息,应对单据流程环节进行简要描述;如存储报表信息,应对报表用途进行简要描述,需要业务人员进行维护。例如C_CONS表,业务描述为:依法与供电企业建立供用电关系的组织或个人称为用电客户,简称用户,不同用电地址视为不同用户。通过新装增容及变更用电归档等业务产生记录信息,包括用户编号、用户名称、用电地址,用电类别,供电电压,负荷性质,合同容量等用电属性。Business description: It cannot be simply the same as the Chinese name, and must include the business meaning of the table. If the table stores business entity information, a brief description of the business entity and related terms should be given; if document information is stored, a brief description of the document process links should be given; if report information is stored, a brief description of the purpose of the report should be given, which needs to be maintained by business personnel. For example, in the C_CONS table, the business description is: Organizations or individuals that establish a power supply and consumption relationship with a power supply company in accordance with the law are called electricity customers, or users for short, and different electricity addresses are regarded as different users. Recorded information is generated through services such as new installations, capacity additions, and changes in electricity consumption archives, including user ID, user name, electricity consumption address, electricity consumption category, power supply voltage, load nature, contract capacity and other electricity consumption attributes.
Figure FDA0002694157290000031
Figure FDA0002694157290000031
(3)数据资源字段梳理(3) Sorting out data resource fields 数据表字段业务信息描述主要包括但不限于,具体如下:The description of the data table field business information mainly includes, but is not limited to, as follows: 字段描述:描述字段的业务含义。详细描述该数据项的业务含义、用途、所处业务活动等信息,由业务人员维护。例如:字段NOTE_T是PE_CODE,业务描述为“用电客户每月电费默认打印的票据类型,遵从国家电网企业级营销管理代码类集,主要为普通发票,增值税发票,收据等”。Field Description: Describes the business meaning of the field. Describe in detail the business meaning, purpose, and business activity of the data item, which is maintained by business personnel. For example: the field NOTE_T is PE_CODE, and the business description is "the bill type printed by default for the monthly electricity bill of electricity customers, which complies with the State Grid enterprise-level marketing management code class set, mainly ordinary invoices, value-added tax invoices, receipts, etc.". (4)数据资源标签(4) Data resource label 为便于用户对数据资源有更好的理解,基于数据资源管理组建功能,完成数据表的数据标签标注工作,具体标签包括但不限于,具体如下:In order to facilitate users to have a better understanding of data resources, based on the data resource management function, complete the data label labeling work of the data table, the specific labels include but are not limited to, as follows: 所属专业:填写该数据表所属专业名称;Major: Fill in the name of the major to which the data sheet belongs; 所属业务部门:填写该数据表所属业务部门名称,例如:所属业务部门:财务部;Business Department: Fill in the name of the business department to which the data sheet belongs, for example: Business Department: Finance Department; 来源业务系统:填写该数据表数据来源业务系统名称,业务系统简称,涉及到多个来源系统需要以分号隔开全部枚举,如“来源系统:[ECP]”;Source business system: fill in the name of the data source business system in this data sheet, and the abbreviation of the business system. If multiple source systems are involved, all enumerations need to be separated by semicolons, such as "source system: [ECP]"; 来源数据表名称:填写该数据表数据来源的数据表中文名称和英文名称,涉及到多个来源表需要以分号隔开全部枚举,例如:来源数据表名称:DATA_ID_MAP_INFO/数据标识映射表;Source data table name: fill in the Chinese name and English name of the data table of the data table data source, involving multiple source tables, all enumerations need to be separated by semicolons, for example: source data table name: DATA_ID_MAP_INFO/data identification mapping table; 负面清单数据:如该表或者该字段为纳入负面清单数据,需添加“负面清单数据”标签;Negative list data: If the table or this field is included in the negative list data, the label of "negative list data" needs to be added; 数据表类型标签:根据数据表类型,添加明细业务数据、编码数据、指标数据、报表数据等标签,例如:数据表类型标签:指标数据;Data table type label: According to the data table type, add labels such as detailed business data, coding data, indicator data, report data, etc., for example: data table type label: indicator data; 支撑应用场景名称标签:针对贴源层、共享层数据表如支撑应用建设,添加应用场景标签,场景标签支撑多个,场景名称需与分析层数据资源分类中场景名称保持一致,例如:企业资源实时掌握、产业板块单位按行业分布情况;Support application scene name label: For the source layer, shared layer data table such as supporting application construction, add application scene label, the scene label supports multiple, the scene name must be consistent with the scene name in the analysis layer data resource classification, for example: enterprise resources Real-time grasp, industry sector unit distribution by industry; 自定义标签:各单位可根据具体需求、业务场景等自定义标签。Custom labels: Each unit can customize labels according to specific needs, business scenarios, etc. (5)数据资源分类与表映射关系(5) Data resource classification and table mapping relationship 基于梳理的数据表信息成果,完成信息系统、主题域及应用主题与数据表的映射关系。Based on the sorted data table information results, complete the mapping relationship between information systems, subject areas and application topics and data tables. 数据类目与业务表对应关系梳理:以末级数据目录为基础,开展数据类目与数据表间对应关系梳理,将业务表挂接至数据类目的末级类目,要求确保梳理的数据类目与数据表间对应关系完整、准确。具体如下:Sorting out the correspondence between data categories and business tables: Based on the last-level data catalog, sort out the correspondence between data categories and data tables, and link business tables to the last-level category of data categories, and ensure the sorted data The correspondence between categories and data tables is complete and accurate. details as follows: 贴源层数据资源分类梳理与表映射关系Source layer data resource classification and table mapping relationship
Figure FDA0002694157290000051
Figure FDA0002694157290000051
共享层数据资源分类梳理与表映射关系Shared layer data resource classification and table mapping relationship
Figure FDA0002694157290000052
Figure FDA0002694157290000052
分析层数据资源梳理标准规范Analysis layer data resource sorting standard specification
Figure FDA0002694157290000053
Figure FDA0002694157290000053
Figure FDA0002694157290000061
Figure FDA0002694157290000061
CN202011000586.3A 2020-09-22 2020-09-22 A data resource management method based on sandbox Active CN112100181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011000586.3A CN112100181B (en) 2020-09-22 2020-09-22 A data resource management method based on sandbox

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011000586.3A CN112100181B (en) 2020-09-22 2020-09-22 A data resource management method based on sandbox

Publications (2)

Publication Number Publication Date
CN112100181A true CN112100181A (en) 2020-12-18
CN112100181B CN112100181B (en) 2024-06-11

Family

ID=73755743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011000586.3A Active CN112100181B (en) 2020-09-22 2020-09-22 A data resource management method based on sandbox

Country Status (1)

Country Link
CN (1) CN112100181B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712299A (en) * 2021-01-21 2021-04-27 网思科技股份有限公司 Resource management method, system, storage medium and electronic device
CN112989132A (en) * 2021-03-29 2021-06-18 国网宁夏电力有限公司电力科学研究院 Data directory establishing method for enterprise data inventory
CN114255073A (en) * 2021-12-10 2022-03-29 国网江西省电力有限公司信息通信分公司 Marketing census method based on data center station
CN114443779A (en) * 2021-12-20 2022-05-06 航天科工网络信息发展有限公司 Data resource management method and system based on data directory
CN115952160A (en) * 2023-01-10 2023-04-11 数据易(北京)信息技术有限公司 Data checking method
CN119671687A (en) * 2025-02-21 2025-03-21 浙江飞猪网络技术有限公司 A resource publishing method and device based on LLM model

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129415A1 (en) * 2004-12-13 2006-06-15 Rohit Thukral System for linking financial asset records with networked assets
US20090327230A1 (en) * 2008-06-27 2009-12-31 Microsoft Corporation Structured and unstructured data models
US20110158405A1 (en) * 2009-12-31 2011-06-30 The Industry & Academy Cooperation in Chungnam National University (IAC) Key management method for scada system
CN103379136A (en) * 2012-04-17 2013-10-30 中国移动通信集团公司 Compression method and decompression method of log acquisition data, compression apparatus and decompression apparatus of log acquisition data
CN105183911A (en) * 2015-10-12 2015-12-23 国家电网公司 Data source binary tree based source tracing method for abnormal data of power system
CN105893593A (en) * 2016-04-18 2016-08-24 国网山东省电力公司信息通信公司 Data fusion method
CN108280562A (en) * 2017-12-06 2018-07-13 国网浙江省电力有限公司 A kind of method of specification electric power enterprise data resource
CN109376188A (en) * 2018-09-13 2019-02-22 智恒科技股份有限公司 A kind of wisdom water utilities big data fusion method and system based on subject area
CN109947863A (en) * 2017-12-17 2019-06-28 隽会云 Traffic statistics analysis system for mobile network
CN111080261A (en) * 2019-12-19 2020-04-28 国网安徽省电力有限公司信息通信分公司 Visual data asset management system based on big data

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129415A1 (en) * 2004-12-13 2006-06-15 Rohit Thukral System for linking financial asset records with networked assets
US20090327230A1 (en) * 2008-06-27 2009-12-31 Microsoft Corporation Structured and unstructured data models
US20110158405A1 (en) * 2009-12-31 2011-06-30 The Industry & Academy Cooperation in Chungnam National University (IAC) Key management method for scada system
CN103379136A (en) * 2012-04-17 2013-10-30 中国移动通信集团公司 Compression method and decompression method of log acquisition data, compression apparatus and decompression apparatus of log acquisition data
CN105183911A (en) * 2015-10-12 2015-12-23 国家电网公司 Data source binary tree based source tracing method for abnormal data of power system
CN105893593A (en) * 2016-04-18 2016-08-24 国网山东省电力公司信息通信公司 Data fusion method
CN108280562A (en) * 2017-12-06 2018-07-13 国网浙江省电力有限公司 A kind of method of specification electric power enterprise data resource
CN109947863A (en) * 2017-12-17 2019-06-28 隽会云 Traffic statistics analysis system for mobile network
CN109376188A (en) * 2018-09-13 2019-02-22 智恒科技股份有限公司 A kind of wisdom water utilities big data fusion method and system based on subject area
CN111080261A (en) * 2019-12-19 2020-04-28 国网安徽省电力有限公司信息通信分公司 Visual data asset management system based on big data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
P. BURAI等: "Individual Tree Species Classification Using Airborne Hyperspectral Imagery And Lidar Data", 《WORKSHOP ON HYPERSPECTRAL IMAGING AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS)》, vol. 2019, no. 10, pages 1 - 4, XP033646802, DOI: 10.1109/WHISPERS.2019.8921016 *
徐涛: "结构化大数据存储与查询优化关键技术", 《中国博士学位论文全文数据库 信息科技辑》, vol. 2018, no. 5, pages 138 - 5 *
李志等: "基于数据中台的电力企业数据资产管理方法研究", 《电力信息与通信技术》, vol. 18, no. 7, pages 76 - 81 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712299A (en) * 2021-01-21 2021-04-27 网思科技股份有限公司 Resource management method, system, storage medium and electronic device
CN112712299B (en) * 2021-01-21 2023-11-24 网思科技股份有限公司 Resource management method, system, storage medium and electronic device
CN112989132A (en) * 2021-03-29 2021-06-18 国网宁夏电力有限公司电力科学研究院 Data directory establishing method for enterprise data inventory
CN114255073A (en) * 2021-12-10 2022-03-29 国网江西省电力有限公司信息通信分公司 Marketing census method based on data center station
CN114255073B (en) * 2021-12-10 2025-05-23 国网江西省电力有限公司信息通信分公司 Marketing general investigation method based on data center
CN114443779A (en) * 2021-12-20 2022-05-06 航天科工网络信息发展有限公司 Data resource management method and system based on data directory
CN115952160A (en) * 2023-01-10 2023-04-11 数据易(北京)信息技术有限公司 Data checking method
CN115952160B (en) * 2023-01-10 2024-04-26 数据易(北京)信息技术有限公司 Data checking method
CN119671687A (en) * 2025-02-21 2025-03-21 浙江飞猪网络技术有限公司 A resource publishing method and device based on LLM model

Also Published As

Publication number Publication date
CN112100181B (en) 2024-06-11

Similar Documents

Publication Publication Date Title
CN112100181B (en) A data resource management method based on sandbox
Trujillo et al. Designing data warehouses with OO conceptual models
CN106570778B (en) A kind of method that data integration based on big data is calculated with line loss analyzing
US20120011118A1 (en) Method and system for defining an extension taxonomy
CN108183927A (en) The monitoring method and system that a kind of distributed system link calls
CN111159191A (en) Data processing method, device and interface
CN108280562B (en) Method for standardizing data resources of power enterprise
US20080222189A1 (en) Associating multidimensional data models
CN110737729A (en) Engineering map data information management method based on knowledge map concept and technology
CN112508671A (en) Enterprise financial data processing method, system, device and medium
Petermann et al. FoodBroker-generating synthetic datasets for graph-based business analytics
CN107168937A (en) Financial cloud accounting element particle and assemble method based on XBRL
CN104766240A (en) Electronic banking data processing system and method
Serbout et al. From openapi fragments to api pattern primitives and design smells
CN108804594A (en) A kind of construction method and device of news content full-text search engine
KR101829198B1 (en) A metadata-based on-line analytical processing system for analyzing importance of reports
Ranwez et al. Ontological distance measures for information visualisation on conceptual maps
CN111143322A (en) A data standard governance system and method
CN114490840A (en) Automatic generation method and system for date dimension table
CN115952160B (en) Data checking method
CN118689914A (en) A method and system for building unified model management based on metadata
CN108108444A (en) Enterprise business unit self-adaptive system and implementation method thereof
CN106649880A (en) Electric power statistical management system and method
Mohamed et al. Efficient computation of comprehensive statistical information of large OWL datasets: a scalable approach
CN101458709B (en) Complex test data retroactive method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant