WO2024027071A1 - 一种数据监控方法及系统 - Google Patents

一种数据监控方法及系统 Download PDF

Info

Publication number
WO2024027071A1
WO2024027071A1 PCT/CN2022/138505 CN2022138505W WO2024027071A1 WO 2024027071 A1 WO2024027071 A1 WO 2024027071A1 CN 2022138505 W CN2022138505 W CN 2022138505W WO 2024027071 A1 WO2024027071 A1 WO 2024027071A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
classification
model
fused
monitoring
Prior art date
Application number
PCT/CN2022/138505
Other languages
English (en)
French (fr)
Inventor
李庆
韩国权
肖益
黄海峰
吕灏
陈轮
曹扬
支婷
Original Assignee
中电科大数据研究院有限公司
太极计算机股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中电科大数据研究院有限公司, 太极计算机股份有限公司 filed Critical 中电科大数据研究院有限公司
Publication of WO2024027071A1 publication Critical patent/WO2024027071A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Definitions

  • the present invention relates to the field of computer technology, and in particular, to a data monitoring method and system.
  • Embodiments of the present invention provide a data monitoring method and system to solve the problem of difficulty in data management in data classification, classification and quality control in current data management platforms.
  • an embodiment of the present invention provides a data monitoring method, including:
  • the data monitoring model is obtained based on classifying and grading the sample fusion data of the current data governance platform and then generating a knowledge graph and quality knowledge base.
  • the data monitoring model includes a data analysis model and a quality audit model
  • the step of inputting the fusion data of the current data governance platform to be monitored into the data monitoring model to obtain the data monitoring results output by the data monitoring model includes:
  • the classification and grading data of the fused data are input into the quality audit model, and the data monitoring results are output based on the quality knowledge base.
  • the data analysis model includes a data feature extraction model, a data classification model and a data classification model
  • the knowledge map includes a classification map and a hierarchical map
  • the classification data is input into the data classification model, and classification and classification data of the fused data is output based on the classification map.
  • the fused data and corresponding data feature values are input into the data classification model, and the classification data is output based on the classification map, including:
  • the weighted fusion data is subjected to weight value index clustering to obtain classified data.
  • inputting the classification data into the data classification model, and outputting classification and classification data of the fused data based on the classification map includes:
  • Classification values corresponding to the classification data are divided into data based on the classification map to obtain classification and classification data of the fused data.
  • the classification and grading data of the fused data into the quality audit model, and output the data monitoring results based on the quality knowledge base, including:
  • the classification and classification data of the fusion data are retrieved, screened, deleted or retained to obtain data monitoring results.
  • the data monitoring model is obtained based on classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and quality knowledge base, including:
  • embodiments of the present invention provide a data monitoring system, including:
  • the data determination unit is used to determine the fused data of the current data governance platform to be monitored
  • a data monitoring unit configured to input the fusion data of the current data governance platform to be monitored into the data monitoring model, and obtain the data monitoring results output by the data monitoring model;
  • the data monitoring model is obtained based on classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and a quality knowledge base.
  • embodiments of the present invention provide an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the program, the above-mentioned first aspect is implemented. The steps of any of the data monitoring methods provided.
  • embodiments of the present invention provide a non-transitory computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the data as described in any one of the above-mentioned first aspects are realized. Steps of the monitoring method.
  • a data monitoring method and system obtained by embodiments of the present invention, the method obtains the data monitoring results output by the data monitoring model by inputting the fusion data of the current data governance platform to be monitored into the data monitoring model; wherein, The above data monitoring model is based on classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and quality knowledge base.
  • the invention effectively solves the problem of difficulty in data management in data classification, classification and quality control work existing in the current data management platform.
  • Figure 1 is a schematic flow chart of a data monitoring method provided by the present invention
  • FIG. 2 is a block diagram of the data monitoring model provided by the present invention.
  • Figure 3 is a block diagram of the data analysis model provided by the present invention.
  • Figure 4 is a schematic structural diagram of the knowledge graph provided by the present invention.
  • Figure 5 is a schematic structural diagram of a data monitoring system provided by the present invention.
  • Figure 6 is a schematic structural diagram of the electronic device provided by the present invention.
  • Figure 1 is a schematic flow chart of a data monitoring method provided by an embodiment of the present invention. As shown in Figure 1, the method includes:
  • Step 110 Determine the converged data of the current data governance platform to be monitored
  • Step 120 Input the fusion data of the current data governance platform to be monitored into the data monitoring model to obtain the data monitoring results output by the data monitoring model;
  • the data monitoring model is obtained based on classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and a quality knowledge base.
  • the present invention first obtains a data monitoring model through classification and grading processing of sample fusion data, and then inputs the fusion data of the current data governance platform into the data monitoring model to obtain the data monitoring results of the current data governance platform.
  • the method provided by the embodiment of the present invention effectively improves the data governance level of the current data governance platform by inputting the fused data of the current data governance platform into the data monitoring model obtained after classification and grading processing.
  • the data monitoring model 200 includes a data analysis model 210 and a quality audit model 220;
  • the step of inputting the fusion data of the current data governance platform to be monitored into the data monitoring model to obtain the data monitoring results output by the data monitoring model includes:
  • the classification and grading data of the fused data are input into the quality audit model, and the data monitoring results are output based on the quality knowledge base.
  • the data monitoring model is implemented through a data analysis model and a quality audit model.
  • the data analysis model sequentially classifies and grades the fused data to be monitored based on a knowledge map including a classification map and a grading map.
  • the quality audit model then It performs quality monitoring on the classified and graded fused data based on the quality knowledge base to obtain the monitoring results.
  • the data analysis model 300 includes a data feature extraction model 310, a data classification model 320 and a data classification model 330;
  • the knowledge map 400 includes a classification map 410 and a hierarchical map 420;
  • the classification data is input to the data classification model 330 , and classification and classification data of the fused data is output based on the classification map 420 .
  • the data analysis model is implemented through a data feature extraction model, a data classification model and a data classification model.
  • the data feature extraction model calculates the data feature value of the fused data to be monitored, and uses the corresponding data feature value as
  • the weight value of the fused data is simultaneously input to the data classification model based on the classification map, and the classified data is output in the classification map based on the weight value index and the clustering of the weighted fused data; the data classification model classifies the classified data based on the classification map.
  • Each category is graded separately to obtain classified and graded data of the fused data.
  • the fused data and corresponding data feature values are input to the data classification model 320, and classification data is output based on the classification map 410, including:
  • the weighted fusion data is subjected to weight value index clustering to obtain classified data.
  • the data classification model classifies the fused data based on the classification map, that is, uses the data feature values corresponding to the fused data as the weight value of the fused data, weights the fused data, and further indexes based on the weight value of the classification map, so as to Get classified data.
  • the classification data is input to the data classification model 330, and the classification and classification data of the fused data is output based on the classification map 420, including:
  • Classification values corresponding to the classification data are divided into data based on the classification map to obtain classification and classification data of the fused data.
  • the data classification model is based on the classification system of the classification map and performs classification processing on the classified data according to the classification value corresponding to the classification data, wherein the classification value corresponding to the classification data is based on the weight value corresponding to the classification data and the preset classification coefficient. obtained by multiplying.
  • the classification and classification data of the fusion data are retrieved, screened, deleted or retained to obtain data monitoring results.
  • the quality audit model is to retrieve, screen, delete or retain the classified and graded data of the fused data based on a quality knowledge base, where the quality knowledge base can be iteratively updated based on quality experience.
  • the data monitoring model is obtained by classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and quality knowledge base, including:
  • the knowledge graph and quality knowledge base in the data monitoring model are obtained by calculating the sample fusion data through Nebula, Spark or Flink of the existing knowledge computing engine.
  • a data monitoring system provided by the present invention is described below.
  • the data monitoring method described below and the data monitoring method described above may be mutually referenced.
  • Figure 5 is a schematic structural diagram of a data monitoring system provided by an embodiment of the present invention. As shown in Figure 5, the system includes a data determination unit 510 and a data monitoring unit 520;
  • the data determination unit 510 is used to determine the fusion data of the current data management platform to be monitored
  • the data monitoring unit 520 is used to input the fusion data of the current data governance platform to be monitored into the data monitoring model, and obtain the data monitoring results output by the data monitoring model;
  • the data monitoring model is obtained based on classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and a quality knowledge base.
  • the system provided by the embodiments of the present invention effectively improves the data governance level of the current data governance platform by inputting the fused data of the current data governance platform into the data monitoring model obtained after classification and grading processing.
  • the data monitoring unit includes a data analysis module and a quality audit module
  • the data monitoring model includes a data analysis model and a quality audit model
  • the data analysis module is used to input the fusion data of the current data governance platform to be monitored into the data analysis model, and output the classification and grading data of the fusion data based on the knowledge graph;
  • the quality audit module is used to input the classification and grading data of the fusion data into the quality audit model, and output the data monitoring results based on the quality knowledge base.
  • the data analysis module includes a feature extraction module, a data classification module and a data classification module;
  • the data analysis model includes a data feature extraction model, a data classification model and a data classification model;
  • the feature extraction module is used to input the fusion data of the current data governance platform to be monitored into the data feature extraction model, and output the data feature value corresponding to the fusion data;
  • the data classification module is used to input the fused data and corresponding data feature values to the data classification model, and output classification data based on the classification map;
  • the data classification module is used to input the classification data into the data classification model, and output the classification and classification data of the fused data based on the classification map.
  • the data classification module is specifically used to:
  • the weighted fusion data is subjected to weight value index clustering to obtain classified data.
  • the data classification module is specifically used for:
  • Classification values corresponding to the classification data are divided into data based on the classification map to obtain classification and classification data of the fused data.
  • the quality audit module is specifically used to:
  • the classification and classification data of the fusion data are retrieved, screened, deleted or retained to obtain data monitoring results.
  • the data monitoring model is obtained by classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and quality knowledge base, including:
  • embodiments of the present invention obtain the data monitoring results output by the data monitoring model by inputting the fusion data of the current data governance platform into the data monitoring model; wherein the data monitoring model is based on the current data governance platform. It is obtained by classifying and grading the sample fusion data to generate a knowledge graph and quality knowledge base.
  • the invention effectively solves the problem of difficulty in data management in data classification, classification and quality control work existing in the current data management platform.
  • Figure 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
  • the electronic device may include: a processor (processor) 610, a communications interface (Communications Interface) 620, a memory (memory) 630 and a communication bus.
  • processor processor
  • Communications Interface Communications Interface
  • memory memory
  • 640 in which the processor 610, the communication interface 620, and the memory 630 complete communication with each other through the communication bus 640.
  • the processor 610 can call logical instructions in the memory 630 to execute a data monitoring method, which method includes: determining the fused data of the current data governance platform to be monitored; and inputting the fused data of the current data governance platform to be monitored into the data In the monitoring model, the data monitoring results output by the data monitoring model are obtained; wherein the data monitoring model is obtained by classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and a quality knowledge base.
  • the above-mentioned logical instructions in the memory 630 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.
  • the technical solution of the present invention essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .
  • inventions of the present invention also provide a computer program product.
  • the computer program product includes a computer program stored on a non-transitory computer-readable storage medium.
  • the computer program includes program instructions. When the program instructions When executed by a computer, the computer can execute the data monitoring method provided by each of the above methods.
  • the method includes: determining the fusion data of the current data governance platform to be monitored; inputting the fusion data of the current data governance platform to be monitored into the data In the monitoring model, the data monitoring results output by the data monitoring model are obtained; wherein the data monitoring model is obtained by classifying and grading the sample fusion data of the current data governance platform to generate a knowledge graph and a quality knowledge base.
  • embodiments of the present invention also provide a non-transitory computer-readable storage medium on which a computer program is stored.
  • the computer program is implemented when executed by a processor to perform the above-mentioned data monitoring methods.
  • the method includes : Determine the fusion data of the current data governance platform to be monitored; input the fusion data of the current data governance platform to be monitored into the data monitoring model to obtain the data monitoring results output by the data monitoring model; wherein, the data
  • the monitoring model is based on classifying and grading the sample fusion data of the current data governance platform and then generating a knowledge graph and quality knowledge base.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
  • each embodiment can be implemented by software plus a necessary general hardware platform, and of course, it can also be implemented by hardware.
  • the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

本发明实施例提供一种数据监控方法及系统,其中方法包括:确定待监控的当前数据治理平台的融合数据;将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。本发明有效解决了目前数据治理平台存在的数据分类分级及质量管控工作中数据难以治理的问题。

Description

一种数据监控方法及系统 技术领域
本发明涉及计算机技术领域,尤其涉及一种数据监控方法及系统。
背景技术
随着计算机技术的发展,政务数据纷繁复杂、种类繁多,且为了能有效、合规地进行数据共享交换,在政务数据治理工作中需要大量的数据分类分级工作。传统的数据治理平台提供的数据分类分级及数据质量管理功能需要大量的专家知识,使用门槛较高,造成在数据治理项目后期转入常态化运营阶段后,普通的维护人员难以胜任,最终导致数据治理项目难以收尾甚至停摆。因此,亟需提供一种数据监控方法及系统以解决目前数据治理平台存在的数据分类分级及质量管控工作中数据难以治理的问题。
发明内容
本发明实施例提供一种数据监控方法及系统,以解决目前数据治理平台存在的数据分类分级及质量管控工作中数据难以治理的问题。
第一方面,本发明实施例提供一种数据监控方法,包括:
确定待监控的当前数据治理平台的融合数据;
将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;
其中,所述数据监控模型是基于对当前数据治理平台的样本融合 数据进行分类分级后生成知识图谱和质量知识库后得到的。
优选地,所述数据监控模型包括数据分析模型和质量稽核模型;
所述将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果,包括:
将所述待监控的当前数据治理平台的融合数据输入至所述数据分析模型,基于所述知识图谱输出所述融合数据的分类分级数据;
将所述融合数据的分类分级数据输入至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果。
优选地,所述数据分析模型包括数据特征提取模型、数据分类模型和数据分级模型;所述知识图谱包括分类图谱和分级图谱;
将所述待监控的当前数据治理平台的融合数据输入至所述数据分析模型,基于所述知识图谱输出所述融合数据的分类分级数据,包括:
将所述待监控的当前数据治理平台的融合数据输入至所述数据特征提取模型,输出所述融合数据对应的数据特征值;
将所述融合数据和对应的数据特征值输入至所述数据分类模型,基于所述分类图谱输出分类数据;
将所述分类数据输入至所述数据分级模型,基于所述分级图谱输出所述融合数据的分类分级数据。
优选地,将所述融合数据和对应的数据特征值输入至所述数据分类模型,基于所述分类图谱输出分类数据,包括:
将融合数据对应的数据特征值作为权重对所述融合数据进行加权处理,得到加权处理后的融合数据;
基于所述分类图谱将所述加权处理后的融合数据进行权重值索引聚类,得到分类数据。
优选地,将所述分类数据输入至所述数据分级模型,基于所述分级图谱输出所述融合数据的分类分级数据,包括:
将所述分类数据对应的权重值乘以预设的分级系数得到所述分类数据对应的分级值;
基于所述分级图谱对所述分类数据对应的分级值进行数据划分,得到所述融合数据的分类分级数据。
优先地,将所述融合数据的分类分级数据输入至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果,包括:
基于所述质量知识库对所述融合数据的分类分级数据进行检索、筛查、删除或保留操作,得到数据监控结果。
优先地,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的,包括:
获取当前数据治理平台的样本融合数据;
通过知识计算引擎对所述样本融合数据进行图计算Nebula、离线计算Spark或实时计算Flink,生成知识图谱和质量知识库。
第二方面,本发明实施例提供一种数据监控系统,包括:
数据确定单元,用于确定待监控的当前数据治理平台的融合数据;
数据监控单元,用于将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;
其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
第三方面,本发明实施例提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述第一方面所提供的任一项所述数据监控方法的步骤。
第四方面,本发明实施例提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如上述第一方面所提供的任一项所述数据监控方法的步骤。
本发明实施例提供的一种数据监控方法及系统,该方法通过将待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。本发明有效解决了目前数据治理平台存在的数据分类分级及质量管控工作中数据难以治理的问题。
附图说明
为了更清楚地说明本发明或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明提供的一种数据监控方法的流程示意图;
图2是本发明提供的数据监控模型框图;
图3是本发明提供的数据分析模型框图;
图4是本发明提供的知识图谱的结构示意图;
图5是本发明提供的一种数据监控系统的结构示意图;
图6是本发明提供的电子设备的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明中的附图,对本发明中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
下面结合图1-图6描述本发明提供的一种数据监控方法及系统。
本发明实施例提供了一种数据监控方法。图1为本发明实施例提供的数据监控方法的流程示意图,如图1所示,该方法包括:
步骤110,确定待监控的当前数据治理平台的融合数据;
步骤120,将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;
其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
具体地,本发明首先通过样本融合数据进行分类分级处理后得到数据监控模型,再将当前数据治理平台的融合数据输入至所述数据监控模型,得到当前数据治理平台的数据监控结果。
与现有技术相比,本发明实施例提供的方法,通过将当前数据治理平台的融合数据输入经过分类分级处理后得到的数据监控模型,实现了当前数据治理平台的数据治理水平的有效提升。
基于上述任一实施例,如图2所示,所述数据监控模型200包括数据分析模型210和质量稽核模型220;
所述将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果,包括:
将所述待监控的当前数据治理平台的融合数据输入至所述数据分析模型,基于所述知识图谱输出所述融合数据的分类分级数据;
将所述融合数据的分类分级数据输入至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果。
具体地,数据监控模型是通过数据分析模型和质量稽核模型来实现的,其中,数据分析模型是基于包括分类图谱和分级图谱的知识图谱对待监控的融合数据依次进行分类和分级,质量稽核模型则是基于质量知识库对分类和分级后的融合数据进行质量监控得到监控结果。
基于上述任一实施例,如图3所示,所述数据分析模型300包括 数据特征提取模型310、数据分类模型320和数据分级模型330;
如图4所示,所述知识图谱400包括分类图谱410和分级图谱420;
将所述待监控的当前数据治理平台的融合数据输入至所述数据分析模型300,基于所述知识图谱400输出所述融合数据的分类分级数据,包括:
将所述待监控的当前数据治理平台的融合数据输入至所述数据特征提取模型310,输出所述融合数据对应的数据特征值;
将所述融合数据和对应的数据特征值输入至所述数据分类模型320,基于所述分类图谱410输出分类数据;
将所述分类数据输入至所述数据分级模型330,基于所述分级图谱420输出所述融合数据的分类分级数据。
具体地,数据分析模型是通过数据特征提取模型、数据分类模型和数据分级模型来实现的,其中,数据特征提取模型是对待监控的融合数据进行数据特征值计算,并将对应的数据特征值作为融合数据的权重值,同时输入至基于分类图谱的数据分类模型,在分类图谱中根据权重值索引以及加权处理后的融合数据的聚类后输出分类数据;数据分级模型基于分级图谱对分类数据的各个类别分别进行等级划分,从而得到融合数据的分类分级数据。
基于上述任一实施例,结合图3和图4,将所述融合数据和对应的数据特征值输入至所述数据分类模型320,基于所述分类图谱410输出分类数据,包括:
将融合数据对应的数据特征值作为权重对所述融合数据进行加权处理,得到加权处理后的融合数据;
基于所述分类图谱将所述加权处理后的融合数据进行权重值索引聚类,得到分类数据。
具体地,数据分类模型基于分类图谱对融合数据进行分类处理, 即,采用融合数据对应的数据特征值作为融合数据的权重值,对融合数据进行加权处理,进一步基于分类图谱的权重值索引,从而得到分类数据。
基于上述任一实施例,结合图3和图4,将所述分类数据输入至所述数据分级模型330,基于所述分级图谱420输出所述融合数据的分类分级数据,包括:
将所述分类数据对应的权重值乘以预设的分级系数得到所述分类数据对应的分级值;
基于所述分级图谱对所述分类数据对应的分级值进行数据划分,得到所述融合数据的分类分级数据。
具体地,数据分级模型基于分级图谱的分级制并根据分类数据对应的分级值对分类数据完成分级处理,其中所述分类数据对应的分级值是根据分类数据对应的权重值与预设的分级系数相乘得到的。
基于上述任一实施例,将所述融合数据的分类分级数据输入至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果,包括:
基于所述质量知识库对所述融合数据的分类分级数据进行检索、筛查、删除或保留操作,得到数据监控结果。
具体地,所述质量稽核模型是基于质量知识库对融合数据的分类分级数据进行检索、筛查、删除或保留,其中,所述质量知识库可以根据质量经验迭代更新。
基于上述任一实施例,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的,包括:
获取当前数据治理平台的样本融合数据;
通过知识计算引擎对所述样本融合数据进行图计算Nebula、离线计算Spark或实时计算Flink,生成知识图谱和质量知识库。
具体地,数据监控模型中的知识图谱和质量知识库均是通过现有 知识计算引擎的Nebula、Spark或Flink方式对样本融合数据进行计算后得到的。
下面对本发明提供的一种数据监控系统进行描述,下文描述的与上文描述的一种数据监控方法可相互对应参照。
图5为本发明实施例提供的数据监控系统的结构示意图,如图5所示,该系统包括数据确定单元510和数据监控单元520;
所述数据确定单元510,用于确定待监控的当前数据治理平台的融合数据;
所述数据监控单元520,用于将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;
其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
与现有技术相比,本发明实施例提供的系统,通过将当前数据治理平台的融合数据输入经过分类分级处理后得到的数据监控模型,实现了当前数据治理平台的数据治理水平的有效提升。
基于上述任一实施例,所述数据监控单元包括数据分析模块和质量稽核模块;
所述数据监控模型包括数据分析模型和质量稽核模型;
所述数据分析模块,用于输入所述待监控的当前数据治理平台的融合数据至所述数据分析模型,基于所述知识图谱输出所述融合数据的分类分级数据;
所述质量稽核模块,用于输入所述融合数据的分类分级数据至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果。
基于上述任一实施例,所述数据分析模块包括特征提取模块、数据分类模块和数据分级模块;
所述数据分析模型包括数据特征提取模型、数据分类模型和数据分级模型;
所述特征提取模块,用于输入所述待监控的当前数据治理平台的融合数据至所述数据特征提取模型,输出所述融合数据对应的数据特征值;
所述数据分类模块,用于输入所述融合数据和对应的数据特征值至所述数据分类模型,基于所述分类图谱输出分类数据;
所述数据分级模块,用于输入所述分类数据至所述数据分级模型,基于所述分级图谱输出所述融合数据的分类分级数据。
基于上述任一实施例,所述数据分类模块,具体用于:
将融合数据对应的数据特征值作为权重对所述融合数据进行加权处理,得到加权处理后的融合数据;
基于所述分类图谱将所述加权处理后的融合数据进行权重值索引聚类,得到分类数据。
基于上述任一实施例,所述数据分级模块,具体用于:
将所述分类数据对应的权重值乘以预设的分级系数得到所述分类数据对应的分级值;
基于所述分级图谱对所述分类数据对应的分级值进行数据划分,得到所述融合数据的分类分级数据。
基于上述任一实施例,所述质量稽核模块,具体用于:
基于所述质量知识库对所述融合数据的分类分级数据进行检索、筛查、删除或保留操作,得到数据监控结果。
基于上述任一实施例,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的,包括:
获取当前数据治理平台的样本融合数据;
通过知识计算引擎对所述样本融合数据进行图计算Nebula、离线 计算Spark或实时计算Flink,生成知识图谱和质量知识库。
综上,本发明实施例通过将当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。本发明有效解决了目前数据治理平台存在的数据分类分级及质量管控工作中数据难以治理的问题。
图6为本发明实施例提供的电子设备的结构示意图,如图6所示,该电子设备可以包括:处理器(processor)610、通信接口(Communications Interface)620、存储器(memory)630和通信总线640,其中,处理器610,通信接口620,存储器630通过通信总线640完成相互间的通信。处理器610可以调用存储器630中的逻辑指令,以执行数据监控方法,该方法包括:确定待监控的当前数据治理平台的融合数据;将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
此外,上述的存储器630中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only  Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
另一方面,本发明实施例还提供一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,计算机能够执行上述各方法所提供的数据监控方法,该方法包括:确定待监控的当前数据治理平台的融合数据;将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
又一方面,本发明实施例还提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现以执行上述各提供的数据监控方法,该方法包括:确定待监控的当前数据治理平台的融合数据;将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现 有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (10)

  1. 一种数据监控方法,其特征在于,该方法包括:
    确定待监控的当前数据治理平台的融合数据;
    将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;
    其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
  2. 根据权利要求1所述的数据监控方法,其特征在于,所述数据监控模型包括数据分析模型和质量稽核模型;
    所述将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果,包括:
    将所述待监控的当前数据治理平台的融合数据输入至所述数据分析模型,基于所述知识图谱输出所述融合数据的分类分级数据;
    将所述融合数据的分类分级数据输入至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果。
  3. 根据权利要求2所述的数据监控方法,其特征在于,所述数据分析模型包括数据特征提取模型、数据分类模型和数据分级模型;所述知识图谱包括分类图谱和分级图谱;
    将所述待监控的当前数据治理平台的融合数据输入至所述数据分析模型,基于所述知识图谱输出所述融合数据的分类分级数据,包括:
    将所述待监控的当前数据治理平台的融合数据输入至所述数据特征提取模型,输出所述融合数据对应的数据特征值;
    将所述融合数据和对应的数据特征值输入至所述数据分类模型,基于所述分类图谱输出分类数据;
    将所述分类数据输入至所述数据分级模型,基于所述分级图谱输出所述融合数据的分类分级数据。
  4. 根据权利要求3所述的数据监控方法,其特征在于,
    将所述融合数据和对应的数据特征值输入至所述数据分类模型,基于所述分类图谱输出分类数据,包括:
    将融合数据对应的数据特征值作为权重对所述融合数据进行加权处理,得到加权处理后的融合数据;
    基于所述分类图谱将所述加权处理后的融合数据进行权重值索引聚类,得到分类数据。
  5. 根据权利要求3所述的数据监控方法,其特征在于,
    将所述分类数据输入至所述数据分级模型,基于所述分级图谱输出所述融合数据的分类分级数据,包括:
    将所述分类数据对应的权重值乘以预设的分级系数得到所述分类数据对应的分级值;
    基于所述分级图谱对所述分类数据对应的分级值进行数据划分,得到所述融合数据的分类分级数据。
  6. 根据权利要求2所述的数据监控方法,其特征在于,
    将所述融合数据的分类分级数据输入至所述质量稽核模型,基于所述质量知识库输出所述数据监控结果,包括:
    基于所述质量知识库对所述融合数据的分类分级数据进行检索、筛查、删除或保留操作,得到数据监控结果。
  7. 根据权利要求1所述的数据监控方法,其特征在于,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的,包括:
    获取当前数据治理平台的样本融合数据;
    通过知识计算引擎对所述样本融合数据进行图计算Nebula、离线计算Spark或实时计算Flink,生成知识图谱和质量知识库。
  8. 一种数据监控系统,其特征在于,包括:
    数据确定单元,用于确定待监控的当前数据治理平台的融合数据;
    数据监控单元,用于将所述待监控的当前数据治理平台的融合数据输入至数据监控模型中,得到所述数据监控模型输出的数据监控结果;
    其中,所述数据监控模型是基于对当前数据治理平台的样本融合数据进行分类分级后生成知识图谱和质量知识库后得到的。
  9. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至7任一项所述数据监控方法的步骤。
  10. 一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现如权利要求1至7任一项所述数据监控的步骤。
PCT/CN2022/138505 2022-08-01 2022-12-13 一种数据监控方法及系统 WO2024027071A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210913441.5A CN114969018B (zh) 2022-08-01 2022-08-01 一种数据监控方法及系统
CN202210913441.5 2022-08-01

Publications (1)

Publication Number Publication Date
WO2024027071A1 true WO2024027071A1 (zh) 2024-02-08

Family

ID=82969442

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/138505 WO2024027071A1 (zh) 2022-08-01 2022-12-13 一种数据监控方法及系统

Country Status (3)

Country Link
CN (1) CN114969018B (zh)
LU (1) LU505740B1 (zh)
WO (1) WO2024027071A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114969018B (zh) * 2022-08-01 2022-11-08 太极计算机股份有限公司 一种数据监控方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858236A (zh) * 2020-06-23 2020-10-30 深圳精匠云创科技有限公司 知识图谱监控方法、装置、计算机设备及存储介质
US20220147822A1 (en) * 2021-01-22 2022-05-12 Beijing Baidu Netcom Science And Technology Co., Ltd. Training method and apparatus for target detection model, device and storage medium
CN114706994A (zh) * 2022-03-21 2022-07-05 华迪计算机集团有限公司 一种基于知识库的运维管理系统和方法
CN114969018A (zh) * 2022-08-01 2022-08-30 太极计算机股份有限公司 一种数据监控方法及系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086458A (zh) * 2018-09-12 2018-12-25 杭州格原信息技术有限公司 一种应用于勘察设计行业的搜索引擎系统
CN111460167A (zh) * 2020-03-19 2020-07-28 平安国际智慧城市科技股份有限公司 基于知识图谱定位排污对象的方法及相关设备
CN112182234B (zh) * 2020-07-29 2022-06-28 长江勘测规划设计研究有限责任公司 流域防洪规划数据知识图谱构建方法
CN112256887B (zh) * 2020-10-28 2022-06-24 福建亿榕信息技术有限公司 基于知识图谱的智能供应链管理方法
CN112990656B (zh) * 2021-02-05 2023-05-05 南方电网调峰调频发电有限公司信息通信分公司 一种it设备监测数据的健康评价系统及健康评价方法
CN113505241B (zh) * 2021-07-15 2023-06-30 润建股份有限公司 一种基于知识图谱的用电安全隐患智能诊断方法
CN114818707A (zh) * 2022-03-02 2022-07-29 北京航空航天大学 一种基于知识图谱的自动驾驶决策方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858236A (zh) * 2020-06-23 2020-10-30 深圳精匠云创科技有限公司 知识图谱监控方法、装置、计算机设备及存储介质
US20220147822A1 (en) * 2021-01-22 2022-05-12 Beijing Baidu Netcom Science And Technology Co., Ltd. Training method and apparatus for target detection model, device and storage medium
CN114706994A (zh) * 2022-03-21 2022-07-05 华迪计算机集团有限公司 一种基于知识库的运维管理系统和方法
CN114969018A (zh) * 2022-08-01 2022-08-30 太极计算机股份有限公司 一种数据监控方法及系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: " Introduction to Knowledge Graph Construction Technology and Its Application in Banking Business-Oracle", DOC IN, 30 April 2019 (2019-04-30), XP093136302, Retrieved from the Internet <URL:https://www.docin.com/p-2200880633.html> [retrieved on 20240229] *
全国信息技术标准化技术委员会大数据标准工作组 中国电子技术标准化研究院 (NON-OFFICIAL TRANSLATION: CHINA NATIONAL INFORMATION TECHNOLOGY STANDARDIZATION NETWORK BIG D: "数据治理工具图谱研究报告(2021版) (Non-official translation: Data Governance Tool Map Research Report (2021 Edition))", HTTP://WWW.CESI.CN/IMAGES/EDITOR/20211103/20211103160022359.PDF, 31 October 2021 (2021-10-31) *

Also Published As

Publication number Publication date
LU505740B1 (en) 2024-06-17
CN114969018A (zh) 2022-08-30
CN114969018B (zh) 2022-11-08
LU505740A1 (en) 2024-02-19

Similar Documents

Publication Publication Date Title
CN110493025B (zh) 一种基于多层有向图的故障根因诊断的方法及装置
CN108683663B (zh) 一种网络安全态势的评估方法及装置
CN106803799B (zh) 一种性能测试方法和装置
WO2021098384A1 (zh) 一种数据异常检测方法及装置
WO2024027071A1 (zh) 一种数据监控方法及系统
CN110991871A (zh) 风险监测方法、装置、设备与计算机可读存储介质
WO2022267085A1 (zh) 基于人工智能的数据中心数据管理方法及系统
CN115576834A (zh) 支撑故障还原的软件测试复用方法、系统、终端及介质
US11227288B1 (en) Systems and methods for integration of disparate data feeds for unified data monitoring
CN108711074B (zh) 业务分类方法、装置、服务器及可读存储介质
CN115883392A (zh) 算力网络的数据感知方法、装置、电子设备及存储介质
CN115600818A (zh) 多维评分方法、装置、电子设备和存储介质
CN111340264A (zh) 基于升级多差树模型的投诉升级预测方法及装置
CN112597500A (zh) 汽车信息安全风险评估方法、装置、电子设备及存储介质
CN114757790B (zh) 一种利用神经网络对多源情报风险评估的方法
CN116882724B (zh) 一种业务流程优化方案的生成方法、装置、设备及介质
CN117873828A (zh) 一种服务器的告警处理方法、装置、设备及介质
CN118227507A (zh) 一种信创设备适配性的测试方法及系统
CN116776524A (zh) 一种数字孪生模型响应的快速计算方法、系统和存储介质
CN116739395A (zh) 一种企业外迁预测方法、装置、设备及存储介质
CN117492903A (zh) 中央告警方法、装置、电子设备及存储介质
CN117688434A (zh) 用户异常行为检测方法、装置、设备及存储介质
CN113241164A (zh) 一种基于高斯过程的医院门诊量预测方法及系统
CN115408378A (zh) 一种数据核验方法及装置
CN114707010A (zh) 模型训练和媒介信息处理方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22953859

Country of ref document: EP

Kind code of ref document: A1