CN116303385A - Data auditing method and device, electronic equipment and storage medium - Google Patents

Data auditing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116303385A
CN116303385A CN202310140103.7A CN202310140103A CN116303385A CN 116303385 A CN116303385 A CN 116303385A CN 202310140103 A CN202310140103 A CN 202310140103A CN 116303385 A CN116303385 A CN 116303385A
Authority
CN
China
Prior art keywords
auditing
data
data flow
flow graph
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310140103.7A
Other languages
Chinese (zh)
Inventor
刘建波
董冰
程赓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Tower Co Ltd
Original Assignee
China Tower Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Tower Co Ltd filed Critical China Tower Co Ltd
Priority to CN202310140103.7A priority Critical patent/CN116303385A/en
Publication of CN116303385A publication Critical patent/CN116303385A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a data auditing method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a data flow graph in a data flow graph library, wherein the data flow graph comprises at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system; determining an audit target in the at least one business system according to the data flow graph, and determining a preset audit rule matched with the audit target; and carrying out data auditing treatment on the auditing target according to the preset auditing rule to obtain a data auditing result, wherein the data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph. The invention can improve the effect of data auditing.

Description

Data auditing method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer information management systems, and in particular, to a data auditing method, apparatus, device, and storage medium.
Background
Today, business processes in many industries need to span multiple business systems, and with the development of business, a long-time system operation can be accompanied by the generation of a large amount of business data, but inconsistency of data standards and inconsistency of data among business systems can directly affect accuracy and consistency of information.
In the prior art, the data problem depends on post data auditing, namely, the data in an offline state is audited, so that a user is helped to find the data problem, and whether abnormal data exist in target business data is determined.
Therefore, the prior art has the problem of poor data auditing effect in the business system.
Disclosure of Invention
The embodiment of the invention provides a data auditing method, a device, electronic equipment and a storage medium, which are used for solving the problem of poor data auditing effect in the prior art.
In a first aspect, an embodiment of the present invention provides a data auditing method, including:
acquiring a data flow graph in a data flow graph library, wherein the data flow graph comprises at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system;
Determining an audit target in the at least one business system according to the data flow graph, and determining a preset audit rule matched with the audit target;
and carrying out data auditing treatment on the auditing target according to the preset auditing rule to obtain a data auditing result, wherein the data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph.
Optionally, before the acquiring the data flow graph matched with the at least one service system, the method further includes:
acquiring first operation information, wherein the first operation information represents the selection of a base element by a user and the selection of a set position of the base element;
acquiring second operation information based on the first operation information, wherein the second operation information is used for adding elements corresponding to auditing nodes between the basic elements, and the auditing nodes are used for configuring auditing rules;
creating a first data flow graph according to the first operation information and the second operation information, and adding the first data flow graph into the data flow graph library;
wherein the base element is at least one of: nodes representing entities in a business system, edges representing associations between multiple nodes, and groups representing sets of similar nodes.
Optionally, the determining an audit target according to the data flow graph and a preset audit rule matching the audit target includes at least one of the following:
setting a first business entity in a first target business system as the auditing target under the condition that the data flow diagram is the intersystem data flow diagram; setting an auditing rule configured by a user at the first business entity as the preset auditing rule, wherein the preset auditing rule is used for judging the existence of the business entity;
setting the attribute of a second service entity in a second target service system as the auditing target under the condition that the data flow graph is the service line data flow graph; setting an auditing rule configured by a user at the second service entity as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute value of the second service entity;
setting metadata information in a third target service system matching database as the auditing target under the condition that the data flow graph is the system library table structure graph; setting a database auditing rule set by a user as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute of the metadata information;
Setting a hardware and software framework corresponding to a fourth target service system as the auditing target under the condition that the data flow graph is the system deployment graph; setting deployment auditing rules of user equipment as the preset auditing rules, wherein the preset auditing rules are used for judging and processing software and hardware execution conditions;
the first target service system, the second target service system, the third target service system and the fourth target service system are service systems in the at least one service system, the first service entity is any service entity in the first target service system, and the second service entity is any service entity in the second target service system.
Optionally, the data auditing process is performed on the auditing target according to the preset auditing rule, and a data auditing result is generated, including at least one of the following:
under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing, collecting the full metadata in the business system matching database, and carrying out consistency judgment on the full metadata to obtain a first data auditing result, wherein the first data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
Under the condition that the data flow graph is the service line data flow graph and the real-time data corresponding to the second service entity is subjected to data auditing processing, incremental data among a plurality of service systems are collected, the incremental data are copied to a memory database, the incremental data in the memory database are subjected to data auditing processing by adopting a preset auditing model, so that a second data auditing result is obtained, and the second data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
and under the condition that the data flow graph is the intersystem data flow graph or the business line data flow graph and the offline data is subjected to data auditing, determining an auditing script according to the auditing rule, wherein the auditing script is matched and corresponds to the auditing rule, inquiring target data of the business system matching database according to the auditing script, generating an auditing result file, and generating a third data auditing result according to the auditing result file, wherein the third data auditing result is used for displaying abnormal data or analysis results in the data flow graph.
Optionally, the first data audit result includes a first data audit result or a second data audit result;
Under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing, collecting the total metadata in the business system matching database and carrying out consistency judgment on the total metadata to obtain a first data auditing result, wherein the method comprises the following steps of:
consistency judgment is carried out on the full-quantity metadata, and under the condition that the full-quantity metadata meets a first preset condition, a first data audit result is generated, wherein the first data audit result indicates that the full-quantity metadata does not contain abnormal data;
generating a second data audit result under the condition that the full-volume metadata does not meet a first preset condition, wherein the second data audit result represents that the full-volume metadata contains abnormal data;
the first preset condition indicates that a target attribute of the database table corresponding to the full-volume metadata is matched with a preset target attribute, and the target attribute comprises at least one of the following: database table structure, field type, and field length.
Optionally, after the data auditing process is performed on the auditing target according to the preset auditing rule, and the data auditing result is generated, the method further includes:
Acquiring third operation information, wherein the third operation information represents that a user modifies auditing rules matched with auditing nodes in the data flow graph;
and updating the data flow diagram according to the third operation information.
In a second aspect, an embodiment of the present invention provides a data auditing apparatus, including:
the first acquisition module is used for acquiring a data flow graph in the data flow graph library, and the data flow graph comprises at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system;
the determining module is used for determining an auditing target in the at least one business system according to the data flow graph and determining a preset auditing rule matched with the auditing target;
the processing module is used for conducting data auditing processing on the auditing target according to the preset auditing rule to obtain a data auditing result, and the data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph.
Optionally, the apparatus further comprises:
the second acquisition module is used for acquiring first operation information, wherein the first operation information represents the selection of a base element by a user and the selection of a set position of the base element;
the third acquisition module is used for acquiring second operation information based on the first operation information, wherein the second operation information is used for adding elements corresponding to auditing nodes between the basic elements, and the auditing nodes are used for configuring auditing rules;
the creation module is used for creating a first data flow graph according to the first operation information and the second operation information and adding the first data flow graph into the data flow graph library;
wherein the base element is at least one of: nodes representing entities in a business system, edges representing associations between multiple nodes, and groups representing sets of similar nodes.
Optionally, the determining module includes:
the first determining unit is used for setting a first business entity in a first target business system as the auditing target under the condition that the data flow diagram is the intersystem data flow diagram; setting an auditing rule configured by a user at the first business entity as the preset auditing rule, wherein the preset auditing rule is used for judging the existence of the business entity;
The second determining unit is used for setting the attribute of a second service entity in a second target service system as the auditing target under the condition that the data flow diagram is the service line data flow diagram; setting an auditing rule configured by a user at the second service entity as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute value of the second service entity;
the third determining unit is used for setting metadata information in a third target service system matching database as the auditing target under the condition that the data flow graph is the system base table structure graph; setting a database auditing rule set by a user as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute of the metadata information;
a fourth determining unit, configured to set a hardware and software architecture corresponding to a fourth target service system as the audit target when the data flow graph is the system deployment graph; setting deployment auditing rules of user equipment as the preset auditing rules, wherein the preset auditing rules are used for judging and processing software and hardware execution conditions;
The first target service system, the second target service system, the third target service system and the fourth target service system are service systems in the at least one service system, the first service entity is any service entity in the first target service system, and the second service entity is any service entity in the second target service system.
Optionally, the processing module includes:
the first processing unit is used for collecting the full metadata in the business system matching database and carrying out consistency judgment on the full metadata under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing processing, so as to obtain a first data auditing result, wherein the first data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
the second processing unit is used for collecting incremental data among a plurality of service systems under the condition that the data flow graph is the service line data flow graph and data auditing processing is carried out on real-time data corresponding to the second service entity, copying the incremental data to a memory database, and carrying out data auditing processing on the incremental data in the memory database by adopting a preset auditing model to obtain a second data auditing result, wherein the second data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
The third processing unit is used for determining a checking script according to the checking rule under the condition that the data flow graph is the intersystem data flow graph or the business line data flow graph and the offline data is subjected to data checking processing, the checking script is matched and corresponds to the checking rule, inquiring target data of the business system matching database according to the checking script, generating a checking result file, and generating a third data checking result according to the checking result file, wherein the third data checking result is used for displaying abnormal data or analysis results in the data flow graph.
Optionally, the first data audit result includes a first data audit result or a second data audit result;
the first processing unit includes:
consistency judgment is carried out on the full-quantity metadata, and under the condition that the full-quantity metadata meets a first preset condition, a first data audit result is generated, wherein the first data audit result indicates that the full-quantity metadata does not contain abnormal data;
generating a second data audit result under the condition that the full-volume metadata does not meet a first preset condition, wherein the second data audit result represents that the full-volume metadata contains abnormal data;
The first preset condition indicates that a target attribute of the database table corresponding to the full-volume metadata is matched with a preset target attribute, and the target attribute comprises at least one of the following: database table structure, field type, and field length.
Optionally, the apparatus further comprises:
the fourth acquisition module is used for acquiring third operation information, and the third operation information represents that a user modifies an auditing rule matched with the auditing node in the data flow graph;
and the updating module is used for updating the data flow graph according to the third operation information.
In a third aspect, an embodiment of the present invention provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data auditing method of the first aspect.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium storing computer instructions comprising:
the computer instructions are for causing the computer to perform the data auditing method according to the first aspect.
In the embodiment of the invention, firstly, a data flow graph in a data flow graph library is acquired, the data flow graph can be divided into a system data flow graph, a system library table structure graph and a system deployment graph according to a related service system and a service flow, then, an audit target in at least one service system is determined according to the data flow graph, a preset audit rule matched with the audit target is determined, finally, data audit processing is carried out on the audit target according to the preset audit rule matched with the audit target, and a data audit result is obtained.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a data auditing method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a data auditing portal provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of an intersystem data flow diagram according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a service line data flow diagram according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a system library table structure diagram according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a system deployment topology provided by an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a data auditing device according to an embodiment of the present invention;
fig. 8 is a block diagram of an electronic device for implementing a data auditing method according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," and the like in embodiments of the present invention are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, fig. 1 is a flow chart of a data auditing method according to an embodiment of the present invention, as shown in fig. 1, including the following steps:
step 101, acquiring a data flow graph in a data flow graph library, wherein the data flow graph comprises at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system;
Step 102, determining an audit target in the at least one business system according to the data flow graph, and determining a preset audit rule matched with the audit target;
and 103, carrying out data auditing treatment on the auditing target according to the preset auditing rule to obtain a data auditing result, wherein the data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph.
The steps 101, 102, 103, and 104 included in the data auditing method may be executed by an electronic computer or a virtual machine, which is not limited to the embodiment of the present invention.
Also, the database involved in the above data auditing method may be any data warehouse for storing data, such as: mysql, oracle, sqlserver and Sqlite, and the like, and the embodiment of the present invention is not limited thereto.
In step 101, the data flow chart library may store a plurality of the data flow charts, where the data flow charts may be created by user editing, that is, in the method provided by the embodiment of the present invention, a chart visualization engine may be configured, so as to provide chart visualization capability on the basis of drawing, layout, analysis, interaction, animation, and the like of the chart, and the chart visualization engine may help the user to create the data flow charts efficiently.
It should be noted that, the constituent elements of the data flow graph may include nodes (nodes), edges (edges) and groups (Combo), and the user may change the color matching and the combination of the nodes, edges and groups according to the actual service scene and the service flow, so as to create a graph visual analysis application that is more fit to the service scene or the service system, i.e. the data flow graph.
In addition, the nodes are basic units forming the graph, generally represent a certain entity, the nodes in the embodiment of the invention can represent service nodes, the edges can represent the relationship between two nodes, the groups can be understood as node groups, and the nodes are used for managing a group of similar nodes, such as a group of nodes with the same type or a group of nodes which are relatively close in position, and the nodes can be divided into the same group, so that visual interference to users is effectively reduced.
It should be appreciated that the user may also create data auditing nodes based on the data flow graph, i.e., display data auditing nodes in the data flow graph and select appropriate auditing rules to examine the data, and in some alternative embodiments, the data flow graph may also display the number of created auditing rules and the number of missing auditing rules, so that the user may obtain more information about intersystem data auditing.
In some alternative embodiments, the user may interactively complete the drawing and creation of the data flow graph described above by: the diagram visualization engine can configure related interaction styles for direct use by users, such as click nodes, click edges, zoom canvas, drag canvas, and other interaction methods, wherein the interaction styles can be divided into point interaction styles, edge interaction styles, and group interaction styles.
Specifically, the user can browse the corresponding data flow graph of the next stage by clicking on the entity or the relation, or click on the auditing rule number to check the corresponding auditing rule.
The dataflow graph includes at least one of: the system dataflow graph, the system library table structure graph, and the system deployment graph may be understood as data flow graphs at different hierarchical levels, such as: the system data flow graph represents a data flow graph among at least one service system, the system library table structure graph represents a data table of at least one service system matched database, the system deployment graph represents a hard software deployment graph corresponding to at least one service system, and the end-to-end data auditing effect is realized by using the data flow graph as traction and longitudinally penetrating through the data flow graph, the database table structure graph and the system deployment graph through the hierarchical data auditing system.
In step 102, determining an audit target in the at least one service system according to the data flow graph, and determining a preset audit rule matching the audit target, first determining a type of the data flow graph, determining the audit target according to the type of the data flow graph, and also understanding as an audit object, wherein the preset audit rule is preset by a user, and after determining the audit target, performing data audit processing according to the preset audit rule set by the user.
It should be noted that, the preset auditing rule is added by the user when creating or editing the data flow graph, and the preset auditing rule is matched with the auditing target, so in the embodiment of the present invention, the specific rule of the preset auditing rule and the number of the preset auditing rules are not limited.
In step 103, the user may perform data auditing on the service system in advance, in advance and after the service system, where the data auditing in advance may be understood as performing data auditing on the data stored in the database, that is, performing data auditing processing on the metadata, the data auditing in advance may be understood as performing data auditing processing on streaming data or real-time data, and the data auditing in post may be understood as performing data auditing processing after data warehousing.
The data auditing result may be used to display abnormal data and/or analysis results in the data flow graph, for example: after the user carries out data auditing on the service system in advance, the obtained data auditing result can be expressed as a data auditing report, and related management personnel can re-normalize and rectify the management flow according to the data auditing report, so that the data standard can be effectively executed and landed; after the user performs data auditing on the business system, the obtained data auditing result can be directly fed back to the corresponding business system, related management personnel can correct business data according to the data auditing result, and abnormal data can be intercepted, so that the efficiency and accuracy of auditing work are improved; after the user carries out data auditing on the service system afterwards, the obtained data auditing result is also expressed as a data auditing report, the report can intensively show the data quality condition, and related management personnel can carry out improvement on the service system according to the data quality condition, so that the data auditing effect of the service system is improved.
In the embodiment, firstly, a data flow graph in a data flow graph library is acquired, the data flow graph can be divided into a system data flow graph, a system library table structure graph and a system deployment graph according to a related service system and a related service process, then an audit target in at least one service system is determined according to the data flow graph, a preset audit rule matched with the audit target is determined, finally data audit processing is carried out on the audit target according to the preset audit rule matched with the audit target, and a data audit result is obtained.
In addition, in the embodiment of the present invention, an interaction module for a user may be set, for example, a data auditing portal, please refer to fig. 2, fig. 2 is a schematic diagram of the data auditing portal, and the data auditing portal may be a service capability of opening the above data flow graph editing in a middle service manner, and for pre-, in-and post-data auditing, that is, a service capability of performing metadata auditing, real-time data auditing, and offline data auditing, a program interface implements service invocation, provides management of multi-level auditing rules, task scheduling, auditing rule sharing, and the like for each level, provides different data auditing service views and capabilities, and completes one-stop data auditing service.
The data auditing portal integrates the configuration and management of metadata auditing, real-time data auditing and offline data auditing and data standard into one portal, is convenient for unified management and maintenance, comprises system management, user management, role management, data security management, system parameter configuration and the like, and can realize the integration of the service capability of the data auditing in advance, in the event and after the event through the setting of the data auditing portal, and the unified configuration and management of the metadata auditing, the real-time data auditing and the offline data auditing.
Optionally, before the acquiring the data flow graph matched with the at least one service system, the method further includes:
acquiring first operation information, wherein the first operation information represents the selection of a base element by a user and the selection of a set position of the base element;
acquiring second operation information based on the first operation information, wherein the second operation information is used for adding elements corresponding to auditing nodes between the basic elements, and the auditing nodes are used for configuring auditing rules;
creating a first data flow graph according to the first operation information and the second operation information, and adding the first data flow graph into the data flow graph library;
wherein the base element is at least one of: nodes representing entities in a business system, edges representing associations between multiple nodes, and groups representing sets of similar nodes.
In this embodiment, the first operation information of the user may be first obtained, where the first operation information may be understood as a determination of the system block diagram by the user, and because the first operation information is used to select a base element and set a position for the base element, after the system block diagram is confirmed, second operation information of the user is obtained again, where the second operation information is used to indicate that the user adds a preset auditing rule in the system block diagram, and finally the first data flow diagram is created based on the first operation information and the second operation information, and the first data flow diagram is added to the data flow diagram library.
It should be noted that, the constituent elements of the data flow graph may include nodes, edges and groups, and a user may change color matching and combination of the nodes, edges and groups according to an actual service scene and a service flow, so as to create a graph visual analysis application that is more fit to the service scene or the service system, i.e. the data flow graph.
In addition, the nodes are basic units forming the graph, generally represent a certain entity, the nodes in the embodiment of the invention can represent service nodes, the edges can represent the relationship between two nodes, the groups can be understood as node groups, and the nodes are used for managing a group of similar nodes, such as a group of nodes with the same type or a group of nodes which are relatively close in position, and the nodes can be divided into the same group, so that visual interference to users is effectively reduced.
Optionally, the determining an audit target according to the data flow graph and a preset audit rule matching the audit target includes at least one of the following:
setting a first business entity in a first target business system as the auditing target under the condition that the data flow diagram is the intersystem data flow diagram; setting an auditing rule configured by a user at the first business entity as the preset auditing rule, wherein the preset auditing rule is used for judging the existence of the business entity;
Setting the attribute of a second service entity in a second target service system as the auditing target under the condition that the data flow graph is the service line data flow graph; setting an auditing rule configured by a user at the second service entity as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute value of the second service entity;
setting metadata information in a third target service system matching database as the auditing target under the condition that the data flow graph is the system library table structure graph; setting a database auditing rule set by a user as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute of the metadata information;
setting a hardware and software framework corresponding to a fourth target service system as the auditing target under the condition that the data flow graph is the system deployment graph; setting deployment auditing rules of user equipment as the preset auditing rules, wherein the preset auditing rules are used for judging and processing software and hardware execution conditions;
the first target service system, the second target service system, the third target service system and the fourth target service system are service systems in the at least one service system, the first service entity is any service entity in the first target service system, and the second service entity is any service entity in the second target service system.
In this embodiment, according to the difference of the types of the data flow graphs, the corresponding auditing target in the data flow graphs and the preset auditing rule corresponding to the auditing target are determined, and it is understood that the data flow graphs represent the processes of logic functions, logic flow directions of service data in the service system and logic transformation in a graphical manner from the data transmission and processing angles.
As can be seen from the foregoing embodiments, according to the data flow graph, determining an audit target in the at least one service system, and determining a preset audit rule matching the audit target, where the audit target and the corresponding preset audit rule are different when the data flow graph is the inter-system data flow graph, the service line data flow graph, the system library table structure graph, and the system deployment graph, respectively, please refer to table 1.
TABLE 1
Hierarchy level Auditing relationships Auditing objects
L1 Intersystem system The existence of key entities among systems keeps consistent (such as station addresses, projects and the like)
L2 Inter/intra system The key attribute values of the same entity are consistent (such as amount, state, etc.)
L3 System library table structure Metadata information of the same entity and the same attribute are consistent (such as length, precision and the like)
L4 Global situation Software and hardware architecture and system deployment rationality (e.g., interfaces, copies, etc.)
The data flow graph can be divided into four layers of L1, L2, L3 and L4, and corresponds to an intersystem data flow graph, a business line data flow graph, a system library table structure graph and a system deployment topological graph respectively.
In the case that the data flow diagram is an intersystem data flow diagram, please refer to fig. 3, fig. 3 is a schematic diagram of an intersystem data flow diagram provided in an embodiment of the present invention, as shown in the drawing, rounded rectangle boxes in the figure represent each service system, right rectangle boxes represent entities in the service system, circles disposed on the entities represent already created data auditing rules, numerals in the circles represent the number of already created data auditing rules, and diamonds disposed on the entities represent missing data auditing rules, numerals in the diamonds represent the number of missing data auditing rules.
Wherein the data flow graph is represented in this case as data auditing between business systems, for example: and in the hierarchy, the auditing objects are key entities among the business systems, and the preset auditing rules are used for judging and processing the existence of the business entities.
In the case that the data flow graph is a service line data flow graph, please refer to fig. 4, fig. 4 is a schematic diagram of a service line data flow graph provided in an embodiment of the present invention, as shown in the drawing, a closed rectangle in the drawing represents a service flow node, a non-closed rectangle in the drawing represents a service entity, a solid arrow in the drawing represents a service flow, a dotted arrow in the drawing represents a data flow, a circle connected between service entities represents an already created data auditing rule, numbers in the circle represent the number of already created data auditing rules, and a diamond connected between service entities represents a missing data auditing rule, numbers in the diamond represent the number of missing data auditing rules.
The financial management system and the property electric charge management system have business interaction, and the payment bill (business entity) co-exists in the two business systems, so that a user can set data auditing operation between the two payment bills, and accordingly, auditing corresponding data in the payment bill, such as key attributes of the payment bill, can check whether the values are consistent or not.
In the case that the data flow chart is a system library table structure chart, please refer to fig. 5, fig. 5 is a schematic diagram of a system library table structure chart provided in an embodiment of the present invention, and as shown in the figure, fig. 5 includes information of databases corresponding to a financial system and a property electric charge management system respectively.
Wherein, in this case, the data flow diagram is represented as data auditing of the corresponding databases of the financial system and the property electric charge management system, specifically, checking whether metadata of the payment bill in the databases are consistent (such as length, precision, etc.).
In the case that the data flow chart is a system deployment topology chart, please refer to fig. 6, fig. 6 is a schematic diagram of a system deployment topology chart provided by the embodiment of the present invention, and a global deployment situation of software and hardware is shown in the diagram, where the data flow chart is shown as checking the deployment rationality of the software and the hardware according to the preset auditing rule.
Optionally, the data auditing process is performed on the auditing target according to the preset auditing rule, and a data auditing result is generated, including at least one of the following:
under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing, collecting the full metadata in the business system matching database, and carrying out consistency judgment on the full metadata to obtain a first data auditing result, wherein the first data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
Under the condition that the data flow graph is the service line data flow graph and the real-time data corresponding to the second service entity is subjected to data auditing processing, incremental data among a plurality of service systems are collected, the incremental data are copied to a memory database, the incremental data in the memory database are subjected to data auditing processing by adopting a preset auditing model, so that a second data auditing result is obtained, and the second data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
and under the condition that the data flow graph is the intersystem data flow graph or the business line data flow graph and the offline data is subjected to data auditing, determining an auditing script according to the auditing rule, wherein the auditing script is matched and corresponds to the auditing rule, inquiring target data of the business system matching database according to the auditing script, generating an auditing result file, and generating a third data auditing result according to the auditing result file, wherein the third data auditing result is used for displaying abnormal data or analysis results in the data flow graph.
The embodiment of the invention mainly comprises the auditing treatment of three data, namely metadata auditing, real-time data auditing and offline data auditing, wherein the metadata auditing corresponds to pre-auditing, the real-time data auditing corresponds to in-process auditing and the offline data auditing corresponds to post-auditing, and the following three aspects are respectively described:
(1) Metadata auditing
And the metadata auditing is mainly based on dictionary standards, auditing the consistency of metadata information of each service system and data warehouse, checking the floor execution condition of the data standards, and finally generating the first data auditing result, namely a metadata auditing report, so that the metadata quality of each service system and data warehouse is improved.
First, in a multi-source heterogeneous environment, the full metadata of each business system and data warehouse is collected through a collection adapter, including the data entities (description of systems, libraries, tables, fields) in the process and the logic in the processing of the data entities. The acquisition adapter is compatible with data sources in different formats, and converts data dictionary information of different database types to meet metadata acquisition requirements, so that main stream database data dictionary (Excel) acquisition or online direct acquisition is realized.
After the metadata is collected, two data auditing modes can be adopted: consistency auditing and model comparison, wherein the consistency auditing is to audit the consistency of key attributes of objects (database tables) maintained by a system, such as whether a table structure is consistent, whether field types are consistent, whether field lengths are consistent and the like; model comparison is to compare differences between the metadata of the repository dimension and the actual business repository metadata, such as no data dictionary, no instance object, field inconsistency, etc.
Finally, the first data auditing result is obtained, and the related manager can normalize, rectify and track related problems based on the first data auditing result, so that the data standard can be effectively landed.
(2) Real-time data auditing
The real-time data auditing can be understood as copying information into an independent memory database in real time through a collector, comparing real-time data through a real-time auditing model, and intercepting non-compliant data in real time to enter the system.
Firstly, the data of a plurality of heterogeneous service systems are copied to a memory database in real time through a collector, and when the data in the service systems change, the incremental data are synchronized to the memory database in real time through the adaptive collector, so that the monitoring and maintenance of the real-time collection of the heterogeneous databases are realized.
It should be noted that, depending on the type of database, the manner of collecting the data may also be different, for example: 1) MySQL data acquisition
Based on the MySQL database, the data is copied to the memory database in real time through the MySQL Binlog real-time data stream, and the data real-time data synchronization is completed. The data is obtained in real time by disguising itself as a MySQL Slave node (Slave) and obtaining Binlog from a Master node (Master).
The data synchronization server is responsible for connecting to different MySQL instances, and maintaining an event message queue for each instance to realize real-time data synchronization.
In each example, a standby MySQL is configured, and the heartbeat mechanism of the auxiliary table is updated periodically to detect the survival of the main library, wherein the heartbeat mechanism is a mechanism for inserting a piece of heartbeat data into each extracted table every minute (configurable) and saving the sending time, the heartbeat table is also extracted, the whole flow is followed by the same logic as the synchronized table in practice, when the heartbeat packet is received, the whole link can be proved to be through even if no added or deleted data exists, and the working data source can be automatically selected through the mechanism under the double Master scene.
2) Oracle data acquisition
Based on an Oracle database, data replication is performed through SLT, and data is obtained by creating a log table and a trigger in a source system through a client.
Then, according to the business audit scenario, solidifying the data audit model in the memory database, for example:
(1) automatic checking model for centralized auditing
The historical data problem affecting the campaigns, which is generated by cutting over the fixed assets of the operators when the integrated business is established, is uniformly managed by the corporate headquarters. Wherein, normal business behaviors such as renting, maintaining and deducting, operation of non-package oil dryer power generation fees and the like are operated by a branch company; the service is terminated and the account is uniformly and upwardly received to the headquarter for centralized operation; the business change (including tracing) operation carries out classified management according to the operation frequency degree and the influence degree on the camp, different data classification authorizes centralized operation after checking and controlling by different levels of business supporting parts and professional departments. The centralized auditing design is independent of a hierarchical auditing system of the production flow, and the information required by centralized auditing in each system is automatically audited in batches or singly in a real-time mode and the reasons are fed back, so that the manual auditing work is reduced, and the centralized operation auditing efficiency is improved.
(2) Sale order batch binding resource auditing model
When the sales order is compiled and rented, whether the product in the customer relation management system (Customer relationship management system, CRM) is matched with the specification of the iron tower and the machine room resources of the corresponding station address in the resource system cannot be judged, so that the cost dispute caused by sales type errors can occur. In the actual operation process, batch binding cannot be carried out on orders and the condition of resource compliance is inquired. And the user is assisted in batch operation and the consistency check is completed by calling the real-time auditing service for verification, so that a series of changes caused by discovery in the future are avoided.
(3) During-transit business auditing model for lease-returning order
When a customer rents an order due to station disassembly caused by haircut or other reasons, due to the lack of data interaction between a property system and a sales system, operators cannot timely know whether the order has relevant service and whether the cost is adjusted, and once the customer rents unilaterally, the station address balance is not matched. And synchronizing property system and sales system data in real time by utilizing the auditing model, and checking the on-road service and cost information related to the corresponding station address of the order, so as to remind a customer manager to coordinate and initiate related business transaction or transfer in real time.
(4) Three types of key bill auditing models in purchasing-paying process
The warehouse entry bill and the purchase order are not compared when the materials are purchased and warehoused, the settlement bill and the purchase order are not compared when the materials are paid, the project state and the payment proportion are not verified, and operators are not verified when the engineering projects are paid, so that the risks of physical management and payment are easily caused. The operator is reminded of timely correction in real time by calling a real-time auditing service to check the consistency of the purchase order and the warehouse entry order, the purchase order and the settlement order and the payment proportion in real time.
And checking the service data through a real-time data auditing model to generate the second data auditing result, feeding back the abnormal data and the auditing result to each service system in real time, and correcting the service data by related management personnel to improve the efficiency and accuracy of auditing work.
It should be understood that the real-time auditing capability based on stream computing is embedded into each business process, and the abnormal business data interception rate is counted while the abnormal business data interception is prompted, so that the warning lamp of each business point is formed. When the interception rate reaches the ratio, the method gives a warning to the user, is beneficial to the business department to find risks and problems of each link and process the risks and problems in time, and reduces the arrival loss of abnormal data to the greatest extent.
(3) Offline data auditing
The offline data auditing can be understood as conforming to the six-dimensional solidified data auditing rules of integrity, uniqueness, timeliness, validity, accuracy and consistency, completing data auditing through timing tasks, discovering data quality problems in each business system and data warehouse and modifying.
When the user creates an offline data auditing rule and starts to carry out data auditing, each checking method is sequentially executed, and the execution of the checking method is actually the execution of the checking script. If the inquired data exists, the corresponding check result file (DAT file) is generated in the received directory, and an XML file with the same name is generated at the same time as a ready mark for the result file to arrive. All these files will be stored in the received directory in a folder named the date of the check of this task.
The audit result collection program uses a Quartz operation mode to carry out polling collection, and the data volume is different in each time of the files needing to be collected, and the possibility of other factors is interfered, so that the time used by each operation is not the same, and the last operation is not necessarily ended when each operation is started. In order to avoid the problem of resource contention among different job batches, the system adopts a single job execution mode, and if the last job is not finished yet when the job is started, the present job is automatically finished, and the next job is started until the last job is finished.
The audit result acquisition module acquires and stores the audit result files, simple summarizing operation is carried out on the audit result data in the acquisition process, and detail data and summary data are respectively stored in a result detail table and a summary table. In the process, if an abnormality occurs, a rollback operation is performed on the database table to avoid incomplete data, and then the file is moved to an Error directory.
It should be noted that, the user may define auditing rules according to auditing categories for different service entities, where each auditing rule simply describes quality problems from the perspective of the service entity, and includes one or more information items, where the information items are objects to be audited specifically by each service entity, so that a specific auditing method is defined according to different information items on the basis of each auditing rule.
In this embodiment, according to the different types of the data flow graphs, the corresponding auditing target in the data flow graphs and the preset auditing rule corresponding to the auditing target are determined.
Optionally, the first data audit result includes a first data audit result or a second data audit result;
under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing, collecting the total metadata in the business system matching database and carrying out consistency judgment on the total metadata to obtain a first data auditing result, wherein the method comprises the following steps of:
consistency judgment is carried out on the full-quantity metadata, and under the condition that the full-quantity metadata meets a first preset condition, a first data audit result is generated, wherein the first data audit result indicates that the full-quantity metadata does not contain abnormal data;
generating a second data audit result under the condition that the full-volume metadata does not meet a first preset condition, wherein the second data audit result represents that the full-volume metadata contains abnormal data;
the first preset condition indicates that a target attribute of the database table corresponding to the full-volume metadata is matched with a preset target attribute, and the target attribute comprises at least one of the following: database table structure, field type, and field length.
In this embodiment, consistency auditing is adopted to audit the consistency of key attributes of objects (database tables) maintained by the service system, such as whether the table structures are consistent, whether the field types are consistent, whether the field lengths are consistent, and other attribute judgment.
It should be noted that, related personnel can modify the service system according to the second data audit result, track and modify the abnormal data, so as to ensure the effective landing of the data standard.
Optionally, after the data auditing process is performed on the auditing target according to the preset auditing rule, and the data auditing result is generated, the method further includes:
acquiring third operation information, wherein the third operation information represents that a user modifies auditing rules matched with auditing nodes in the data flow graph;
and updating the data flow diagram according to the third operation information.
In this embodiment, the third operation information of the user is obtained, where the third operation information may be understood as a modification operation performed by the user on the data flow graph in the data flow graph library, and when the service system operating environment or the data changes, the auditing rule needs to be replaced or modified, and by this method, stability of auditing the service system data is improved.
It should be noted that, if the service flow in the service system changes, the user may modify the basic element corresponding to the service entity in the data flow diagram through the third operation information, which is not limited in this embodiment of the present invention.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a data auditing device according to an embodiment of the present invention, and as shown in fig. 7, a data auditing device 700 includes:
a first obtaining module 701, configured to obtain a data flow graph in a data flow gallery, where the data flow graph includes at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system;
A determining module 702, configured to determine an audit target in the at least one service system according to the data flow graph, and determine a preset audit rule matching the audit target;
the processing module 703 is configured to perform data auditing processing on the auditing target according to the preset auditing rule, so as to obtain a data auditing result, where the data auditing result is used to display abnormal data and/or analysis results in the data flow graph.
Optionally, the apparatus 700 further comprises:
the second acquisition module is used for acquiring first operation information, wherein the first operation information represents the selection of a base element by a user and the selection of a set position of the base element;
the third acquisition module is used for acquiring second operation information based on the first operation information, wherein the second operation information is used for adding elements corresponding to auditing nodes between the basic elements, and the auditing nodes are used for configuring auditing rules;
the creation module is used for creating a first data flow graph according to the first operation information and the second operation information and adding the first data flow graph into the data flow graph library;
wherein the base element is at least one of: nodes representing entities in a business system, edges representing associations between multiple nodes, and groups representing sets of similar nodes.
Optionally, the determining module 702 includes:
the first determining unit is used for setting a first business entity in a first target business system as the auditing target under the condition that the data flow diagram is the intersystem data flow diagram; setting an auditing rule configured by a user at the first business entity as the preset auditing rule, wherein the preset auditing rule is used for judging the existence of the business entity;
the second determining unit is used for setting the attribute of a second service entity in a second target service system as the auditing target under the condition that the data flow diagram is the service line data flow diagram; setting an auditing rule configured by a user at the second service entity as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute value of the second service entity;
the third determining unit is used for setting metadata information in a third target service system matching database as the auditing target under the condition that the data flow graph is the system base table structure graph; setting a database auditing rule set by a user as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute of the metadata information;
A fourth determining unit, configured to set a hardware and software architecture corresponding to a fourth target service system as the audit target when the data flow graph is the system deployment graph; setting deployment auditing rules of user equipment as the preset auditing rules, wherein the preset auditing rules are used for judging and processing software and hardware execution conditions;
the first target service system, the second target service system, the third target service system and the fourth target service system are service systems in the at least one service system, the first service entity is any service entity in the first target service system, and the second service entity is any service entity in the second target service system.
Optionally, the processing module 703 includes:
the first processing unit is used for collecting the full metadata in the business system matching database and carrying out consistency judgment on the full metadata under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing processing, so as to obtain a first data auditing result, wherein the first data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
The second processing unit is used for collecting incremental data among a plurality of service systems under the condition that the data flow graph is the service line data flow graph and data auditing processing is carried out on real-time data corresponding to the second service entity, copying the incremental data to a memory database, and carrying out data auditing processing on the incremental data in the memory database by adopting a preset auditing model to obtain a second data auditing result, wherein the second data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
the third processing unit is used for determining a checking script according to the checking rule under the condition that the data flow graph is the intersystem data flow graph or the business line data flow graph and the offline data is subjected to data checking processing, the checking script is matched and corresponds to the checking rule, inquiring target data of the business system matching database according to the checking script, generating a checking result file, and generating a third data checking result according to the checking result file, wherein the third data checking result is used for displaying abnormal data or analysis results in the data flow graph.
Optionally, the first data audit result includes a first data audit result or a second data audit result;
the first processing unit includes:
consistency judgment is carried out on the full-quantity metadata, and under the condition that the full-quantity metadata meets a first preset condition, a first data audit result is generated, wherein the first data audit result indicates that the full-quantity metadata does not contain abnormal data;
generating a second data audit result under the condition that the full-volume metadata does not meet a first preset condition, wherein the second data audit result represents that the full-volume metadata contains abnormal data;
the first preset condition indicates that a target attribute of the database table corresponding to the full-volume metadata is matched with a preset target attribute, and the target attribute comprises at least one of the following: database table structure, field type, and field length.
Optionally, the apparatus 700 further comprises:
the fourth acquisition module is used for acquiring third operation information, and the third operation information represents that a user modifies an auditing rule matched with the auditing node in the data flow graph;
and the updating module is used for updating the data flow graph according to the third operation information.
According to an embodiment of the invention, the invention further provides an electronic device and a readable storage medium.
FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read-Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a random access Memory (Random Access Memory, RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An Input/Output (I/O) interface 805 is also connected to bus 804.
Various components in device 800 are connected to I/O interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a central processing unit (Central processing unit, CPU), a graphics processing unit (Graphics processing unit, GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processors, controllers, microcontrollers, etc. The computing unit 801 performs the various methods and processes described above, such as the data auditing method.
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuitry, field programmable gate arrays (Field Programmable Gate Array, FPGAs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), application specific standard products (Application Specific Standard Parts, ASSPs), system On Chip (SOC), complex programmable logic devices (Complex Programmable logic device, CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read-Only Memory) or flash Memory), an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: display means for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area networks, wide area networks, and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of auditing data, comprising:
acquiring a data flow graph in a data flow graph library, wherein the data flow graph comprises at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system;
determining an audit target in the at least one business system according to the data flow graph, and determining a preset audit rule matched with the audit target;
and carrying out data auditing treatment on the auditing target according to the preset auditing rule to obtain a data auditing result, wherein the data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph.
2. The data auditing method of claim 1, wherein prior to said obtaining at least one business system matched data flow graph, the method further comprises:
acquiring first operation information, wherein the first operation information represents the selection of a base element by a user and the selection of a set position of the base element;
Acquiring second operation information based on the first operation information, wherein the second operation information is used for adding elements corresponding to auditing nodes between the basic elements, and the auditing nodes are used for configuring auditing rules;
creating a first data flow graph according to the first operation information and the second operation information, and adding the first data flow graph into the data flow graph library;
wherein the base element is at least one of: nodes representing entities in a business system, edges representing associations between multiple nodes, and groups representing sets of similar nodes.
3. The data auditing method according to claim 2, wherein the determining an auditing target according to the data flow graph and a preset auditing rule matching the auditing target include at least one of:
setting a first business entity in a first target business system as the auditing target under the condition that the data flow diagram is the intersystem data flow diagram; setting an auditing rule configured by a user at the first business entity as the preset auditing rule, wherein the preset auditing rule is used for judging the existence of the business entity;
Setting the attribute of a second service entity in a second target service system as the auditing target under the condition that the data flow graph is the service line data flow graph; setting an auditing rule configured by a user at the second service entity as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute value of the second service entity;
setting metadata information in a third target service system matching database as the auditing target under the condition that the data flow graph is the system library table structure graph; setting a database auditing rule set by a user as the preset auditing rule, wherein the preset auditing rule is used for carrying out consistency judgment processing on the attribute of the metadata information;
setting a hardware and software framework corresponding to a fourth target service system as the auditing target under the condition that the data flow graph is the system deployment graph; setting deployment auditing rules of user equipment as the preset auditing rules, wherein the preset auditing rules are used for judging and processing software and hardware execution conditions;
the first target service system, the second target service system, the third target service system and the fourth target service system are service systems in the at least one service system, the first service entity is any service entity in the first target service system, and the second service entity is any service entity in the second target service system.
4. The method for auditing data according to claim 3, wherein the data auditing process is performed on the auditing target according to the preset auditing rule to generate a data auditing result, and the data auditing result includes at least one of the following:
under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing, collecting the full metadata in the business system matching database, and carrying out consistency judgment on the full metadata to obtain a first data auditing result, wherein the first data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
under the condition that the data flow graph is the service line data flow graph and the real-time data corresponding to the second service entity is subjected to data auditing processing, incremental data among a plurality of service systems are collected, the incremental data are copied to a memory database, the incremental data in the memory database are subjected to data auditing processing by adopting a preset auditing model, so that a second data auditing result is obtained, and the second data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph;
And under the condition that the data flow graph is the intersystem data flow graph or the business line data flow graph and the offline data is subjected to data auditing, determining an auditing script according to the auditing rule, wherein the auditing script is matched and corresponds to the auditing rule, inquiring target data of the business system matching database according to the auditing script, generating an auditing result file, and generating a third data auditing result according to the auditing result file, wherein the third data auditing result is used for displaying abnormal data or analysis results in the data flow graph.
5. The data auditing method of claim 4, wherein the first data auditing result includes a first data auditing nuclear result or a second data auditing nuclear result;
under the condition that the data flow graph is the system library table structure graph and the metadata information is subjected to data auditing, collecting the total metadata in the business system matching database and carrying out consistency judgment on the total metadata to obtain a first data auditing result, wherein the method comprises the following steps of:
consistency judgment is carried out on the full-quantity metadata, and under the condition that the full-quantity metadata meets a first preset condition, a first data audit result is generated, wherein the first data audit result indicates that the full-quantity metadata does not contain abnormal data;
Generating a second data audit result under the condition that the full-volume metadata does not meet a first preset condition, wherein the second data audit result represents that the full-volume metadata contains abnormal data;
the first preset condition indicates that a target attribute of the database table corresponding to the full-volume metadata is matched with a preset target attribute, and the target attribute comprises at least one of the following: database table structure, field type, and field length.
6. The method according to any one of claims 1 to 5, wherein after the data auditing process is performed on the auditing target according to the preset auditing rule, the method further comprises:
acquiring third operation information, wherein the third operation information represents that a user modifies auditing rules matched with auditing nodes in the data flow graph;
and updating the data flow diagram according to the third operation information.
7. A data auditing apparatus, comprising:
the first acquisition module is used for acquiring a data flow graph in the data flow graph library, and the data flow graph comprises at least one of the following: a system data flow graph, a system library table structure graph and a system deployment graph, wherein the system data flow graph comprises an intersystem data flow graph or a service line data flow graph, the system library table structure graph represents a data table of at least one service system matching database, and the system deployment graph represents a hardware and software deployment graph corresponding to at least one service system;
The determining module is used for determining an auditing target in the at least one business system according to the data flow graph and determining a preset auditing rule matched with the auditing target;
the processing module is used for conducting data auditing processing on the auditing target according to the preset auditing rule to obtain a data auditing result, and the data auditing result is used for displaying abnormal data and/or analysis results in the data flow graph.
8. The data auditing device of claim 7, further comprising:
the second acquisition module is used for acquiring first operation information, wherein the first operation information represents the selection of a base element by a user and the selection of a set position of the base element;
the third acquisition module is used for acquiring second operation information based on the first operation information, wherein the second operation information is used for adding elements corresponding to auditing nodes between the basic elements, and the auditing nodes are used for configuring auditing rules;
the creation module is used for creating a first data flow graph according to the first operation information and the second operation information and adding the first data flow graph into the data flow graph library;
Wherein the base element is at least one of: nodes representing entities in a business system, edges representing associations between multiple nodes, and groups representing sets of similar nodes.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data auditing method of any of claims 1 to 6.
10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the data auditing method according to any one of claims 1 to 6.
CN202310140103.7A 2023-02-13 2023-02-13 Data auditing method and device, electronic equipment and storage medium Pending CN116303385A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310140103.7A CN116303385A (en) 2023-02-13 2023-02-13 Data auditing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310140103.7A CN116303385A (en) 2023-02-13 2023-02-13 Data auditing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116303385A true CN116303385A (en) 2023-06-23

Family

ID=86802318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310140103.7A Pending CN116303385A (en) 2023-02-13 2023-02-13 Data auditing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116303385A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701381A (en) * 2023-08-03 2023-09-05 南京莫愁智慧信息科技有限公司 Multistage verification system and method for distributed data acquisition and warehousing
CN117312314A (en) * 2023-09-26 2023-12-29 广州加之科技有限公司 Comprehensive auditing management method, device, terminal and medium for hospital business data

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701381A (en) * 2023-08-03 2023-09-05 南京莫愁智慧信息科技有限公司 Multistage verification system and method for distributed data acquisition and warehousing
CN116701381B (en) * 2023-08-03 2023-11-03 南京莫愁智慧信息科技有限公司 Multistage verification system and method for distributed data acquisition and warehousing
CN117312314A (en) * 2023-09-26 2023-12-29 广州加之科技有限公司 Comprehensive auditing management method, device, terminal and medium for hospital business data

Similar Documents

Publication Publication Date Title
US11461294B2 (en) System for importing data into a data repository
US11360950B2 (en) System for analysing data relationships to support data query execution
CN112396404A (en) Data center system
CN116303385A (en) Data auditing method and device, electronic equipment and storage medium
US8340995B2 (en) Method and system of using artifacts to identify elements of a component business model
CN102117306B (en) Method and system for monitoring ETL (extract-transform-load) data processing process
CN112395325A (en) Data management method, system, terminal equipment and storage medium
CN110764786A (en) Optimized deployment resource and software delivery platform in cloud computing environment
US20120310906A1 (en) Building Information Tracking System and Method of Use
US20140229223A1 (en) Integrated erp based planning
CN104580446A (en) Cloud service real-time charging and managing system oriented to OpenStack open source cloud platform
CN114880405A (en) Data lake-based data processing method and system
CN109214776A (en) A kind of thermoelectricity EPC general contract of civil engineering management overall process total factor control platform
CN111143391A (en) Data sharing exchange method and system
CN105868956A (en) Data processing method and device
CN115169011A (en) Editing system and application system of airplane assembly outline
CN109947797B (en) Data inspection device and method
CN113722564A (en) Visualization method and device for energy and material supply chain based on space map convolution
NOVIKOV et al. Improving the enterprise resource planning system based on digital modules of the industry 4.0 concept
CN114841678B (en) Post data exchange method, data exchange system, server and storage medium
CN116775956A (en) Method, device, equipment and storage medium for creating multi-level BOM model
JP6063235B2 (en) Work automation support system and work automation support method
US20140149186A1 (en) Method and system of using artifacts to identify elements of a component business model
CN112907220A (en) System, method and device for engineering cost
Kim et al. Management of software test using case tool

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination