CN112241443B - Data quality monitoring method, device, computing equipment and computer storage medium - Google Patents

Data quality monitoring method, device, computing equipment and computer storage medium Download PDF

Info

Publication number
CN112241443B
CN112241443B CN201910642686.7A CN201910642686A CN112241443B CN 112241443 B CN112241443 B CN 112241443B CN 201910642686 A CN201910642686 A CN 201910642686A CN 112241443 B CN112241443 B CN 112241443B
Authority
CN
China
Prior art keywords
data
node
flow direction
association
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910642686.7A
Other languages
Chinese (zh)
Other versions
CN112241443A (en
Inventor
金崇超
孙新华
刘坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201910642686.7A priority Critical patent/CN112241443B/en
Publication of CN112241443A publication Critical patent/CN112241443A/en
Application granted granted Critical
Publication of CN112241443B publication Critical patent/CN112241443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • General Factory Administration (AREA)

Abstract

The embodiment of the invention relates to the technical field of quality monitoring, and discloses a data quality monitoring method, a device, a computing device and a computer storage medium, wherein the method comprises the following steps: acquiring a data flow map among all data nodes; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data. By means of the mode, the embodiment of the invention automatically organizes the data flow modes under different scenes and services by utilizing the association relation among the data of the production system, monitors the quality of the data in the whole service and the whole scene in the whole process, provides the occurrence point of the abnormal links of the data, improves the data quality monitoring efficiency and reduces the labor cost.

Description

Data quality monitoring method, device, computing equipment and computer storage medium
Technical Field
The embodiment of the invention relates to the technical field of quality monitoring, in particular to a data quality monitoring method, a data quality monitoring device, computing equipment and a computer storage medium.
Background
With the business development of the telecommunication market, operators gradually shift from extensive operation which attracts users from price war to refined operation which is centered on customers under the action of strong market competition and technology driving, and diversified customer demands are met through business innovation. In the process, the business characteristics of various service types and flexible tariff package lead the business support system to be more and more complex, the risk of generating data quality problems due to abnormality of data in the production process is inevitably increased, and once the data quality problems occur in the production system, the normal handling of business can be influenced, and the working and processing results of later data types can be influenced.
Aiming at the data quality problem, the existing processing mode is to perform manual auditing comparison work at regular intervals, and the method can be roughly divided into three steps: 1) Establishing data audit points, and respectively establishing audit points according to different service conditions and experiences; 2) Manually combing the auditing flow, namely manually combing the business scene flow related to each auditing point and the data table flowing through by utilizing a system interaction panoramic framework; 3) And (3) checking periodically, and arranging personnel to perform data quality comparison auditing work on each auditing point step by step according to the auditing flow.
In carrying out embodiments of the present invention, the inventors found that: in order to meet diversified customer demands and merge continuously updated market environments, operators need to develop innovative businesses in time, so that the auditing points and the auditing flow are required to be updated continuously and synchronously according to the traditional data quality auditing comparison method, and the business flow, the data flow and other carding works are completely carried out manually, so that the time and the labor are consumed, and the efficiency is extremely low. In addition, under the condition of business and scene interaction, the production environment has complex and changeable derived data conditions, and through manually establishing audit points, only partial business and scene data quality audit is often involved, and the coverage audit range has limitation; based on complicated specific processes of different services and scenes, the method only focuses on the end-to-end data quality problem by means of a traditional data quality audit comparison mode of manual processing, cannot specifically locate the link of the problem in the middle of the service, and cannot monitor the occurrence link of the data quality problem.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention provide a data quality monitoring method, apparatus, computing device, and computer storage medium, which overcome or at least partially solve the foregoing problems.
According to an aspect of an embodiment of the present invention, there is provided a data quality monitoring method, the method including: acquiring a data flow map among all data nodes; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In an optional manner, the acquiring a data flow map between the data nodes includes: carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes; and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an optional manner, the performing data feature analysis on the historical service data to obtain a dependency relationship between data nodes includes: acquiring historical service data in a production system, and establishing a training data node table; collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes; aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate; and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an optional manner, the obtaining, for any two data nodes, the optimal field combination of any two data nodes according to the common field combination application expansion rate and retention rate includes: extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate; taking the field combination with the highest retention rate; when a plurality of field combinations with highest retention rate exist, eliminating the field combinations with abnormal expansion rate to form special field combinations; splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination; and repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
In an optional manner, the obtaining the optimal flow direction between the data nodes according to the dependency relationship between the data nodes to form a data flow map includes: acquiring historical service data in a production system, and establishing a service data temporary table; selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the initial table and the time field of the association table; judging the internal association flow direction of association tables of any two data nodes in the association table set; and obtaining an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map.
In an optional manner, the obtaining the optimal flow direction according to the basic association flow direction and the internal association flow direction to form a data flow map includes: if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction; traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions; and forming a data flow map according to the optimal flow direction among the data nodes.
In an alternative manner, the data flow characteristics include a start node data amount and an end node data amount; the step of carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting abnormal data and positioning the abnormal occurrence node of the abnormal data comprises the following steps: comparing the node data quantity of the data nodes flowing in each stream in the business data and the data flow map, calculating the missing data quantity, performing data node position marking, and outputting a data missing level list and a data missing detail list; and outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
According to another aspect of an embodiment of the present invention, there is provided a data quality monitoring apparatus, the apparatus including: the flow acquisition unit is used for acquiring a data flow map among the data nodes; the data acquisition unit is used for acquiring real-time service data in the production system; the feature extraction unit is used for acquiring the data flow features of each data node in the service data according to the data flow map; and the anomaly detection unit is used for carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an anomaly data detection result and positioning an anomaly occurrence node of the anomaly data.
According to another aspect of an embodiment of the present invention, there is provided a computing device including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the steps of the data quality monitoring method.
According to yet another aspect of the embodiments of the present invention, there is provided a computer storage medium having stored therein at least one executable instruction for causing the processor to perform the steps of the above-described data quality monitoring method.
According to the embodiment of the invention, the data flow patterns among the data nodes are obtained; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of the business data and each data node of the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data, and automatically combing data flow forms under different scenes and businesses by utilizing the association relation among the data of the production system, so that the data abnormal link occurrence node is provided by full-business full-flow monitoring comparison of the data quality conditions, the data quality monitoring efficiency is improved, and the labor cost is reduced.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific embodiments of the present invention are given for clarity and understanding.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
Fig. 1 shows a flow chart of a data quality monitoring method according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating another method for monitoring data quality according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of another method for monitoring data quality according to an embodiment of the present invention;
FIG. 4 shows a schematic diagram of retention and expansion of a data quality monitoring method according to an embodiment of the present invention;
fig. 5 shows an optimal flow direction obtaining schematic diagram of a data quality monitoring method according to an embodiment of the present invention;
fig. 6 shows a schematic structural diagram of a data quality monitoring device according to an embodiment of the present invention;
FIG. 7 illustrates a schematic diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 shows a flow chart of a data quality monitoring method according to an embodiment of the present invention. As shown in fig. 1, the data quality monitoring method includes:
step S11: and acquiring a data flow map among the data nodes.
In the embodiment of the invention, the data flow patterns among the data nodes which are trained in advance can be obtained, and the data flow patterns among the data nodes can be obtained by training the historical service data in the production system.
When training the historical service data in the production system to obtain a data flow chart among the data nodes, as shown in fig. 2, step S11 includes:
step S111: and carrying out data characteristic analysis on the historical service data in the production system to obtain the dependency relationship among the data nodes.
In the embodiment of the invention, the historical service data in the production system is utilized to create a training node set table, the characteristic frames such as the data expansion rate and the retention rate are constructed, the optimal field combination of each data node is obtained according to the expansion rate and the retention rate, and the dependency relationship among each data node is obtained according to the obvious and implicit relationship among the node tables.
As shown in fig. 3, step S111 includes the steps of:
step S115: and acquiring historical service data in the production system, and establishing a training data node table.
And acquiring historical service data of a training node table in the production system, creating and generating a corresponding temporary training node table for subsequent training, establishing a training data set, and creating the training data node table.
Step S116: and respectively collecting field characteristics according to the data node table, and acquiring a plurality of general field combinations of each data node.
Specifically, extracting a field of which the type of each data node in the training data node table is character type or numerical value type, and performing field analysis to obtain a field analysis result; and extracting a tag field from the field analysis result, and removing the field with the null value exceeding a threshold value to form a plurality of general field combinations of each data node.
For the created training data node table, extracting the fields with the types of character type or numerical value type from the training data node table, and sequentially counting the field characteristics of the relevant fields of each data table, such as the record number, the duplicate removal record number, the null record number, the type length with the maximum data volume, the corresponding data volume and the like, so as to form a data field analysis result table shown in table 1.
Table 1 data field analysis results table
Field name Logical name Data type
NODE_ID Node ID NUMBER(10)
NODE_CODE Node coding VARchar2(30)
COLUMN_NAME Field coding VARchar2(30)
COLUMN_TYPE Field type VARchar2(30)
ALL_CNT Node record number NUMBER(20)
COL_CNT Record number after the field is duplicated NUMBER(20)
NULL_CNT The field null record number NUMBER(20)
MAX_LENGTH The field records the most length type NUMBER(4)
MAX_LENGTH_CNT The field records the number of records of the maximum length type NUMBER(20)
The fields of the suspected tag set are extracted according to the counted field characteristics and used for analyzing the field diversity, the data characteristic field results are written, and the field extraction rules of the suspected tag set are as follows:
and further eliminating the fields with excessive null values according to preset rules, and performing field feature screening to form general field combinations meeting the conditions, and preferably eliminating the fields with null values exceeding the preset values to form general field combinations of all data nodes.
Step S117: and aiming at any two data nodes, obtaining the optimal field combination of any two data nodes according to the common field combination application expansion rate and the retention rate.
Specifically, extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate; taking the field combination with the highest retention rate; when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rate are removed to form special field combinations. In the embodiment of the present invention, as shown in fig. 4, taking any two associated data nodes a and B as an example, 10000 pieces of data in the general field combination of the data node a are extracted to form a table a, and the general field combination of the data node B forms a full table B. And matching the field combination with the associated full-quantity table B, and sequentially counting the data retention rate and the expansion rate. Wherein the retention rate is the data amount/10000 of the intersection with the full table B in the table a, and the expansion rate is the data amount of the intersection with the table a in the full table B/the data amount of the intersection with the full table B in the table a. For example, referring to fig. 4, table a includes an α field, table B includes a β field and a γ field, and the two tables are associated by field combination α=β, resulting in q=2, p=3. If the field retention rate among the field correlation tables accords with the threshold value, the field combination with the highest retention rate is taken, otherwise, the field combination is removed. If there are multiple fields meeting the above conditions, the combination of fields with abnormal expansion rate, such as expansion rate >100, is eliminated.
After obtaining a special field combination, splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination; and repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
Step S118: and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
And judging whether the association tables are dominant relations or recessive relations according to the time fields in the training data node tables. If the time sequence relation exists in the time field between tables, the time sequence relation is dominant, otherwise, the time sequence relation is recessive. And outputting the dependency relationship among the data nodes to form an association table of the dependency relationship among the data nodes as shown in table 2.
Table 2 association table of dependency relationships between data nodes
Step S112: and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
And forming a basic and internal association flow direction on the basis of the dependency relationship among the data nodes, finally automatically establishing a data flow map, and completing data characteristic dependency model analysis to form the data flow map.
In the embodiment of the invention, the historical service data in the production system is acquired, the historical service data in the production system is established and acquired, and a service data temporary table is established. Specifically, service data of the data node is acquired according to the configuration table, and a new related service data temporary table is created and generated from the training data node table according to the service caliber of the training service entry. Then extracting each data node, and selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the initial table and the time field of the association table; and judging the internal association flow direction of the association tables of any two data nodes in the association table set. If any association table B is selected from the association table set { B } of the initial table a i Carding to obtain the following characteristics: node a and associated word of node aSegment, node b i Node b i Association field, association type of (a). Based on node a and node b i Sequentially generating temporary association result tables, and respectively taking an initial table a and any association table b i The time fields of the initial table a are spliced, and the initial data quantity of the initial table a and any association table b are further counted i And (3) completing the judgment of the basic association flow direction: if a- & gt b in time sequence association i If the confidence of the rule exceeds the threshold, the traffic is considered to flow through Table b i And otherwise, does not flow through. Traversing the association table set { B } to obtain a basic association flow direction a → { B' }. The association tables with association relation in the association table set { B } are compared in pairs to judge the flow direction, and { B } is calculated respectively i '}→{B j ' j (i not equal to j) and { B } j '}→{B i Confidence of' (i not equal to j), if the confidence exceeds a threshold, the internal association flow direction with higher confidence is taken.
In the embodiment of the invention, the optimal flow direction is further obtained according to the basic association flow direction and the internal association flow direction, and a data flow map is formed. The method comprises the following steps: if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction; traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions; and forming a data flow map according to the optimal flow direction among the data nodes. For example, referring to fig. 5, if there is a basic association flow direction, such as a→b2, and there is a basic and internal association flow direction, such as a→b1, b1→b2, where the start node can reach the end node through other nodes, the original basic association flow direction, such as a→b2, is removed, and the other basic and internal association flow directions are combined and reserved as the optimal flow direction. And forming a business process according to the optimal flow direction among the data nodes, wherein the result is shown in a table 3, and further generating a data process map.
TABLE 3 business process results table
Field name Logical name Data type
JOB_ID Training item ID NUMBER(10)
TASK_ID Training task ID NUMBER(10)
FLOW_ID Flow ID NUMBER(10)
START_NODE_ID Start node ID NUMBER(10)
START_NODE_CODE Start node encoding VARchar2(30)
END_NODE_ID End node ID NUMBER(10)
END_NODE_CODE End node encoding VARchar2(30)
According to the embodiment of the invention, the characteristic frames such as the data expansion rate and the retention rate are constructed to form the basic and internal association flow directions, and finally the automatic establishment function of the data flow chart is realized, so that the association relation among the production system data is used creatively in the follow-up process, the automatic monitoring replaces a purely manual auditing mode, the data quality auditing efficiency is improved, and the labor cost is reduced.
Step S12: and acquiring real-time business data in the production system.
Specifically, service data of the data node is acquired according to the configuration table, real-time service data in the production system is acquired from the training data node table according to the service caliber of the service inlet, and a new related service data temporary table is created and produced.
Step S13: and acquiring the data flow characteristics of each data node in the service data according to the data flow map.
Specifically, according to a flow chart formed by training, the data flow characteristics of each data node are counted, wherein the data flow characteristics comprise a start node data quantity and an end node data quantity.
Step S14: and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
And comparing the node data quantity of the data nodes flowing in each flow direction in the business data and the data flow chart, calculating the missing data quantity, marking the position of the data nodes, outputting a data missing level list and a data missing detail list, and respectively outputting the data missing level list and the data missing detail list as shown in tables 4 and 5.
TABLE 4 data loss level inventory
Field name Logical name Data type
JOB_ID Training item ID NUMBER(10)
TM_INTRVL_ID Training task ID VARchar2(30)
FLOW_ID Belonging to Process ID NUMBER(10)
START_NODE_CODE Start node encoding VARchar2(30)
END_NODE_CODE End node encoding VARchar2(30)
V_START_CNT Initial node data volume NUMBER(10)
V_END_CNT End node data volume NUMBER(10)
GRP_EXPR Magnitude of difference VARchar2(1000)
TABLE 5 data loss detail list
Further, outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data. The recording form of the abnormal data detection result is shown in table 6.
TABLE 6 abnormal Link detection results schematic Table
Field name Logical name Data type
USER_ID User ID VARchar2(30)
JOB_ID Training item ID NUMBER(10)
TASK_ID Training task ID NUMBER(10)
NODE_ID Data node ID NUMBER(10)
NODE_CODE Data node encoding VARchar2(30)
GRP_ROWNO Number of user records NUMBER(10)
GRP_EXPR Recording type of user VARchar2(1000)
ERR_FLAG Whether the user is abnormal at the node NUMBER(1)
The embodiment of the invention creatively breaks through the limitation of manpower and complex flow in the traditional mode, and by automatically combing the data flow forms under different scenes and services, the full-service full-scene full-flow monitoring compares the data quality conditions, is not limited to artificially established audit points, automatically checks abnormal service operation in the production process from the abnormal data flow conditions, and can rapidly locate abnormal links and provide abnormal completion reasons of the service operation, namely data abnormal link occurrence points.
According to the embodiment of the invention, the data flow patterns among the data nodes are obtained; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of the business data and each data node of the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data, and automatically combing data flow forms under different scenes and businesses by utilizing the association relation among the data of the production system, so that the data abnormal link occurrence node is provided by full-business full-flow monitoring comparison of the data quality conditions, the data quality monitoring efficiency is improved, and the labor cost is reduced.
Fig. 6 shows a schematic structural diagram of a data quality monitoring apparatus according to an embodiment of the present invention. As shown in fig. 6, the data quality monitoring apparatus includes: a flow acquisition unit 601, a data acquisition unit 602, a feature extraction unit 603, and an abnormality detection unit 604. Wherein:
the flow obtaining unit 601 is configured to obtain a data flow map between each data node; the data acquisition unit 602 is configured to acquire real-time service data in the production system; the feature extraction unit 603 is configured to obtain a data flow feature of each data node in the service data according to the data flow map; the anomaly detection unit 604 is configured to perform consistency comparison on the business data and the data flow characteristics of each data node of the data flow graph, output an anomaly data detection result, and locate an anomaly occurrence node of the anomaly data.
In an alternative manner, the flow obtaining unit 601 is configured to: carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes; and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an alternative manner, the flow obtaining unit 601 is configured to: acquiring historical service data in a production system, and establishing a training data node table; collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes; aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate; and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an alternative manner, the flow obtaining unit 601 is further configured to: extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate; taking the field combination with the highest retention rate; when a plurality of field combinations with highest retention rate exist, eliminating the field combinations with abnormal expansion rate to form special field combinations; splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination; and repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
In an alternative manner, the flow obtaining unit 601 is further configured to: acquiring historical service data in a production system, and establishing a service data temporary table; selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the initial table and the time field of the association table; judging the internal association flow direction of association tables of any two data nodes in the association table set; and obtaining an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map.
In an alternative manner, the flow obtaining unit 601 is further configured to: if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction; traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions; and forming a data flow map according to the optimal flow direction among the data nodes.
In an alternative manner, the data flow characteristics include a start node data amount and an end node data amount; the abnormality detection unit 604 is configured to: comparing the node data quantity of the data nodes flowing in each stream in the business data and the data flow map, calculating the missing data quantity, performing data node position marking, and outputting a data missing level list and a data missing detail list; and outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
According to the embodiment of the invention, the data flow patterns among the data nodes are obtained; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of the business data and each data node of the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data, and automatically combing data flow forms under different scenes and businesses by utilizing the association relation among the data of the production system, so that the data abnormal link occurrence node is provided by full-business full-flow monitoring comparison of the data quality conditions, the data quality monitoring efficiency is improved, and the labor cost is reduced.
Embodiments of the present invention provide a non-volatile computer storage medium having stored thereon at least one executable instruction for performing the data quality monitoring method of any of the method embodiments described above.
The executable instructions may be particularly useful for causing a processor to:
acquiring a data flow map among all data nodes;
acquiring real-time business data in a production system;
acquiring the data flow characteristics of each data node in the service data according to the data flow map;
and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In one alternative, the executable instructions cause the processor to:
carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In one alternative, the executable instructions cause the processor to:
Acquiring historical service data in a production system, and establishing a training data node table;
collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes;
aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In one alternative, the executable instructions cause the processor to:
extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate;
taking the field combination with the highest retention rate;
when a plurality of field combinations with highest retention rate exist, eliminating the field combinations with abnormal expansion rate to form special field combinations;
splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination;
and repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
In one alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a service data temporary table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the initial table and the time field of the association table;
judging the internal association flow direction of association tables of any two data nodes in the association table set;
and obtaining an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map.
In one alternative, the executable instructions cause the processor to:
if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction;
traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions;
And forming a data flow map according to the optimal flow direction among the data nodes.
In an alternative manner, the data flow characteristics include a start node data amount and an end node data amount; the executable instructions cause the processor to:
comparing the node data quantity of the data nodes flowing in each stream in the business data and the data flow map, calculating the missing data quantity, performing data node position marking, and outputting a data missing level list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
According to the embodiment of the invention, the data flow patterns among the data nodes are obtained; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of the business data and each data node of the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data, and automatically combing data flow forms under different scenes and businesses by utilizing the association relation among the data of the production system, so that the data abnormal link occurrence node is provided by full-business full-flow monitoring comparison of the data quality conditions, the data quality monitoring efficiency is improved, and the labor cost is reduced.
An embodiment of the present invention provides a computer program product comprising a computer program stored on a computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the data quality monitoring method of any of the method embodiments described above.
The executable instructions may be particularly useful for causing a processor to:
acquiring a data flow map among all data nodes;
acquiring real-time business data in a production system;
acquiring the data flow characteristics of each data node in the service data according to the data flow map;
and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In one alternative, the executable instructions cause the processor to:
carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In one alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a training data node table;
collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes;
aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In one alternative, the executable instructions cause the processor to:
extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate;
taking the field combination with the highest retention rate;
when a plurality of field combinations with highest retention rate exist, eliminating the field combinations with abnormal expansion rate to form special field combinations;
splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination;
And repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
In one alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a service data temporary table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the initial table and the time field of the association table;
judging the internal association flow direction of association tables of any two data nodes in the association table set;
and obtaining an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map.
In one alternative, the executable instructions cause the processor to:
if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction;
Traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions;
and forming a data flow map according to the optimal flow direction among the data nodes.
In an alternative manner, the data flow characteristics include a start node data amount and an end node data amount; the executable instructions cause the processor to:
comparing the node data quantity of the data nodes flowing in each stream in the business data and the data flow map, calculating the missing data quantity, performing data node position marking, and outputting a data missing level list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
According to the embodiment of the invention, the data flow patterns among the data nodes are obtained; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of the business data and each data node of the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data, and automatically combing data flow forms under different scenes and businesses by utilizing the association relation among the data of the production system, so that the data abnormal link occurrence node is provided by full-business full-flow monitoring comparison of the data quality conditions, the data quality monitoring efficiency is improved, and the labor cost is reduced.
FIG. 7 illustrates a schematic diagram of a computing device in accordance with an embodiment of the invention, which is not limited to a particular implementation of the device.
As shown in fig. 7, the computing device may include: a processor 702, a communication interface (Communications Interface), a memory 706, and a communication bus 708.
Wherein: processor 702, communication interface 704, and memory 706 perform communication with each other via a communication bus 708. A communication interface 704 for communicating with network elements of other devices, such as clients or other servers. The processor 702 is configured to execute the program 710, and may specifically perform relevant steps in the above-described data quality monitoring method embodiment.
In particular, program 710 may include program code including computer-operating instructions.
The processor 702 may be a Central Processing Unit (CPU), or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors comprised by the device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
Memory 706 for storing programs 710. The memory 706 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 710 may be specifically configured to cause the processor 702 to:
acquiring a data flow map among all data nodes;
acquiring real-time business data in a production system;
acquiring the data flow characteristics of each data node in the service data according to the data flow map;
and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In an alternative, the program 710 causes the processor to:
carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an alternative, the program 710 causes the processor to:
Acquiring historical service data in a production system, and establishing a training data node table;
collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes;
aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an alternative, the program 710 causes the processor to:
extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate;
taking the field combination with the highest retention rate;
when a plurality of field combinations with highest retention rate exist, eliminating the field combinations with abnormal expansion rate to form special field combinations;
splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination;
and repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
In an alternative, the program 710 causes the processor to:
acquiring historical service data in a production system, and establishing a service data temporary table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the initial table and the time field of the association table;
judging the internal association flow direction of association tables of any two data nodes in the association table set;
and obtaining an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map.
In an alternative, the program 710 causes the processor to:
if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction;
traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions;
And forming a data flow map according to the optimal flow direction among the data nodes.
In an alternative manner, the data flow characteristics include a start node data amount and an end node data amount; the program 710 causes the processor to:
comparing the node data quantity of the data nodes flowing in each stream in the business data and the data flow map, calculating the missing data quantity, performing data node position marking, and outputting a data missing level list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
According to the embodiment of the invention, the data flow patterns among the data nodes are obtained; acquiring real-time business data in a production system; acquiring the data flow characteristics of each data node in the service data according to the data flow map; and carrying out consistency comparison on the data flow characteristics of the business data and each data node of the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data, and automatically combing data flow forms under different scenes and businesses by utilizing the association relation among the data of the production system, so that the data abnormal link occurrence node is provided by full-business full-flow monitoring comparison of the data quality conditions, the data quality monitoring efficiency is improved, and the labor cost is reduced.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (7)

1. A method of data quality monitoring, the method comprising:
acquiring a data flow map among the data nodes, including: carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes; acquiring the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map;
The step of carrying out data characteristic analysis on the historical service data to obtain the dependency relationship among the data nodes comprises the following steps: acquiring historical service data in a production system, and establishing a training data node table; collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes; aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate; judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table;
the step of obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map comprises the following steps: acquiring historical service data in a production system, and establishing a service data temporary table; selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the initial table and the time field of the association table; judging the internal association flow direction of association tables of any two data nodes in the association table set; acquiring an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map;
Acquiring real-time business data in a production system;
acquiring the data flow characteristics of each data node in the service data according to the data flow map;
and carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
2. The method according to claim 1, wherein said obtaining, for any two of said data nodes, an optimal field combination of any two of said data nodes from said generic field combination application expansion rate and retention rate comprises:
extracting a preset number of business data in the general field combination of any data node to be matched with the general field combination of another data node, and counting the retention rate and the expansion rate;
taking the field combination with the highest retention rate;
when a plurality of field combinations with highest retention rate exist, eliminating the field combinations with abnormal expansion rate to form special field combinations;
splicing the special field combination and the general field combination of the other data node to form a new inter-table field combination;
and repeatedly iterating the inter-table field combination and the special field combination until no new special field combination is generated between tables, thereby obtaining the optimal field combination.
3. The method of claim 1, wherein the obtaining the optimal flow direction according to the base associated flow direction and the internal associated flow direction to form a data flow map comprises:
if the first basic association flow direction can be realized through the second basic association flow direction and the internal association flow direction, reserving the second basic association flow direction and the internal association flow direction, and deleting the first basic association flow direction;
traversing the basic association flow direction and the internal association flow direction, and finally reserving the basic association flow direction and the internal association flow direction as optimal flow directions;
and forming a data flow map according to the optimal flow direction among the data nodes.
4. The method of claim 1, wherein the data flow characteristics include a start node data amount and an end node data amount;
the step of carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting abnormal data and positioning the abnormal occurrence node of the abnormal data comprises the following steps:
comparing the node data quantity of the data nodes flowing in each stream in the business data and the data flow map, calculating the missing data quantity, performing data node position marking, and outputting a data missing level list and a data missing detail list;
And outputting an abnormal data detection result according to the data missing level list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
5. A data quality monitoring device, the device comprising:
the flow acquisition unit is used for acquiring a data flow map among the data nodes, and comprises the following steps: carrying out data characteristic analysis on historical service data in a production system to obtain a dependency relationship among data nodes; acquiring the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map;
the step of carrying out data characteristic analysis on the historical service data to obtain the dependency relationship among the data nodes comprises the following steps: acquiring historical service data in a production system, and establishing a training data node table; collecting field characteristics according to the data node table respectively, and acquiring general field combinations of all data nodes; aiming at any two data nodes, obtaining optimal field combinations of any two data nodes according to the common field combination application expansion rate and the retention rate; judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table;
The step of obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map comprises the following steps: acquiring historical service data in a production system, and establishing a service data temporary table; selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the initial table and the time field of the association table; judging the internal association flow direction of association tables of any two data nodes in the association table set; acquiring an optimal flow direction according to the basic association flow direction and the internal association flow direction, and forming a data flow map;
the data acquisition unit is used for acquiring real-time service data in the production system;
the feature extraction unit is used for acquiring the data flow features of each data node in the service data according to the data flow map;
and the anomaly detection unit is used for carrying out consistency comparison on the data flow characteristics of each data node in the business data and the data flow map, outputting an anomaly data detection result and positioning an anomaly occurrence node of the anomaly data.
6. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to hold at least one executable instruction that causes the processor to perform the steps of the data quality monitoring method according to any one of claims 1-4.
7. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform the steps of the data quality monitoring method according to any one of claims 1-4.
CN201910642686.7A 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium Active CN112241443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910642686.7A CN112241443B (en) 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910642686.7A CN112241443B (en) 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN112241443A CN112241443A (en) 2021-01-19
CN112241443B true CN112241443B (en) 2023-11-21

Family

ID=74167104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910642686.7A Active CN112241443B (en) 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN112241443B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342861B (en) * 2021-07-06 2022-11-11 云南中烟工业有限责任公司 Data management method and device in service scene

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232538A (en) * 2007-12-28 2008-07-30 华为技术有限公司 Apparatus and method for merging business data
CN104135395A (en) * 2014-03-10 2014-11-05 腾讯科技(深圳)有限公司 Method and system of monitoring data transmission quality in IDC (Internet Data Center) network
CN105045832A (en) * 2015-06-30 2015-11-11 北京奇艺世纪科技有限公司 Data acquisition method and apparatus
CN107426265A (en) * 2016-03-11 2017-12-01 阿里巴巴集团控股有限公司 The synchronous method and apparatus of data consistency
CN107577717A (en) * 2017-08-09 2018-01-12 阿里巴巴集团控股有限公司 A kind of processing method, device and server for ensureing data consistency
CN107707482A (en) * 2017-09-29 2018-02-16 新华三技术有限公司 A kind of data smoothing method and apparatus
CN107784088A (en) * 2017-09-30 2018-03-09 杭州博世数据网络有限公司 The knowledge mapping construction method of knowledge based point annexation
CN108243046A (en) * 2016-12-27 2018-07-03 中国移动通信集团浙江有限公司 A kind of evaluation the quality method and device based on data auditing
CN108833184A (en) * 2018-06-29 2018-11-16 腾讯科技(深圳)有限公司 Service fault localization method, device, computer equipment and storage medium
CN108874907A (en) * 2018-05-25 2018-11-23 北京明略软件系统有限公司 A kind of data query method and apparatus, computer readable storage medium
CN108959564A (en) * 2018-07-04 2018-12-07 玖富金科控股集团有限责任公司 Data warehouse metadata management method, readable storage medium storing program for executing and computer equipment
CN108984284A (en) * 2018-06-26 2018-12-11 杭州比智科技有限公司 DAG method for scheduling task and device based on off-line calculation platform
CN109213747A (en) * 2018-08-08 2019-01-15 麒麟合盛网络技术股份有限公司 A kind of data managing method and device
CN109308602A (en) * 2018-08-15 2019-02-05 平安科技(深圳)有限公司 Operation flow data processing method, device, computer equipment and storage medium
CN109408535A (en) * 2018-09-28 2019-03-01 中国平安财产保险股份有限公司 Big data quantity matching process, device, computer equipment and storage medium
CN109766334A (en) * 2019-01-07 2019-05-17 国网湖南省电力有限公司 Processing method and system for electrical equipment online supervision abnormal data
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium
CN109845320A (en) * 2018-02-09 2019-06-04 Oppo广东移动通信有限公司 The method and apparatus carried out data transmission based on quality of service
CN109951547A (en) * 2019-03-15 2019-06-28 百度在线网络技术(北京)有限公司 Transactions requests method for parallel processing, device, equipment and medium

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232538A (en) * 2007-12-28 2008-07-30 华为技术有限公司 Apparatus and method for merging business data
CN104135395A (en) * 2014-03-10 2014-11-05 腾讯科技(深圳)有限公司 Method and system of monitoring data transmission quality in IDC (Internet Data Center) network
CN105045832A (en) * 2015-06-30 2015-11-11 北京奇艺世纪科技有限公司 Data acquisition method and apparatus
CN107426265A (en) * 2016-03-11 2017-12-01 阿里巴巴集团控股有限公司 The synchronous method and apparatus of data consistency
CN108243046A (en) * 2016-12-27 2018-07-03 中国移动通信集团浙江有限公司 A kind of evaluation the quality method and device based on data auditing
CN107577717A (en) * 2017-08-09 2018-01-12 阿里巴巴集团控股有限公司 A kind of processing method, device and server for ensureing data consistency
CN107707482A (en) * 2017-09-29 2018-02-16 新华三技术有限公司 A kind of data smoothing method and apparatus
CN107784088A (en) * 2017-09-30 2018-03-09 杭州博世数据网络有限公司 The knowledge mapping construction method of knowledge based point annexation
CN109845320A (en) * 2018-02-09 2019-06-04 Oppo广东移动通信有限公司 The method and apparatus carried out data transmission based on quality of service
CN108874907A (en) * 2018-05-25 2018-11-23 北京明略软件系统有限公司 A kind of data query method and apparatus, computer readable storage medium
CN108984284A (en) * 2018-06-26 2018-12-11 杭州比智科技有限公司 DAG method for scheduling task and device based on off-line calculation platform
CN108833184A (en) * 2018-06-29 2018-11-16 腾讯科技(深圳)有限公司 Service fault localization method, device, computer equipment and storage medium
CN108959564A (en) * 2018-07-04 2018-12-07 玖富金科控股集团有限责任公司 Data warehouse metadata management method, readable storage medium storing program for executing and computer equipment
CN109213747A (en) * 2018-08-08 2019-01-15 麒麟合盛网络技术股份有限公司 A kind of data managing method and device
CN109308602A (en) * 2018-08-15 2019-02-05 平安科技(深圳)有限公司 Operation flow data processing method, device, computer equipment and storage medium
CN109408535A (en) * 2018-09-28 2019-03-01 中国平安财产保险股份有限公司 Big data quantity matching process, device, computer equipment and storage medium
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium
CN109766334A (en) * 2019-01-07 2019-05-17 国网湖南省电力有限公司 Processing method and system for electrical equipment online supervision abnormal data
CN109951547A (en) * 2019-03-15 2019-06-28 百度在线网络技术(北京)有限公司 Transactions requests method for parallel processing, device, equipment and medium

Also Published As

Publication number Publication date
CN112241443A (en) 2021-01-19

Similar Documents

Publication Publication Date Title
CN110704231A (en) Fault processing method and device
CN108683530B (en) Data analysis method and device for multi-dimensional data and storage medium
US9043647B2 (en) Fault detection and localization in data centers
CN108228706A (en) For identifying the method and apparatus of abnormal transaction corporations
CN110647447B (en) Abnormal instance detection method, device, equipment and medium for distributed system
US9836522B2 (en) Framework for ordered clustering
CN111708938B (en) Method, apparatus, electronic device, and storage medium for information processing
CN106612216A (en) Method and apparatus of detecting website access exception
CN111260156A (en) Construction method of cash flow prediction model and cash flow prediction method and device
CN110493806A (en) Mobile network complains source tracing method and device
CN112817785A (en) Anomaly detection method and device for micro-service system
US8140444B2 (en) Method of measuring a large population of web pages for compliance to content standards that require human judgement to evaluate
CN109063433A (en) Recognition methods, device and the readable storage medium storing program for executing of fictitious users
CN110909129B (en) Abnormal complaint event identification method and device
CN112241443B (en) Data quality monitoring method, device, computing equipment and computer storage medium
US11182696B2 (en) Quantitative discovery of name changes
CN109995558A (en) Failure information processing method, device, equipment and storage medium
CN112241820B (en) Risk identification method and device for key nodes in fund flow and computing equipment
CN111190817B (en) Method and device for processing software defects
CN108335236A (en) A kind of source of houses leakage broker's detection method and device
CN114338441A (en) Analysis method for intelligently identifying service link based on service flow
CN114356781A (en) Software function testing method and device
CN113962216A (en) Text processing method and device, electronic equipment and readable storage medium
TWM568442U (en) Cash flow grouping system
CN113781237B (en) Product purchase order consumption method based on distributed artificial intelligence system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant