CN112241443A - Data quality monitoring method and device, computing equipment and computer storage medium - Google Patents

Data quality monitoring method and device, computing equipment and computer storage medium Download PDF

Info

Publication number
CN112241443A
CN112241443A CN201910642686.7A CN201910642686A CN112241443A CN 112241443 A CN112241443 A CN 112241443A CN 201910642686 A CN201910642686 A CN 201910642686A CN 112241443 A CN112241443 A CN 112241443A
Authority
CN
China
Prior art keywords
data
node
flow direction
nodes
flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910642686.7A
Other languages
Chinese (zh)
Other versions
CN112241443B (en
Inventor
金崇超
孙新华
刘坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201910642686.7A priority Critical patent/CN112241443B/en
Publication of CN112241443A publication Critical patent/CN112241443A/en
Application granted granted Critical
Publication of CN112241443B publication Critical patent/CN112241443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • General Factory Administration (AREA)

Abstract

The embodiment of the invention relates to the technical field of quality monitoring, and discloses a data quality monitoring method, a data quality monitoring device, computing equipment and a computer storage medium, wherein the method comprises the following steps: acquiring a data flow map among data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data. Through the mode, the embodiment of the invention automatically combs the data circulation forms under different scenes and services by utilizing the incidence relation among the data of the production system, the full-service full-scene full-process monitoring compares the data quality condition, provides the data abnormal link occurrence point, improves the data quality monitoring efficiency and reduces the labor cost.

Description

Data quality monitoring method and device, computing equipment and computer storage medium
Technical Field
The embodiment of the invention relates to the technical field of quality monitoring, in particular to a data quality monitoring method, a data quality monitoring device, computing equipment and a computer storage medium.
Background
With the business development of the telecommunication market, under the driving of intense market competition and technology, operators have gradually changed from extensive operation in which price war attracts users to refined operation with customers as the center, and diversified customer requirements are met through business innovation. In the process, the business support system is more and more complex due to the business characteristics of various service types and flexible price packages, the risk of data quality problems caused by abnormity in the production process is inevitably increased, and once the data quality problems occur in the production system, the normal handling of the business can be influenced on one hand, and the later data type work processing results can be influenced on the other hand.
For the data quality problem, the conventional processing mode is to perform manual audit comparison work regularly, and the conventional processing mode can be roughly divided into three steps: 1) establishing data audit points, and respectively establishing the audit points according to different service conditions and experience; 2) manually combing the audit flows, namely manually combing the business scene flows related to each audit point and the data sheets flowing through the audit points by utilizing a system interactive panoramic frame; 3) and (4) regular audit verification, wherein personnel are arranged to carry out audit work of comparing data quality step by step for each audit point regularly according to an audit flow.
In the process of implementing the embodiment of the present invention, the inventors found that: in order to meet diversified customer requirements and integrate a continuously updated market environment, operators need to develop innovative services in time, and therefore, in a traditional data quality audit comparison mode, the audit point and the audit process need to be continuously and synchronously updated, and the service flow, the data flow and other carding work are completely carried out manually, so that the time consumption, the labor consumption and the efficiency are extremely low. In addition, under the condition of service and scene interaction in the production environment, the derived data condition is complex and variable, the audit points are established manually, the audit points are often only related to the data quality audit of partial services and scenes, and the coverage of the audit range has limitation; and based on different services and scenes, the specific flow is complicated, and only the end-to-end data quality problem is focused by a traditional data quality audit comparison mode of manual processing, the link of the intermediate problem of the service cannot be specifically positioned, and the link of the occurrence of the data quality problem cannot be monitored.
Disclosure of Invention
In view of the above problems, embodiments of the present invention provide a data quality monitoring method, apparatus, computing device and computer storage medium, which overcome or at least partially solve the above problems.
According to an aspect of an embodiment of the present invention, there is provided a data quality monitoring method, including: acquiring a data flow map among data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In an optional manner, the obtaining a data flow graph between data nodes includes: performing data characteristic analysis on historical service data in the production system to obtain the dependency relationship among data nodes; and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an optional manner, the performing data feature analysis on the historical service data to obtain a dependency relationship between data nodes includes: acquiring historical service data in a production system, and establishing a training data node table; respectively collecting field characteristics according to the data node table, and acquiring a general field combination of each data node; aiming at any two data nodes, acquiring the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate; and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an optional manner, the obtaining, for any two of the data nodes, an optimal field combination of any two of the data nodes according to the general field combination application expansion rate and retention rate includes: extracting preset amount of service data in the general field combination of any data node to match with the general field combination of another data node, and counting retention rate and expansion rate; combining fields with the highest retention rate; when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are removed to form special field combinations; splicing the special field combination with the general field combination of the other data node to form a new inter-table field combination; and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
In an optional manner, the obtaining an optimal flow direction between data nodes according to a dependency relationship between the data nodes to form a data flow graph includes: acquiring historical service data in a production system, and establishing a temporary service data table; selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the time fields of the initial table and the association table; judging the internal association flow direction of the association tables of any two data nodes in the association table set; and obtaining the optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map.
In an optional manner, the obtaining an optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map includes: if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction; traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction; and forming a data flow chart according to the optimal flow direction among the data nodes.
In an optional manner, the data flow characteristics include a start node data volume and an end node data volume; the consistency comparison of the data flow characteristics of the data nodes in the service data and the data flow graph is performed, abnormal data is output, and an abnormal occurrence node of the abnormal data is located, including: comparing the node data quantity of the data nodes flowing in the business data and the data flow graph, calculating missing data quantity, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list; and outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
According to another aspect of the embodiments of the present invention, there is provided a data quality monitoring apparatus, including: the flow acquiring unit is used for acquiring a data flow map among the data nodes; the data acquisition unit is used for acquiring real-time service data in the production system; the characteristic extraction unit is used for acquiring the data flow characteristics of each data node in the service data according to the data flow map; and the abnormal detection unit is used for comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
According to another aspect of embodiments of the present invention, there is provided a computing device including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the steps of the data quality monitoring method.
According to a further aspect of the embodiments of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing the processor to execute the steps of the data quality monitoring method described above.
The embodiment of the invention obtains the data flow map among the data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; the data flow characteristics of the data nodes in the service data and the data flow maps are compared in a consistent mode, abnormal data detection results are output, abnormal generation nodes of the abnormal data are located, data flow modes under different scenes and services are automatically combed through the incidence relation among data of the production system, the data quality condition is monitored and compared in the whole service whole scene whole flow, data abnormal link occurrence points are provided, data quality monitoring efficiency is improved, and labor cost is reduced.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flow chart of a data quality monitoring method provided by an embodiment of the present invention;
FIG. 2 is a flow chart of another data quality monitoring method provided by an embodiment of the invention;
FIG. 3 is a flow chart of another data quality monitoring method provided by the embodiment of the invention;
FIG. 4 is a graph illustrating retention and expansion rates of a data quality monitoring method provided by an embodiment of the invention;
FIG. 5 is a schematic diagram illustrating an optimal flow direction acquisition of a data quality monitoring method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data quality monitoring apparatus provided in an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 is a schematic flow chart illustrating a data quality monitoring method according to an embodiment of the present invention. As shown in fig. 1, the data quality monitoring method includes:
step S11: and acquiring a data flow map among the data nodes.
In the embodiment of the invention, a pre-trained data flow map among the data nodes can be obtained, and historical service data in a production system can be trained to obtain the data flow map among the data nodes.
When the historical service data in the production system is trained to obtain the data flow graph between the data nodes, as shown in fig. 2, step S11 includes:
step S111: and carrying out data characteristic analysis on historical service data in the production system to obtain the dependency relationship among the data nodes.
In the embodiment of the invention, historical service data in a production system is utilized to create a training node set table, a characteristic frame such as data expansion rate, retention rate and the like is constructed, the optimal field combination of each data node is obtained according to the expansion rate and the retention rate, and the dependency relationship among the data nodes is obtained according to the explicit-implicit relationship among the node tables.
As shown in fig. 3, step S111 includes the steps of:
step S115: and acquiring historical service data in the production system, and establishing a training data node table.
Historical service data of a training node table in a production system is collected, a corresponding temporary training node table is created and generated for subsequent training, a training data set is established, and a training data node table is created.
Step S116: and respectively collecting field characteristics according to the data node table, and acquiring a plurality of general field combinations of each data node.
Specifically, extracting a field of which the type of each data node in the training data node table is a character type or a numerical type, and performing field analysis to obtain a field analysis result; and extracting label fields from the field analysis result, and eliminating fields with null values exceeding a threshold value to form a plurality of general field combinations of each data node.
And aiming at the created training data node table, extracting fields with character types or numerical types in the training data node table, and sequentially counting field characteristics such as the number of records, the number of duplicate removal records, the number of null value records, the type length with the largest data quantity, the corresponding data quantity and the like of related fields of each data table to form a data field analysis result table shown in table 1.
Table 1 data field analysis results table
Name of field Logical name Data type
NODE_ID Node ID NUMBER(10)
NODE_CODE Node encoding VARchar2(30)
COLUMN_NAME Field coding VARchar2(30)
COLUMN_TYPE Type of field VARchar2(30)
ALL_CNT Node record number NUMBER(20)
COL_CNT Record number after the duplication of the field NUMBER(20)
NULL_CNT The field null value records the number NUMBER(20)
MAX_LENGTH The length type of the field record the most NUMBER(4)
MAX_LENGTH_CNT The field records the number of records of the maximum length type NUMBER(20)
Extracting fields of a suspected label set according to the statistical field characteristics for analyzing field diversity and writing data characteristic field results, wherein the suspected label set field extraction rule is as follows:
Figure BDA0002132423520000061
and further eliminating excessive fields with null values according to a preset rule, and screening field features to form a general field combination meeting the conditions.
Step S117: and aiming at any two data nodes, obtaining the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate.
Specifically, a preset number of service data in the general field combination of any data node is extracted to be matched with the general field combination of another data node, and a retention rate and an expansion rate are counted; combining fields with the highest retention rate; and when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are rejected to form special field combinations. In the embodiment of the present invention, as shown in fig. 4, taking any two associated data nodes a and B as an example, 10000 pieces of data in the general field combination of the data node a are extracted to form a table a, and the general field combination of the data node B forms a full table B. And matching the field combination with the associated full-scale table B, and sequentially counting the data retention rate and the expansion rate. The retention rate is the data volume/10000 of the table A which is intersected with the full-scale table B, and the expansion rate is the data volume of the table B which is intersected with the table A/the data volume of the table A which is intersected with the full-scale table B. For example, referring to fig. 4, table a includes an α field, table B includes a β field and a γ field, and the two tables are associated by combining the fields α ═ β, resulting in Q ═ 2 and P ═ 3. And if the field retention rate between the field related tables meets a threshold value, the field combination with the highest retention rate is selected, and if not, the field combination is rejected. If a plurality of fields meet the conditions, field combinations with abnormal expansion rates are rejected, and the expansion rate is larger than 100.
After a special field combination is obtained, splicing the special field combination and the general field combination of the other data node to form a new field combination between tables; and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
Step S118: and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
And judging whether the association tables are in an explicit relation or an implicit relation according to the time fields in the training data node tables. If the time sequence relation exists in the time field between tables, the relation is explicit, otherwise, the relation is implicit. The dependency relationship between the data nodes is output, and an association table of the dependency relationship between the data nodes is formed as shown in table 2.
TABLE 2 Association table of dependencies between data nodes
Figure BDA0002132423520000071
Figure BDA0002132423520000081
Step S112: and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
And forming a basic and internal association flow direction on the basis of the dependency relationship among the data nodes, and finally automatically establishing a data flow graph, completing the analysis of a data feature dependency model and forming the data flow graph.
In the embodiment of the invention, historical business data in the production system is obtained, and a business data temporary table is established. Specifically, the business data of the data node of the inlet is taken according to the configuration table, and a new business is created and generated from the training data node table according to the business caliber of the training business inletAnd (4) related service data temporary tables. Then extracting each data node, and selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the time fields of the initial table and the association table; and judging the internal association flow direction of the association tables of any two data nodes in the association table set. E.g. selecting any association table B from the association table set B of the original table aiCarding, the following characteristics are obtained: node a, association field of node a, and node biNode biAn association field, an association type. Based on node a and node biSequentially generating a temporary table of association results, and respectively taking the initial table a and any association table biThe time field of the table is spliced, and the initial data volume of the initial table a and any associated table b are further countediThe data volume of the association set is judged, and the basic association flow direction is judged: if a → b in the timing relationshipiIf the confidence of the rule exceeds the threshold, the business flow is considered to pass through the table biOtherwise, it does not flow through. The set of association tables { B } is traversed to obtain the base association flow direction a → { B' }. Pairwise comparison of the association tables with association relations in the association table set { B } is performed to judge the flow direction, and { B } is calculated respectivelyi'}→{Bj' } (i ≠ j) and { Bj'}→{Bi' } (i ≠ j), if the confidence exceeds the threshold value, the internal association flow is taken as the internal association flow with high confidence.
In the embodiment of the invention, an optimal flow direction is further obtained according to the basic associated flow direction and the internal associated flow direction, so as to form a data flow map. The method specifically comprises the following steps: if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction; traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction; and forming a data flow chart according to the optimal flow direction among the data nodes. For example, referring to fig. 5, if there is a basic associated flow direction, e.g., a → b2, and there is a basic and internal associated flow direction of the start node that can reach the end node through other nodes, e.g., a → b1, b1 → b2, the original basic associated flow direction, i.e., a → b2, is eliminated, and the other basic and internal associated flow directions are merged and retained as the optimal flow direction. And forming a business process according to the optimal flow direction among the data nodes, wherein the result is shown in a table 3, and further generating a data process map.
Table 3 business process results table
Name of field Logical name Data type
JOB_ID Training item ID NUMBER(10)
TASK_ID Training task ID NUMBER(10)
FLOW_ID Process ID NUMBER(10)
START_NODE_ID Start node ID NUMBER(10)
START_NODE_CODE Start node encoding VARchar2(30)
END_NODE_ID End node ID NUMBER(10)
END_NODE_CODE End node encoding VARchar2(30)
According to the embodiment of the invention, the characteristic frames of the data expansion rate, the retention rate and the like are constructed to form the basic and internal association flow direction, and finally, the automatic establishment function of the data flow chart is realized, so that the association relation among the production system data is utilized in a subsequent creative manner, the automatic monitoring replaces a pure manual auditing mode, the data quality auditing efficiency is improved, and the labor cost is reduced.
Step S12: and acquiring real-time service data in the production system.
Specifically, the service data of the data node at the inlet is taken according to the configuration table, the real-time service data in the production system is obtained from the training data node table according to the service aperture of the service inlet, and a new temporary table of the relevant service data is created and produced.
Step S13: and acquiring the data flow characteristics of each data node in the service data according to the data flow graph.
Specifically, the data flow characteristics of each data node are counted according to the process graph formed by training, wherein the data flow characteristics comprise the data volume of the starting node and the data volume of the ending node.
Step S14: and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
And comparing the node data amount of the data nodes flowing in the business data and the data flow graph, calculating the missing data amount, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list, wherein the output data missing magnitude list and the output data missing detail list are respectively shown in a table 4 and a table 5.
TABLE 4 data loss magnitude List
Name of field Logical name Data type
JOB_ID Training item ID NUMBER(10)
TM_INTRVL_ID Training task ID VARchar2(30)
FLOW_ID The process ID NUMBER(10)
START_NODE_CODE Initial node encoding VARchar2(30)
END_NODE_CODE End node encoding VARchar2(30)
V_START_CNT Starting node data volume NUMBER(10)
V_END_CNT End node data volume NUMBER(10)
GRP_EXPR Magnitude of difference VARchar2(1000)
TABLE 5 data loss List
Figure BDA0002132423520000101
Figure BDA0002132423520000111
And further, outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list and positioning an abnormal occurrence node of the abnormal data. The abnormal data detection results are recorded in the form shown in Table 6.
Table 6 schematic diagram of abnormal link detection results
Name of field Logical name Data type
USER_ID User ID VARchar2(30)
JOB_ID Training item ID NUMBER(10)
TASK_ID Training task ID NUMBER(10)
NODE_ID Data node ID NUMBER(10)
NODE_CODE Data node encoding VARchar2(30)
GRP_ROWNO Number of user records NUMBER(10)
GRP_EXPR Record type of user VARchar2(1000)
ERR_FLAG Whether the user is abnormal at the node NUMBER(1)
The embodiment of the invention creatively breaks through the limitation of the traditional mode due to manpower and complex flow, monitors and compares the data quality condition of the whole-service whole-scene whole-flow by automatically combing the data flow modes under different scenes and services without being limited to the artificially established check point, automatically checks the abnormal service operation in the production process from the abnormal condition of the data flow, can quickly position the abnormal link, and provides the reason for completing the abnormal service operation, namely the abnormal data link occurrence point.
The embodiment of the invention obtains the data flow map among the data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; the data flow characteristics of the data nodes in the service data and the data flow maps are compared in a consistent mode, abnormal data detection results are output, abnormal generation nodes of the abnormal data are located, data flow modes under different scenes and services are automatically combed through the incidence relation among data of the production system, the data quality condition is monitored and compared in the whole service whole scene whole flow, data abnormal link occurrence points are provided, data quality monitoring efficiency is improved, and labor cost is reduced.
Fig. 6 shows a schematic structural diagram of a data quality monitoring device according to an embodiment of the present invention. As shown in fig. 6, the data quality monitoring apparatus includes: a flow acquisition unit 601, a data acquisition unit 602, a feature extraction unit 603, and an abnormality detection unit 604. Wherein:
the flow acquiring unit 601 is configured to acquire a data flow map between data nodes; the data obtaining unit 602 is configured to obtain real-time service data in the production system; the feature extraction unit 603 is configured to obtain data flow features of each data node in the service data according to the data flow graph; the anomaly detection unit 604 is configured to compare consistency of the data flow characteristics of the data nodes in the service data and the data flow graph, output an anomaly data detection result, and locate an anomaly occurrence node of the anomaly data.
In an optional manner, the flow acquiring unit 601 is configured to: performing data characteristic analysis on historical service data in the production system to obtain the dependency relationship among data nodes; and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an optional manner, the flow acquiring unit 601 is configured to: acquiring historical service data in a production system, and establishing a training data node table; respectively collecting field characteristics according to the data node table, and acquiring a general field combination of each data node; aiming at any two data nodes, acquiring the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate; and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an optional manner, the flow acquiring unit 601 is further configured to: extracting preset amount of service data in the general field combination of any data node to match with the general field combination of another data node, and counting retention rate and expansion rate; combining fields with the highest retention rate; when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are removed to form special field combinations; splicing the special field combination with the general field combination of the other data node to form a new inter-table field combination; and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
In an optional manner, the flow acquiring unit 601 is further configured to: acquiring historical service data in a production system, and establishing a temporary service data table; selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes; judging a basic association flow direction according to the time fields of the initial table and the association table; judging the internal association flow direction of the association tables of any two data nodes in the association table set; and obtaining the optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map.
In an optional manner, the flow acquiring unit 601 is further configured to: if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction; traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction; and forming a data flow chart according to the optimal flow direction among the data nodes.
In an optional manner, the data flow characteristics include a start node data volume and an end node data volume; the anomaly detection unit 604 is configured to: comparing the node data quantity of the data nodes flowing in the business data and the data flow graph, calculating missing data quantity, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list; and outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
The embodiment of the invention obtains the data flow map among the data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; the data flow characteristics of the data nodes in the service data and the data flow maps are compared in a consistent mode, abnormal data detection results are output, abnormal generation nodes of the abnormal data are located, data flow modes under different scenes and services are automatically combed through the incidence relation among data of the production system, the data quality condition is monitored and compared in the whole service whole scene whole flow, data abnormal link occurrence points are provided, data quality monitoring efficiency is improved, and labor cost is reduced.
The embodiment of the invention provides a nonvolatile computer storage medium, wherein at least one executable instruction is stored in the computer storage medium, and the computer executable instruction can execute the data quality monitoring method in any method embodiment.
The executable instructions may be specifically configured to cause the processor to:
acquiring a data flow map among data nodes;
acquiring real-time service data in a production system;
acquiring data flow characteristics of each data node in the service data according to the data flow graph;
and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In an alternative, the executable instructions cause the processor to:
performing data characteristic analysis on historical service data in the production system to obtain the dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a training data node table;
respectively collecting field characteristics according to the data node table, and acquiring a general field combination of each data node;
aiming at any two data nodes, acquiring the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an alternative, the executable instructions cause the processor to:
extracting preset amount of service data in the general field combination of any data node to match with the general field combination of another data node, and counting retention rate and expansion rate;
combining fields with the highest retention rate;
when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are removed to form special field combinations;
splicing the special field combination with the general field combination of the other data node to form a new inter-table field combination;
and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
In an alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a temporary service data table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the time fields of the initial table and the association table;
judging the internal association flow direction of the association tables of any two data nodes in the association table set;
and obtaining the optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map.
In an alternative, the executable instructions cause the processor to:
if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction;
traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction;
and forming a data flow chart according to the optimal flow direction among the data nodes.
In an optional manner, the data flow characteristics include a start node data volume and an end node data volume; the executable instructions cause the processor to:
comparing the node data quantity of the data nodes flowing in the business data and the data flow graph, calculating missing data quantity, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
The embodiment of the invention obtains the data flow map among the data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; the data flow characteristics of the data nodes in the service data and the data flow maps are compared in a consistent mode, abnormal data detection results are output, abnormal generation nodes of the abnormal data are located, data flow modes under different scenes and services are automatically combed through the incidence relation among data of the production system, the data quality condition is monitored and compared in the whole service whole scene whole flow, data abnormal link occurrence points are provided, data quality monitoring efficiency is improved, and labor cost is reduced.
Embodiments of the present invention provide a computer program product comprising a computer program stored on a computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the data quality monitoring method of any of the above-mentioned method embodiments.
The executable instructions may be specifically configured to cause the processor to:
acquiring a data flow map among data nodes;
acquiring real-time service data in a production system;
acquiring data flow characteristics of each data node in the service data according to the data flow graph;
and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In an alternative, the executable instructions cause the processor to:
performing data characteristic analysis on historical service data in the production system to obtain the dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a training data node table;
respectively collecting field characteristics according to the data node table, and acquiring a general field combination of each data node;
aiming at any two data nodes, acquiring the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an alternative, the executable instructions cause the processor to:
extracting preset amount of service data in the general field combination of any data node to match with the general field combination of another data node, and counting retention rate and expansion rate;
combining fields with the highest retention rate;
when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are removed to form special field combinations;
splicing the special field combination with the general field combination of the other data node to form a new inter-table field combination;
and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
In an alternative, the executable instructions cause the processor to:
acquiring historical service data in a production system, and establishing a temporary service data table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the time fields of the initial table and the association table;
judging the internal association flow direction of the association tables of any two data nodes in the association table set;
and obtaining the optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map.
In an alternative, the executable instructions cause the processor to:
if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction;
traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction;
and forming a data flow chart according to the optimal flow direction among the data nodes.
In an optional manner, the data flow characteristics include a start node data volume and an end node data volume; the executable instructions cause the processor to:
comparing the node data quantity of the data nodes flowing in the business data and the data flow graph, calculating missing data quantity, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
The embodiment of the invention obtains the data flow map among the data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; the data flow characteristics of the data nodes in the service data and the data flow maps are compared in a consistent mode, abnormal data detection results are output, abnormal generation nodes of the abnormal data are located, data flow modes under different scenes and services are automatically combed through the incidence relation among data of the production system, the data quality condition is monitored and compared in the whole service whole scene whole flow, data abnormal link occurrence points are provided, data quality monitoring efficiency is improved, and labor cost is reduced.
Fig. 7 is a schematic structural diagram of a computing device according to an embodiment of the present invention, and a specific embodiment of the present invention does not limit a specific implementation of the device.
As shown in fig. 7, the computing device may include: a processor (processor)702, a Communications Interface 704, a memory 706, and a communication bus 708.
Wherein: the processor 702, communication interface 704, and memory 706 communicate with each other via a communication bus 708. A communication interface 704 for communicating with network elements of other devices, such as clients or other servers. The processor 702 is configured to execute the program 710, and may specifically execute the relevant steps in the above-described data quality monitoring method embodiment.
In particular, the program 710 may include program code that includes computer operating instructions.
The processor 702 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the present invention. The device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
The memory 706 stores a program 710. The memory 706 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 710 may specifically be used to cause the processor 702 to perform the following operations:
acquiring a data flow map among data nodes;
acquiring real-time service data in a production system;
acquiring data flow characteristics of each data node in the service data according to the data flow graph;
and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
In an alternative, the program 710 causes the processor to:
performing data characteristic analysis on historical service data in the production system to obtain the dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
In an alternative, the program 710 causes the processor to:
acquiring historical service data in a production system, and establishing a training data node table;
respectively collecting field characteristics according to the data node table, and acquiring a general field combination of each data node;
aiming at any two data nodes, acquiring the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
In an alternative, the program 710 causes the processor to:
extracting preset amount of service data in the general field combination of any data node to match with the general field combination of another data node, and counting retention rate and expansion rate;
combining fields with the highest retention rate;
when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are removed to form special field combinations;
splicing the special field combination with the general field combination of the other data node to form a new inter-table field combination;
and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
In an alternative, the program 710 causes the processor to:
acquiring historical service data in a production system, and establishing a temporary service data table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the time fields of the initial table and the association table;
judging the internal association flow direction of the association tables of any two data nodes in the association table set;
and obtaining the optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map.
In an alternative, the program 710 causes the processor to:
if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction;
traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction;
and forming a data flow chart according to the optimal flow direction among the data nodes.
In an optional manner, the data flow characteristics include a start node data volume and an end node data volume; the program 710 causes the processor to:
comparing the node data quantity of the data nodes flowing in the business data and the data flow graph, calculating missing data quantity, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
The embodiment of the invention obtains the data flow map among the data nodes; acquiring real-time service data in a production system; acquiring data flow characteristics of each data node in the service data according to the data flow graph; the data flow characteristics of the data nodes in the service data and the data flow maps are compared in a consistent mode, abnormal data detection results are output, abnormal generation nodes of the abnormal data are located, data flow modes under different scenes and services are automatically combed through the incidence relation among data of the production system, the data quality condition is monitored and compared in the whole service whole scene whole flow, data abnormal link occurrence points are provided, data quality monitoring efficiency is improved, and labor cost is reduced.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (10)

1. A method for monitoring data quality, the method comprising:
acquiring a data flow map among data nodes;
acquiring real-time service data in a production system;
acquiring data flow characteristics of each data node in the service data according to the data flow graph;
and comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
2. The method of claim 1, wherein obtaining the dataflow graph between data nodes includes:
performing data characteristic analysis on historical service data in the production system to obtain the dependency relationship among data nodes;
and obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow map.
3. The method according to claim 2, wherein the performing data characteristic analysis on the historical service data to obtain the dependency relationship between the data nodes comprises:
acquiring historical service data in a production system, and establishing a training data node table;
respectively collecting field characteristics according to the data node table, and acquiring a general field combination of each data node;
aiming at any two data nodes, acquiring the optimal field combination of any two data nodes according to the general field combination application expansion rate and retention rate;
and judging the dependency relationship between any two data nodes in the optimal field combination according to the time field in the training data node table.
4. The method according to claim 3, wherein the obtaining an optimal field combination for any two of the data nodes according to the general field combination application inflation rate and retention rate comprises:
extracting preset amount of service data in the general field combination of any data node to match with the general field combination of another data node, and counting retention rate and expansion rate;
combining fields with the highest retention rate;
when a plurality of field combinations with the highest retention rate exist, the field combinations with abnormal expansion rates are removed to form special field combinations;
splicing the special field combination with the general field combination of the other data node to form a new inter-table field combination;
and repeating iteration on the field combination between tables and the special field combination until no new special field combination is generated between the tables, and obtaining the optimal field combination.
5. The method according to claim 2, wherein the obtaining the optimal flow direction among the data nodes according to the dependency relationship among the data nodes to form a data flow graph comprises:
acquiring historical service data in a production system, and establishing a temporary service data table;
selecting an association table of any data node from an association table set associated with an initial table of the initial data node according to the dependency relationship among the data nodes;
judging a basic association flow direction according to the time fields of the initial table and the association table;
judging the internal association flow direction of the association tables of any two data nodes in the association table set;
and obtaining the optimal flow direction according to the basic associated flow direction and the internal associated flow direction to form a data flow map.
6. The method of claim 5, wherein obtaining an optimal flow direction from the base associative flow direction and the internal associative flow direction, forming a data flow graph, comprises:
if the first basic associated flow direction can be realized through a second basic associated flow direction and the internal associated flow direction, reserving the second basic associated flow direction and the internal associated flow direction, and deleting the first basic associated flow direction;
traversing the basic correlation flow direction and the internal correlation flow direction, and finally keeping the basic correlation flow direction and the internal correlation flow direction as an optimal flow direction;
and forming a data flow chart according to the optimal flow direction among the data nodes.
7. The method of claim 1, wherein the data flow characteristics include a starting node data volume and an ending node data volume;
the consistency comparison of the data flow characteristics of the data nodes in the service data and the data flow graph is performed, abnormal data is output, and an abnormal occurrence node of the abnormal data is located, including:
comparing the node data quantity of the data nodes flowing in the business data and the data flow graph, calculating missing data quantity, marking the positions of the data nodes, and outputting a data missing magnitude list and a data missing detail list;
and outputting an abnormal data detection result according to the data missing magnitude list and the data missing detail list, and positioning an abnormal occurrence node of the abnormal data.
8. A data quality monitoring apparatus, the apparatus comprising:
the flow acquiring unit is used for acquiring a data flow map among the data nodes;
the data acquisition unit is used for acquiring real-time service data in the production system;
the characteristic extraction unit is used for acquiring the data flow characteristics of each data node in the service data according to the data flow map;
and the abnormal detection unit is used for comparing the consistency of the data flow characteristics of the data nodes in the service data and the data flow maps, outputting an abnormal data detection result and positioning an abnormal occurrence node of the abnormal data.
9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform the steps of the data quality monitoring method according to any one of claims 1-7.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform the steps of the data quality monitoring method according to any one of claims 1-7.
CN201910642686.7A 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium Active CN112241443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910642686.7A CN112241443B (en) 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910642686.7A CN112241443B (en) 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN112241443A true CN112241443A (en) 2021-01-19
CN112241443B CN112241443B (en) 2023-11-21

Family

ID=74167104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910642686.7A Active CN112241443B (en) 2019-07-16 2019-07-16 Data quality monitoring method, device, computing equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN112241443B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342861A (en) * 2021-07-06 2021-09-03 云南中烟工业有限责任公司 Data management method and device in business scene

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232538A (en) * 2007-12-28 2008-07-30 华为技术有限公司 Apparatus and method for merging business data
CN104135395A (en) * 2014-03-10 2014-11-05 腾讯科技(深圳)有限公司 Method and system of monitoring data transmission quality in IDC (Internet Data Center) network
CN105045832A (en) * 2015-06-30 2015-11-11 北京奇艺世纪科技有限公司 Data acquisition method and apparatus
CN107426265A (en) * 2016-03-11 2017-12-01 阿里巴巴集团控股有限公司 The synchronous method and apparatus of data consistency
CN107577717A (en) * 2017-08-09 2018-01-12 阿里巴巴集团控股有限公司 A kind of processing method, device and server for ensureing data consistency
CN107707482A (en) * 2017-09-29 2018-02-16 新华三技术有限公司 A kind of data smoothing method and apparatus
CN107784088A (en) * 2017-09-30 2018-03-09 杭州博世数据网络有限公司 The knowledge mapping construction method of knowledge based point annexation
CN108243046A (en) * 2016-12-27 2018-07-03 中国移动通信集团浙江有限公司 A kind of evaluation the quality method and device based on data auditing
CN108833184A (en) * 2018-06-29 2018-11-16 腾讯科技(深圳)有限公司 Service fault localization method, device, computer equipment and storage medium
CN108874907A (en) * 2018-05-25 2018-11-23 北京明略软件系统有限公司 A kind of data query method and apparatus, computer readable storage medium
CN108959564A (en) * 2018-07-04 2018-12-07 玖富金科控股集团有限责任公司 Data warehouse metadata management method, readable storage medium storing program for executing and computer equipment
CN108984284A (en) * 2018-06-26 2018-12-11 杭州比智科技有限公司 DAG method for scheduling task and device based on off-line calculation platform
CN109213747A (en) * 2018-08-08 2019-01-15 麒麟合盛网络技术股份有限公司 A kind of data managing method and device
CN109308602A (en) * 2018-08-15 2019-02-05 平安科技(深圳)有限公司 Operation flow data processing method, device, computer equipment and storage medium
CN109408535A (en) * 2018-09-28 2019-03-01 中国平安财产保险股份有限公司 Big data quantity matching process, device, computer equipment and storage medium
CN109766334A (en) * 2019-01-07 2019-05-17 国网湖南省电力有限公司 Processing method and system for electrical equipment online supervision abnormal data
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium
CN109845320A (en) * 2018-02-09 2019-06-04 Oppo广东移动通信有限公司 The method and apparatus carried out data transmission based on quality of service
CN109951547A (en) * 2019-03-15 2019-06-28 百度在线网络技术(北京)有限公司 Transactions requests method for parallel processing, device, equipment and medium

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232538A (en) * 2007-12-28 2008-07-30 华为技术有限公司 Apparatus and method for merging business data
CN104135395A (en) * 2014-03-10 2014-11-05 腾讯科技(深圳)有限公司 Method and system of monitoring data transmission quality in IDC (Internet Data Center) network
CN105045832A (en) * 2015-06-30 2015-11-11 北京奇艺世纪科技有限公司 Data acquisition method and apparatus
CN107426265A (en) * 2016-03-11 2017-12-01 阿里巴巴集团控股有限公司 The synchronous method and apparatus of data consistency
CN108243046A (en) * 2016-12-27 2018-07-03 中国移动通信集团浙江有限公司 A kind of evaluation the quality method and device based on data auditing
CN107577717A (en) * 2017-08-09 2018-01-12 阿里巴巴集团控股有限公司 A kind of processing method, device and server for ensureing data consistency
CN107707482A (en) * 2017-09-29 2018-02-16 新华三技术有限公司 A kind of data smoothing method and apparatus
CN107784088A (en) * 2017-09-30 2018-03-09 杭州博世数据网络有限公司 The knowledge mapping construction method of knowledge based point annexation
CN109845320A (en) * 2018-02-09 2019-06-04 Oppo广东移动通信有限公司 The method and apparatus carried out data transmission based on quality of service
CN108874907A (en) * 2018-05-25 2018-11-23 北京明略软件系统有限公司 A kind of data query method and apparatus, computer readable storage medium
CN108984284A (en) * 2018-06-26 2018-12-11 杭州比智科技有限公司 DAG method for scheduling task and device based on off-line calculation platform
CN108833184A (en) * 2018-06-29 2018-11-16 腾讯科技(深圳)有限公司 Service fault localization method, device, computer equipment and storage medium
CN108959564A (en) * 2018-07-04 2018-12-07 玖富金科控股集团有限责任公司 Data warehouse metadata management method, readable storage medium storing program for executing and computer equipment
CN109213747A (en) * 2018-08-08 2019-01-15 麒麟合盛网络技术股份有限公司 A kind of data managing method and device
CN109308602A (en) * 2018-08-15 2019-02-05 平安科技(深圳)有限公司 Operation flow data processing method, device, computer equipment and storage medium
CN109408535A (en) * 2018-09-28 2019-03-01 中国平安财产保险股份有限公司 Big data quantity matching process, device, computer equipment and storage medium
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium
CN109766334A (en) * 2019-01-07 2019-05-17 国网湖南省电力有限公司 Processing method and system for electrical equipment online supervision abnormal data
CN109951547A (en) * 2019-03-15 2019-06-28 百度在线网络技术(北京)有限公司 Transactions requests method for parallel processing, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342861A (en) * 2021-07-06 2021-09-03 云南中烟工业有限责任公司 Data management method and device in business scene
CN113342861B (en) * 2021-07-06 2022-11-11 云南中烟工业有限责任公司 Data management method and device in service scene

Also Published As

Publication number Publication date
CN112241443B (en) 2023-11-21

Similar Documents

Publication Publication Date Title
CN108197532B (en) The method, apparatus and computer installation of recognition of face
CN110992167B (en) Bank customer business intention recognition method and device
CN111291816B (en) Method and device for carrying out feature processing aiming at user classification model
CN107423613B (en) Method and device for determining device fingerprint according to similarity and server
CN110222171A (en) A kind of application of disaggregated model, disaggregated model training method and device
CN110647447B (en) Abnormal instance detection method, device, equipment and medium for distributed system
CN109284369B (en) Method, system, device and medium for judging importance of securities news information
CN111815169A (en) Business approval parameter configuration method and device
CN113888299A (en) Wind control decision method and device, computer equipment and storage medium
US8140444B2 (en) Method of measuring a large population of web pages for compliance to content standards that require human judgement to evaluate
CN109063433A (en) Recognition methods, device and the readable storage medium storing program for executing of fictitious users
CN112434178A (en) Image classification method and device, electronic equipment and storage medium
CN112241443A (en) Data quality monitoring method and device, computing equipment and computer storage medium
CN114841789A (en) Block chain-based auditing and auditing pricing fault data online editing method and system
CN113190623B (en) Data processing method, device, server and storage medium
CN112241820A (en) Risk identification method and device for key nodes in fund flow and computing equipment
CN108830302B (en) Image classification method, training method, classification prediction method and related device
CN111190817A (en) Method and device for processing software defects
CN114697127B (en) Service session risk processing method based on cloud computing and server
EP3828712A1 (en) Data parsing method and device
CN109344299A (en) Object search method, apparatus, electronic equipment and computer readable storage medium
CN110730342B (en) Video quality analysis method and device, server and terminal
CN113962216A (en) Text processing method and device, electronic equipment and readable storage medium
CN113282686A (en) Method and device for determining association rule of unbalanced sample
CN113569879A (en) Training method of abnormal recognition model, abnormal account recognition method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant