CN112800149A - Data blood margin analysis-based data management method and system - Google Patents

Data blood margin analysis-based data management method and system Download PDF

Info

Publication number
CN112800149A
CN112800149A CN202110187130.0A CN202110187130A CN112800149A CN 112800149 A CN112800149 A CN 112800149A CN 202110187130 A CN202110187130 A CN 202110187130A CN 112800149 A CN112800149 A CN 112800149A
Authority
CN
China
Prior art keywords
data
node
analysis
blood
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110187130.0A
Other languages
Chinese (zh)
Other versions
CN112800149B (en
Inventor
王泽宇
宋海涛
尹曦萌
于春蕾
张正奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202110187130.0A priority Critical patent/CN112800149B/en
Publication of CN112800149A publication Critical patent/CN112800149A/en
Application granted granted Critical
Publication of CN112800149B publication Critical patent/CN112800149B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data management method and a data management system based on data blood margin analysis, belongs to the technical field of data processing, and aims to solve the technical problems of how to overcome the difficulties of data source tracing, verification and correlation analysis in the data management process, and the adopted technical scheme is as follows: the method comprises the steps of constructing a data family relation network map by analyzing data blood relationship, and carrying out mutual evidence expansion on data of all nodes in the network map, so that data management personnel are helped to complete source tracing, verification, supplement and standardization of the data, and the data management efficiency is improved; the method comprises the following specific steps: scheduling and storing the big data; performing blood-related analysis on the data to form a data family map; and constructing a data map through an algorithm model. The system comprises a big data scheduling and storing module, a data blood margin analyzing module and an algorithm model module.

Description

Data blood margin analysis-based data management method and system
Technical Field
The invention relates to the technical field of data processing, in particular to a data management method and a data management system based on data blood relationship analysis.
Background
In the big data era, data is explosively increased, and massive and various types of data are rapidly generated. The huge and complicated data information is called again, converted and transformed, circulated and circulated, and new data is generated and converged into a data ocean.
Human relationship refers to the interpersonal relationship resulting from marriage or birth, such as parent-child relationship, sibling relationship, and other relatives derived therefrom. In the processes of data generation, processing, circulation and extinction, a relationship is naturally formed among the data, and the relationship between the data is expressed by referring to a similar relationship in human society, which is called as the blood-related relationship of the data.
The data bloodline in turn has the following characteristics:
attributing: the data is owned by a specific organization or person, and the organization or person owning the data has the use right of the data;
② multi-source: the same data can have a plurality of sources (namely a plurality of parents), and the data is generated by processing a plurality of data or by a plurality of processing modes or processing steps;
traceability: the blood relationship of the data represents the full life cycle of the data, and the whole process from data generation to abandonment can be traced back;
fourthly, layering: the blood-related relationship of the data is hierarchical; the description information of the data such as classification, induction and summarization of the data forms new data, and the description information of different degrees forms the hierarchy of the data.
In the disordered data, how to use the characteristics of 4 blood sources above the blood source of the data to straighten the blood source relationship of the data is a difficult problem to help data management personnel to better complete data management work such as data source tracing, verification, supplement, standardization and the like.
To visually describe the data lineage definition, as an example in life, for example, in a shopping website, after a customer purchases an item, the data is stored in a background database table a. When it is necessary to count which articles are sold in a month, the original data in the database needs to be processed and summarized to form an intermediate table B for storing the data processed in the stage, and if the logic is complicated, the processing is continued to form the intermediate table. . . Until finally processing into the final table for foreground presentation, say C table. Then the A table is the original source of the C table data and is the ancestor of the C table data. From the A table data to the B table data to the C table data, the link is the data bloodline of the C table.
In the data processing process, each link from the data source to the final data generation may cause a data quality problem, for example, the data quality of the data source itself is not high, if the data quality is not detected and processed in the subsequent processing link, the data information is finally transferred to the target table, and the data quality is not high, and there is also a possibility that some improper processing is performed on the data in the data processing of a certain link, so that the data quality of the subsequent link becomes poor.
Therefore, in the data management process, how to overcome the difficulties of data source tracing, verification and correlation analysis is a problem to be solved urgently at present.
Disclosure of Invention
The technical task of the invention is to provide a data management method and a data management system based on data blood relationship analysis, so as to solve the problems of difficult source tracing, difficult verification and difficult correlation analysis of data in the data management process.
The technical task of the invention is realized in the following way, the method for data management based on data blood relationship analysis comprises the steps of constructing a data family relation network map by analyzing the data blood relationship, and carrying out mutual evidence expansion on data of each node in the network map, thereby helping data management personnel to complete the tracing, verification, supplement and normalization of data and improving the data management efficiency; the method comprises the following specific steps:
scheduling and storing the big data;
performing blood-related analysis on the data to form a data family map;
and constructing a data map through an algorithm model.
Preferably, the scheduling and storing of the big data are as follows:
dispatching the relevant data resources into a database of HBASE through a data dispatcher of NIFI;
in the dispatching process, the field names are subjected to standardized processing, and the data of the focus fields are cleaned, so that the blood relationship analysis is facilitated.
The data family map is formed by preferably performing blood-related analysis on the data as follows:
finding the most basic data resource through the data characteristics to serve as an information main node, and finding a data outflow node of the information main node to serve as a sub-node; when a parent node and a child node are searched, important field information is marked in the information main node so as to carry out mutual evidence analysis on data of each node;
finding the data inflow node and the data outflow node of the child node, and forming a family data grid after mutual association;
identifying a basic node, an inflow node and an outflow node by using circles, identifying data inflow and data outflow by using line segments with arrows, and starting to analyze the blood margin of data by using the basic node as a main node;
in the data blood margin analysis process, table names and table key fields are marked in a circle in a key mode, association fields among tables are marked clearly on a connecting line, and all data inflow nodes and data outflow nodes are connected in sequence to form a data family map.
More preferably, the data family map comprises the following elements:
host nodes: only one main node is positioned in the middle of the whole map and is a core node of the visual graph; the blood relationship displayed by the map is the blood relationship of the main node, and other blood relationship irrelevant to the node is not displayed on the graph so as to ensure the simplicity and the clarity of the graph;
data outflow node: the data flow-out node is a father node of the main node and represents a data source;
thirdly, data flows into the node: the data flow-in node is a child node of the main node and represents the destination of the data; the data inflow node also comprises a special node, namely a terminal node, wherein the terminal node is a special data outflow node and indicates that data is not circulated downwards;
data transfer circuit: the data flow path is a data flow path which flows from left to right; the data circulation line converges from the data inflow node to the host node and diffuses from the host node to the data outflow node.
More preferably, the blood margin analysis method is as follows:
firstly, a static analysis method: based on the compiling principle, objective reflection of data circulation is realized by scanning and syntax analysis of a source code and static analysis and listing of paths related to program logic;
② contact infection type analysis method: screening program commands related to data transmission and mapping to obtain key information for deep analysis;
third, the logic time sequence analysis method: in order to avoid the interference of redundant information, an indirect process of transmission and mapping which has no direct relation with data fields of the database, the file and the communication interface and intermediate variables of the program are converted into direct transmission and mapping among the data fields of the database, the file and the communication interface according to a program processing flow.
Preferably, the data family map is constructed by an algorithm model as follows:
abstracting a data table into an object, abstracting fields in the data table into object attributes, abstracting a data table and table relation into an object relation, establishing a uniform body data model by taking the object and attribute set relation as an element, and establishing mapping from a physical table to the body data model;
and analyzing the relation of the data family data table through an algorithm model to form a data map and extracting value data information.
A data governance system based on data blood margin analysis, the system comprising,
the big data scheduling and storing module is used for scheduling and storing the analysis data;
the data blood relationship analysis module is used for analyzing the data relationship to generate a data family map;
and the algorithm model module is used for automatically analyzing the data association relationship to form a data map through the key field indexes of all the nodes, managing the data quality, analyzing the data relationship and extracting the data value.
Preferably, the big data scheduling storage module comprises,
the warehousing submodule is used for scheduling and warehousing the data;
the standardization submodule is used for carrying out standardization processing on the data fields in the scheduling process;
and the cleaning submodule is used for cleaning the key fields.
Preferably, the data blood margin analysis module comprises,
the query submodule I is used for querying the main data node;
the query submodule II is used for querying the data inflow node;
the query submodule III is used for querying the data outflow node;
a construction submodule for constructing a data family map;
and the identification submodule is used for identifying the important data field of the node.
A computer readable storage medium having stored thereon computer executable instructions, which when executed by a processor, implement a data governance method based on data blood margin analysis as described above.
The data governing method and the system based on the data blood relationship analysis have the following advantages that:
the method solves the problems of difficult data source tracing, difficult verification, difficult specification, difficult analysis and the like in the data management process under the background of the existing big data, and can realize the data source tracing and quality verification and provide help for the data analysis by performing blood-related analysis on the scale-related data to form a family network graph and comparing and analyzing key data items of associated nodes;
the data management efficiency is improved mainly through data blood margin analysis, a data blood margin relation map is established for certain data, and the data in the map is analyzed and verified mutually according to attributes, multi-source property, traceability and hierarchical characteristics of the data blood margin, so that the data can be traced effectively, the data quality is verified, the incidence relation among the data is analyzed, the data management efficiency is improved finally, and the later-stage data analysis and utilization are facilitated;
the invention analyzes the whole process of data generation, circulation and extinction, finds the data blood relationship, delineates the whole data family relationship network graph, and carries out mutual evidence expansion on the data of each node in the network graph, thereby helping data management personnel or data management algorithm to trace source, verify, supplement and standardize the data and improving the data management efficiency.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart diagram of a data governance method based on data blood margin analysis;
FIG. 2 is a block diagram of a data governance system based on data blood margin analysis.
Detailed Description
The data governance method and system based on data blood relationship analysis according to the present invention will be described in detail with reference to the drawings and specific embodiments.
Example 1:
as shown in the attached drawing 1, the data governance method based on data blood relationship analysis of the invention constructs a data family relationship network map by analyzing the data blood relationship, and carries out mutual evidence expansion on each node data in the network map, thereby helping data governance personnel to complete the tracing, verification, supplement and standardization of data and improving the data governance efficiency; the method comprises the following specific steps:
s1, scheduling and storing the big data;
s2, performing blood relationship analysis on the data to form a data family map;
and S3, constructing a data map through the algorithm model.
The scheduling and storing of the big data of S1 in this embodiment are specifically as follows:
s101, relevant data resources are scheduled into an HBASE database through a NIFI data scheduler;
s102, in the dispatching process, the field names are subjected to standardization processing, and the highlight fields are subjected to data cleaning, so that blood relationship analysis is facilitated.
In this example, the data subjected to the blood-related analysis by S2 form a data family map as follows:
s201, finding the most basic data resource through data characteristics to serve as an information main node, and finding a data outflow node of the information main node to serve as a sub-node; when a parent node and a child node are searched, important field information is marked in the information main node so as to carry out mutual evidence analysis on data of each node;
s202, finding a data inflow node and a data outflow node of a child node, and forming a family data grid after mutual correlation;
s203, identifying a basic node, an inflow node and an outflow node by circles, identifying data inflow and data outflow by line segments with arrows, and starting to perform data blood margin analysis by taking the basic node as a main node;
s204, in the data blood relationship analysis process, table names and table key fields are marked in the circle in a key mode, association fields among the tables are marked clearly on a connecting line, and the data inflow nodes and the data outflow nodes are connected in sequence to form a data family map.
The data family map in this example includes the following elements:
host nodes: only one main node is positioned in the middle of the whole map and is a core node of the visual graph; the blood relationship displayed by the map is the blood relationship of the main node, and other blood relationship irrelevant to the node is not displayed on the graph so as to ensure the simplicity and the clarity of the graph;
data outflow node: the data flow-out node is a father node of the main node and represents a data source;
thirdly, data flows into the node: the data flow-in node is a child node of the main node and represents the destination of the data; the data inflow node also comprises a special node, namely a terminal node, wherein the terminal node is a special data outflow node and indicates that data is not circulated downwards;
data transfer circuit: the data flow path is a data flow path which flows from left to right; the data circulation line converges from the data inflow node to the host node and diffuses from the host node to the data outflow node.
The blood margin analysis method in this example is as follows:
firstly, a static analysis method: based on the compiling principle, objective reflection of data circulation is realized by scanning and syntax analysis of a source code and static analysis and listing of paths related to program logic;
② contact infection type analysis method: screening program commands related to data transmission and mapping to obtain key information for deep analysis;
third, the logic time sequence analysis method: in order to avoid the interference of redundant information, an indirect process of transmission and mapping which has no direct relation with data fields of the database, the file and the communication interface and intermediate variables of the program are converted into direct transmission and mapping among the data fields of the database, the file and the communication interface according to a program processing flow.
In this embodiment, the specific steps of constructing the data family map through the algorithm model in step S3 are as follows:
s301, abstracting a data table into objects, abstracting fields in the data table into object attributes, abstracting a data table and table relationship into object relationships, establishing a uniform body data model by taking the object and attribute set relationships as elements, and establishing mapping from a physical table to the body data model;
and S302, analyzing the relation of the data family data table through an algorithm model to form a data map, and extracting value data information. The following is an example of an algorithm for obtaining the bloody border relationship between tables by parsing the SQL syntax tree based on the DRUID:
Figure BDA0002943375300000061
Figure BDA0002943375300000071
the blood margin logical relationship between tables is analyzed through various tools such as the example's DRUID or spark's logcplan's and the like, the table and table association relationship is extracted through an algorithm, and a data model is established to form a data family map, help data analysts to trace data sources, control data quality and the like, and play an auxiliary role in data relationship analysis.
Example 2:
as shown in fig. 2, the data governance system based on data blood relationship analysis of the present invention includes a big data scheduling storage module, for scheduling and storing the analysis data;
the data blood relationship analysis module is used for analyzing the data relationship to generate a data family map;
and the algorithm model module is used for automatically analyzing the data association relationship to form a data map through the key field indexes of all the nodes, managing the data quality, analyzing the data relationship and extracting the data value.
The big data scheduling storage module in this embodiment includes,
the warehousing submodule is used for scheduling and warehousing the data;
the standardization submodule is used for carrying out standardization processing on the data fields in the scheduling process;
and the cleaning submodule is used for cleaning the key fields.
The data blood margin analysis module in this embodiment includes,
the query submodule I is used for querying the main data node;
the query submodule II is used for querying the data inflow node;
the query submodule III is used for querying the data outflow node;
a construction submodule for constructing a data family map;
and the identification submodule is used for identifying the important data field of the node.
Example 3:
the embodiment of the invention also provides a computer-readable storage medium, wherein a plurality of instructions are stored, and the instructions are loaded by a processor, so that the processor executes the data governance method based on the data blood margin analysis in any embodiment of the invention. Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a flexible disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-R data management method and system M, DVD-RW, DVD + RW based on data edge analysis), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A data governance method based on data blood relationship analysis is characterized in that the method is that a data family relation network map is constructed by analyzing the data blood relationship, and the data of each node in the network map is mutually verified and expanded, so that data governance personnel are helped to complete data tracing, verification, supplement and standardization, and the data governance efficiency is improved; the method comprises the following specific steps:
scheduling and storing the big data;
performing blood-related analysis on the data to form a data family map;
and constructing a data map through an algorithm model.
2. The data governance method based on data blood margin analysis according to claim 1, wherein the scheduling and storing of big data is as follows:
dispatching the relevant data resources into a database of HBASE through a data dispatcher of NIFI;
in the dispatching process, the field names are subjected to standardized processing, and the data of the focus fields are cleaned, so that the blood relationship analysis is facilitated.
3. The data governance method based on data blood-related analysis according to claim 1, wherein performing blood-related analysis on the data to form a data family map is as follows:
finding the most basic data resource through the data characteristics to serve as an information main node, and finding a data outflow node of the information main node to serve as a sub-node; when finding a parent node and a child node, identifying important field information in the information main node;
finding the data inflow node and the data outflow node of the child node, and forming a family data grid after mutual association;
identifying a basic node, an inflow node and an outflow node by using circles, identifying data inflow and data outflow by using line segments with arrows, and starting to analyze the blood margin of data by using the basic node as a main node;
in the data blood margin analysis process, table names and table key fields are marked in a circle in a key mode, association fields among tables are marked clearly on a connecting line, and all data inflow nodes and data outflow nodes are connected in sequence to form a data family map.
4. The data governance method based on data-based blood-margin analysis of claim 3, wherein the data family map comprises the following elements:
host nodes: only one main node is positioned in the middle of the whole map and is a core node of the visual graph; the blood relationship displayed by the map is the blood relationship of the main node;
data outflow node: the data flow-out node is a father node of the main node and represents a data source;
thirdly, data flows into the node: the data flow-in node is a child node of the main node and represents the destination of the data; the data inflow node also comprises a terminal node, and the terminal node indicates that the data is not circulated downwards any more;
data transfer circuit: the data flow path is a data flow path which flows from left to right; the data circulation line converges from the data inflow node to the host node and diffuses from the host node to the data outflow node.
5. The data governance method based on data blood margin analysis according to claim 3 or 4, wherein the blood margin analysis method is as follows:
firstly, a static analysis method: based on the compiling principle, objective reflection of data circulation is realized by scanning and syntax analysis of a source code and static analysis and listing of paths related to program logic;
② contact infection type analysis method: screening program commands related to data transmission and mapping to obtain key information for deep analysis;
third, the logic time sequence analysis method: according to the program processing flow, the indirect process and the program intermediate variable which are transmitted and mapped without direct relation with the data fields of the database, the file and the communication interface are converted into direct transmission and mapping among the data fields of the database, the file and the communication interface.
6. The data governance method based on data blood margin analysis according to claim 1, wherein the data family map is constructed by an algorithm model as follows:
abstracting a data table into an object, abstracting fields in the data table into object attributes, abstracting a data table and table relation into an object relation, establishing a uniform body data model by taking the object and attribute set relation as an element, and establishing mapping from a physical table to the body data model;
and analyzing the relation of the data family data table through an algorithm model to form a data map and extracting value data information.
7. A data governance system based on data blood margin analysis is characterized by comprising,
the big data scheduling and storing module is used for scheduling and storing the analysis data;
the data blood relationship analysis module is used for analyzing the data relationship to generate a data family map;
and the algorithm model module is used for automatically analyzing the data association relationship to form a data map through the key field indexes of all the nodes, managing the data quality, analyzing the data relationship and extracting the data value.
8. The data governance system based on data blooding margin analysis according to claim 7, wherein the big data schedule storage module comprises,
the warehousing submodule is used for scheduling and warehousing the data;
the standardization submodule is used for carrying out standardization processing on the data fields in the scheduling process;
and the cleaning submodule is used for cleaning the key fields.
9. The data governance system based on data margin analysis according to claim 7 or 8, wherein the data margin analysis module comprises,
the query submodule I is used for querying the main data node;
the query submodule II is used for querying the data inflow node;
the query submodule III is used for querying the data outflow node;
a construction submodule for constructing a data family map;
and the identification submodule is used for identifying the important data field of the node.
10. A computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, perform a data governance method based on data blood margin analysis as recited in claims 1 to 6.
CN202110187130.0A 2021-02-18 2021-02-18 Data treatment method and system based on data blood edge analysis Active CN112800149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110187130.0A CN112800149B (en) 2021-02-18 2021-02-18 Data treatment method and system based on data blood edge analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110187130.0A CN112800149B (en) 2021-02-18 2021-02-18 Data treatment method and system based on data blood edge analysis

Publications (2)

Publication Number Publication Date
CN112800149A true CN112800149A (en) 2021-05-14
CN112800149B CN112800149B (en) 2023-08-08

Family

ID=75815229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110187130.0A Active CN112800149B (en) 2021-02-18 2021-02-18 Data treatment method and system based on data blood edge analysis

Country Status (1)

Country Link
CN (1) CN112800149B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191879A (en) * 2021-05-21 2021-07-30 中国工商银行股份有限公司 Data transmission method, device, system and medium based on complex network
CN117131477A (en) * 2023-08-14 2023-11-28 南昌大学 Full-link data tracing method based on local data blood-edge digital watermark

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6618733B1 (en) * 2000-04-11 2003-09-09 Revelink Inc. View navigation for creation, update and querying of data objects and textual annotations of relations between data objects
CN104537129A (en) * 2015-01-30 2015-04-22 中国地质大学(武汉) Web based database virtual storage processing method
CN106649457A (en) * 2016-09-26 2017-05-10 天津海量信息技术股份有限公司 Data processing frame based on object relation mapping technology
CN106844693A (en) * 2017-01-24 2017-06-13 浙江大学 A kind of conversion methods of openEHR Template to relational database
CN110297872A (en) * 2019-06-28 2019-10-01 浪潮软件集团有限公司 A kind of building, querying method and the system of sciemtifec and technical sphere knowledge mapping
CN110399448A (en) * 2019-07-31 2019-11-01 浪潮软件集团有限公司 Chinese Place Names address searching matching process, terminal, computer readable storage medium
CN110471949A (en) * 2019-07-11 2019-11-19 阿里巴巴集团控股有限公司 Data consanguinity analysis method, apparatus, system, server and storage medium
CN110866123A (en) * 2019-11-06 2020-03-06 浪潮软件集团有限公司 Method for constructing data map based on data model and system for constructing data map
CN111324781A (en) * 2020-03-03 2020-06-23 南京领行科技股份有限公司 Data analysis method, device and equipment
CN111694858A (en) * 2020-04-28 2020-09-22 平安科技(深圳)有限公司 Data blood margin analysis method, device, equipment and computer readable storage medium
CN112115315A (en) * 2020-09-25 2020-12-22 平安国际智慧城市科技股份有限公司 Blood relationship data query method and device, computer equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6618733B1 (en) * 2000-04-11 2003-09-09 Revelink Inc. View navigation for creation, update and querying of data objects and textual annotations of relations between data objects
CN104537129A (en) * 2015-01-30 2015-04-22 中国地质大学(武汉) Web based database virtual storage processing method
CN106649457A (en) * 2016-09-26 2017-05-10 天津海量信息技术股份有限公司 Data processing frame based on object relation mapping technology
CN106844693A (en) * 2017-01-24 2017-06-13 浙江大学 A kind of conversion methods of openEHR Template to relational database
CN110297872A (en) * 2019-06-28 2019-10-01 浪潮软件集团有限公司 A kind of building, querying method and the system of sciemtifec and technical sphere knowledge mapping
CN110471949A (en) * 2019-07-11 2019-11-19 阿里巴巴集团控股有限公司 Data consanguinity analysis method, apparatus, system, server and storage medium
CN110399448A (en) * 2019-07-31 2019-11-01 浪潮软件集团有限公司 Chinese Place Names address searching matching process, terminal, computer readable storage medium
CN110866123A (en) * 2019-11-06 2020-03-06 浪潮软件集团有限公司 Method for constructing data map based on data model and system for constructing data map
CN111324781A (en) * 2020-03-03 2020-06-23 南京领行科技股份有限公司 Data analysis method, device and equipment
CN111694858A (en) * 2020-04-28 2020-09-22 平安科技(深圳)有限公司 Data blood margin analysis method, device, equipment and computer readable storage medium
CN112115315A (en) * 2020-09-25 2020-12-22 平安国际智慧城市科技股份有限公司 Blood relationship data query method and device, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
李亚洲;陈坚;: "论公安机关数据治理体系的创新", 江苏警官学院学报, no. 02 *
知乎会员ID: "数据治理:数据血缘关系分析", 《CSDN》 *
知乎会员ID: "数据治理:数据血缘关系分析", 《CSDN》, 16 July 2020 (2020-07-16), pages 1 - 7 *
程昌秀,周成虎,陆锋: "对象关系型GIS中改进基态修正时空数据模型的实现", 中国图象图形学报, no. 06 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191879A (en) * 2021-05-21 2021-07-30 中国工商银行股份有限公司 Data transmission method, device, system and medium based on complex network
CN117131477A (en) * 2023-08-14 2023-11-28 南昌大学 Full-link data tracing method based on local data blood-edge digital watermark
CN117131477B (en) * 2023-08-14 2024-03-29 南昌大学 Full-link data tracing method based on local data blood-edge digital watermark

Also Published As

Publication number Publication date
CN112800149B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
Motahari Nezhad et al. Protocol-aware matching of web service interfaces for adapter development
US8972460B2 (en) Data model optimization using multi-level entity dependencies
Aftab et al. Big data augmentation with data warehouse: A survey
US20070156736A1 (en) Method and apparatus for automatically detecting a latent referential integrity relationship between different tables of a database
CN106664224A (en) System and method for metadata enhanced inventory management of a communications system
CN106293891B (en) Multidimensional investment index monitoring method
CN112800149A (en) Data blood margin analysis-based data management method and system
CN109871470B (en) Power grid equipment data labeling management system and implementation method
CN109977175B (en) Data configuration query method and device
CN112035508A (en) SQL (structured query language) -based online metadata analysis method, system and equipment
Amiri et al. Data‐driven business process similarity
CN114461644A (en) Data acquisition method and device, electronic equipment and storage medium
Song et al. Matching heterogeneous events with patterns
US20110055373A1 (en) Service identification for resources in a computing environment
Benedetti et al. Exposing the underlying schema of LOD sources
Zou et al. Lachesis: automatic partitioning for UDF-centric analytics
CN105719072A (en) System and method for associating multistage assembly transactions
CN111506554B (en) Data labeling method and related device
CN112527796A (en) Data table processing method and device and computer readable storage medium
JP2010170287A (en) Data extraction system
CN116136843A (en) Multi-source heterogeneous mass data fusion sharing method under complex service scene
US20150286700A1 (en) Recording medium having stored thereon database access control program, method for controlling database access, and information processing apparatus
CN114443699A (en) Information query method and device, computer equipment and computer readable storage medium
Wang et al. A survey on data cleaning methods in cyberspace
Farooq et al. A layered approach for similarity measurement between ontologies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant