WO2021218021A1

WO2021218021A1 - Data-based blood relationship analysis method, apparatus, and device and computer-readable storage medium

Info

Publication number: WO2021218021A1
Application number: PCT/CN2020/118135
Authority: WO
Inventors: 黄祥铮; 李钊; 万书武; 李均; 赵素群
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-04-28
Filing date: 2020-09-27
Publication date: 2021-11-04
Also published as: CN111694858A

Abstract

The present application relates to the technical field of big data, and discloses a data-based blood relationship analysis method, apparatus, and device and a computer-readable storage medium, used to meet the data-based blood relationship analysis requirements of different types of databases during production practices. The method comprises: acquiring an input table and an output table of structured query language (SQL) statements currently being executed on a big data platform, and a blood relationship between the input table and the output table; converting each of the input table and the output table into an entity object in a system having a pre-configured type, and storing the entity objects in a pre-configured graph database; constructing a blood relationship graph between the entity objects in the graph database according to the blood relationship; receiving a mapping relationship between a service source table and a big data table sent by a data access platform; determining, in the blood relationship graph and according to the mapping relationship, a target entity object node to which an ancestor node is to be added; and adding the corresponding ancestor node to the target entity object node so as to acquire a target blood relationship graph.

Description

Data blood relationship analysis method, device, equipment and computer readable storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 28, 2020, the application number is 202010350107.4, and the invention title is "data blood relationship analysis method, device, equipment, and computer readable storage medium", and its entire content Incorporated in the application by reference.

Technical field

This application relates to the technical field of knowledge relationship analysis, in particular to data blood relationship analysis methods, devices, equipment, and computer-readable storage media.

Background technique

With the rapid development of Internet technology, massive amounts of business data are generated every day. Faced with the ever-increasing massive amounts of data, the governance of data has increasingly become an important focus of major companies, especially when big data enters the daily operations of major companies. In the current analysis of the decision-making system, when certain data changes, how to accurately trace the source of the data and how to analyze the impact of the data has become an important topic.

Blood relationship analysis is a relatively common method in the field of data governance. Blood relationship analysis finds all related metadata objects starting from a certain data object and the relationship between these metadata objects through a comprehensive tracking of the data processing process, and can realize data Traceability of fusion processing. The inventor found that with regard to data blood relationship management, there are currently data blood relationship analysis tools based on relational databases or big data platforms on the market. These analysis tools can only perform blood relationship analysis on data in a single type of database, which cannot meet the requirements of production practice. Data blood relationship analysis requirements for different types of databases.

Summary of the invention

The main purpose of this application is to propose a data blood relationship analysis method, device, equipment, and computer readable storage medium, which are designed to meet the data blood relationship analysis needs of different types of databases in production practice.

The first aspect of the present application provides a data blood relationship analysis method. The data blood relationship analysis method includes the following steps:

Acquiring the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

Converting the input table and the output table into entity objects under a preset type system, respectively, and storing the entity objects in a preset graphic database;

Constructing a graph of the blood relationship between the entity objects in the graphic database according to the blood relationship;

Receive the mapping relationship between the service source table and the big data table sent by the data access platform, where the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to the big data platform. Data table, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table;

According to the mapping relationship, determine the target entity object node to which the ancestor node is to be added in the blood relationship graph;

A corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, where the ancestor node is used to represent the entity object converted from the business source table of the target entity object.

A second aspect of the present application provides a data blood relationship analysis device, and the data blood relationship analysis device includes:

The obtaining module is used to obtain the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

A conversion module, configured to convert the input table and the output table into entity objects under a preset type system, respectively, and store the entity objects in a preset graphic database;

A construction module, configured to construct a blood relationship graph between the entity objects in the graphic database according to the blood relationship;

The receiving module is used to receive the mapping relationship between the service source table and the big data table sent by the data access platform, wherein the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to The big data table of the big data platform, and in the process of extracting business data, the mapping relationship between the business source table and the big data table is recorded;

A determining module, configured to determine a target entity object node to which an ancestor node is to be added in the blood relationship graph according to the mapping relationship;

The adding module is used to add a corresponding ancestor node to the target entity object node to obtain the target blood relationship graph, wherein the ancestor node is used to represent the entity object transformed from the business source table of the target entity object.

A third aspect of the present application provides a data blood relationship analysis device. The data blood relationship analysis device includes a memory and at least one processor, the memory stores instructions, and the memory and the at least one processor communicate with each other through a wire. Connect; the at least one processor calls the instructions in the memory, so that the data blood relationship analysis device executes the steps of the data blood relationship analysis method as described below:

The fourth aspect of the present application provides a computer-readable storage medium having instructions stored in the computer-readable storage medium, which when run on a computer, cause the computer to execute the steps of the data blood relationship analysis method as described below:

This application obtains the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table; compare the input table and the output table Respectively transforming into entity objects under a preset type system, storing the entity objects in a preset graphic database; constructing a blood relationship graph between the entity objects in the graphic database according to the blood relationship; Receive the mapping relationship between the service source table and the big data table sent by the data access platform, where the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to the big data platform. Data table, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table; according to the mapping relationship, determine the target of the ancestor node to be added in the blood relationship graph Entity object node; adding a corresponding ancestor node to the target entity object node to obtain the target blood relationship graph, wherein the ancestor node is used to represent the entity object transformed from the business source table of the target entity object. In this way, the target blood relationship graph is generated by combining the business source table of the relational business database, the big data table of the big data platform, and the blood relationship between them, and realizes the metadata of the relational data and the big data type data. Governance is integrated to meet the data blood relationship analysis needs of different types of databases in production practice.

Description of the drawings

FIG. 1 is a schematic flowchart of an embodiment of a method for analyzing data blood relationship of this application;

2 is a schematic diagram of the communication architecture between the data blood relationship analysis platform and other business platforms in an embodiment of the application;

3 is a schematic diagram of the blood relationship diagram of the big data table in the embodiment of the application;

4 is a schematic diagram of updating the blood relationship map in FIG. 3;

FIG. 5 is a schematic diagram of modules of an embodiment of the data blood relationship analysis device of this application;

FIG. 6 is a schematic structural diagram of a data blood relationship analysis device provided by an embodiment of the application.

Detailed ways

The embodiment of the application provides a data blood relationship analysis method, device, equipment, and computer-readable storage medium, which are generated by combining the business source table of the relational business database, the big data table of the big data platform, and the blood relationship between them The target blood relationship map realizes the integration of metadata governance of relational data and big data type data, and meets the data blood relationship analysis needs of different types of databases in production practice.

The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects, without having to use To describe a specific order or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances so that the embodiments described herein can be implemented in a sequence other than the content illustrated or described herein. In addition, the terms "including" or "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not necessarily limited to those clearly listed. Steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.

For ease of understanding, the specific process of the embodiment of the data blood relationship analysis method of the present application will be described below.

Referring to Fig. 1, Fig. 1 is a schematic flow chart of an embodiment of a method for data blood relationship analysis according to the present application, and the method includes:

Step 101: Obtain the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

In this embodiment, the data blood relationship analysis method is applied to a server, and the server is equipped with a data blood relationship analysis platform. Referring to Figure 2, Figure 2 is a schematic diagram of the communication architecture between the data blood relationship analysis platform and other business platforms in an embodiment of the application. The communication architecture includes a data blood relationship analysis platform, a data access platform, a big data platform, and a relational business database. in:

The data access platform is responsible for extracting business data from the relational business database and transferring it to the big data platform. At the same time, it records the mapping relationship between the business source table and the big data table, and stores the mapping relationship in the supporting database of the data access platform In, timing synchronization to the data blood relationship analysis platform;

The big data platform is responsible for obtaining the flow relationship between the big data tables in the big data platform through the structured query language (SQL) statements currently executed on the big data platform and sending it to the data blood relationship analysis platform;

The data blood relationship analysis platform is responsible for generating a blood relationship graph based on the mapping relationship between the business source table and the big data table, and the circulation relationship between the big data tables in the big data platform, so as to display the data blood relationship in a visual way.

It should be noted that the relational database is a database widely used in the production practice of enterprises. The relational database and big data platform in this embodiment are determined according to actual business needs. For example, the relational database can be MySQL, Oracle, SQL Server, For relational databases such as Postgre SQL, the big data platform can be Hadoop, Spark, Storm and other big data platforms.

First, the server obtains the input table and output table of the SQL statement executed on the Hadoop big data platform, as well as the blood relationship between the input table and the output table. The input table represents the source table input when the SQL statement is executed, and the output table represents the execution. The target table output in the SQL statement, and the blood relationship between the input table and the output table can be obtained by parsing the SQL statement.

In one embodiment, the above step 101 may include: monitoring the structured query language SQL statement currently executed on the big data platform through a preset hook program; The SQL statement is parsed, and the input table, output table of the SQL statement, and the blood relationship between the input table and the output table are obtained.

Specifically, a hook program can be set in the server in advance, and the SQL statement currently executed on the big data platform can be monitored through the hook program. After that, the server parses the SQL statement into " Input” (input) and “Output” (output) two data sets, and from these two data sets, the input table, output table of the SQL statement, and the blood relationship between the input table and the output table are obtained.

For example, if the hook program listens to the SQL statement currently being executed on the big data platform: "insert overwrite table T1 select*from T2" (overwrite the data in table T2 into table T1), then use the preset syntax The parser and the lexical parser can parse the SQL statement into: input table T2, output table T1, and T2 is the source table of T1.

Step 102: Convert the input table and the output table into entity objects under a preset type system respectively, and store the entity objects in a preset graphics database;

In computer science, the Type System is used to define how to classify values and expressions in programming languages into many different types, how to manipulate these types, and how these types interact with each other. The graph database is a non-relational database, which uses graph theory to store the relational information between entities.

In this step, the server converts the input table and output table into entity objects under the preset type system, and stores the entity objects in the preset graph database. Taking the graph database JanusGraph as an example, JanusGraph is mainly composed of two parts :

1. Hbase, Hbase is a distributed, column-oriented, high-performance, non-relational database that supports real-time reading and writing. Through Hbase, specific entity objects generated by the type system can be stored in real time, as well as the blood relationship of the entity objects;

2. ElasticSearch, ElasticSearch is a distributed and scalable real-time search and analysis engine. Through ElasticSearch, an index is created for the entity objects in Hbase, and the entity objects and their blood relationships can be quickly retrieved in real time.

In this embodiment, the server can store the entity object in Hbase.

Step 103: Construct a blood relationship graph between the entity objects in the graphic database according to the blood relationship;

In this step, the server constructs a blood relationship graph between the entity objects in the graph database according to the blood relationship between the input table and the output table.

Further, this step 103 may include: invoking a preset graph processing engine, and creating entity object nodes corresponding to the input table and output table one-to-one in the graph database through the graph processing engine; Directed edges are added between them to generate a graph of blood relationship between entity objects.

In this embodiment, the graph processing engine can be Graph Engine, which is a memory-based distributed large-scale graph data processing engine. Through Graph Engine, entities corresponding to input tables and output tables can be created in the graph database. Object nodes, and then add directed edges between the created entity object nodes according to the blood relationship between the tables, and a visualized blood relationship graph of the big data table can be generated.

For example, currently the following two SQL statements have been executed on the big data platform:

1. Insert overwrite table test_org_info select*from tmp1_org_info (overwrite the data in the table "tmp1_org_info" into the table "test_org_info");

2. Insert overwrite table tmp1_org_info select* from tmp2_org_info (overwrite the data in the table "tmp2_org_info" and insert it into the table "tmp1_org_info").

The constructed blood relationship map can refer to Figure 3, which is a schematic diagram of the blood relationship map of the big data table in the embodiment of the application. The ancestor node of the table "tmp1_org_info" in the figure is the table "tmp2_org_info", and the descendant nodes are the table "test_org_info". , The whole blood relationship is clear at a glance.

Step 104: Receive the mapping relationship between the service source table and the big data table sent by the data access platform, where the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to the big data platform. Big data table, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table;

In this step, when the data access platform extracts business data from the business source table of the relational business database and transfers it to the big data table of the big data platform, it records the mapping relationship between the business source table and the big data table, and adds The mapping relationship is synchronized to the data blood relationship analysis platform at regular intervals, and the data blood relationship analysis platform receives the mapping relationship between the business source table and the big data table sent by the data access platform, thereby providing a prerequisite guarantee for the subsequent generation of the target blood relationship map.

Step 105: According to the mapping relationship, determine the target entity object node to which the ancestor node is to be added in the blood relationship graph;

In this step, the server determines the target entity object node of the ancestor node to be added in the blood relationship graph generated above according to the mapping relationship between the business source table and the big data table.

Further, this step 105 may include: obtaining the table name of the big data table in the mapping relationship; judging whether there is an entity object node corresponding to the table name in the blood relationship graph; if there is an entity object corresponding to the table name in the blood relationship graph Node, the entity object node corresponding to the table name is determined as the target entity object node of the ancestor node to be added.

In this embodiment, the server obtains the table name of the big data table in the mapping relationship between the business source table and the big data table, and then determines whether there is an entity object node corresponding to the table name in the blood relationship graph, and if it exists, Explain that the table data of the entity object node comes from the business source table of the relational business database. At this time, the entity object node is determined as the target entity object node of the ancestor node to be added. If it does not exist, the blood relationship graph is directly determined For the target blood relationship map.

Step 106: Add a corresponding ancestor node to the target entity object node to obtain the target blood relationship graph, where the ancestor node is used to represent the entity object converted from the business source table of the target entity object.

In this step, the server obtains the business source table corresponding to the target entity object node, converts the business source table into entity objects under the preset type system, and adds the entity object as the ancestor node of the target entity object node. In the blood relationship map, the target blood relationship map is obtained, thereby completing the complete blood relationship link from the relational business database table to the big data table.

In this embodiment, the target blood relationship graph is generated by combining the business source table of the relational business database, the big data table of the big data platform, and the blood relationship between them, and realizes the metadata of the relational data and the big data type data. Governance is integrated to meet the data blood relationship analysis needs of different types of databases in production practice.

Further, based on the first embodiment of the data blood relationship analysis method in this application, a second embodiment of the data blood relationship analysis method in this application is proposed.

In this embodiment, after the above step 106, it may further include:

Determine the entity object node to be analyzed in the target blood relationship graph;

In this step, the server can receive a selection instruction triggered by the user to select the entity object node to be analyzed in the target blood relationship graph; of course, the server can also use the preset entity object node as the entity object node to be analyzed, where: Analysis refers to the analysis of the business involved in the entity object node.

Obtain the business associated with the entity object node to be analyzed, and count the number of chains containing the blood relationship chain of the entity object node to be analyzed;

In this step, the server can read the preset business configuration file to obtain the business associated with the entity object node to be analyzed. In addition, since there may be multiple blood relationship links for an entity object node, the server can periodically collect statistics and statistics. The number of chains in the blood relationship chain of the analyzed entity object node. The number of chains indicates the reference situation of the entity object node to be analyzed. The more the number of chains, the more popular the business involved in the entity object node.

The number of chains is compared with the first preset threshold and the second preset threshold. The first preset threshold is greater than the second preset threshold; when the number of chains is greater than or equal to the first preset threshold, the physical object to be analyzed is compared The business associated with the node is marked as a hot business; when the number of chains is less than or equal to the second preset threshold, the business associated with the entity object node to be analyzed is marked as an unpopular business.

In this step, the server compares the obtained number of chains with the first preset threshold and the second preset threshold respectively. When the number of chains is greater than or equal to the first preset threshold, it means that the entity object node to be analyzed is frequently cited , The business associated with it becomes more popular. At this time, the server marks the business associated with the entity object node to be analyzed as a hot business. On the contrary, when the number of chains is less than or equal to the second preset threshold, the entity object to be analyzed is indicated Nodes are cited less frequently, and the services associated with them are relatively indifferent. At this time, the server marks the services associated with the entity object node to be analyzed as unpopular services. In addition, the server can also send marked hot and unpopular businesses to the front-end page for display. For popular businesses, relevant business maintenance and attention can be strengthened in production, and for unpopular businesses, improvements may be needed.

Through the above method, the popularity analysis of the business related to the entity object node in the target blood relationship graph is realized, which is convenient for managers to understand the hot and cold conditions of the business department and adjust the production plan of the business department in time.

Further, based on the first embodiment of the data blood relationship analysis method in this application, a third embodiment of the data blood relationship analysis method in this application is proposed.

In this embodiment, after step 106, it may further include: receiving a query instruction based on the target blood relationship graph through a preset user interaction page; according to the query instruction, sending the target data blood relationship analysis graph to the user interaction page for visualization exhibit.

In this embodiment, the data blood relationship analysis platform can provide a user interaction page and an open application programming interface to provide real-time query and search services to managers or other external systems. Specifically, the server may receive a query instruction based on the target blood relationship map through a preset user interaction page, and then according to the query instruction, send the target data blood relationship analysis map to the user interaction page for visual display.

Through the visual display of the blood relationship of the data, the ancestral data of the business data can be clearly understood, and the occurrence of temporary production can be quickly traced back to the accurate source, and the cause of the incident can be analyzed in time to improve production measures.

Further, after the step of sending the blood relationship analysis graph of the target data to the user interaction page for visual display according to the query instruction, it may further include:

According to the preset receiving frequency, receive the mapping relationship between the business source table and the big data table sent by the data access platform; determine whether the mapping relationship is updated, and detect whether a new SQL statement is executed on the big data platform; If the relationship is updated, or a new SQL statement is executed on the big data platform, the target blood relationship graph will be updated accordingly.

In this embodiment, the server can receive the mapping relationship between the service source table and the big data table sent by the data access platform according to the preset receiving frequency, and determine whether the mapping relationship is updated, and at the same time, detect the big data platform Whether a new SQL statement is executed on the database, if the mapping relationship is updated, or a new SQL statement is executed on the big data platform, the target blood relationship graph will be updated accordingly.

Referring to FIG. 4, FIG. 4 is a schematic diagram of updating the blood relationship map in FIG. When it is detected that a new SQL statement is executed on the big data platform: "insert overwrite table test_org_info select*from delta1_org_info" (the data in the table "delta1_org_info" is inserted into the table "test_org_info"), the blood of the table "test_org_info" The relationship chain has become two, converged in the table "test_org_info" node.

Through the above method, the real-time update of the target blood relationship map is realized, which provides a guarantee for accurate business traceability and impact analysis.

The embodiment of the present application also provides a data blood relationship analysis device.

Referring to FIG. 5, FIG. 5 is a schematic diagram of modules of an embodiment of the data blood relationship analysis device of the present application. In this embodiment, the data blood relationship analysis device includes:

The obtaining module 501 is configured to obtain the input table and the output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

The conversion module 502 is configured to convert the input table and the output table into entity objects under a preset type system, respectively, and store the entity objects in a preset graphics database;

The construction module 503 is configured to construct a blood relationship graph between the entity objects in the graphic database according to the blood relationship;

The receiving module 504 is configured to receive the mapping relationship between the service source table and the big data table sent by the data access platform, wherein the data access platform is used to extract the service data from the service source table of the relational service database and save it To the big data table of the big data platform, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table;

The determining module 505 is configured to determine the target entity object node to which the ancestor node is to be added in the blood relationship graph according to the mapping relationship;

The adding module 506 is configured to add a corresponding ancestor node to the target entity object node to obtain a target blood relationship graph, where the ancestor node is used to represent the entity object transformed from the business source table of the target entity object.

Optionally, the obtaining module 501 is further configured to:

Through the preset hook program, monitor the structured query language SQL statement currently executed on the big data platform;

Through the preset syntax parser and lexical parser, the monitored SQL statement is parsed to obtain the input table, output table of the SQL statement, and the blood relationship between the input table and the output table .

Optionally, the building module 503 is also used to:

Calling a preset graph processing engine, and creating entity object nodes corresponding to the input table and the output table one-to-one in the graph database through the graph processing engine;

According to the blood relationship, a directed edge is added between the created entity object nodes to generate a blood relationship graph between the entity objects.

Optionally, the determining module 505 is further configured to:

Acquiring the table name of the big data table in the mapping relationship;

Judging whether there is an entity object node corresponding to the table name in the blood relationship graph;

If there is an entity object node corresponding to the table name in the blood relationship graph, the entity object node corresponding to the table name is determined as the target entity object node of the ancestor node to be added.

Optionally, the data blood relationship analysis device further includes a service marking module, and the service marking module is configured to:

Acquiring the business associated with the entity object node to be analyzed, and counting the number of chains containing the blood relationship chain of the entity object node to be analyzed;

Comparing the number of chains with a first preset threshold and a second preset threshold respectively, where the first preset threshold is greater than the second preset threshold;

When the number of chains is greater than or equal to the first preset threshold, mark the business associated with the entity object node to be analyzed as a hot business;

When the number of chains is less than or equal to the second preset threshold, the service associated with the entity object node to be analyzed is marked as an unpopular service.

Optionally, the data blood relationship analysis device further includes a query module, and the query module is configured to:

Receiving a query instruction based on the target blood relationship graph through a preset user interaction page;

According to the query instruction, the blood relationship analysis graph of the target data is sent to the user interaction page for visual display.

Optionally, the data blood relationship analysis device further includes an update module, and the update module is configured to:

Receiving the mapping relationship between the service source table and the big data table sent by the data access platform according to a preset receiving frequency;

Judging whether the mapping relationship has been updated, and detecting whether a new SQL statement is executed on the big data platform;

If there is an update of the mapping relationship, or a new SQL statement is executed on the big data platform, the target blood relationship graph is updated correspondingly.

The functional realization and beneficial effects of each module in the data blood relationship analysis device correspond to the steps in the embodiment of the data blood relationship analysis method, and will not be repeated here.

The data blood relationship analysis device in the embodiment of the present application is described in detail above from the perspective of a modular functional entity, and the data blood relationship analysis device in the embodiment of the present application is described in detail below from the perspective of hardware processing.

Referring to FIG. 6, FIG. 6 is a schematic structural diagram of a data blood relationship analysis device provided by an embodiment of the application. The data blood relationship analysis device 600 may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 610 (for example, one or more processors) and a memory 620. Or more than one storage medium 630 (for example, one or one storage device with a large amount of storage) storing application programs 533 or data 632. Among them, the memory 620 and the storage medium 630 may be short-term storage or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the data blood relationship analysis device 600. Further, the processor 610 may be configured to communicate with the storage medium 630, and execute a series of instruction operations in the storage medium 630 on the data blood relationship analysis device 600.

The data blood relationship analysis device 600 may also include one or more power supplies 640, one or more wired or wireless network interfaces 650, one or more input and output interfaces 660, and/or one or more operating systems 631, such as Windows Serve , Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art can understand that the structure of the data blood relationship analysis device shown in FIG. 6 does not constitute a limitation on the data blood relationship analysis device, and may include more or less components than shown in the figure, or a combination of certain components, or different components. Component arrangement.

The present application also provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium. A data blood relationship analysis program is stored, and when the data blood relationship analysis program is executed by a processor, the steps of the data blood relationship analysis method described above are realized.

Wherein, the method and beneficial effects achieved when the data blood relationship analysis program running on the processor is executed can refer to each embodiment of the data blood relationship analysis method of the present application, which will not be repeated here.

Those skilled in the art can understand that if the aforementioned integrated modules or units are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

The above embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still compare the previous embodiments. The recorded technical solutions are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

A data blood relationship analysis method, wherein the data blood relationship analysis method includes the following steps:

Acquiring the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

Converting the input table and the output table into entity objects under a preset type system, respectively, and storing the entity objects in a preset graphic database;

Constructing a graph of the blood relationship between the entity objects in the graphic database according to the blood relationship;

Receive the mapping relationship between the service source table and the big data table sent by the data access platform, where the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to the big data platform. Data table, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table;

According to the mapping relationship, determine the target entity object node to which the ancestor node is to be added in the blood relationship graph;

A corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, where the ancestor node is used to represent the entity object converted from the business source table of the target entity object.
The data blood relationship analysis method according to claim 1, wherein said obtaining the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the relationship between the input table and the output table The steps of blood relationship include:

Through the preset hook program, monitor the structured query language SQL statement currently executed on the big data platform;

Through the preset syntax parser and lexical parser, the monitored SQL statement is parsed to obtain the input table, output table of the SQL statement, and the blood relationship between the input table and the output table .
The data blood relationship analysis method of claim 1, wherein the step of constructing a blood relationship graph between the entity objects in the graph database according to the blood relationship comprises:

Calling a preset graph processing engine, and creating entity object nodes corresponding to the input table and the output table one-to-one in the graph database through the graph processing engine;

According to the blood relationship, a directed edge is added between the created entity object nodes to generate a blood relationship graph between the entity objects.
The data blood relationship analysis method according to claim 1, wherein the step of determining the target entity object node of the ancestor node to be added in the blood relationship graph according to the mapping relationship comprises:

Acquiring the table name of the big data table in the mapping relationship;

Judging whether there is an entity object node corresponding to the table name in the blood relationship graph;

If there is an entity object node corresponding to the table name in the blood relationship graph, the entity object node corresponding to the table name is determined as the target entity object node of the ancestor node to be added.
The data blood relationship analysis method according to any one of claims 1 to 4, wherein the corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, wherein the ancestor node is used to represent After the step of transforming the entity object obtained from the business source table of the target entity object, it further includes:

Determine the entity object node to be analyzed in the target blood relationship graph;

Acquiring the business associated with the entity object node to be analyzed, and counting the number of chains containing the blood relationship chain of the entity object node to be analyzed;

Comparing the number of chains with a first preset threshold and a second preset threshold respectively, where the first preset threshold is greater than the second preset threshold;

When the number of chains is greater than or equal to the first preset threshold, mark the business associated with the entity object node to be analyzed as a hot business;

When the number of chains is less than or equal to the second preset threshold, the service associated with the entity object node to be analyzed is marked as an unpopular service.
The data blood relationship analysis method according to any one of claims 1 to 4, wherein the corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, wherein the ancestor node is used to represent After the step of transforming the entity object obtained from the business source table of the target entity object, it further includes:

Receiving a query instruction based on the target blood relationship graph through a preset user interaction page;

According to the query instruction, the blood relationship analysis graph of the target data is sent to the user interaction page for visual display.
The data blood relationship analysis method according to claim 6, wherein after the step of sending the target data blood relationship analysis graph to the user interaction page for visual display according to the query instruction, the method further comprises:

Receiving the mapping relationship between the service source table and the big data table sent by the data access platform according to a preset receiving frequency;

Judging whether the mapping relationship has been updated, and detecting whether a new SQL statement is executed on the big data platform;

If there is an update of the mapping relationship, or a new SQL statement is executed on the big data platform, the target blood relationship graph is updated correspondingly.
A data blood relationship analysis device, wherein the data blood relationship analysis device includes:

The obtaining module is used to obtain the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

A conversion module, configured to convert the input table and the output table into entity objects under a preset type system, respectively, and store the entity objects in a preset graphic database;

A construction module, configured to construct a blood relationship graph between the entity objects in the graphic database according to the blood relationship;

The receiving module is used to receive the mapping relationship between the service source table and the big data table sent by the data access platform, wherein the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to The big data table of the big data platform, and in the process of extracting business data, the mapping relationship between the business source table and the big data table is recorded;

The determining module is configured to determine the target entity object node of the ancestor node to be added in the blood relationship graph according to the mapping relationship;

The adding module is used to add a corresponding ancestor node to the target entity object node to obtain the target blood relationship graph, wherein the ancestor node is used to represent the entity object transformed from the business source table of the target entity object.
A data blood relationship analysis device, wherein the data blood relationship analysis device includes a memory and at least one processor, the memory stores instructions, and the memory and the at least one processor are interconnected by wires;

The at least one processor invokes the instructions in the memory, so that the data blood relationship analysis device executes the steps of the data blood relationship analysis method as described below:

Acquiring the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

Converting the input table and the output table into entity objects under a preset type system, respectively, and storing the entity objects in a preset graphic database;

Constructing a graph of the blood relationship between the entity objects in the graphic database according to the blood relationship;

Receive the mapping relationship between the service source table and the big data table sent by the data access platform, where the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to the big data platform. Data table, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table;

According to the mapping relationship, determine the target entity object node to which the ancestor node is to be added in the blood relationship graph;

A corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, where the ancestor node is used to represent the entity object converted from the business source table of the target entity object.
The data blood relationship analysis device according to claim 9, wherein said obtaining the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the relationship between the input table and the output table The blood relationship includes the following steps:

Through the preset hook program, monitor the structured query language SQL statement currently executed on the big data platform;

Through the preset syntax parser and lexical parser, the monitored SQL statement is parsed to obtain the input table, output table of the SQL statement, and the blood relationship between the input table and the output table .
9. The data blood relationship analysis device according to claim 9, wherein said constructing a blood relationship graph between said entity objects in said graphic database according to said blood relationship includes the following steps:

Calling a preset graph processing engine, and creating entity object nodes corresponding to the input table and the output table one-to-one in the graph database through the graph processing engine;

According to the blood relationship, a directed edge is added between the created entity object nodes to generate a blood relationship graph between the entity objects.
The data blood relationship analysis device according to claim 9, wherein the determining the target entity object node of the ancestor node to be added in the blood relationship graph according to the mapping relationship comprises the following steps:

Acquiring the table name of the big data table in the mapping relationship;

Judging whether there is an entity object node corresponding to the table name in the blood relationship graph;

If there is an entity object node corresponding to the table name in the blood relationship graph, the entity object node corresponding to the table name is determined as the target entity object node of the ancestor node to be added.
The data blood relationship analysis device according to any one of claims 9-12, wherein the corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, wherein the ancestor node is used to represent After the step of transforming the entity object obtained from the business source table of the target entity object, it further includes the following steps:

Determine the entity object node to be analyzed in the target blood relationship graph;

Acquiring the business associated with the entity object node to be analyzed, and counting the number of chains containing the blood relationship chain of the entity object node to be analyzed;

Comparing the number of chains with a first preset threshold and a second preset threshold respectively, where the first preset threshold is greater than the second preset threshold;

When the number of chains is greater than or equal to the first preset threshold, mark the business associated with the entity object node to be analyzed as a hot business;

When the number of chains is less than or equal to the second preset threshold, the service associated with the entity object node to be analyzed is marked as an unpopular service.
The data blood relationship analysis device according to any one of claims 9-12, wherein the corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, wherein the ancestor node is used to represent After the step of transforming the entity object obtained from the business source table of the target entity object, it further includes the following steps:

Receiving a query instruction based on the target blood relationship graph through a preset user interaction page;

According to the query instruction, the blood relationship analysis graph of the target data is sent to the user interaction page for visual display.
The data blood relationship analysis device according to claim 14, wherein after the step of sending the target data blood relationship analysis graph to the user interaction page for visual display according to the query instruction, the method further comprises the following steps:

Receiving the mapping relationship between the service source table and the big data table sent by the data access platform according to a preset receiving frequency;

Judging whether the mapping relationship has been updated, and detecting whether a new SQL statement is executed on the big data platform;

If there is an update of the mapping relationship, or a new SQL statement is executed on the big data platform, the target blood relationship graph is updated correspondingly.
A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein when the computer program is executed by a processor, the steps of the data blood relationship analysis method described below are executed:

Acquiring the input table and output table of the structured query language SQL statement currently executed on the big data platform, and the blood relationship between the input table and the output table;

Converting the input table and the output table into entity objects under a preset type system, respectively, and storing the entity objects in a preset graphic database;

Constructing a graph of the blood relationship between the entity objects in the graphic database according to the blood relationship;

Receive the mapping relationship between the service source table and the big data table sent by the data access platform, where the data access platform is used to extract the service data from the service source table of the relational service database and transfer it to the big data platform. Data table, and in the process of extracting business data, record the mapping relationship between the business source table and the big data table;

According to the mapping relationship, determine the target entity object node to which the ancestor node is to be added in the blood relationship graph;

A corresponding ancestor node is added to the target entity object node to obtain a target blood relationship graph, where the ancestor node is used to represent the entity object converted from the business source table of the target entity object.
The computer-readable storage medium according to claim 16, wherein the computer program of the data blood relationship analysis method is executed by the processor to obtain the input table of the structured query language SQL statement currently executed on the big data platform , The output table, and the steps of the blood relationship between the input table and the output table include the following steps:

Through the preset hook program, monitor the structured query language SQL statement currently executed on the big data platform;

Through the preset syntax parser and lexical parser, the monitored SQL statement is parsed to obtain the input table, output table of the SQL statement, and the blood relationship between the input table and the output table .
The computer-readable storage medium according to claim 16, wherein the computer program of the data blood relationship analysis method is executed by the processor, and the physical object is constructed in the graphic database according to the blood relationship. The steps of the blood relationship map include the following steps:

Calling a preset graph processing engine, and creating entity object nodes corresponding to the input table and the output table one-to-one in the graph database through the graph processing engine;

According to the blood relationship, a directed edge is added between the created entity object nodes to generate a blood relationship graph between the entity objects.
The computer-readable storage medium according to claim 16, wherein the computer program of the data blood relationship analysis method is executed by the processor, and the ancestor node to be added is determined in the blood relationship graph according to the mapping relationship. The steps of the target entity object node include the following steps:

Acquiring the table name of the big data table in the mapping relationship;

Judging whether there is an entity object node corresponding to the table name in the blood relationship graph;

If there is an entity object node corresponding to the table name in the blood relationship graph, the entity object node corresponding to the table name is determined as the target entity object node of the ancestor node to be added.
The computer-readable storage medium according to any one of claims 16-19, wherein the computer program of the data blood relationship analysis method is executed by the processor, and the corresponding ancestor node is added to the target entity object node , After obtaining the target blood relationship graph, wherein the ancestor node is used to represent the entity object obtained from the business source table of the target entity object, and further includes the following steps:

Determine the entity object node to be analyzed in the target blood relationship graph;

Acquiring the business associated with the entity object node to be analyzed, and counting the number of chains containing the blood relationship chain of the entity object node to be analyzed;

Comparing the number of chains with a first preset threshold and a second preset threshold respectively, where the first preset threshold is greater than the second preset threshold;

When the number of chains is greater than or equal to the first preset threshold, mark the business associated with the entity object node to be analyzed as a hot business;

When the number of chains is less than or equal to the second preset threshold, the service associated with the entity object node to be analyzed is marked as an unpopular service.