CN117056369A - Data blood edge processing method, device, equipment and medium - Google Patents

Data blood edge processing method, device, equipment and medium Download PDF

Info

Publication number
CN117056369A
CN117056369A CN202311015269.2A CN202311015269A CN117056369A CN 117056369 A CN117056369 A CN 117056369A CN 202311015269 A CN202311015269 A CN 202311015269A CN 117056369 A CN117056369 A CN 117056369A
Authority
CN
China
Prior art keywords
data
hook
blood
data table
related information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311015269.2A
Other languages
Chinese (zh)
Inventor
丁锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202311015269.2A priority Critical patent/CN117056369A/en
Publication of CN117056369A publication Critical patent/CN117056369A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data blood edge processing method, a device, equipment and a medium, which can be applied to the field of big data or the field of finance. The method comprises the following steps: based on hook plug-in, intercepting a database operation request and analyzing to obtain related information of a data table; the related information of the data table is used for representing the blood margin relation among the data tables and the blood margin relation among the fields in the data tables; asynchronously sending the related information of the data table to a message queue; consuming the related information of the data table output by the message queue, and storing the related information of the data table in a target database; the target database is used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables. Therefore, the hook plug-in can be used for directly intercepting the database operation request and analyzing the blood-edge relation related to the data table, so that the data blood-edge analysis can be rapidly realized, and the processing efficiency and the accuracy are improved.

Description

Data blood edge processing method, device, equipment and medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to a method, an apparatus, a device, and a medium for processing a data blood edge.
Background
The data blood-edge, also called the blood-edge relationship of data, refers to a data relationship that is naturally formed in the whole life cycle of generation, processing, fusion, and flow to final extinction. Since the full life cycle of the data finally needs to be integrated back to enable the actual business, the analysis and processing process of the data blood edges has important significance for the actual business.
Taking a financial institution as an example, the financial institution has a plurality of business systems and complex data structures. In the conventional data blood-edge processing method, the method is mainly implemented by means of extraction, conversion and loading (ETL) operation of analysis data. However, since the ETL job is large and the processing procedure is complicated, errors are likely to occur, and therefore, the processing efficiency of the method for performing data blood-edge analysis based on the ETL job is not high, and the accuracy of the blood-edge relationship obtained by the processing is not good.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a medium for processing data blood edges, which are used for improving the processing efficiency of the data blood edges and the accuracy of the obtained blood edge relationship.
In a first aspect, an embodiment of the present application provides a method for processing a data blood edge, including:
based on hook plug-in, intercepting a database operation request and analyzing to obtain related information of a data table; the related information of the data table is used for representing the blood margin relation among the data tables and the blood margin relation among the fields in the data tables;
asynchronously sending the related information of the data table to a message queue;
consuming the related information of the data table output by the message queue, and storing the related information of the data table in a target database; the target database is used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables.
Optionally, the message queue is a Kafka message queue; and the consumption of the relevant information of the data table output by the message queue comprises the following steps:
registering the topic of the Kafka and consuming the related information of the data table based on the topic.
Optionally, the method further comprises:
integrating the fields of the same data table in the same database to obtain an integrated field;
and updating the blood relationship among the fields in the data table based on the integrated field.
Optionally, the hook plug-in is obtained by the following steps:
determining a currently running data engine; one such data engine corresponds to a hook plug-in;
based on the data engine, a corresponding hook plug-in is determined.
Optionally, the data engine includes a data warehouse tool hive; the hook plug-in corresponding to the hive is a hook plug-in hive hook hung in the hive;
the hook plug-in is used for intercepting and analyzing a database operation request to obtain related information of a data table, and comprises the following steps:
configuring a hive hook interface based on the hive hook; the hive hook interface is used for intercepting and analyzing the database operation request to obtain the related information of the data table.
Optionally, the data engine comprises a big data calculation engine Spark; the hook plug-in corresponding to the Spark is a hook plug-in Spark hook hung in the Spark;
the hook plug-in is used for intercepting and analyzing a database operation request to obtain related information of a data table, and comprises the following steps:
configuring a Spark hook interface based on the Spark hook; the Spark hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain related information of the data table.
Optionally, the data engine includes a data query engine prest; the hook plug-in corresponding to the prest is a hook plug-in prest hook hung in the prest;
the hook plug-in is used for intercepting and analyzing a database operation request to obtain related information of a data table, and comprises the following steps:
configuring a Presto hook interface based on the Presto hook; the Presto hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain relevant information of the data table.
In a second aspect, an embodiment of the present application provides a device for processing a data blood edge, including:
the analysis module is used for intercepting and analyzing the database operation request based on the hook plug-in to obtain the related information of the data table; the related information of the data table is used for representing the blood margin relation among the data tables and the blood margin relation among the fields in the data tables;
the sending module is used for asynchronously sending the related information of the data table to the message queue;
the consumption module is used for consuming the related information of the data table output by the message queue and storing the related information of the data table in a target database; the target database is used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, memory, system bus;
the processor and the memory are connected through the system bus;
the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any implementation of the method of processing data blood edges described above.
In a fourth aspect, an embodiment of the present application provides a computer readable storage medium, where instructions are stored, when the instructions are executed on an electronic device, cause the electronic device to perform any implementation of the method for processing a data blood edge described above.
From the above technical solutions, the embodiment of the present application has the following advantages:
in the embodiment of the application, based on the hook plug-in, the database operation request is intercepted and analyzed, and after the related information of the data table is obtained, the related information of the data table can be asynchronously sent to the message queue. Because the related information of the data tables can be used for representing the blood-edge relationship among the data tables and the blood-edge relationship among the fields in the data tables, after the related information of the data tables output by the message queue is consumed, the related information of the data tables can be stored in a target database, and the target database can be used for inquiring and displaying the blood-edge relationship. Therefore, the hook plug-in can be used for directly intercepting the database operation request and analyzing the blood edge relation related to the data table, namely analyzing the blood edge relation among the data tables and the blood edge relation among fields in the data tables, so that the data blood edge analysis can be rapidly realized, and the processing efficiency and the accuracy are improved. In addition, the information is cached through the message queue and consumed, so that integration of the blood-edge relations is facilitated, the blood-edge relations can be queried and displayed after the information is stored in the target database, the blood-edge relations of data can be clearly and intuitively mastered, and smooth expansion of actual business of a follow-up financial institution is facilitated.
Drawings
FIG. 1 is a flowchart of a method for processing data blood edges according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a data blood edge processing device according to an embodiment of the present application.
Detailed Description
As described above, the financial institution is taken as an example, and the business systems of the financial institution are numerous and the data structure is complex. In the traditional data blood-edge processing method, the method is mainly realized by means of ETL operation for analyzing data. However, since the ETL job is large and the processing procedure is complicated, errors are likely to occur, and therefore, the processing efficiency of the method for performing data blood-edge analysis based on the ETL job is not high, and the accuracy of the blood-edge relationship obtained by the processing is not good.
In order to solve the above problems, an embodiment of the present application provides a method for processing a data blood edge, which may include: based on the hook plug-in, the database operation request is intercepted and analyzed, and after the related information of the data table is obtained, the related information of the data table can be asynchronously sent to the message queue. Because the related information of the data tables can be used for representing the blood-edge relationship among the data tables and the blood-edge relationship among the fields in the data tables, after the related information of the data tables output by the message queue is consumed, the related information of the data tables can be stored in a target database, and the target database can be used for inquiring and displaying the blood-edge relationship.
Therefore, the hook plug-in can be used for directly intercepting the database operation request and analyzing the blood edge relation related to the data table, namely analyzing the blood edge relation among the data tables and the blood edge relation among fields in the data tables, so that the data blood edge analysis can be rapidly realized, and the processing efficiency and the accuracy are improved. In addition, the information is cached through the message queue and consumed, so that integration of the blood-edge relations is facilitated, the blood-edge relations can be queried and displayed after the information is stored in the target database, the blood-edge relations of data can be clearly and intuitively mastered, and smooth expansion of actual business of a follow-up financial institution is facilitated.
It should be noted that the method, the device, the equipment and the medium for processing the data blood edges provided by the embodiment of the application can be used in the big data field or the financial field. The foregoing is merely an example, and is not limited to the application fields of the method, the device, the equipment and the medium for processing the data blood edges provided by the embodiment of the present application. In addition, the embodiment of the application also does not limit the execution main body of the data blood edge processing method, for example, the data blood edge processing method of the embodiment of the application can be applied to data processing equipment such as terminal equipment or a server. The terminal device may be an electronic device such as a smart phone, a computer, a personal digital assistant (Personal Digital Assistant, PDA), a tablet computer, etc. The servers may be stand alone servers, clustered servers, or cloud servers.
For ease of understanding, the terms involved in the embodiments of the present application will be described first.
Hook plug-ins, i.e., hook plug-ins, function to modify or extend the original behavior of an operating system, application, or other software component by intercepting function calls, messaging, event transfers between software modules. Specifically, an arbitrary program can be pre-installed in an original program through a hook plug-in, when the original program is executed to the position of the hook plug-in, the original program can be intercepted first, and the installed program is executed first. In the data blood edge processing scene provided by the embodiment of the application, the hook plug-in can be used for intercepting and analyzing the database operation request, so that the related information of the data tables, namely the blood edge relation among the data tables and the blood edge relation among the fields in the data tables, is analyzed.
The Kafka message queue is a high-throughput distributed publish-subscribe message system, and is mainly used for buffering in a data processing system, especially for real-time streaming data processing. In Kafka, which is an important part of the topic, messages can be classified by topic, on which both the generation and consumption of the message are required.
The data warehouse tool hive can be used for extracting, converting and loading data, is a large-scale data mechanism and is suitable for carrying out statistical analysis on a data warehouse. Accordingly, hive hook refers to a hook plug-in that is hooked in hive.
The big data computing engine Spark is a fast and general computing engine designed for large-scale data processing. Accordingly, spark hook refers to a hook plug-in that is hooked in Spark.
The data query engine Presto is a data query engine, and can perform rapid interactive analysis on data. Accordingly, presto hook refers to a hook plug-in that is hooked in Presto.
In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Fig. 1 is a flowchart of a method for processing data blood edges according to an embodiment of the present application. Referring to fig. 1, the method for processing data blood edges provided in the embodiment of the present application may include:
s101: based on the hook plug-in, the database operation request is intercepted and analyzed, and relevant information of the data table is obtained.
The related information of the data table is used for representing the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables. The related information of the data table may specifically indicate which data table of which database a certain data is stored in, what the field corresponding to the data is, and the attribute of the field. Thus, the blood-edge relationship between the data tables and the blood-edge relationship between the fields in the data tables can be characterized.
In an embodiment of the present application, a data engine may correspond to a hook plug-in. Corresponding to this, the hook plug-in may be obtained by: determining a currently running data engine; based on the data engine, a corresponding hook plug-in is determined. Therefore, the hook plug-in can be obtained rapidly, and the interception and analysis of the database operation request can be realized conveniently, so that the blood-edge relationship between the data tables and the blood-edge relationship between the fields in the data tables can be obtained. For example, for a big data platform of commercial banking, it may specifically include three data engines, hive, spark, and prest. Correspondingly, the hook plug-in corresponding to the hive is a hook plug-in hive hook hung in the hive; the hook plug-in corresponding to the Spark is a hook plug-in Spark hook hung in the Spark; the hook plug-in corresponding to the prest is a hook plug-in prest hook which is hung in the prest.
Based on this, for the three different data engines and their respective hook plugins, the embodiments of the present application may provide different implementation manners to implement the process of intercepting and analyzing the database operation request, that is, S101, which are described below.
As a possible implementation manner, for hive hook, S101 may specifically include: configuring a hive hook interface based on the hive hook; the hive hook interface is used for intercepting and analyzing the database operation request to obtain the related information of the data table. Specifically, in hive, the hive hook interface may be an execeutewithhookcontext interface, which may be used to obtain the relevant information of the data table from HookContext.
As another possible implementation manner, for Spark hook, S101 may specifically include: configuring a Spark hook interface based on the Spark hook; the Sparkhook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain relevant information of the data table. Specifically, in Spark, the above-mentioned hive hook interface may be a query execution list interface, where the interface may be used to obtain currently executed query execution through an onSuccess callback function, obtain all Attribute of Attribute by using the output method of LogicalPlan, and traverse and parse Attribute of Attribute by using the exprId mapping relation of NamedExpress to obtain relevant information of the data table.
As yet another possible implementation manner, for Presto hook, S101 may specifically include: configuring a Presto hook interface based on the Presto hook; the Presto hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain relevant information of the data table. Specifically, in Presto, the Presto hook interface may be an EventListener interface, which may be used to parse the relevant information of the data table from the query completedevent of the queryCompleted callback function.
S102: and asynchronously sending the related information of the data table to a message queue.
In the implementation of the application, the information is sent to the message queue in an asynchronous transmission mode, and the response of the message queue is not required to be waited, so that the flexibility and the usability of the data engine can be improved, and the performance of the data engine is improved.
S103: consuming the related information of the data table output by the message queue, and storing the related information of the data table in a target database.
In the embodiment of the present application, the process of consuming the relevant information of the data table output by the message queue, that is, S103, may specifically include: registering the topic of Kafka and consuming the related information of the data table based on the topic. In Kafka, different topics can be consumed by consumers subscribed to the topics, so that integration of the blood-edge relations is facilitated by registering the topics and consuming relevant information of the data table based on the topics, the blood-edge relations can be queried and displayed after the integration is stored in a target database, the blood-edge relations of data can be clearly and intuitively mastered, and smooth expansion of actual business of a subsequent financial institution is facilitated.
In addition, the target database may be an elastic search (a search server that can provide database services). Correspondingly, the target database can be used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables.
Further, during data processing and analysis, it may sometimes be desirable to integrate fields of the same data table in the same database to facilitate better data lineage processing. Taking a data table containing user information as an example, the user information may include the name of the user, and in the data table, specifically, two different fields are used to respectively store the last name and the first name of the user, so that the name of the user can be obtained by combining the two fields, and the subsequent data blood-edge processing can be facilitated. Based on this, in the embodiment of the present application, fields of the same data table in the same database may be integrated to obtain an integrated field; the blood-edge relationship between the fields in the data table is updated based on the integrated fields.
Based on the above relevant content of S101-S103, in the embodiment of the present application, based on the hook plug-in, the database operation request is intercepted and parsed, and after relevant information of the data table is obtained, the relevant information of the data table may be asynchronously sent to the message queue. Because the related information of the data tables can be used for representing the blood-edge relationship among the data tables and the blood-edge relationship among the fields in the data tables, after the related information of the data tables output by the message queue is consumed, the related information of the data tables can be stored in a target database, and the target database can be used for inquiring and displaying the blood-edge relationship. Therefore, the hook plug-in can be used for directly intercepting the database operation request and analyzing the blood edge relation related to the data table, namely analyzing the blood edge relation among the data tables and the blood edge relation among fields in the data tables, so that the data blood edge analysis can be rapidly realized, and the processing efficiency and the accuracy are improved. In addition, the information is cached through the message queue and consumed, so that integration of the blood-edge relations is facilitated, the blood-edge relations can be queried and displayed after the information is stored in the target database, the blood-edge relations of data can be clearly and intuitively mastered, and smooth expansion of actual business of a follow-up financial institution is facilitated.
Based on the data blood edge processing method provided by the embodiment, the embodiment of the application can also provide a data blood edge processing device. The data blood-edge processing device is described below with reference to the examples and drawings, respectively.
Fig. 2 is a schematic structural diagram of a data blood edge processing device according to an embodiment of the present application. Referring to fig. 2, a data blood edge processing apparatus 200 according to an embodiment of the present application includes:
the parsing module 201 is configured to intercept the database operation request and parse the database operation request based on the hook plug-in, so as to obtain relevant information of the data table; the related information of the data table is used for representing the blood margin relation among the data tables and the blood margin relation among the fields in the data tables;
a sending module 202, configured to asynchronously send related information of the data table to a message queue;
the consumption module 203 is configured to consume related information of the data table output by the message queue, and store the related information of the data table in a target database; the target database is used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables.
As one embodiment, the message queue is a Kafka message queue; the consumption module 203 includes:
and the consumption sub-module is used for registering the theme of the Kafka and consuming the related information of the data table based on the theme.
As an embodiment, the apparatus 200 for processing data blood edges further includes:
the integration module is used for integrating the fields of the same data table in the same database to obtain integrated fields;
and the updating module is used for updating the blood-margin relation among the fields in the data table based on the integrated fields.
As an embodiment, the hook plug-in is obtained by the following modules:
the first determining module is used for determining a data engine which is currently running; one such data engine corresponds to a hook plug-in;
and the second determining module is used for determining the corresponding hook plugin based on the data engine.
As one embodiment, the data engine includes a data warehouse tool hive; the hook plug-in corresponding to the hive is a hook plug-in hive hook hung in the hive;
the parsing module 201 includes:
the first configuration module is used for configuring a hive hook interface based on the hive hook; the hive hook interface is used for intercepting and analyzing the database operation request to obtain the related information of the data table.
As one embodiment, the data engine comprises a big data calculation engine Spark; the hook plug-in corresponding to the Spark is a hook plug-in Spark hook hung in the Spark;
the parsing module 201 includes:
the second configuration module is used for configuring a Spark hook interface based on the Spark hook; the Spark hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain related information of the data table.
As one embodiment, the data engine includes a data query engine prest; the hook plug-in corresponding to the prest is a hook plug-in prest hook hung in the prest;
the parsing module 201 includes:
a third configuration module, configured to configure a Presto hook interface based on the Presto hook; the Presto hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain relevant information of the data table.
Further, an embodiment of the present application further provides an electronic device, including: a processor, memory, system bus;
the processor and the memory are connected through the system bus;
the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any implementation of the method of processing data blood edges described above.
Further, the embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores instructions, which when executed on an electronic device, cause the electronic device to execute any implementation manner of the data blood-edge processing method.
From the above description of embodiments, it will be apparent to those skilled in the art that all or part of the steps of the above described example methods may be implemented in software plus necessary general purpose hardware platforms. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application. It should be noted that, in the present description, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for processing a data blood edge, comprising:
based on hook plug-in, intercepting a database operation request and analyzing to obtain related information of a data table; the related information of the data table is used for representing the blood margin relation among the data tables and the blood margin relation among the fields in the data tables;
asynchronously sending the related information of the data table to a message queue;
consuming the related information of the data table output by the message queue, and storing the related information of the data table in a target database; the target database is used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables.
2. The processing method according to claim 1, wherein the message queue is a Kafka message queue Kafka; and the consumption of the relevant information of the data table output by the message queue comprises the following steps:
registering the topic of the Kafka and consuming the related information of the data table based on the topic.
3. A method of processing according to claim 1, wherein the method further comprises:
integrating the fields of the same data table in the same database to obtain an integrated field;
and updating the blood relationship among the fields in the data table based on the integrated field.
4. The processing method according to claim 1, wherein the hook plug-in is obtained by:
determining a currently running data engine; one such data engine corresponds to a hook plug-in;
based on the data engine, a corresponding hook plug-in is determined.
5. The processing method of claim 4, wherein the data engine comprises a data warehouse tool hive; the hook plug-in corresponding to the hive is a hook plug-in hive hook hung in the hive;
the hook plug-in is used for intercepting and analyzing a database operation request to obtain related information of a data table, and comprises the following steps:
configuring a hive hook interface based on the hive hook; the hive hook interface is used for intercepting and analyzing the database operation request to obtain the related information of the data table.
6. The processing method of claim 4, wherein the data engine comprises a big data calculation engine Spark; the hook plug-in corresponding to the Spark is a hook plug-in Spark hook hung in the Spark;
the hook plug-in is used for intercepting and analyzing a database operation request to obtain related information of a data table, and comprises the following steps:
configuring a Spark hook interface based on the Spark hook; the Spark hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain related information of the data table.
7. The processing method of claim 4, wherein the data engine comprises a data query engine prest; the hook plug-in corresponding to the prest is a hook plug-in prest hook hung in the prest;
the hook plug-in is used for intercepting and analyzing a database operation request to obtain related information of a data table, and comprises the following steps:
configuring a Presto hook interface based on the Presto hook; the Presto hook interface is used for determining and intercepting database operation requests and analyzing the database operation requests to obtain relevant information of the data table.
8. A data blood edge processing apparatus, comprising:
the analysis module is used for intercepting and analyzing the database operation request based on the hook plug-in to obtain the related information of the data table; the related information of the data table is used for representing the blood margin relation among the data tables and the blood margin relation among the fields in the data tables;
the sending module is used for asynchronously sending the related information of the data table to the message queue;
the consumption module is used for consuming the related information of the data table output by the message queue and storing the related information of the data table in a target database; the target database is used for inquiring and displaying the blood-edge relation among the data tables and the blood-edge relation among the fields in the data tables.
9. An electronic device, the device comprising: a processor, memory, system bus;
the processor and the memory are connected through the system bus;
the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of processing data blood clots of any of claims 1 to 7.
10. A computer readable storage medium having instructions stored therein which, when executed on an electronic device, cause the electronic device to perform the method of processing data blood clots of any one of claims 1 to 7.
CN202311015269.2A 2023-08-11 2023-08-11 Data blood edge processing method, device, equipment and medium Pending CN117056369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311015269.2A CN117056369A (en) 2023-08-11 2023-08-11 Data blood edge processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311015269.2A CN117056369A (en) 2023-08-11 2023-08-11 Data blood edge processing method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN117056369A true CN117056369A (en) 2023-11-14

Family

ID=88663906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311015269.2A Pending CN117056369A (en) 2023-08-11 2023-08-11 Data blood edge processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN117056369A (en)

Similar Documents

Publication Publication Date Title
JP6266630B2 (en) Managing continuous queries with archived relations
KR101863981B1 (en) Using text messages to interact with spreadsheets
US8849854B2 (en) Method and system for providing detailed information in an interactive manner in a short message service (SMS) environment
CN108073625B (en) System and method for metadata information management
CN112162965B (en) Log data processing method, device, computer equipment and storage medium
JP5031819B2 (en) Declarations for transformations in service sequences
CN112416991A (en) Data processing method and device and storage medium
CN113568938A (en) Data stream processing method and device, electronic equipment and storage medium
CN113220710B (en) Data query method, device, electronic equipment and storage medium
CN110888672A (en) Metadata architecture-based expression engine implementation method and system
CN108319604B (en) Optimization method for association of large and small tables in hive
CN112883088B (en) Data processing method, device, equipment and storage medium
EP4216076A1 (en) Method and apparatus of processing an observation information, electronic device and storage medium
CN116955856A (en) Information display method, device, electronic equipment and storage medium
US20230146245A1 (en) Systems and methods for handling macro compatibility for documents at a storage system
CN116594709A (en) Method, apparatus and computer program product for acquiring data
CN116383207A (en) Data tag management method and device, electronic equipment and storage medium
CN117056369A (en) Data blood edge processing method, device, equipment and medium
CN115630122A (en) Data synchronization method and device, storage medium and computer equipment
US10114864B1 (en) List element query support and processing
CN114254014A (en) Business data display method, device, equipment and storage medium
CN113297516A (en) Customer interaction interface generation method and device and electronic equipment
WO2021143199A1 (en) Method and apparatus for searching log, computer device, and storage medium
CN113779018A (en) Data processing method and device
CN111143328A (en) Agile business intelligent data construction method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination