CN109656963B - Metadata acquisition method, apparatus, device and computer readable storage medium - Google Patents

Metadata acquisition method, apparatus, device and computer readable storage medium Download PDF

Info

Publication number
CN109656963B
CN109656963B CN201811551965.4A CN201811551965A CN109656963B CN 109656963 B CN109656963 B CN 109656963B CN 201811551965 A CN201811551965 A CN 201811551965A CN 109656963 B CN109656963 B CN 109656963B
Authority
CN
China
Prior art keywords
metadata
acquisition
data
task
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811551965.4A
Other languages
Chinese (zh)
Other versions
CN109656963A (en
Inventor
周可
邸帅
汪亚男
兰冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201811551965.4A priority Critical patent/CN109656963B/en
Publication of CN109656963A publication Critical patent/CN109656963A/en
Application granted granted Critical
Publication of CN109656963B publication Critical patent/CN109656963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a metadata acquisition method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: after detecting an acquisition instruction for acquiring metadata, detecting whether the current time is within the execution time range of an acquisition task corresponding to the acquisition instruction; if the current time is in the execution time range, determining a target actuator with the minimum resource utilization rate in actuators connected with the scheduler; and sending the acquisition task to the target executor so that the target executor can acquire the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task. According to the method, metadata of different data types are acquired through adapting to acquisition tools of different data types, so that metadata stored in other data components except the relational database are acquired.

Description

Metadata acquisition method, apparatus, device and computer readable storage medium
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a metadata acquisition method, apparatus, device, and computer readable storage medium.
Background
The metadata management provides reliable and convenient tool support for an enterprise to establish a metadata management system, can help the enterprise draw data blood edges, unify data apertures, mark data orientations, analyze data relationships, manage model changes and the like, is convenient for the enterprise to more effectively discover and utilize the value of information assets, realizes accurate and efficient analysis and decision making, promotes system change management, and reduces project risks. One of the core functions of metadata management is to acquire metadata information of each data system and service system, build a data model for the metadata information, store the data model in a relational database or a distributed file system, and manage versions of the metadata.
Some metadata management and data governance tools exist today such as Transwarp Governor and PrimetonMetaCube. The Governor is a metadata management and data governance tool in Transwarp Data Hub (TDH). It can be used by users to manage metadata for data lineage analysis and impact analysis. Primeon MetaCube. Metacube is an enterprise-level metadata management product with common elements based on CWM (Common Warehouse Metamodel, public warehouse meta-model) specifications, reliable and convenient tool support is provided for enterprises to establish a metadata management system, metadata from the field of data warehouses in the enterprises can be collected, and end-to-end metadata service is provided for the enterprises, so that the value of information assets can be more effectively discovered and utilized, and accurate and efficient analysis and decision-making can be realized.
In the metadata acquisition process, the current metadata management and data management tool is not perfect enough to support data components, and the current mainstream big data platform is integrated by various open source and business data components and mainly comprises a streaming component (Kafka), a distributed non-relational database (HBase), a data warehouse tool (Hive), a relational database (Oracle/Teradata Database/MySQL), a business intelligence (Business Intelligence, BI) tool (SAS) and the like, wherein the current metadata management and data management tool only can support the relational database and cannot acquire metadata stored by other data components except the relational database.
Disclosure of Invention
The invention mainly aims to provide a metadata acquisition method, a metadata acquisition device, metadata acquisition equipment and a metadata acquisition computer readable storage medium, and aims to solve the technical problem that metadata stored in other data components except a relational database cannot be acquired in the prior art.
In order to achieve the above object, the present invention provides a metadata acquisition method applied to a scheduler, the metadata acquisition method comprising the steps of:
after detecting an acquisition instruction for acquiring metadata, detecting whether the current time is within the execution time range of an acquisition task corresponding to the acquisition instruction;
If the current time is in the execution time range, determining a target actuator with the minimum resource utilization rate in actuators connected with the scheduler;
and sending the acquisition task to the target executor so that the target executor can acquire the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task.
Preferably, if the current time is within the execution time range, determining a target actuator with the minimum resource utilization rate in the actuators connected to the scheduler includes:
if the time is in the execution time range, acquiring the current CPU resource utilization rate of each executor connected with the scheduler;
and determining the actuator with the minimum CPU resource utilization rate as a target actuator.
In order to achieve the above object, the present invention provides a metadata acquisition method applied to an actuator, the metadata acquisition method including the steps of:
after receiving an acquisition task sent by a scheduler, acquiring an execution parameter corresponding to the acquisition task;
connecting a data system storing metadata corresponding to the acquisition task according to the execution parameters, and determining an acquisition tool for acquiring the metadata in the data system according to the execution parameters;
And acquiring the metadata in the connected data system according to the acquisition tool.
Preferably, after the step of acquiring the metadata in the connected data system according to the acquiring tool, the method further includes:
storing the metadata into a big data platform so that the big data platform can convert the data format of the metadata into a data format corresponding to a metadata storage model, obtaining the metadata after format conversion, and returning a success message of successful format conversion;
and triggering a storage instruction for storing metadata into a metadata management database after receiving the success message, and sending the storage instruction to the big data platform so that the big data platform stores the metadata after format conversion into the metadata management database according to the storage instruction.
Preferably, after the step of triggering a storage instruction for storing metadata in a metadata management database after receiving the success message, and sending the storage instruction to the big data platform, the big data platform stores the metadata after format conversion in the metadata management database according to the storage instruction, the method further includes:
Detecting whether prompt information of successful metadata storage sent by the big data platform is received or not;
and if the prompt information is received, the prompt information is sent to the dispatcher, so that the dispatcher records the task state corresponding to the acquisition task according to the prompt information.
Preferably, when the metadata acquired by the acquisition task is process metadata, after the step of acquiring the metadata in the connected data system according to the acquisition tool, the method further includes:
acquiring an execution record corresponding to the process metadata, and analyzing the execution record corresponding to the process metadata to obtain a storage path of the execution log corresponding to the process metadata;
acquiring the execution log according to the storage path, and determining a task identifier associated with the process metadata according to the execution log;
determining input data and output data corresponding to the process metadata according to the task identifier, and associating the input data names and the output data names with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata;
And sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
Preferably, when the metadata acquired by the acquisition task is service metadata, after the step of acquiring the metadata in the connected data system according to the acquisition tool, the method further includes:
determining first association information of the service metadata and corresponding technical metadata and second association information of the service metadata and corresponding service product information;
and sending the first association information and the second association information to a big data platform so that the big data platform can store the first association information and the second association information to a metadata management system database.
Preferably, the step of connecting a data system storing metadata corresponding to the acquisition task according to the execution parameters, and determining an acquisition tool for acquiring metadata in the data system according to the execution parameters includes:
determining an interface path and a storage type of a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
And connecting the data system according to the interface path, and determining an acquisition tool for acquiring data in the data system according to the storage type.
In addition, in order to achieve the above object, the present invention also provides a metadata acquisition apparatus applied to a scheduler, the metadata acquisition apparatus comprising:
the first detection module is used for detecting whether the current time is in the execution time range of the acquisition task corresponding to the acquisition instruction after the acquisition instruction for acquiring the metadata is detected;
the first determining module is used for determining a target actuator with the minimum resource utilization rate in the actuators connected with the scheduler if the current time is in the execution time range;
and the first sending module is used for sending the acquisition task to the target executor so that the target executor can acquire the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task.
In addition, in order to achieve the above object, the present invention also provides a metadata acquisition apparatus applied to an actuator, the metadata acquisition apparatus including:
The acquisition module is used for acquiring execution parameters corresponding to the acquisition task after receiving the acquisition task sent by the scheduler;
the connection module is used for connecting a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
a second determining module, configured to determine an acquiring tool for acquiring metadata in the data system according to the execution parameter;
the acquisition module is further used for acquiring the metadata in the connected data system according to the acquisition tool.
In addition, in order to achieve the above object, the present invention also provides a metadata acquisition apparatus including a memory, a processor, and a metadata acquisition program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the metadata acquisition method as described above.
In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a metadata acquisition program which, when executed by a processor, implements the steps of the metadata acquisition method as described above.
According to the method, when the acquisition instruction for acquiring the metadata is detected, and when the current time is detected to be in the execution time range of the acquisition task corresponding to the acquisition instruction, the target executor connected with the scheduler is determined, the acquisition task is sent to the target executor, so that the target executor acquires the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task, and acquires the metadata of different data types by adapting the acquisition tools of different data types, so that the metadata stored in other data components except the relational database are acquired.
Drawings
FIG. 1 is a schematic diagram of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a metadata acquisition method according to a first embodiment of the present invention;
FIG. 3 is a flowchart of a metadata acquisition method according to a second embodiment of the present invention;
fig. 4 is a flowchart of a metadata acquisition method according to a third embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, fig. 1 is a schematic structural diagram of a hardware running environment according to an embodiment of the present invention.
It should be noted that fig. 1 is a schematic structural diagram of a hardware running environment of the metadata obtaining device. The metadata acquisition device in the embodiment of the invention can be terminal devices such as a PC, a portable computer and the like.
As shown in fig. 1, the metadata acquisition apparatus may include: a processor 1001, such as a CPU, a user interface 1003, a network interface 1004, a memory 1005, a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the metadata acquisition device structure shown in fig. 1 is not limiting of the metadata acquisition device and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a metadata acquisition program may be included in the memory 1005 as one type of computer storage medium. The operating system is a program for managing and controlling hardware and software resources of the metadata acquisition device, and supports the running of the metadata acquisition program and other software or programs.
In the metadata acquisition apparatus shown in fig. 1, the user interface 1003 may be used for detecting an acquisition instruction or the like, and the network interface 1004 is mainly used for connecting to a background server, and performing data communication with the background server; when the metadata acquisition device is a scheduler, the processor 1001 may be configured to call a metadata acquisition program stored in the memory 1005 and perform the following operations:
after detecting an acquisition instruction for acquiring metadata, detecting whether the current time is within the execution time range of an acquisition task corresponding to the acquisition instruction;
if the current time is in the execution time range, determining a target actuator with the minimum resource utilization rate in actuators connected with the scheduler;
and sending the acquisition task to the target executor so that the target executor can acquire the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task.
Further, if the current time is within the execution time range, determining a target actuator with the minimum resource utilization rate from the actuators connected to the scheduler includes:
if the time is in the execution time range, acquiring the current CPU resource utilization rate of each executor connected with the scheduler;
and determining the actuator with the minimum CPU resource utilization rate as a target actuator.
When the metadata acquisition device is an actuator, the processor 1001 may be configured to call a metadata acquisition program stored in the memory 1005 and perform the following operations:
after receiving an acquisition task sent by a scheduler, acquiring an execution parameter corresponding to the acquisition task;
connecting a data system storing metadata corresponding to the acquisition task according to the execution parameters, and determining an acquisition tool for acquiring the metadata in the data system according to the execution parameters;
and acquiring the metadata in the connected data system according to the acquisition tool.
Further, after the step of acquiring the metadata in the connected data system according to the acquiring tool, the processor 1001 may be further configured to invoke a blockchain-based metadata acquiring program stored in the memory 1005, and perform the following steps:
Storing the metadata into a big data platform so that the big data platform can convert the data format of the metadata into a data format corresponding to a metadata storage model, obtaining the metadata after format conversion, and returning a success message of successful format conversion;
and triggering a storage instruction for storing metadata into a metadata management database after receiving the success message, and sending the storage instruction to the big data platform so that the big data platform stores the metadata after format conversion into the metadata management database according to the storage instruction.
Further, after the step of triggering a storage instruction for storing metadata in a metadata management database after receiving the success message and sending the storage instruction to the big data platform, so that the big data platform stores the metadata after format conversion in the metadata management database according to the storage instruction, the processor 1001 may be further configured to invoke a blockchain-based metadata acquisition program stored in the memory 1005, and perform the following steps:
detecting whether prompt information of successful metadata storage sent by the big data platform is received or not;
And if the prompt information is received, the prompt information is sent to the dispatcher, so that the dispatcher records the task state corresponding to the acquisition task according to the prompt information.
Further, when the metadata acquired by the acquisition task is process metadata, after the step of acquiring the metadata in the connected data system according to the acquisition tool, the processor 1001 may further be configured to call a blockchain-based metadata acquisition program stored in the memory 1005, and perform the following steps:
acquiring an execution record corresponding to the process metadata, and analyzing the execution record corresponding to the process metadata to obtain a storage path of the execution log corresponding to the process metadata;
acquiring the execution log according to the storage path, and determining a task identifier associated with the process metadata according to the execution log;
determining input data and output data corresponding to the process metadata according to the task identifier, and associating the input data names and the output data names with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata;
And sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
Further, when the metadata acquired by the acquisition task is service metadata, after the step of acquiring the metadata in the connected data system according to the acquisition tool, the processor 1001 may further be configured to invoke a blockchain-based metadata acquisition program stored in the memory 1005, and perform the following steps:
determining first association information of the service metadata and corresponding technical metadata and second association information of the service metadata and corresponding service product information;
and sending the first association information and the second association information to a big data platform so that the big data platform can store the first association information and the second association information to a metadata management system database.
Further, the step of connecting the data system storing the metadata corresponding to the acquisition task according to the execution parameters, and determining an acquisition tool for acquiring the metadata in the data system according to the execution parameters includes:
Determining an interface path and a storage type of a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
and connecting the data system according to the interface path, and determining an acquisition tool for acquiring data in the data system according to the storage type.
Based on the above-described structure, various embodiments of a metadata acquisition method are presented.
Referring to fig. 2, fig. 2 is a flowchart illustrating a metadata acquisition method according to a first embodiment of the present invention.
Embodiments of the present invention provide embodiments of metadata acquisition methods, it being noted that although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in a different order than that illustrated herein.
First, terms used in the embodiments of the present invention will be explained.
(1) Hadoop is a software framework that can perform distributed processing on a large amount of data. Hadoop includes four modules, common, HDFS (Hadoop Distributed File System, distributed file system), YARN (Yet Another Resource Negotiator, another resource coordinator) and MapReduce, wherein Common is a Common tool that can support other modules; HDFS is a distributed file system for providing high throughput access performance; YARN is a framework that provides job scheduling and cluster resource management; mapReduce is a data parallel computing framework, called MR for short.
(2) Hive is a data warehouse tool based on Hadoop, which can map a structured data file into a database table, provide an SQL (Structured Query Language) query function, and convert SQL sentences into MapReduce tasks for operation. Hive's management of data warehouse includes two aspects: the management of metadata is firstly, and the management of data is secondly. The metadata is stored in a relational database by Hive, such as MySQL (relational database management system), and comprises names of tables, columns and partitions of the tables, attributes of the tables (whether external tables are used or not) and HDFS (Hadoop distributed file system) storage catalogs where data of the tables are located; data: hive's data is stored in HDFS, most queries are done by MapReduce task computation.
(3) Kafka, an open source stream processing platform developed by the Apache software Foundation, is written by Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system that can handle all action flow data in consumer-scale websites.
(4) HBase is a distributed, column-oriented, highly extensible key/value big data store, which can provide high-efficiency random read-write access capability to massive structured data based on HDFS.
(5) Oracle Database: the Oracle RDBMS is also called Oracle for short, and is a relational database management system of Oracle corporation.
(6) Teradata database: the relational database management system proposed by Teradata corporation, which adopts the standard SQL query language, is suitable for processing complex query data warehouse applications.
(7) CMDB: configuration Management Database a configuration management database, the CMDB stores and manages various configuration information of devices in the enterprise IT architecture, which is tightly associated with all service support and service delivery flows, supports the operation of these flows and exerts the value of the configuration information, while guaranteeing the accuracy of data depending on the related flows.
(8) SAS: statistical Analysis System (SAS), a modular, integrated, large-scale application software system. The system consists of tens of special modules, and the functions comprise data access, data storage and management, application development, graphic processing, data analysis, report preparation, operation study method, metering economy and prediction, and the like, so that the standard SQL language is supported to operate the data, and the simple list to the relatively complex statistical report can be made.
The metadata acquisition method comprises the following steps:
Step S10, after detecting an acquisition instruction for acquiring metadata, detecting whether the current time is within the execution time range of the acquisition task corresponding to the acquisition instruction.
When the scheduler detects an acquisition instruction for acquiring metadata, the scheduler acquires the current time and detects whether the current time is in the execution time range of the acquisition task corresponding to the acquisition instruction. The acquiring instruction can be triggered by a user according to the needs, and can also be triggered by a timing task preset in the scheduler. The execution time range may be set according to specific needs, and the execution time range is not particularly limited in this embodiment. The metadata includes technical metadata, business metadata, and process metadata. Technical metadata is a storage structure describing data in a database system, namely field types, names, notes, data field lengths and the like of the data stored in the database system; the business metadata is a business context describing the data, i.e., describing the data belongs to that business product, is used by those business products, etc.; process metadata represents how data is processed from a source layer table to a data mart table, i.e., the data is called by those systems, etc.
Further, when the acquisition instruction is to acquire technical metadata, after detecting the acquisition instruction of acquiring the technical metadata, the scheduler judges whether the technical metadata structure change frequency corresponding to the acquisition instruction is greater than a preset frequency (optionally, 1 day), if the technical metadata structure change frequency is determined to be greater than the preset frequency, the scheduler acquires the technical metadata in a batch acquisition mode, if the technical metadata structure change frequency is determined to be less than or equal to the preset frequency, the scheduler imports the full amount of technical metadata in the data system at a time to initialize the technical metadata in the data system, and then the data system updates the technical metadata in real time in a real-time event message mode. When the field type, name, annotation, data field length, etc. of the technical metadata change, it indicates that the technical metadata structure has changed.
The process of updating technical metadata in real time by the data system through a real-time event message is as follows: the data system initiates the technical metadata to the metadata management system database by defining an init.sh script to call an ETL (Extract-Transform-Load) task for batch technical metadata update, i.e. the data system writes the technical metadata stored in the data system into the metadata management system database. Specifically, the data system acquires technical metadata with changed structure through Hook, encapsulates the technical metadata into an event message and sends the event message to the message system, and the message system sends the event message to the stream analysis system. After the stream analysis system receives the event message, the technical metadata with the changed structure is analyzed in the event message, and the technical metadata with the changed structure is written into the metadata management system database. Note that this Hook implements Hive post Execution Hook, and by analyzing the operation type of SQL, when the operation type is analyzed to be DDL (Update/Alter Table/Partition etc.) operation, the technical metadata of the event capture structure change is triggered. It will be appreciated that technical metadata for structural changes may be generated when the data system executes an SQL statement. In the embodiment of the invention, the message system is Kafka. The technical metadata is acquired in real time, so that the user can conveniently inquire the technical metadata in real time.
And step S20, if the current time is within the execution time range, determining a target actuator with the minimum resource utilization rate from the actuators connected with the scheduler.
If the current time is detected to be in the execution time range, the scheduler determines the actuator with the minimum resource utilization rate from the actuators connected with the scheduler, and determines the actuator with the minimum resource utilization rate as a target actuator. It should be noted that, the metadata sets associated ETL tasks for different data systems, that is, the metadata sets associated acquisition tasks corresponding to different data systems, and each acquisition task sets a corresponding execution time range. When the current time is detected to be in the corresponding execution time range, the scheduler triggers an execution instruction and adds the acquisition task into an execution queue.
Further, if it is detected that the current time is not within the execution time range, the scheduler continues to detect whether the current time is within the execution time range.
Further, step S20 includes:
and a step a of acquiring the current CPU resource utilization rate of each executor connected with the scheduler if the current time is in the execution time range.
And b, determining the actuator with the minimum CPU resource utilization rate as a target actuator.
Specifically, the process of determining the target executor by the scheduler is as follows: if the current time is detected to be in the execution time range, the scheduler acquires the current CPU (Central Processing Unit ) resource utilization rate of each executor connected with the scheduler, and determines the executor with the smallest CPU resource utilization rate in all the executors as a target executor.
Further, the scheduler may also obtain the current available memory space (i.e., space resource utilization rate) of each connected actuator, and determine the actuator with the largest available memory space (i.e., the smallest space resource utilization rate) among all the actuators as the target actuator. Further, the target actuator can also be determined by combining the available space of the memory and the utilization rate of CPU resources. Specifically, if the weights corresponding to the space resource utilization rate and the CPU resource utilization rate can be set, the space resource utilization rate and the CPU resource utilization rate corresponding to each actuator are multiplied by the corresponding weights and added, and the actuator with the smallest value obtained after the addition is determined as the target actuator, wherein the weights corresponding to the space resource utilization rate and the CPU resource utilization rate can be set according to specific needs, and the size of the weights corresponding to the space resource utilization rate and the CPU resource utilization rate is not particularly limited in the embodiment of the invention.
Step S30, sending the acquisition task to the target executor, so that the target executor can acquire the metadata in the corresponding data system by adopting the corresponding acquisition tool according to the acquisition task.
After determining the target executor, the scheduler extracts the acquisition task in the execution queue and sends the acquisition task to the target executor so that the target executor can acquire metadata in a data system corresponding to the acquisition tool by adopting the corresponding acquisition tool according to the acquisition task.
In this embodiment, metadata sets associated acquisition tasks for different data systems, each acquisition task has its own timing scheduling policy, and the scheduler triggers task execution according to the timing scheduling policy of the acquisition task. Once the triggering condition of the scheduling policy is met, the scheduler adds the task to the execution queue, and takes the task out of the execution queue according to the load balancing policy of the resource to send to the target executor.
It should be noted that, in the embodiment of the present invention, the executor sets different metadata acquisition tools (crawlers) for different data systems, where the acquisition tools are designed according to metadata structures of the data systems, so that the acquisition tools adapt to corresponding data systems, and the acquisition tools may be connected to corresponding data systems, and read metadata in the corresponding data systems. In an embodiment of the present invention, the acquisition tool supports relational databases, non-relational databases, data warehouse systems (Hive), messaging systems (Kafka), and business BI software (SAS), where relational data includes, but is not limited to MySQL, oracle, and Teradata Database.
According to the embodiment, when the acquisition instruction for acquiring the metadata is detected, and when the current time is detected to be in the execution time range of the acquisition task corresponding to the acquisition instruction, the target executor connected with the scheduler is determined, the acquisition task is sent to the target executor, so that the target executor acquires the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task, and acquires the metadata of different data types by adapting the acquisition tools of different data types, so that the metadata stored in other data components except the relational database are acquired.
It should be noted that, except for the single supported components, the existing technical solution has the following defects in metadata acquisition and the defects that the invention can solve:
metadata type support is not rich enough and is single; the invention can support the acquisition of various metadata such as technical metadata, business metadata, process metadata and the like.
The current metadata updating and acquiring mode is single; for the system with infrequent change, the metadata can be updated in increment once a day; and the metadata real-time query of the user is met by combining the requirement of frequent change with the real-time update of the distributed message system.
Further, a second embodiment of the metadata acquisition method of the present invention is presented.
The second embodiment of the metadata acquisition method is different from the first embodiment of the metadata acquisition method in that the metadata acquisition method is applied to an actuator, and referring to fig. 3, the metadata acquisition method further includes:
step S40, after receiving the acquisition task sent by the scheduler, acquiring the execution parameter corresponding to the acquisition task.
It should be noted that, the scheduler in the embodiment of the present invention may be the target actuator in the first embodiment, or any actuator connected to the scheduler. After the executor receives the acquisition task sent by the scheduler, the executor acquires the execution parameters corresponding to the acquisition task. Among other things, the execution parameters include, but are not limited to, the address, name, interface path of the data system, and data type of metadata stored by the data system. The interface path is the position where the actuator is connected with the interface of the data system.
Step S50, connecting a data system storing metadata corresponding to the acquisition task according to the execution parameters, and determining an acquisition tool for acquiring the metadata in the data system according to the execution parameters.
Step S60, acquiring the metadata in the connected data system according to the acquisition tool.
After the executor acquires the execution parameters corresponding to the acquisition tasks, the executor is connected with a data system for storing metadata corresponding to the acquisition tasks according to the execution parameters, and determines an acquisition tool for acquiring the metadata in the data system according to the execution parameters. When the acquisition tool is determined, the executor acquires metadata in the connected data system according to the acquisition tool. Specifically, the executor acquires corresponding metadata in the data system according to the data grabbing stage (Extract) of the acquisition task.
Further, step S50 includes:
and c, determining an interface path and a storage type of a data system for storing metadata corresponding to the acquisition task according to the execution parameters.
And d, connecting the data system according to the interface path, and determining an acquisition tool for acquiring data in the data system according to the storage type.
Specifically, after the executor obtains the execution parameters corresponding to the acquisition task, the executor determines an interface path of a data system storing metadata corresponding to the acquisition task and a data type of data stored in the data system according to the execution parameters, connects the data system according to the interface path, and determines an acquisition tool for acquiring the data in the data system according to the storage type. It should be noted that the interface paths of different data systems are different.
Further, if the executor cannot determine the data system to be connected according to the interface path, the executor may connect to the corresponding data system according to the interface path, the address of the data system and the name of the data system in the execution parameters.
According to the embodiment, after the acquisition task sent by the scheduler is received, the execution parameters corresponding to the task are acquired, the data system for storing metadata corresponding to the acquisition task is connected according to the execution parameters, the acquisition tool for acquiring the metadata in the data system is determined according to the execution parameters, the metadata is acquired in the connected data system according to the acquisition tool, and the metadata of different data types are acquired through the adaptation of the acquisition tools of different data types, so that the metadata stored by other data components except the relational database are acquired.
Further, a third embodiment of the metadata acquisition method of the present invention is presented.
The third embodiment of the metadata acquisition method is different from the first or second embodiment of the metadata acquisition method in that, referring to fig. 4, the metadata acquisition method further includes:
step S70, storing the metadata into a big data platform so that the big data platform can convert the data format of the metadata into a data format corresponding to a metadata storage model, obtaining the metadata after format conversion, and returning a success message of successful format conversion.
When the executor acquires the metadata, the executor is connected with the big data platform through the bridge plug-in a data conversion stage (Transform) of the acquisition task, and sends the metadata to the big data platform so as to store the metadata into the big data platform, and specifically, store the metadata into an HDFS of the big data platform. After metadata is stored in the big data platform, the big data platform converts the data format of the metadata into the data format corresponding to the metadata storage model, the metadata after format conversion is obtained, and after the metadata format conversion is successful, a success message of successful format conversion is generated, and the success message is sent to the executor. The metadata storage model is preset, and the data format corresponding to the metadata storage model is also preset, for example, the data mode corresponding to the metadata storage model can be set as MySQL or HBase, etc.
And step S80, triggering a storage instruction for storing metadata into a metadata management database after receiving the success message, and sending the storage instruction to the big data platform so that the big data platform stores the metadata after format conversion into the metadata management database according to the storage instruction.
When the executor receives the success message sent by the big data platform, the executor automatically triggers a storage instruction for storing the metadata into the metadata management database, and sends the storage instruction to the big data platform. And after the big data platform receives the storage instruction, the big data platform calls the import (Load) logic of the acquisition task according to the storage instruction to store the metadata after format conversion into a metadata management database.
Further, after the big data platform successfully stores the metadata after format conversion into the metadata management database, the big data platform automatically generates prompt information of successful metadata storage and sends the prompt information to the executor.
Further, the metadata acquisition method further includes:
and e, detecting whether prompt information of successful metadata storage sent by the big data platform is received.
And f, if the prompt information is received, sending the prompt information to the scheduler so that the scheduler can record the task state corresponding to the acquisition task according to the prompt information.
After the executor sends a storage instruction to the big data platform, the executor detects whether the prompt information of successful metadata storage sent by the big data platform is received. If the executor receives the prompt information, the executor records the task state corresponding to the acquisition task according to the prompt information and sends the prompt information to the dispatcher. Specifically, the executor modifies the state of the acquisition task into an executed state according to the prompt information. It should be noted that, if the acquiring task only has a non-executed state and an executed state, the executor modifies the acquiring task from the non-executed state to the executed state according to the prompt information; if the acquisition task has an unexecuted state, an executing state and an executed state, the executor modifies the acquisition task from the executing state to the executed state according to the prompt information. Further, the executor may record the execution duration of the acquisition task according to the prompt information, specifically, the executor records the first receiving time when the acquisition task is received, and records the second receiving time when the prompt information is received, and calculates the time difference between the first receiving time and the second receiving time, where the time difference is the execution duration of the acquisition task.
And after receiving the prompt information, the scheduler acquires the task state corresponding to the task according to the prompt information record. It should be noted that, the process of acquiring the task state corresponding to the task by the scheduler according to the prompt information record is consistent with the process of acquiring the task state corresponding to the task by the executor according to the prompt information record, which is not described in detail herein.
According to the embodiment of the invention, the obtained metadata is stored in the big data platform, so that the big data platform can convert the data format of the metadata into the data format corresponding to the metadata storage model, after receiving the success message returned by the big data platform, the storage instruction for storing the metadata into the metadata management database is triggered, and the storage instruction is sent to the big data platform, so that the big data platform can store the metadata with the converted format into the metadata management database according to the storage instruction, and the metadata with different data types can be conveniently converted into the unified data format and then stored into the metadata management database, so that the metadata management database can manage the metadata conveniently.
Further, a fourth embodiment of the metadata acquisition method of the present invention is presented.
The fourth embodiment of the metadata acquisition method is different from the first, second, or third embodiments of the metadata acquisition method in that, when the metadata acquired by the acquisition task is process metadata, the metadata acquisition method further includes:
And g, acquiring an execution record corresponding to the process metadata, and analyzing the execution record corresponding to the process metadata to obtain a storage path of the execution log corresponding to the process metadata.
When the metadata acquired by the acquisition task is process metadata, after the executor acquires the process metadata in batches through a metadata acquisition tool Crawler, the executor acquires an execution record corresponding to the process metadata through a log interface of an execution log acquisition system of the scheduler, or acquires the execution record corresponding to the process metadata in a database of the scheduler. After the execution record corresponding to the process metadata is obtained, the executor analyzes the execution record to obtain a storage path of the execution log corresponding to the process metadata. If the execution record is stored in the database of the scheduler, the executor analyzes the execution record to obtain a log identifier of the execution log corresponding to the process metadata in the database. It will be appreciated that in some particular field in the execution record, the storage path of the execution log or the log identification is stored or written in a particular form in the execution record.
It should be noted that, the metadata acquisition tool Crawler in the embodiment of the present invention may adapt to the large data batch scheduling system (Azkaban, airflow, oozie) and adapt to other self-grinding or commercial batch scheduling systems.
And h, acquiring the execution log according to the storage path, and determining a task identifier associated with the process metadata according to the execution log.
After the executor obtains the storage path of the execution log, the executor obtains the execution log in the scheduler according to the storage path, analyzes the execution log, and obtains a task identifier corresponding to the execution log and used for obtaining task management. The task identifier is a Hadoop Job ID (MapReduce Job ID), and is a task identifier of a task corresponding to the acquisition task, that is, the task identifier is an identifier corresponding to the acquisition task when the acquisition task is executed in Hadoop.
And i, determining input data and output data corresponding to the process metadata according to the task identifier, and associating the input data names and the output data names with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata.
And j, transmitting the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
After determining the task identifier, the executor determines a corresponding Hadoop task according to the task identifier, analyzes the Hadoop task to obtain input data and output data corresponding to the Hadoop task, and obtains the input data and the output data corresponding to the process metadata. Each Hadoop task corresponds to a unique task identifier. And after the input data and the output data corresponding to the process metadata are obtained, the executor associates the names of the input data and the output data with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata, and sends the name association information to the big data platform. And after the big data platform receives the name association information, the big data platform writes the name association information into a metadata management system database. It should be noted that, each of the input data and the output data has corresponding technical metadata, business metadata, and process metadata.
Further, the executor may send the name-related information to the big data platform together when sending the process metadata to the big data platform.
In the embodiment, through establishing the association between the input data name and the output data name of the process metadata and the data names of the corresponding technical metadata, the association between the acquisition task and the Hadoop task is established, and the relationship between the Hadoop task and the technical metadata is further established, so that a user can quickly determine whether the data are used by the tasks or the data are produced by the tasks according to the name association information in the metadata management database.
Further, a fifth embodiment of the metadata acquisition method of the present invention is presented.
The fourth embodiment of the metadata acquisition method is different from the first, second, third, or fourth embodiment of the metadata acquisition method in that, when the metadata acquired by the acquisition task is service metadata, the metadata acquisition method further includes:
and step k, determining first association information of the business metadata and the corresponding technical metadata and second association information of the business metadata and the corresponding business product information.
When the acquisition instruction is to acquire the service metadata, that is, when the metadata acquired by the acquisition task is the service metadata, after the executor acquires the service metadata, the executor determines first association information of the acquired service metadata and the corresponding technical metadata and second association information of the service metadata and the corresponding service product information. The service metadata information mainly comprises service attributes of a data table, wherein the service attributes comprise: the data system to which the data table belongs, the business product to which the data table belongs and the association information between the data table and the data system. It should be noted that, the data table corresponds to a data system, where the data system has multiple levels of sub-data systems, and in the sub-data system at the bottom layer, a data table corresponding to service metadata is stored, where the field type, the table field name, the comment (including the field comment and the table comment), the field length, and the like of each field in the data table are technical metadata. Therefore, the first association information is an association relationship between the data system and the data table, that is, the data system where the data table is located can be determined through the first association information, specifically, the first association information can be a name of the data system (including a name of each subsystem) and an associated name of the data table, and the like; the association information between the data table and the data systems is the association relation between the data table and each data system, and the storage position of the data table corresponding to the service metadata can be determined through the association information. The second association information is an association relationship between the service product information and the data system, that is, the service product corresponding to the data system can be determined through the second association information, specifically, the service product information can include a service product name, a product code and the like, and the second association information can be the service product name and an associated data system name.
Specifically, when acquiring business metadata, the data systems of the executor that acquire the business metadata are CMDB and product catalog. In the process of acquiring service metadata, an executor acquires the service metadata in a CMDB through a metadata acquisition tool, and then associates the service metadata and the technical metadata belonging to the same data to obtain first association information of the service metadata and the corresponding technical metadata. The executor reads the service product information from the product catalog through the metadata acquisition tool, and the service metadata is associated with the corresponding service product information to obtain second association information.
And step l, the first association information and the second association information are sent to a big data platform, so that the big data platform stores the first association information and the second association information to a metadata management system database.
And after the executor obtains the first association information and the second association information, the executor sends the first association information and the second association information to the big data platform. After the big data platform receives the first association information and the second association information sent by the executor, the big data platform stores the first association information and the second association information into the metadata association system database, namely, writes the first association information and the second association information into the metadata association system database.
In the embodiment, after the service metadata and the technical metadata are acquired, the association between the service metadata and the technical metadata is established, the association between the service product and the service system is established through the association between the service product and the service metadata, and the corresponding association information is sent to the metadata management system database, so that the association relationship between the service metadata and the technical metadata is quickly found in the metadata management system database, and the data system corresponding to the service product is quickly determined.
In addition, an embodiment of the present invention further provides a metadata acquisition apparatus, where the metadata acquisition apparatus is applied to a scheduler, and the metadata acquisition apparatus includes:
the first detection module is used for detecting whether the current time is in the execution time range of the acquisition task corresponding to the acquisition instruction after the acquisition instruction for acquiring the metadata is detected;
the first determining module is used for determining a target actuator with the minimum resource utilization rate in the actuators connected with the scheduler if the current time is in the execution time range;
And the first sending module is used for sending the acquisition task to the target executor so that the target executor can acquire the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task.
Further, the first determining module includes:
the first acquisition unit is used for acquiring the current CPU resource utilization rate of each executor connected with the scheduler if the current time is in the execution time range;
and the first determining unit is used for determining the actuator with the minimum CPU resource utilization rate as a target actuator.
In addition, an embodiment of the present invention further provides a metadata acquisition apparatus, where the metadata acquisition apparatus is applied to an executor, and the metadata acquisition apparatus includes:
the acquisition module is used for acquiring execution parameters corresponding to the acquisition task after receiving the acquisition task sent by the scheduler;
the connection module is used for connecting a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
a second determining module, configured to determine an acquiring tool for acquiring metadata in the data system according to the execution parameter;
The acquisition module is further used for acquiring the metadata in the connected data system according to the acquisition tool.
Further, the metadata acquisition apparatus further includes:
the storage module is used for storing the metadata into a big data platform so that the big data platform can convert the data format of the metadata into a data format corresponding to a metadata storage model, obtain the metadata after format conversion and return a success message of successful format conversion;
the triggering module is used for triggering a storage instruction for storing metadata into a metadata management database after receiving the success message;
and the second sending module is used for sending the storage instruction to the big data platform so that the big data platform can store the metadata after format conversion to the metadata management database according to the storage instruction.
Further, the metadata acquisition apparatus further includes:
the second detection module is used for detecting whether prompt information of successful metadata storage sent by the big data platform is received or not;
and the second sending module is further used for sending the prompt information to the dispatcher if the prompt information is received, so that the dispatcher records the task state corresponding to the acquisition task according to the prompt information.
Further, the metadata acquisition apparatus further includes:
the acquisition module is also used for acquiring an execution record corresponding to the process metadata;
the analysis module is used for analyzing the execution record corresponding to the process metadata to obtain a storage path of the execution log corresponding to the process metadata;
the acquisition module is also used for acquiring the execution log according to the storage path;
the second determining module is further configured to determine a task identifier associated with the process metadata according to the execution log; determining input data and output data corresponding to the process metadata according to the task identifier;
the association module is used for associating the input data name and the output data name with the data name of the corresponding technical metadata so as to obtain name association information between the input data and the output data and the technical metadata;
and the third sending module is used for sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
Further, the metadata acquisition apparatus further includes:
the second determining module is further configured to determine first association information of the service metadata and corresponding technical metadata, and second association information of the service metadata and corresponding service product information;
And the fourth sending module is used for sending the first association information and the second association information to a big data platform so that the big data platform can store the first association information and the second association information to a metadata management system database.
Further, the acquisition module is further used for determining an interface path and a storage type of a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
the connection module is also used for connecting the data system according to the interface path;
the second determining module is further configured to determine an acquiring tool for acquiring data in the data system according to the storage type.
The specific implementation manner of the metadata acquisition apparatus of the present invention is substantially the same as that of each embodiment of the metadata acquisition method described above, and will not be described herein again.
In addition, the embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a metadata acquisition program, and the metadata acquisition program realizes the steps of the metadata acquisition method when being executed by a processor.
The specific implementation manner of the computer readable storage medium of the present invention is basically the same as the above embodiments of the metadata acquisition method, and will not be described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (16)

1. A metadata acquisition method, wherein the metadata acquisition method is applied to a scheduler, the metadata acquisition method comprising the steps of:
After detecting an acquisition instruction for acquiring metadata, detecting whether the current time is within the execution time range of an acquisition task corresponding to the acquisition instruction;
if the current time is in the execution time range, determining a target actuator with the minimum resource utilization rate in actuators connected with the scheduler;
the acquisition task is sent to the target executor, when the metadata acquired by the acquisition task is process metadata, the target executor acquires the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task, acquires an execution record corresponding to the process metadata, and analyzes the execution record corresponding to the process metadata to obtain a storage path of an execution log corresponding to the process metadata; acquiring the execution log according to the storage path, and determining a task identifier associated with the process metadata according to the execution log; determining input data and output data corresponding to the process metadata according to the task identifier, and associating the names of the input data and the output data with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata; and sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
2. The method for obtaining metadata according to claim 1, wherein the step of determining a target actuator with a minimum resource utilization among actuators connected to the scheduler if the current time is within the execution time range comprises:
if the time is in the execution time range, acquiring the current CPU resource utilization rate of each executor connected with the scheduler;
and determining the actuator with the minimum CPU resource utilization rate as a target actuator.
3. A metadata acquisition method, wherein the metadata acquisition method is applied to an actuator, the metadata acquisition method comprising the steps of:
after receiving an acquisition task sent by a scheduler, acquiring an execution parameter corresponding to the acquisition task;
connecting a data system storing metadata corresponding to the acquisition task according to the execution parameters, and determining an acquisition tool for acquiring the metadata in the data system according to the execution parameters;
when the metadata acquired by the acquisition task is process metadata, acquiring the metadata in the connected data system according to the acquisition tool, acquiring an execution record corresponding to the process metadata, and analyzing the execution record corresponding to the process metadata to obtain a storage path of an execution log corresponding to the process metadata;
Acquiring the execution log according to the storage path, and determining a task identifier associated with the process metadata according to the execution log;
determining input data and output data corresponding to the process metadata according to the task identifier, and associating the names of the input data and the output data with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata;
and sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
4. The metadata acquisition method according to claim 3, wherein after the step of acquiring the metadata in the connected data system according to the acquisition tool, further comprising:
storing the metadata into a big data platform so that the big data platform can convert the data format of the metadata into a data format corresponding to a metadata storage model, obtaining the metadata after format conversion, and returning a success message of successful format conversion;
and triggering a storage instruction for storing metadata into a metadata management database after receiving the success message, and sending the storage instruction to the big data platform so that the big data platform stores the metadata after format conversion into the metadata management database according to the storage instruction.
5. The method for obtaining metadata according to claim 4, wherein after the step of triggering a storing instruction to store metadata in a metadata management database and transmitting the storing instruction to the big data platform to store the metadata after format conversion in the metadata management database according to the storing instruction, the method further comprises:
detecting whether prompt information of successful metadata storage sent by the big data platform is received or not;
and if the prompt information is received, the prompt information is sent to the dispatcher, so that the dispatcher records the task state corresponding to the acquisition task according to the prompt information.
6. The metadata acquisition method according to claim 3, wherein when the metadata acquired by the acquisition task is service metadata, the step of acquiring the metadata in the connected data system according to the acquisition tool further comprises, after:
determining first association information of the service metadata and corresponding technical metadata and second association information of the service metadata and corresponding service product information;
And sending the first association information and the second association information to a big data platform so that the big data platform can store the first association information and the second association information to a metadata management system database.
7. The metadata acquisition method according to any one of claims 3 to 6, wherein the step of connecting a data system storing metadata corresponding to the acquisition task according to the execution parameters and determining an acquisition tool for acquiring metadata in the data system according to the execution parameters includes:
determining an interface path and a storage type of a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
and connecting the data system according to the interface path, and determining an acquisition tool for acquiring data in the data system according to the storage type.
8. A metadata acquisition apparatus, the metadata acquisition apparatus being applied to a scheduler, the metadata acquisition apparatus comprising:
the first detection module is used for detecting whether the current time is in the execution time range of the acquisition task corresponding to the acquisition instruction after the acquisition instruction for acquiring the metadata is detected;
The first determining module is used for determining a target actuator with the minimum resource utilization rate in the actuators connected with the scheduler if the current time is in the execution time range;
the first sending module is used for sending the acquisition task to the target executor, when the metadata acquired by the acquisition task is process metadata, the target executor acquires the metadata in a corresponding data system by adopting a corresponding acquisition tool according to the acquisition task, acquires an execution record corresponding to the process metadata, and analyzes the execution record corresponding to the process metadata to obtain a storage path of an execution log corresponding to the process metadata; acquiring the execution log according to the storage path, and determining a task identifier associated with the process metadata according to the execution log; determining input data and output data corresponding to the process metadata according to the task identifier, and associating the names of the input data and the output data with the data names of the corresponding technical metadata to obtain name association information between the input data and the output data and the technical metadata; and sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
9. A metadata acquisition apparatus, the metadata acquisition apparatus being applied to an actuator, the metadata acquisition apparatus comprising:
the acquisition module is used for acquiring execution parameters corresponding to the acquisition task after receiving the acquisition task sent by the scheduler;
the connection module is used for connecting a data system for storing metadata corresponding to the acquisition task according to the execution parameters;
a second determining module, configured to determine an acquiring tool for acquiring metadata in the data system according to the execution parameter;
when the metadata acquired by the acquisition task is process metadata, the acquisition module is further used for acquiring the metadata in the connected data system according to the acquisition tool, and acquiring an execution record corresponding to the process metadata;
the analysis module is used for analyzing the execution record corresponding to the process metadata to obtain a storage path of the execution log corresponding to the process metadata;
the acquisition module is also used for acquiring the execution log according to the storage path;
the second determining module is further configured to determine a task identifier associated with the process metadata according to the execution log; determining input data and output data corresponding to the process metadata according to the task identifier;
The association module is used for associating the input data name and the output data name with the data name of the corresponding technical metadata so as to obtain name association information between the input data and the output data and the technical metadata;
and the third sending module is used for sending the name association information to a big data platform so that the big data platform can store the association information into a metadata management system database.
10. The metadata acquisition apparatus according to claim 9, wherein the metadata acquisition apparatus further comprises:
the storage module is used for storing the metadata into a big data platform so that the big data platform can convert the data format of the metadata into a data format corresponding to a metadata storage model, obtain the metadata after format conversion and return a success message of successful format conversion;
the triggering module is used for triggering a storage instruction for storing metadata into a metadata management database after receiving the success message;
and the second sending module is used for sending the storage instruction to the big data platform so that the big data platform can store the metadata after format conversion to the metadata management database according to the storage instruction.
11. The metadata acquisition apparatus according to claim 10, wherein the metadata acquisition apparatus further comprises:
the second detection module is used for detecting whether prompt information of successful metadata storage sent by the big data platform is received or not;
and the second sending module is further used for sending the prompt information to the dispatcher if the prompt information is received, so that the dispatcher records the task state corresponding to the acquisition task according to the prompt information.
12. The metadata acquisition apparatus according to claim 9, wherein the metadata acquisition apparatus further comprises:
the second determining module is further configured to determine first association information of service metadata and corresponding technical metadata, and second association information of the service metadata and corresponding service product information;
and the fourth sending module is used for sending the first association information and the second association information to a big data platform so that the big data platform can store the first association information and the second association information to a metadata management system database.
13. A metadata acquisition device comprising a memory, a processor and a metadata acquisition program stored on the memory and executable on the processor, which metadata acquisition program when executed by the processor implements the steps of the metadata acquisition method according to claim 1 or 2.
14. A metadata acquisition device comprising a memory, a processor and a metadata acquisition program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the metadata acquisition method according to any one of claims 3 to 7.
15. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a metadata acquisition program which, when executed by a processor, implements the steps of the metadata acquisition method according to claim 1 or 2.
16. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a metadata acquisition program which, when executed by a processor, implements the steps of the metadata acquisition method according to any one of claims 3 to 7.
CN201811551965.4A 2018-12-18 2018-12-18 Metadata acquisition method, apparatus, device and computer readable storage medium Active CN109656963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811551965.4A CN109656963B (en) 2018-12-18 2018-12-18 Metadata acquisition method, apparatus, device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811551965.4A CN109656963B (en) 2018-12-18 2018-12-18 Metadata acquisition method, apparatus, device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109656963A CN109656963A (en) 2019-04-19
CN109656963B true CN109656963B (en) 2023-06-09

Family

ID=66114533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811551965.4A Active CN109656963B (en) 2018-12-18 2018-12-18 Metadata acquisition method, apparatus, device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109656963B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377568A (en) * 2019-07-26 2019-10-25 北京明略软件系统有限公司 A kind of metadata acquisition method and device
CN110516130A (en) * 2019-08-28 2019-11-29 北京明略软件系统有限公司 Metadata processing method and device, storage medium, electronic device
CN111182026B (en) * 2019-11-27 2023-05-12 广西金慕联行数字科技有限公司 Intelligent cloud box
CN110968592B (en) * 2019-12-06 2023-11-21 深圳前海环融联易信息科技服务有限公司 Metadata acquisition method, metadata acquisition device, computer equipment and computer readable storage medium
CN111381951B (en) * 2020-03-06 2023-06-30 北京思特奇信息技术股份有限公司 Dirty data processing method, device and storage medium in system architecture
CN111367984B (en) * 2020-03-11 2023-03-21 中国工商银行股份有限公司 Method and system for loading high-timeliness data into data lake
CN111651531B (en) * 2020-06-05 2024-04-09 深圳前海微众银行股份有限公司 Data importing method, device, equipment and computer storage medium
CN111782886A (en) * 2020-06-28 2020-10-16 杭州海康威视数字技术股份有限公司 Method and device for managing metadata
CN111984380A (en) * 2020-08-21 2020-11-24 北京金山云网络技术有限公司 Stream computing service system and control method and device thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101566981A (en) * 2008-04-24 2009-10-28 长沙创智天马财务软件有限公司 Method for establishing dynamic virtual data base in analyzing and processing system
CN102270225B (en) * 2011-06-28 2015-10-21 用友网络科技股份有限公司 Data change daily record method for supervising and data change daily record supervising device
CN103618652B (en) * 2013-12-17 2018-03-20 沈阳觉醒软件有限公司 A kind of audit of business datum and depth analysis system and method
CN105045656B (en) * 2015-06-30 2018-11-30 深圳清华大学研究院 Big data storage and management method based on virtual container
US10394637B2 (en) * 2015-09-04 2019-08-27 American Express Travel Related Services Company, Inc. Systems and methods for data validation and processing using metadata
CN108846076A (en) * 2018-06-08 2018-11-20 山大地纬软件股份有限公司 The massive multi-source ETL process method and system of supporting interface adaptation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Metadata integration architecture in enterprise data warehouse system;Yu QianCheng;《The 2nd International Conference on information Science and Engineering》;340-343 *
Research and Application of Data Modelling and Integration Based on Metadata;Jiecai Zheng,Xueqing Li;《2015 7th International Conference Based on Information Technology in Medicine and Education(ITME)》;525-528 *

Also Published As

Publication number Publication date
CN109656963A (en) 2019-04-19

Similar Documents

Publication Publication Date Title
CN109656963B (en) Metadata acquisition method, apparatus, device and computer readable storage medium
US20230252028A1 (en) Data serialization in a distributed event processing system
CN109582660B (en) Data blood margin analysis method, device, equipment, system and readable storage medium
US11216302B2 (en) Modifying task dependencies at worker nodes using precompiled libraries
US20230169086A1 (en) Event driven extract, transform, load (etl) processing
JP6266630B2 (en) Managing continuous queries with archived relations
JP5298117B2 (en) Data merging in distributed computing
US9280568B2 (en) Zero downtime schema evolution
US20130166602A1 (en) Cloud-enabled business object modeling
US10394805B2 (en) Database management for mobile devices
CN109902117B (en) Business system analysis method and device
WO2020238597A1 (en) Hadoop-based data updating method, device, system and medium
US11061964B2 (en) Techniques for processing relational data with a user-defined function (UDF)
US20130173594A1 (en) Techniques for accessing a parallel database system via external programs using vertical and/or horizontal partitioning
CN108885641A (en) High Performance Data Query processing and data analysis
US20130174048A1 (en) Techniques for guided access to an external distributed file system from a database management system
US11816511B1 (en) Virtual partitioning of a shared message bus
CN113806429A (en) Canvas type log analysis method based on large data stream processing framework
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
US8930426B2 (en) Distributed requests on remote data
US11748441B1 (en) Serving real-time big data analytics on browser using probabilistic data structures
CN114185998A (en) Data processing method, device, equipment and storage medium
Nevarez et al. How SQL Server Works
CN116383295A (en) Data processing method, device, electronic equipment and computer readable storage medium
Borowski et al. I does not always have to be Map Reduce or Spark

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant