CN117076518A - Metadata query method, device, system and related equipment - Google Patents

Metadata query method, device, system and related equipment Download PDF

Info

Publication number
CN117076518A
CN117076518A CN202310854529.9A CN202310854529A CN117076518A CN 117076518 A CN117076518 A CN 117076518A CN 202310854529 A CN202310854529 A CN 202310854529A CN 117076518 A CN117076518 A CN 117076518A
Authority
CN
China
Prior art keywords
metadata
query
distributed
knowledge graph
service system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310854529.9A
Other languages
Chinese (zh)
Inventor
刘康
杨明川
夏晓晴
郭枝虾
闫汇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Technology Innovation Center
China Telecom Corp Ltd
Original Assignee
China Telecom Technology Innovation Center
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Technology Innovation Center, China Telecom Corp Ltd filed Critical China Telecom Technology Innovation Center
Priority to CN202310854529.9A priority Critical patent/CN117076518A/en
Publication of CN117076518A publication Critical patent/CN117076518A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a metadata query method, device and system and related equipment, and relates to the technical field of big data management. The method comprises the following steps: acquiring metadata in a plurality of databases of distributed deployment; classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed; and processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried. The metadata query method and the metadata query device can overcome the problem that metadata query efficiency is low in the ultra-large-scale distributed data management process of the related technology to a certain extent.

Description

Metadata query method, device, system and related equipment
Technical Field
The disclosure relates to the technical field of big data management, in particular to a metadata query method, device and system and related equipment.
Background
With the advent of the big data age, management of data is more and more important in life, various data plays an important role in the business management process, and the data volume is more and more huge. In the related art, in the process of ultra-large-scale distributed data management, complex calculation power is required in the process of storing, inquiring and updating data, so that the related art has the technical problems of overlarge data storage, low inquiring efficiency and the like in the process of managing the data.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The disclosure provides a metadata query method, a device, a system and related equipment, which at least overcome the problem that the metadata query efficiency is lower in the ultra-large scale distributed data management process of the related technology to a certain extent.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to one aspect of the present disclosure, there is provided a metadata query method including: acquiring metadata in a plurality of databases of distributed deployment; classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed; and processing each metadata set of distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
In some exemplary embodiments of the present disclosure, based on the foregoing scheme, each metadata set of the distributed deployment is processed to obtain a multi-dimensional metadata knowledge-graph, including: performing association analysis on metadata in each metadata set of the distributed deployment, and determining association relations among a plurality of metadata in each metadata set; and obtaining a multidimensional metadata knowledge graph according to the association relation between the metadata and the metadata in each metadata set.
In some exemplary embodiments of the present disclosure, based on the foregoing solution, after obtaining the multi-dimensional metadata knowledge-graph according to the association relationship between the plurality of metadata and the plurality of metadata, the method further includes: and storing the multidimensional metadata knowledge graph into an internal memory database.
In some exemplary embodiments of the present disclosure, based on the foregoing scheme, after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph, the method further includes: and receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through the metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
In some exemplary embodiments of the present disclosure, based on the foregoing scheme, obtaining metadata in a plurality of databases of a distributed deployment includes: receiving calling application programming interface information initiated by a plurality of service systems by adopting a passive ingestion mode; and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
In some exemplary embodiments of the present disclosure, based on the foregoing solution, the metadata is classified and stored according to a preset rule, to obtain a plurality of metadata sets of distributed deployment, where the method further includes: and marking the source and attribute information of the metadata.
In some exemplary embodiments of the disclosure, based on the foregoing scheme, the method further includes: and opening application programming interfaces for a plurality of service systems, wherein the service systems ingest the metadata through the application programming interfaces.
According to another aspect of the present disclosure, there is also provided a metadata query apparatus including: the metadata acquisition module is used for acquiring metadata in a plurality of databases distributed to be deployed; the metadata storage module is used for classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed; the metadata knowledge graph generation module is used for processing each metadata set distributed to obtain a multi-dimensional metadata knowledge graph, and the metadata knowledge graph is used for inquiring metadata to be inquired indicated by the metadata inquiry instruction according to the metadata inquiry instruction sent by each service system.
According to another aspect of the present disclosure, there is also provided a metadata query system including: the metadata management unit is connected with each service system and used for acquiring metadata in a plurality of service system databases distributed and deployed; the intermediate information processing unit is connected with the metadata management unit and used for classifying and storing the acquired metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed, processing each metadata set distributed and deployed to obtain a multi-dimensional metadata knowledge graph, and querying metadata to be queried indicated by the metadata query instructions according to metadata query instructions sent by each service system.
In some exemplary embodiments of the present disclosure, based on the foregoing, the system further comprises: and the distributed database is connected with the metadata management unit and is used for accessing each service system database and transmitting metadata in each service system database to the metadata management unit.
According to still another aspect of the present disclosure, there is also provided an electronic apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform any of the metadata query methods described above via execution of the executable instructions.
According to yet another aspect of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any one of the metadata query methods described above.
The embodiment of the disclosure provides a metadata query method, a device, a system and related equipment, which firstly acquire metadata in a plurality of databases distributed and deployed; then, classifying and storing the acquired metadata according to a preset rule to obtain a plurality of metadata sets distributed and deployed; and finally, processing each metadata set distributed to obtain a multidimensional metadata knowledge graph, wherein the metadata knowledge graph in the embodiment of the disclosure can query metadata to be queried indicated by the data query instruction according to the data query instruction sent by each service system and return the metadata to be queried.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a data query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 is a schematic diagram of a system architecture for applying a metadata query method in an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a metadata query method in an embodiment of the disclosure;
fig. 3 is a schematic diagram illustrating a method for obtaining a multi-dimensional metadata knowledge graph in an embodiment of the disclosure;
FIG. 4 is a schematic diagram of a metadata query mechanism in an embodiment of the present disclosure;
FIG. 5 illustrates a metadata query system diagram in accordance with an embodiment of the present disclosure;
FIG. 6 is a schematic diagram showing an intermediate information processing unit in an embodiment of the present disclosure;
fig. 7 shows a schematic diagram of an electronic device to which a metadata query method is applied in an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the disclosed aspects may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
FIG. 1 illustrates an exemplary application system architecture diagram to which the metadata query method of embodiments of the present disclosure may be applied. As shown in fig. 1, the system architecture may include a terminal device 101, a network 102, and a server 103.
The medium used by the network 102 to provide a communication link between the terminal device 101 and the server 103 may be a wired network or a wireless network.
Alternatively, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the Internet, but may be any network including, but not limited to, a local area network (Local Area Network, LAN), metropolitan area network (Metropolitan Area Network, MAN), wide area network (Wide Area Network, WAN), mobile, wired or wireless network, private network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including HyperText Mark-up Language (HTML), extensible markup Language (Extensible MarkupLanguage, XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as secure sockets layer (Secure Socket Layer, SSL), transport layer security (Transport Layer Security, TLS), virtual private network (Virtual Private Network, VPN), internet protocol security (Internet ProtocolSecurity, IPsec), etc. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of or in addition to the data communication techniques described above.
The terminal device 101 may be a variety of electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, wearable devices, augmented reality devices, virtual reality devices, and the like.
Alternatively, the clients of the applications installed in different terminal devices 101 are the same or clients of the same type of application based on different operating systems. The specific form of the application client may also be different based on the different terminal platforms, for example, the application client may be a mobile phone client, a PC client, etc.
The server 103 may be a server providing various services, such as a background management server providing support for devices operated by the user with the terminal apparatus 101. The background management server can analyze and process the received data such as the request and the like, and feed back the processing result to the terminal equipment.
Optionally, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Those skilled in the art will appreciate that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative, and that any number of terminal devices, networks, and servers may be provided as desired. The embodiments of the present disclosure are not limited in this regard.
Under the system architecture described above, the embodiments of the present disclosure provide a metadata query method, which may be performed by any electronic device with computing processing capabilities.
In some embodiments, the metadata query method provided in the embodiments of the present disclosure may be performed by a terminal device of the above system architecture; in other embodiments, the metadata query method provided in the embodiments of the present disclosure may be performed by a server in the system architecture described above; in other embodiments, the metadata query method provided in the embodiments of the present disclosure may be implemented by the terminal device and the server in the system architecture in an interactive manner.
Hereinafter, each step of the metadata query method in the present exemplary embodiment will be described in more detail with reference to the accompanying drawings and examples.
Firstly, the embodiment of the disclosure provides a query method which can be applied to but is not limited to metadata, and the embodiment of the disclosure stores metadata acquired from a plurality of databases distributed in a classified manner, processes the metadata stored in the classified manner to obtain a multidimensional metadata knowledge graph, so that the problems that in the prior art, in the ultra-large scale distributed data management process, all metadata are stored in one register to cause overlarge data storage and poor data processing capability are solved, and according to the metadata stored in the classified manner in different dimensions, the multidimensional metadata knowledge graph is obtained, when a business system queries corresponding metadata, all metadata do not need to be traversed in sequence, and according to the required metadata dimension, the metadata knowledge graph in the corresponding dimension is queried, so that the problem of low query efficiency is solved.
Fig. 2 is a schematic diagram of a metadata query method in an embodiment of the disclosure, as shown in fig. 2, where the metadata query method provided in the embodiment of the disclosure includes the following steps:
s201, acquiring metadata in a plurality of databases of distributed deployment.
In some embodiments, metadata in embodiments of the present disclosure is provided by a plurality of databases deployed in a distributed manner, where each database in a distributed manner in embodiments of the present disclosure may be a database of each business system, and then the acquired metadata is centrally transferred into a register or database in an application or system to which the metadata query method in embodiments of the present disclosure is applied, where the metadata acquired from the databases of the business systems may include related business information, database information, data table information, field information, data sources, data types, data sets, remark information, and so on.
S202, classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed.
In some embodiments, the preset rule in the embodiments of the present disclosure includes a data type of metadata that has been classified in advance, the metadata obtained in step S201 is classified and stored according to the data type of the metadata that has been classified in advance, and the metadata is classified and stored, so that the calculation effort can be reduced, and further, a dimension metadata knowledge graph is constructed for the metadata that is stored according to the classification subsequently, and the retrieval efficiency of the distributed database is improved according to the multi-dimension metadata knowledge graph.
In more detail, the embodiment of the present disclosure may classify according to service information, and store the classified service information in a mapped database, where the service information in the embodiment of the present disclosure includes: the service name, the database information and the table name/field name can be classified from large to small according to the service name, the database information and the table name/field name, and the classified service information is stored in the mapped database.
S203, each metadata set distributed is processed to obtain a multi-dimensional metadata knowledge graph, and the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
In some embodiments, the embodiments of the present disclosure process each metadata set of the distributed deployment to obtain a metadata knowledge graph corresponding to each metadata set, where each metadata set of the distributed deployment is obtained by classifying metadata according to a preset rule, so that the metadata knowledge graph corresponding to each metadata set obtained in the embodiments of the present disclosure is a multi-dimensional metadata knowledge graph, and according to a data query instruction sent by each service system, the embodiments of the present disclosure query metadata to be queried indicated by the data query instruction through the obtained multi-dimensional metadata knowledge graph, and can provide a query service for classifying metadata in a classified manner.
The embodiment of the disclosure provides a metadata query method, which comprises the steps of firstly, acquiring metadata in a plurality of databases distributed and deployed; then, classifying and storing the acquired metadata according to a preset rule to obtain a plurality of metadata sets distributed and deployed; and finally, processing each metadata set distributed to obtain a multidimensional metadata knowledge graph, wherein the metadata knowledge graph in the embodiment of the disclosure can query metadata to be queried indicated by the metadata query instruction according to the metadata query instruction sent by each service system.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a metadata query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
In some embodiments, as shown in fig. 3, each metadata set of the distributed deployment is processed to obtain a multi-dimensional metadata knowledge graph, which specifically includes the following steps:
s301, carrying out association analysis on metadata in each metadata set of distributed deployment, and determining association relations among a plurality of metadata in each metadata set;
s302, obtaining a multidimensional metadata knowledge graph according to the association relation between a plurality of metadata in each metadata set and a plurality of metadata.
Specifically, according to the embodiment of the disclosure, the acquired metadata can realize the knowledge graph associated retrieval of the metadata through the multidimensional metadata knowledge graph, so that the problems of complicated, isolated, heterogeneous and multi-source mass account order data and difficult association in the related technology are solved, and the internal mass data asset management level is improved.
In some embodiments, the disclosed embodiments query metadata according to an automatic hierarchical classification mapping query mode, specifically, return knowledge maps of different dimensions according to the following 4 query types: inquiring all, inquiring according to the name of the service platform, inquiring according to the database information and inquiring according to the table name/field name, wherein if the inquiring types are all inquired, returning the full knowledge graph; returning the metadata knowledge graph of the service platform if the query type is query according to the name of the service platform; if the query type is according to the condition that the database information is queried, returning to the metadata knowledge graph of the database; if the query type is according to the table name/field name query, returning the knowledge graph associated with the table name or the field name.
In some embodiments, after obtaining a multidimensional metadata knowledge graph according to the association relationships between the plurality of metadata and the plurality of metadata, the metadata query method in the embodiments of the present disclosure further includes: and storing the multidimensional metadata knowledge graph into an internal memory database.
Specifically, the embodiment of the disclosure carries out association aggregation on the acquired metadata through the knowledge graph construction tool to form data virtualization, stores the constructed metadata knowledge graph in the memory database, and in turn provides efficient retrieval service of the metadata for the service system, improves the retrieval efficiency of the metadata, and shortens the time for searching the data.
In more detail, the memory database in the embodiment of the disclosure discards the conventional manner of disk data management, redesigns the architecture based on the memory of all data, and correspondingly improves the aspects of data caching, fast algorithm and parallel operation, so that the data processing speed is much faster than that of the conventional database, generally more than 10 times, and therefore, the constructed metadata knowledge graph is stored in the memory database, so that faster and more efficient query service can be provided for an external service system.
In some embodiments, after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, the metadata query method in the embodiments of the present disclosure further includes: and receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through a metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
In some embodiments, obtaining metadata in a plurality of databases of a distributed deployment includes: receiving calling application programming interface information initiated by a plurality of service systems by adopting a passive ingestion mode; and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
In some embodiments, when a plurality of service systems initiate metadata ingestion requests, application programming interface information is called, metadata is classified from big to big according to service names, database information and table names/field names according to parameter rules of the programming interfaces, corresponding metadata is stored in a mapped database according to different ingestion types, and specifically, when the service systems initiate metadata ingestion types to represent all, the metadata including the service names, the database names, the table names, the field names and other data are stored in the mapped database; under the condition that the service system initiates that the metadata ingestion type represents the service name, the name of the current service system is stored in a mapping database; under the condition that the service system initiates metadata ingestion type to represent database information, storing various databases supporting MySQL (Structured Query Language ), hive, elasticsearch, mongoDB, oracle, kafka, postgreSQL and the like into a mapped database; in the case where the business system initiates that the metadata ingest type represents a table name/field name, the table name and field name in the current database are stored in the mapped database.
In some embodiments, metadata is classified and stored according to a preset rule to obtain a plurality of metadata sets deployed in a distributed manner, and the metadata query method in the embodiments of the present disclosure further includes: the source of the metadata and the attribute information are noted.
In some embodiments, the attribute information of the metadata in the embodiments of the present disclosure includes: metadata storage time, the user to which the metadata belongs, metadata type, and the like.
In more detail, in the embodiment of the disclosure, the ingested metadata is subjected to association analysis through a knowledge graph and a manual labeling technology, a knowledge graph based on the metadata is constructed, and an association relationship is established for the metadata.
In some embodiments, the metadata query method in the embodiments of the present disclosure further includes: and opening application programming interfaces for a plurality of service systems, and enabling the service systems to ingest metadata through the application programming interfaces.
In more detail, the open Application Programming Interface (API) for a plurality of business systems in embodiments of the present disclosure includes: the business system captures corresponding metadata through connecting different APIs, the embodiment of the disclosure defines the metadata automatic integration APIs and the automatic data directory retrieval APIs at the same time, the constructed metadata knowledge graph is stored in a memory database to form a virtual layer, data is virtualized, the link and the real-time query of the metadata are realized by three methods of application integration and real-time analysis and the knowledge graph, the data management and the application of the metadata are finally defined, the overall thought of overall management, relation processing and analysis of the metadata to a final application scene is opened, and the planning use is assisted from a higher level and the use value of the metadata is improved.
In some embodiments, the metadata query method provided in the embodiments of the present disclosure does not have local data, and provides metadata storage, association analysis and quick retrieval services for each service system by managing metadata and designing a metadata ingestion method, so that the storage and efficient retrieval functions of ultra-large distributed data management are realized, all other requirements for distributed management of metadata can be satisfied, the metadata query method has higher universality, and a more efficient solution is provided for the aspect of large data industry management.
Based on the same inventive concept, the embodiments of the present disclosure also provide a metadata query device, as follows. Since the principle of solving the problem of the embodiment of the device is similar to that of the embodiment of the method, the implementation of the embodiment of the device can be referred to the implementation of the embodiment of the method, and the repetition is omitted.
Fig. 4 shows a schematic diagram of a metadata query apparatus according to an embodiment of the disclosure, as shown in fig. 4, where the apparatus includes:
a metadata acquisition module 401, configured to acquire metadata in a plurality of databases of a distributed deployment;
the metadata storage module 402 is configured to store metadata in a classified manner according to a preset rule, so as to obtain a plurality of metadata sets deployed in a distributed manner;
The metadata knowledge graph generation module 403 is configured to process each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, where the metadata knowledge graph is used to respond to a data query instruction sent by each service system and return metadata to be queried.
The metadata query device provided by the embodiment of the disclosure acquires metadata in a plurality of databases in distributed deployment through a metadata acquisition module; the metadata storage module is used for classifying and storing the acquired metadata according to a preset rule to obtain a plurality of metadata sets distributed and deployed; and processing each metadata set distributed by the metadata knowledge graph generation module to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph in the embodiment of the disclosure can return metadata to be queried according to the data query instruction sent by each service system.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a data query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
In some embodiments, the metadata knowledge graph generation module in the embodiments of the present disclosure is further configured to perform association analysis on metadata in each metadata set of the distributed deployment, and determine association relationships between a plurality of metadata in each metadata set; and obtaining a multidimensional metadata knowledge graph according to the association relation between the metadata and the metadata in each metadata set.
In some embodiments, after obtaining a multidimensional metadata knowledge graph according to the association relationship between the plurality of metadata and the plurality of metadata, the metadata query apparatus in the embodiments of the present disclosure further includes: and the metadata knowledge graph storage module is used for storing the multidimensional metadata knowledge graph into the memory database.
In some embodiments, after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph, the metadata query apparatus in the embodiments of the present disclosure further includes: and the instruction query module is used for receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through a metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
In some embodiments, the metadata acquisition module in the embodiments of the present disclosure is further configured to receive call application programming interface information initiated by a plurality of service systems in a passive ingestion manner; and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
In some embodiments, metadata is classified and stored according to a preset rule to obtain a plurality of metadata sets deployed in a distributed manner, and the metadata query device in the embodiments of the present disclosure further includes: and the information labeling module is used for labeling the source and attribute information of the metadata.
In some embodiments, the metadata query apparatus in the embodiments of the present disclosure further includes: and the information interaction module is used for opening application programming interfaces for a plurality of service systems, and the service systems ingest metadata through the application programming interfaces.
Based on the same inventive concept, a metadata query system is also provided in the embodiments of the present disclosure, as follows. Since the principle of solving the problem of the system embodiment is similar to that of the method embodiment, the implementation of the system embodiment can be referred to the implementation of the method embodiment, and the repetition is omitted.
FIG. 5 shows a schematic diagram of a metadata query system in an embodiment of the disclosure, as shown in FIG. 5, the system comprising:
a metadata management unit 501, connected to each service system, for acquiring metadata in a plurality of service system databases deployed in a distributed manner;
the intermediate information processing unit 502 is connected to the metadata management unit 501, and is configured to store the acquired metadata in a classified manner according to a preset rule, obtain a plurality of metadata sets in a distributed deployment, process each metadata set in the distributed deployment, and obtain a multi-dimensional metadata knowledge graph, where the metadata knowledge graph is used to respond to a data query instruction sent by each service system, and return metadata to be queried.
According to the metadata query system provided by the embodiment of the disclosure, metadata in a plurality of databases distributed and deployed are acquired through a metadata management unit; and classifying and storing the acquired metadata according to a preset rule by an intermediate information processing unit to obtain a plurality of metadata sets distributed and deployed, and processing each metadata set distributed and deployed to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for querying metadata to be queried indicated by the metadata query instruction according to the data query instruction sent by each service system.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a data query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
In some embodiments, as shown in fig. 5, the metadata query system in the embodiments of the present disclosure further includes: the distributed database 503 is connected to a plurality of service systems, and is used for accessing each service system database, and transmitting metadata in each service system database to the metadata management unit 501.
In more detail, the embodiment of the disclosure opens the metadata ingest interface by accessing the distributed database of each service system, adopts a passive metadata ingest mode to be safer and more effective, and only needs to distribute keys for each system to carry out authentication.
In some embodiments, as shown in fig. 5, the upper layer service providing unit of the metadata query system in the embodiment of the present disclosure further provides APIs with different requirements for each service system, including a metadata automation integration API, a metadata query API, a knowledge graph construction API, and an automation data directory retrieval API.
In some embodiments, as shown in fig. 5, the metadata query system in the embodiments of the present disclosure includes: a metadata management unit 501, an intermediate information processing unit 502, a distributed database 503, and an upper layer service providing unit 504. Wherein the distributed database 503 is a real database of each service system, and the access mode is verification through an API request; the metadata management unit 501 mainly performs entity extraction for accessed metadata; the intermediate information processing unit 502 analyzes the metadata relationship, carries out metadata association aggregation, and realizes metadata knowledge graph construction; the upper layer service providing unit 504 provides API services for each external system, and when the service system initiates an associated query service across databases and tables, the intermediate information processing unit in the embodiment of the present disclosure may directly provide a quick query service for the service system.
In some embodiments, as shown in fig. 6, the intermediate information processing unit in the embodiments of the present disclosure includes a Service (Service) middleware 600, where the Service middleware 600 specifically includes 4 modules, namely a passive access module 601, a data processing module 602, a knowledge graph construction module 603, and a graph database module 604, and in more detail, the passive access module in the embodiments of the present disclosure outputs success or failure of writing the data after inputting identity authentication information, a system name, database information, a database name, and a format-customized metadata set; the data processing module ingests the data in the passive access module according to the ingestion type and the ingestion description, specifically, the ingestion type comprises ingestion entirety, an ingestion sub-service platform, an ingestion data type and an ingestion table name or a field name, and the ingestion description can be all data, a service platform name, a database name, a display, a field name and the like; the knowledge graph module in the embodiment of the disclosure obtains a metadata set through the passive access module, inputs the metadata set into the knowledge graph model for training, outputs a knowledge graph, constructs a metadata association relationship, and writes the metadata association relationship into the graph database module; if the knowledge graph in the graph database module is queried, identity authentication information, entity names or relation names are required to be input, and query results are obtained according to the output graph data.
In some embodiments, the Service middleware in the embodiments of the present disclosure may directly provide a fast query Service for each Service system.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 700 according to such an embodiment of the present disclosure is described below with reference to fig. 7. The electronic device 700 shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 7, the electronic device 700 is embodied in the form of a general purpose computing device. Components of electronic device 700 may include, but are not limited to: the at least one processing unit 701, the at least one memory unit 702, and a bus 703 that connects the different system components (including the memory unit 702 and the processing unit 701).
In which a storage unit stores program code that can be executed by the processing unit 701, such that the processing unit 701 performs steps according to various exemplary embodiments of the present disclosure described in the above-described "exemplary method" section of the present specification.
In some embodiments, when the electronic device is used to control, for example, the knowledge-graph-based question-answering method described above in the present disclosure, the processing unit 701 may perform the following steps of the method embodiments described above:
metadata in a plurality of databases of a distributed deployment is obtained.
And classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed.
And processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
The storage unit 702 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 7021 and/or cache memory 7022, and may further include Read Only Memory (ROM) 7023.
The storage unit 702 may also include a program/utility 7024 having a set (at least one) of program modules 7025, such program modules 7025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus 703 may be one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 700 may also communicate with one or more external devices 704 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 700, and/or any device (e.g., router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 705. Also, the electronic device 700 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through the network adapter 706. As shown, the network adapter 706 communicates with other modules of the electronic device 700 via the bus 703. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 700, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In particular, according to embodiments of the present disclosure, the process described above with reference to the flowcharts may be implemented as a computer program product comprising: and the computer program realizes the metadata query method when the computer program is executed by the processor.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium, which may be a readable signal medium or a readable storage medium, is also provided. On which a program product is stored which enables the implementation of the method described above of the present disclosure. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
More specific examples of the computer readable storage medium in the present disclosure may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In this disclosure, a computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Alternatively, the program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
In particular implementations, the program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the description of the above embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (12)

1. A method for querying metadata, comprising:
acquiring metadata in a plurality of databases of distributed deployment;
classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed;
and processing each metadata set of distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
2. The metadata query method as claimed in claim 1, wherein processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph comprises:
performing association analysis on metadata in each metadata set of the distributed deployment, and determining association relations among a plurality of metadata in each metadata set;
and obtaining a multidimensional metadata knowledge graph according to the association relation between the metadata and the metadata in each metadata set.
3. The metadata query method according to claim 2, wherein after obtaining a multi-dimensional metadata knowledge graph from a plurality of metadata and an association relationship between the plurality of metadata, the method further comprises:
And storing the multidimensional metadata knowledge graph into an internal memory database.
4. The metadata query method of claim 1, wherein after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph, the method further comprises:
and receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through the metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
5. The method of claim 1, wherein obtaining metadata in a plurality of databases of a distributed deployment comprises:
receiving calling application programming interface information initiated by a plurality of service systems by adopting a passive ingestion mode;
and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
6. The metadata query method of claim 1, wherein the metadata is classified according to a preset rule for storage, and a plurality of metadata sets of distributed deployment are obtained, the method further comprising:
and marking the source and attribute information of the metadata.
7. The metadata query method of claim 1, wherein the method further comprises:
and opening application programming interfaces for a plurality of service systems, wherein the service systems ingest the metadata through the application programming interfaces.
8. A metadata query mechanism, comprising:
the metadata acquisition module is used for acquiring metadata in a plurality of databases distributed to be deployed;
the metadata storage module is used for classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed;
the metadata knowledge graph generation module is used for processing each metadata set distributed to obtain a multi-dimensional metadata knowledge graph, and the metadata knowledge graph is used for inquiring metadata to be inquired indicated by the metadata inquiry instruction according to the metadata inquiry instruction sent by each service system.
9. A metadata query system, comprising:
the metadata management unit is connected with each service system and used for acquiring metadata in a plurality of service system databases distributed and deployed;
the intermediate information processing unit is connected with the metadata management unit and used for classifying and storing the acquired metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed, processing each metadata set distributed and deployed to obtain a multi-dimensional metadata knowledge graph, and querying metadata to be queried indicated by the metadata query instructions according to metadata query instructions sent by each service system.
10. The metadata query system of claim 9, wherein the system further comprises:
and the distributed database is connected with the metadata management unit and is used for accessing each service system database and transmitting metadata in each service system database to the metadata management unit.
11. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the metadata query method of any of claims 1-7 via execution of the executable instructions.
12. A computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the metadata query method of any of claims 1 to 7.
CN202310854529.9A 2023-07-12 2023-07-12 Metadata query method, device, system and related equipment Pending CN117076518A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310854529.9A CN117076518A (en) 2023-07-12 2023-07-12 Metadata query method, device, system and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310854529.9A CN117076518A (en) 2023-07-12 2023-07-12 Metadata query method, device, system and related equipment

Publications (1)

Publication Number Publication Date
CN117076518A true CN117076518A (en) 2023-11-17

Family

ID=88714243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310854529.9A Pending CN117076518A (en) 2023-07-12 2023-07-12 Metadata query method, device, system and related equipment

Country Status (1)

Country Link
CN (1) CN117076518A (en)

Similar Documents

Publication Publication Date Title
US9363195B2 (en) Configuring cloud resources
US20210149895A1 (en) Query conversion for querying disparate data sources
US11681723B1 (en) Modeling of a non-relational database as a normalized relational database
CN108536778B (en) Data application sharing platform and method
JP2022549187A (en) Machine learning inference calls for database query processing
CN111949693B (en) Data processing device, data processing method, storage medium and electronic equipment
US20230023253A1 (en) Method for processing model parameters, and apparatus
US20190370255A1 (en) Remote query optimization in multi data sources
WO2018177032A1 (en) Method and device for processing response data, client device and electronic device
CN113923225A (en) Distributed architecture-based federated learning platform, method, device and storage medium
WO2024021476A1 (en) Data processing method and apparatus, electronic device and storage medium
CN116860854A (en) Multi-source data merging processing method, device, system and related equipment
CN116860941A (en) Question answering method and device, electronic equipment and storage medium
CN112685081A (en) System migration method and device, electronic equipment and storage medium
WO2022262481A1 (en) Calibration data management system, method, apparatus and device for electronic control unit
CN117076518A (en) Metadata query method, device, system and related equipment
CN115794494A (en) Data backup method, system, device, equipment and medium based on dynamic strategy
CN117472555A (en) Computing power resource allocation method, system, device, equipment and storage medium
CN114780361A (en) Log generation method, device, computer system and readable storage medium
US11809992B1 (en) Applying compression profiles across similar neural network architectures
CN115795119B (en) Haptic feature information acquisition method, device, system, equipment and medium
CN112464255A (en) Data processing method and device, storage medium and electronic equipment
CN113127496A (en) Method, apparatus, medium, and device for determining change data in database
CN114448976B (en) Method, device, equipment, medium and program product for assembling network message
US11842077B2 (en) Method, device, and computer program product for transmitting data for object storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination