CN117076518A - Metadata query method, device, system and related equipment - Google Patents
Metadata query method, device, system and related equipment Download PDFInfo
- Publication number
- CN117076518A CN117076518A CN202310854529.9A CN202310854529A CN117076518A CN 117076518 A CN117076518 A CN 117076518A CN 202310854529 A CN202310854529 A CN 202310854529A CN 117076518 A CN117076518 A CN 117076518A
- Authority
- CN
- China
- Prior art keywords
- metadata
- query
- distributed
- knowledge graph
- service system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000012545 processing Methods 0.000 claims abstract description 35
- 230000037406 food intake Effects 0.000 claims description 18
- 230000010365 information processing Effects 0.000 claims description 9
- 238000012098 association analyses Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 13
- 238000013523 data management Methods 0.000 abstract description 9
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000007726 management method Methods 0.000 description 22
- 238000010586 diagram Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 7
- 239000008186 active pharmaceutical agent Substances 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000012384 transportation and delivery Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure provides a metadata query method, device and system and related equipment, and relates to the technical field of big data management. The method comprises the following steps: acquiring metadata in a plurality of databases of distributed deployment; classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed; and processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried. The metadata query method and the metadata query device can overcome the problem that metadata query efficiency is low in the ultra-large-scale distributed data management process of the related technology to a certain extent.
Description
Technical Field
The disclosure relates to the technical field of big data management, in particular to a metadata query method, device and system and related equipment.
Background
With the advent of the big data age, management of data is more and more important in life, various data plays an important role in the business management process, and the data volume is more and more huge. In the related art, in the process of ultra-large-scale distributed data management, complex calculation power is required in the process of storing, inquiring and updating data, so that the related art has the technical problems of overlarge data storage, low inquiring efficiency and the like in the process of managing the data.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The disclosure provides a metadata query method, a device, a system and related equipment, which at least overcome the problem that the metadata query efficiency is lower in the ultra-large scale distributed data management process of the related technology to a certain extent.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to one aspect of the present disclosure, there is provided a metadata query method including: acquiring metadata in a plurality of databases of distributed deployment; classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed; and processing each metadata set of distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
In some exemplary embodiments of the present disclosure, based on the foregoing scheme, each metadata set of the distributed deployment is processed to obtain a multi-dimensional metadata knowledge-graph, including: performing association analysis on metadata in each metadata set of the distributed deployment, and determining association relations among a plurality of metadata in each metadata set; and obtaining a multidimensional metadata knowledge graph according to the association relation between the metadata and the metadata in each metadata set.
In some exemplary embodiments of the present disclosure, based on the foregoing solution, after obtaining the multi-dimensional metadata knowledge-graph according to the association relationship between the plurality of metadata and the plurality of metadata, the method further includes: and storing the multidimensional metadata knowledge graph into an internal memory database.
In some exemplary embodiments of the present disclosure, based on the foregoing scheme, after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph, the method further includes: and receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through the metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
In some exemplary embodiments of the present disclosure, based on the foregoing scheme, obtaining metadata in a plurality of databases of a distributed deployment includes: receiving calling application programming interface information initiated by a plurality of service systems by adopting a passive ingestion mode; and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
In some exemplary embodiments of the present disclosure, based on the foregoing solution, the metadata is classified and stored according to a preset rule, to obtain a plurality of metadata sets of distributed deployment, where the method further includes: and marking the source and attribute information of the metadata.
In some exemplary embodiments of the disclosure, based on the foregoing scheme, the method further includes: and opening application programming interfaces for a plurality of service systems, wherein the service systems ingest the metadata through the application programming interfaces.
According to another aspect of the present disclosure, there is also provided a metadata query apparatus including: the metadata acquisition module is used for acquiring metadata in a plurality of databases distributed to be deployed; the metadata storage module is used for classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed; the metadata knowledge graph generation module is used for processing each metadata set distributed to obtain a multi-dimensional metadata knowledge graph, and the metadata knowledge graph is used for inquiring metadata to be inquired indicated by the metadata inquiry instruction according to the metadata inquiry instruction sent by each service system.
According to another aspect of the present disclosure, there is also provided a metadata query system including: the metadata management unit is connected with each service system and used for acquiring metadata in a plurality of service system databases distributed and deployed; the intermediate information processing unit is connected with the metadata management unit and used for classifying and storing the acquired metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed, processing each metadata set distributed and deployed to obtain a multi-dimensional metadata knowledge graph, and querying metadata to be queried indicated by the metadata query instructions according to metadata query instructions sent by each service system.
In some exemplary embodiments of the present disclosure, based on the foregoing, the system further comprises: and the distributed database is connected with the metadata management unit and is used for accessing each service system database and transmitting metadata in each service system database to the metadata management unit.
According to still another aspect of the present disclosure, there is also provided an electronic apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform any of the metadata query methods described above via execution of the executable instructions.
According to yet another aspect of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any one of the metadata query methods described above.
The embodiment of the disclosure provides a metadata query method, a device, a system and related equipment, which firstly acquire metadata in a plurality of databases distributed and deployed; then, classifying and storing the acquired metadata according to a preset rule to obtain a plurality of metadata sets distributed and deployed; and finally, processing each metadata set distributed to obtain a multidimensional metadata knowledge graph, wherein the metadata knowledge graph in the embodiment of the disclosure can query metadata to be queried indicated by the data query instruction according to the data query instruction sent by each service system and return the metadata to be queried.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a data query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 is a schematic diagram of a system architecture for applying a metadata query method in an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a metadata query method in an embodiment of the disclosure;
fig. 3 is a schematic diagram illustrating a method for obtaining a multi-dimensional metadata knowledge graph in an embodiment of the disclosure;
FIG. 4 is a schematic diagram of a metadata query mechanism in an embodiment of the present disclosure;
FIG. 5 illustrates a metadata query system diagram in accordance with an embodiment of the present disclosure;
FIG. 6 is a schematic diagram showing an intermediate information processing unit in an embodiment of the present disclosure;
fig. 7 shows a schematic diagram of an electronic device to which a metadata query method is applied in an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the disclosed aspects may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
FIG. 1 illustrates an exemplary application system architecture diagram to which the metadata query method of embodiments of the present disclosure may be applied. As shown in fig. 1, the system architecture may include a terminal device 101, a network 102, and a server 103.
The medium used by the network 102 to provide a communication link between the terminal device 101 and the server 103 may be a wired network or a wireless network.
Alternatively, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the Internet, but may be any network including, but not limited to, a local area network (Local Area Network, LAN), metropolitan area network (Metropolitan Area Network, MAN), wide area network (Wide Area Network, WAN), mobile, wired or wireless network, private network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including HyperText Mark-up Language (HTML), extensible markup Language (Extensible MarkupLanguage, XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as secure sockets layer (Secure Socket Layer, SSL), transport layer security (Transport Layer Security, TLS), virtual private network (Virtual Private Network, VPN), internet protocol security (Internet ProtocolSecurity, IPsec), etc. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of or in addition to the data communication techniques described above.
The terminal device 101 may be a variety of electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, wearable devices, augmented reality devices, virtual reality devices, and the like.
Alternatively, the clients of the applications installed in different terminal devices 101 are the same or clients of the same type of application based on different operating systems. The specific form of the application client may also be different based on the different terminal platforms, for example, the application client may be a mobile phone client, a PC client, etc.
The server 103 may be a server providing various services, such as a background management server providing support for devices operated by the user with the terminal apparatus 101. The background management server can analyze and process the received data such as the request and the like, and feed back the processing result to the terminal equipment.
Optionally, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Those skilled in the art will appreciate that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative, and that any number of terminal devices, networks, and servers may be provided as desired. The embodiments of the present disclosure are not limited in this regard.
Under the system architecture described above, the embodiments of the present disclosure provide a metadata query method, which may be performed by any electronic device with computing processing capabilities.
In some embodiments, the metadata query method provided in the embodiments of the present disclosure may be performed by a terminal device of the above system architecture; in other embodiments, the metadata query method provided in the embodiments of the present disclosure may be performed by a server in the system architecture described above; in other embodiments, the metadata query method provided in the embodiments of the present disclosure may be implemented by the terminal device and the server in the system architecture in an interactive manner.
Hereinafter, each step of the metadata query method in the present exemplary embodiment will be described in more detail with reference to the accompanying drawings and examples.
Firstly, the embodiment of the disclosure provides a query method which can be applied to but is not limited to metadata, and the embodiment of the disclosure stores metadata acquired from a plurality of databases distributed in a classified manner, processes the metadata stored in the classified manner to obtain a multidimensional metadata knowledge graph, so that the problems that in the prior art, in the ultra-large scale distributed data management process, all metadata are stored in one register to cause overlarge data storage and poor data processing capability are solved, and according to the metadata stored in the classified manner in different dimensions, the multidimensional metadata knowledge graph is obtained, when a business system queries corresponding metadata, all metadata do not need to be traversed in sequence, and according to the required metadata dimension, the metadata knowledge graph in the corresponding dimension is queried, so that the problem of low query efficiency is solved.
Fig. 2 is a schematic diagram of a metadata query method in an embodiment of the disclosure, as shown in fig. 2, where the metadata query method provided in the embodiment of the disclosure includes the following steps:
s201, acquiring metadata in a plurality of databases of distributed deployment.
In some embodiments, metadata in embodiments of the present disclosure is provided by a plurality of databases deployed in a distributed manner, where each database in a distributed manner in embodiments of the present disclosure may be a database of each business system, and then the acquired metadata is centrally transferred into a register or database in an application or system to which the metadata query method in embodiments of the present disclosure is applied, where the metadata acquired from the databases of the business systems may include related business information, database information, data table information, field information, data sources, data types, data sets, remark information, and so on.
S202, classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed.
In some embodiments, the preset rule in the embodiments of the present disclosure includes a data type of metadata that has been classified in advance, the metadata obtained in step S201 is classified and stored according to the data type of the metadata that has been classified in advance, and the metadata is classified and stored, so that the calculation effort can be reduced, and further, a dimension metadata knowledge graph is constructed for the metadata that is stored according to the classification subsequently, and the retrieval efficiency of the distributed database is improved according to the multi-dimension metadata knowledge graph.
In more detail, the embodiment of the present disclosure may classify according to service information, and store the classified service information in a mapped database, where the service information in the embodiment of the present disclosure includes: the service name, the database information and the table name/field name can be classified from large to small according to the service name, the database information and the table name/field name, and the classified service information is stored in the mapped database.
S203, each metadata set distributed is processed to obtain a multi-dimensional metadata knowledge graph, and the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
In some embodiments, the embodiments of the present disclosure process each metadata set of the distributed deployment to obtain a metadata knowledge graph corresponding to each metadata set, where each metadata set of the distributed deployment is obtained by classifying metadata according to a preset rule, so that the metadata knowledge graph corresponding to each metadata set obtained in the embodiments of the present disclosure is a multi-dimensional metadata knowledge graph, and according to a data query instruction sent by each service system, the embodiments of the present disclosure query metadata to be queried indicated by the data query instruction through the obtained multi-dimensional metadata knowledge graph, and can provide a query service for classifying metadata in a classified manner.
The embodiment of the disclosure provides a metadata query method, which comprises the steps of firstly, acquiring metadata in a plurality of databases distributed and deployed; then, classifying and storing the acquired metadata according to a preset rule to obtain a plurality of metadata sets distributed and deployed; and finally, processing each metadata set distributed to obtain a multidimensional metadata knowledge graph, wherein the metadata knowledge graph in the embodiment of the disclosure can query metadata to be queried indicated by the metadata query instruction according to the metadata query instruction sent by each service system.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a metadata query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
In some embodiments, as shown in fig. 3, each metadata set of the distributed deployment is processed to obtain a multi-dimensional metadata knowledge graph, which specifically includes the following steps:
s301, carrying out association analysis on metadata in each metadata set of distributed deployment, and determining association relations among a plurality of metadata in each metadata set;
s302, obtaining a multidimensional metadata knowledge graph according to the association relation between a plurality of metadata in each metadata set and a plurality of metadata.
Specifically, according to the embodiment of the disclosure, the acquired metadata can realize the knowledge graph associated retrieval of the metadata through the multidimensional metadata knowledge graph, so that the problems of complicated, isolated, heterogeneous and multi-source mass account order data and difficult association in the related technology are solved, and the internal mass data asset management level is improved.
In some embodiments, the disclosed embodiments query metadata according to an automatic hierarchical classification mapping query mode, specifically, return knowledge maps of different dimensions according to the following 4 query types: inquiring all, inquiring according to the name of the service platform, inquiring according to the database information and inquiring according to the table name/field name, wherein if the inquiring types are all inquired, returning the full knowledge graph; returning the metadata knowledge graph of the service platform if the query type is query according to the name of the service platform; if the query type is according to the condition that the database information is queried, returning to the metadata knowledge graph of the database; if the query type is according to the table name/field name query, returning the knowledge graph associated with the table name or the field name.
In some embodiments, after obtaining a multidimensional metadata knowledge graph according to the association relationships between the plurality of metadata and the plurality of metadata, the metadata query method in the embodiments of the present disclosure further includes: and storing the multidimensional metadata knowledge graph into an internal memory database.
Specifically, the embodiment of the disclosure carries out association aggregation on the acquired metadata through the knowledge graph construction tool to form data virtualization, stores the constructed metadata knowledge graph in the memory database, and in turn provides efficient retrieval service of the metadata for the service system, improves the retrieval efficiency of the metadata, and shortens the time for searching the data.
In more detail, the memory database in the embodiment of the disclosure discards the conventional manner of disk data management, redesigns the architecture based on the memory of all data, and correspondingly improves the aspects of data caching, fast algorithm and parallel operation, so that the data processing speed is much faster than that of the conventional database, generally more than 10 times, and therefore, the constructed metadata knowledge graph is stored in the memory database, so that faster and more efficient query service can be provided for an external service system.
In some embodiments, after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, the metadata query method in the embodiments of the present disclosure further includes: and receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through a metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
In some embodiments, obtaining metadata in a plurality of databases of a distributed deployment includes: receiving calling application programming interface information initiated by a plurality of service systems by adopting a passive ingestion mode; and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
In some embodiments, when a plurality of service systems initiate metadata ingestion requests, application programming interface information is called, metadata is classified from big to big according to service names, database information and table names/field names according to parameter rules of the programming interfaces, corresponding metadata is stored in a mapped database according to different ingestion types, and specifically, when the service systems initiate metadata ingestion types to represent all, the metadata including the service names, the database names, the table names, the field names and other data are stored in the mapped database; under the condition that the service system initiates that the metadata ingestion type represents the service name, the name of the current service system is stored in a mapping database; under the condition that the service system initiates metadata ingestion type to represent database information, storing various databases supporting MySQL (Structured Query Language ), hive, elasticsearch, mongoDB, oracle, kafka, postgreSQL and the like into a mapped database; in the case where the business system initiates that the metadata ingest type represents a table name/field name, the table name and field name in the current database are stored in the mapped database.
In some embodiments, metadata is classified and stored according to a preset rule to obtain a plurality of metadata sets deployed in a distributed manner, and the metadata query method in the embodiments of the present disclosure further includes: the source of the metadata and the attribute information are noted.
In some embodiments, the attribute information of the metadata in the embodiments of the present disclosure includes: metadata storage time, the user to which the metadata belongs, metadata type, and the like.
In more detail, in the embodiment of the disclosure, the ingested metadata is subjected to association analysis through a knowledge graph and a manual labeling technology, a knowledge graph based on the metadata is constructed, and an association relationship is established for the metadata.
In some embodiments, the metadata query method in the embodiments of the present disclosure further includes: and opening application programming interfaces for a plurality of service systems, and enabling the service systems to ingest metadata through the application programming interfaces.
In more detail, the open Application Programming Interface (API) for a plurality of business systems in embodiments of the present disclosure includes: the business system captures corresponding metadata through connecting different APIs, the embodiment of the disclosure defines the metadata automatic integration APIs and the automatic data directory retrieval APIs at the same time, the constructed metadata knowledge graph is stored in a memory database to form a virtual layer, data is virtualized, the link and the real-time query of the metadata are realized by three methods of application integration and real-time analysis and the knowledge graph, the data management and the application of the metadata are finally defined, the overall thought of overall management, relation processing and analysis of the metadata to a final application scene is opened, and the planning use is assisted from a higher level and the use value of the metadata is improved.
In some embodiments, the metadata query method provided in the embodiments of the present disclosure does not have local data, and provides metadata storage, association analysis and quick retrieval services for each service system by managing metadata and designing a metadata ingestion method, so that the storage and efficient retrieval functions of ultra-large distributed data management are realized, all other requirements for distributed management of metadata can be satisfied, the metadata query method has higher universality, and a more efficient solution is provided for the aspect of large data industry management.
Based on the same inventive concept, the embodiments of the present disclosure also provide a metadata query device, as follows. Since the principle of solving the problem of the embodiment of the device is similar to that of the embodiment of the method, the implementation of the embodiment of the device can be referred to the implementation of the embodiment of the method, and the repetition is omitted.
Fig. 4 shows a schematic diagram of a metadata query apparatus according to an embodiment of the disclosure, as shown in fig. 4, where the apparatus includes:
a metadata acquisition module 401, configured to acquire metadata in a plurality of databases of a distributed deployment;
the metadata storage module 402 is configured to store metadata in a classified manner according to a preset rule, so as to obtain a plurality of metadata sets deployed in a distributed manner;
The metadata knowledge graph generation module 403 is configured to process each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, where the metadata knowledge graph is used to respond to a data query instruction sent by each service system and return metadata to be queried.
The metadata query device provided by the embodiment of the disclosure acquires metadata in a plurality of databases in distributed deployment through a metadata acquisition module; the metadata storage module is used for classifying and storing the acquired metadata according to a preset rule to obtain a plurality of metadata sets distributed and deployed; and processing each metadata set distributed by the metadata knowledge graph generation module to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph in the embodiment of the disclosure can return metadata to be queried according to the data query instruction sent by each service system.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a data query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
In some embodiments, the metadata knowledge graph generation module in the embodiments of the present disclosure is further configured to perform association analysis on metadata in each metadata set of the distributed deployment, and determine association relationships between a plurality of metadata in each metadata set; and obtaining a multidimensional metadata knowledge graph according to the association relation between the metadata and the metadata in each metadata set.
In some embodiments, after obtaining a multidimensional metadata knowledge graph according to the association relationship between the plurality of metadata and the plurality of metadata, the metadata query apparatus in the embodiments of the present disclosure further includes: and the metadata knowledge graph storage module is used for storing the multidimensional metadata knowledge graph into the memory database.
In some embodiments, after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph, the metadata query apparatus in the embodiments of the present disclosure further includes: and the instruction query module is used for receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through a metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
In some embodiments, the metadata acquisition module in the embodiments of the present disclosure is further configured to receive call application programming interface information initiated by a plurality of service systems in a passive ingestion manner; and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
In some embodiments, metadata is classified and stored according to a preset rule to obtain a plurality of metadata sets deployed in a distributed manner, and the metadata query device in the embodiments of the present disclosure further includes: and the information labeling module is used for labeling the source and attribute information of the metadata.
In some embodiments, the metadata query apparatus in the embodiments of the present disclosure further includes: and the information interaction module is used for opening application programming interfaces for a plurality of service systems, and the service systems ingest metadata through the application programming interfaces.
Based on the same inventive concept, a metadata query system is also provided in the embodiments of the present disclosure, as follows. Since the principle of solving the problem of the system embodiment is similar to that of the method embodiment, the implementation of the system embodiment can be referred to the implementation of the method embodiment, and the repetition is omitted.
FIG. 5 shows a schematic diagram of a metadata query system in an embodiment of the disclosure, as shown in FIG. 5, the system comprising:
a metadata management unit 501, connected to each service system, for acquiring metadata in a plurality of service system databases deployed in a distributed manner;
the intermediate information processing unit 502 is connected to the metadata management unit 501, and is configured to store the acquired metadata in a classified manner according to a preset rule, obtain a plurality of metadata sets in a distributed deployment, process each metadata set in the distributed deployment, and obtain a multi-dimensional metadata knowledge graph, where the metadata knowledge graph is used to respond to a data query instruction sent by each service system, and return metadata to be queried.
According to the metadata query system provided by the embodiment of the disclosure, metadata in a plurality of databases distributed and deployed are acquired through a metadata management unit; and classifying and storing the acquired metadata according to a preset rule by an intermediate information processing unit to obtain a plurality of metadata sets distributed and deployed, and processing each metadata set distributed and deployed to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for querying metadata to be queried indicated by the metadata query instruction according to the data query instruction sent by each service system.
Compared with the technical problem of low data query efficiency in the management of data in the prior art, the embodiment of the disclosure obtains the multidimensional metadata knowledge graph by classifying and storing the acquired metadata and processing the classified and stored metadata, and when a service system sends a data query instruction, the metadata knowledge graph under the same dimension can be called according to the dimension of the query instruction, so that the technical effect of quickly querying the metadata is realized.
In some embodiments, as shown in fig. 5, the metadata query system in the embodiments of the present disclosure further includes: the distributed database 503 is connected to a plurality of service systems, and is used for accessing each service system database, and transmitting metadata in each service system database to the metadata management unit 501.
In more detail, the embodiment of the disclosure opens the metadata ingest interface by accessing the distributed database of each service system, adopts a passive metadata ingest mode to be safer and more effective, and only needs to distribute keys for each system to carry out authentication.
In some embodiments, as shown in fig. 5, the upper layer service providing unit of the metadata query system in the embodiment of the present disclosure further provides APIs with different requirements for each service system, including a metadata automation integration API, a metadata query API, a knowledge graph construction API, and an automation data directory retrieval API.
In some embodiments, as shown in fig. 5, the metadata query system in the embodiments of the present disclosure includes: a metadata management unit 501, an intermediate information processing unit 502, a distributed database 503, and an upper layer service providing unit 504. Wherein the distributed database 503 is a real database of each service system, and the access mode is verification through an API request; the metadata management unit 501 mainly performs entity extraction for accessed metadata; the intermediate information processing unit 502 analyzes the metadata relationship, carries out metadata association aggregation, and realizes metadata knowledge graph construction; the upper layer service providing unit 504 provides API services for each external system, and when the service system initiates an associated query service across databases and tables, the intermediate information processing unit in the embodiment of the present disclosure may directly provide a quick query service for the service system.
In some embodiments, as shown in fig. 6, the intermediate information processing unit in the embodiments of the present disclosure includes a Service (Service) middleware 600, where the Service middleware 600 specifically includes 4 modules, namely a passive access module 601, a data processing module 602, a knowledge graph construction module 603, and a graph database module 604, and in more detail, the passive access module in the embodiments of the present disclosure outputs success or failure of writing the data after inputting identity authentication information, a system name, database information, a database name, and a format-customized metadata set; the data processing module ingests the data in the passive access module according to the ingestion type and the ingestion description, specifically, the ingestion type comprises ingestion entirety, an ingestion sub-service platform, an ingestion data type and an ingestion table name or a field name, and the ingestion description can be all data, a service platform name, a database name, a display, a field name and the like; the knowledge graph module in the embodiment of the disclosure obtains a metadata set through the passive access module, inputs the metadata set into the knowledge graph model for training, outputs a knowledge graph, constructs a metadata association relationship, and writes the metadata association relationship into the graph database module; if the knowledge graph in the graph database module is queried, identity authentication information, entity names or relation names are required to be input, and query results are obtained according to the output graph data.
In some embodiments, the Service middleware in the embodiments of the present disclosure may directly provide a fast query Service for each Service system.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 700 according to such an embodiment of the present disclosure is described below with reference to fig. 7. The electronic device 700 shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 7, the electronic device 700 is embodied in the form of a general purpose computing device. Components of electronic device 700 may include, but are not limited to: the at least one processing unit 701, the at least one memory unit 702, and a bus 703 that connects the different system components (including the memory unit 702 and the processing unit 701).
In which a storage unit stores program code that can be executed by the processing unit 701, such that the processing unit 701 performs steps according to various exemplary embodiments of the present disclosure described in the above-described "exemplary method" section of the present specification.
In some embodiments, when the electronic device is used to control, for example, the knowledge-graph-based question-answering method described above in the present disclosure, the processing unit 701 may perform the following steps of the method embodiments described above:
metadata in a plurality of databases of a distributed deployment is obtained.
And classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed.
And processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
The storage unit 702 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 7021 and/or cache memory 7022, and may further include Read Only Memory (ROM) 7023.
The storage unit 702 may also include a program/utility 7024 having a set (at least one) of program modules 7025, such program modules 7025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus 703 may be one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 700 may also communicate with one or more external devices 704 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 700, and/or any device (e.g., router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 705. Also, the electronic device 700 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through the network adapter 706. As shown, the network adapter 706 communicates with other modules of the electronic device 700 via the bus 703. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 700, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In particular, according to embodiments of the present disclosure, the process described above with reference to the flowcharts may be implemented as a computer program product comprising: and the computer program realizes the metadata query method when the computer program is executed by the processor.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium, which may be a readable signal medium or a readable storage medium, is also provided. On which a program product is stored which enables the implementation of the method described above of the present disclosure. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
More specific examples of the computer readable storage medium in the present disclosure may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In this disclosure, a computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Alternatively, the program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
In particular implementations, the program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the description of the above embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
Claims (12)
1. A method for querying metadata, comprising:
acquiring metadata in a plurality of databases of distributed deployment;
classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed;
and processing each metadata set of distributed deployment to obtain a multi-dimensional metadata knowledge graph, wherein the metadata knowledge graph is used for responding to data query instructions sent by each service system and returning metadata to be queried.
2. The metadata query method as claimed in claim 1, wherein processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph comprises:
performing association analysis on metadata in each metadata set of the distributed deployment, and determining association relations among a plurality of metadata in each metadata set;
and obtaining a multidimensional metadata knowledge graph according to the association relation between the metadata and the metadata in each metadata set.
3. The metadata query method according to claim 2, wherein after obtaining a multi-dimensional metadata knowledge graph from a plurality of metadata and an association relationship between the plurality of metadata, the method further comprises:
And storing the multidimensional metadata knowledge graph into an internal memory database.
4. The metadata query method of claim 1, wherein after processing each metadata set of the distributed deployment to obtain a multi-dimensional metadata knowledge-graph, the method further comprises:
and receiving metadata query instructions sent by each service system, querying metadata to be queried indicated by the metadata query instructions through the metadata knowledge graph, and sending the metadata to be queried to the corresponding service system.
5. The method of claim 1, wherein obtaining metadata in a plurality of databases of a distributed deployment comprises:
receiving calling application programming interface information initiated by a plurality of service systems by adopting a passive ingestion mode;
and acquiring the metadata transmitted into the multiple service system databases of the distributed deployment.
6. The metadata query method of claim 1, wherein the metadata is classified according to a preset rule for storage, and a plurality of metadata sets of distributed deployment are obtained, the method further comprising:
and marking the source and attribute information of the metadata.
7. The metadata query method of claim 1, wherein the method further comprises:
and opening application programming interfaces for a plurality of service systems, wherein the service systems ingest the metadata through the application programming interfaces.
8. A metadata query mechanism, comprising:
the metadata acquisition module is used for acquiring metadata in a plurality of databases distributed to be deployed;
the metadata storage module is used for classifying and storing the metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed;
the metadata knowledge graph generation module is used for processing each metadata set distributed to obtain a multi-dimensional metadata knowledge graph, and the metadata knowledge graph is used for inquiring metadata to be inquired indicated by the metadata inquiry instruction according to the metadata inquiry instruction sent by each service system.
9. A metadata query system, comprising:
the metadata management unit is connected with each service system and used for acquiring metadata in a plurality of service system databases distributed and deployed;
the intermediate information processing unit is connected with the metadata management unit and used for classifying and storing the acquired metadata according to preset rules to obtain a plurality of metadata sets distributed and deployed, processing each metadata set distributed and deployed to obtain a multi-dimensional metadata knowledge graph, and querying metadata to be queried indicated by the metadata query instructions according to metadata query instructions sent by each service system.
10. The metadata query system of claim 9, wherein the system further comprises:
and the distributed database is connected with the metadata management unit and is used for accessing each service system database and transmitting metadata in each service system database to the metadata management unit.
11. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the metadata query method of any of claims 1-7 via execution of the executable instructions.
12. A computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the metadata query method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310854529.9A CN117076518A (en) | 2023-07-12 | 2023-07-12 | Metadata query method, device, system and related equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310854529.9A CN117076518A (en) | 2023-07-12 | 2023-07-12 | Metadata query method, device, system and related equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117076518A true CN117076518A (en) | 2023-11-17 |
Family
ID=88714243
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310854529.9A Pending CN117076518A (en) | 2023-07-12 | 2023-07-12 | Metadata query method, device, system and related equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117076518A (en) |
-
2023
- 2023-07-12 CN CN202310854529.9A patent/CN117076518A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9363195B2 (en) | Configuring cloud resources | |
US20210149895A1 (en) | Query conversion for querying disparate data sources | |
US11681723B1 (en) | Modeling of a non-relational database as a normalized relational database | |
CN108536778B (en) | Data application sharing platform and method | |
JP2022549187A (en) | Machine learning inference calls for database query processing | |
CN111949693B (en) | Data processing device, data processing method, storage medium and electronic equipment | |
US20230023253A1 (en) | Method for processing model parameters, and apparatus | |
US20190370255A1 (en) | Remote query optimization in multi data sources | |
WO2018177032A1 (en) | Method and device for processing response data, client device and electronic device | |
CN113923225A (en) | Distributed architecture-based federated learning platform, method, device and storage medium | |
WO2024021476A1 (en) | Data processing method and apparatus, electronic device and storage medium | |
CN116860854A (en) | Multi-source data merging processing method, device, system and related equipment | |
CN116860941A (en) | Question answering method and device, electronic equipment and storage medium | |
CN112685081A (en) | System migration method and device, electronic equipment and storage medium | |
WO2022262481A1 (en) | Calibration data management system, method, apparatus and device for electronic control unit | |
CN117076518A (en) | Metadata query method, device, system and related equipment | |
CN115794494A (en) | Data backup method, system, device, equipment and medium based on dynamic strategy | |
CN117472555A (en) | Computing power resource allocation method, system, device, equipment and storage medium | |
CN114780361A (en) | Log generation method, device, computer system and readable storage medium | |
US11809992B1 (en) | Applying compression profiles across similar neural network architectures | |
CN115795119B (en) | Haptic feature information acquisition method, device, system, equipment and medium | |
CN112464255A (en) | Data processing method and device, storage medium and electronic equipment | |
CN113127496A (en) | Method, apparatus, medium, and device for determining change data in database | |
CN114448976B (en) | Method, device, equipment, medium and program product for assembling network message | |
US11842077B2 (en) | Method, device, and computer program product for transmitting data for object storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |