CN112416923A - Metadata management method and device, equipment and storage medium - Google Patents

Metadata management method and device, equipment and storage medium Download PDF

Info

Publication number
CN112416923A
CN112416923A CN201910780607.9A CN201910780607A CN112416923A CN 112416923 A CN112416923 A CN 112416923A CN 201910780607 A CN201910780607 A CN 201910780607A CN 112416923 A CN112416923 A CN 112416923A
Authority
CN
China
Prior art keywords
metadata
type
module
instance
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910780607.9A
Other languages
Chinese (zh)
Inventor
吕燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201910780607.9A priority Critical patent/CN112416923A/en
Priority to PCT/CN2020/110167 priority patent/WO2021032146A1/en
Publication of CN112416923A publication Critical patent/CN112416923A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Abstract

The application discloses a metadata management method, a device, equipment and a computer readable storage medium, wherein the metadata management device comprises a metadata type management module, a metadata acquisition module and a metadata storage index module, wherein: the metadata type management module is used for loading metadata types defined based on an object-oriented mode and recording the metadata types in the metadata storage index module; the metadata acquisition module is used for acquiring a metadata instance corresponding to the metadata type according to the metadata type and storing the metadata instance in the metadata storage index module; the metadata storage index module is used for storing the metadata type and the metadata instance. According to the scheme provided by the embodiment, the metadata type is defined based on an object-oriented mode, and the method is flexible in design, simple in semantics, easy to reuse, good in expansibility and strong in maintainability.

Description

Metadata management method and device, equipment and storage medium
Technical Field
The embodiment of the invention relates to a metadata management method, a metadata management device, metadata management equipment and a computer-readable storage medium.
Background
In the current society, data volume expands at a high speed, data is becoming the core competitiveness of governments and enterprises, and people mine data values through data analysis to provide accurate judgment bases for management decision makers.
However, governments, enterprises, and the like have numerous and heterogeneous data systems in the process of electronic informatization, and the massive data dispersed in different systems causes complexity and high difficulty in management of data resource utilization. Management decision makers cannot review the internal data information, the system and the relationship among the systems from a uniform business perspective. In order to achieve the goal of mining data value, metadata management is needed to be carried out firstly, a global data map is established, data apertures are unified, data directions are marked, data relations are analyzed, management model changes are managed, a manager is helped to analyze the relation between each local part of a data warehouse and global context, and the fact that the local part is observed and the global part is known is really achieved. Big data related technologies allow the value of government and enterprise data to be fully mined, but big data often means collection, dissemination and sharing among many data sources, such as mobile personal data, social network data, public data, internet of things data and the like, and the processes need support of metadata management based on big data.
The demand for metadata management is increasing at home and abroad, and the metadata management is an important means for government and enterprise data governance. Metadata is data that describes data, mainly information that describes attributes of data. In a metadata management product in the related art, a metadata type is defined from 4 dimensions of a data set, a field, an element and a code set by using a conventional data dictionary modeling method, the metadata type is redundant, and unstructured metadata cannot be designed and managed. Therefore, improvements are needed.
Disclosure of Invention
At least one embodiment of the invention provides a metadata management method, a metadata management device, metadata management equipment and a computer-readable storage medium, which are used for supporting multiple metadata types.
At least one embodiment of the present invention provides a metadata management apparatus, including a metadata type management module, a metadata collection module, and a metadata storage index module, where:
the metadata type management module is used for loading metadata types defined based on an object-oriented mode and recording the metadata types in the metadata storage index module;
the metadata acquisition module is used for acquiring a metadata instance corresponding to the metadata type according to the metadata type and storing the metadata instance in the metadata storage index module;
the metadata storage index module is used for storing the metadata type and the metadata instance.
At least one embodiment of the present invention provides a metadata management method, including:
loading a metadata type defined based on an object-oriented mode, and storing the metadata type;
and acquiring a metadata instance corresponding to the metadata type according to the metadata type, and storing the metadata instance.
At least one embodiment of the present invention provides a metadata management apparatus including a memory and a processor, the memory storing a program that, when read and executed by the processor, implements the metadata management method according to any one of the embodiments.
At least one embodiment of the present invention provides a computer-readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the metadata management method of any of the embodiments.
Compared with the prior art, one embodiment of the invention comprises a metadata management device, a metadata type management module loads a metadata type defined based on an object-oriented mode and records the metadata type in a metadata storage index module; and the metadata acquisition module acquires a metadata instance corresponding to the metadata type according to the metadata type and stores the metadata instance in the metadata storage index module. The scheme provided by the embodiment breaks through a conventional data dictionary definition mode, the metadata type is defined based on an object-oriented mode, the design is flexible, the semantics is simple, the multiplexing is easy, the expansibility is good, the maintainability is strong, the modeling of any type of metadata can be realized, the method is suitable for various service fields, and the flexibility and the universality are strong.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a diagram of a metadata management module according to an embodiment of the present invention;
FIG. 2 is a diagram of a metadata type management module provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a metadata type collection module according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a metadata visualization maintenance module provided by an embodiment of the invention;
FIG. 5 is a diagram of a metadata type management sub-interface provided by an embodiment of the invention;
FIG. 6 is a diagrammatic illustration of a metadata generic visual maintenance interface provided by an embodiment of the present invention;
FIG. 7 is a diagram of a metadata management module according to another embodiment of the present invention;
FIG. 8 is a flowchart of a method for managing metadata according to an embodiment of the present invention;
FIG. 9 is a flow chart of an implementation of the field of data sharing according to an embodiment of the present invention;
FIG. 10 is a flowchart illustrating data standard management according to the present application, according to an embodiment of the present invention;
FIG. 11 is a flow chart for data governance using the present application according to an embodiment of the present invention;
FIG. 12 is a flow chart illustrating the application of the present application to provide data services in accordance with an embodiment of the present invention;
fig. 13 is a block diagram of a metadata management apparatus according to an embodiment of the present invention;
fig. 14 is a block diagram of a computer-readable storage medium provided by an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
Many governments and enterprises have various internal data forms and non-uniform standards, and a support capable of expanding and managing any type of metadata management is required. The traditional data dictionary modeling method is difficult to design the association and inheritance relationships among the metadata types, and the metadata types cannot be defined from the aspect of object inheritance extension. In the application, the metadata is defined based on the object-oriented mode, so that the incidence relation, inheritance relation and containment relation among metadata types can be designed.
As shown in fig. 1, an embodiment of the present invention provides a metadata management apparatus 100, including: a metadata type management module 101, a metadata collection module 102 and a metadata storage index module 103.
The metadata type management module 101 is configured to load a metadata type defined based on an object-oriented schema, and record the metadata type in the metadata storage index module 103.
Each metadata type has a plurality of attributes, and when a relationship exists between the metadata types, the metadata types can be defined, and the relationship can be an inheritance relationship, an association relationship, a containment relationship and the like. Compared with metadata defined in a data dictionary definition mode in the related art, the metadata type provided by the embodiment can define structured metadata and unstructured data, and has strong flexibility and universality. In addition, the relationship of metadata types to each other may also be described.
Wherein, the metadata type is loaded, that is, the metadata type is newly added in the metadata storage index module 103. The type of the loaded metadata can be from an externally input file or the like, and can also be input by a user through an operation interface.
The metadata collection module 102 is configured to obtain a metadata instance corresponding to the metadata type according to the metadata type, and store the metadata instance in the metadata storage index module 103;
the metadata storage index module 103 is configured to store the metadata type and the metadata instance.
The scheme provided by the embodiment breaks through a conventional data dictionary definition mode, the metadata type is defined based on an object-oriented mode, the design is flexible, the semantics is simple, the multiplexing is easy, the expansibility is good, the maintainability is strong, the modeling of any type of metadata can be realized, the method is suitable for various service fields, and the flexibility and the universality are strong.
In one embodiment, the step of defining the metadata type is as follows:
(a) basic types are designed, including: enumeration type, structure, tag type;
(b) designing object attribute types, wherein the attributes comprise a unique identifier, a must-select identifier, a type identifier, a number identifier, a relation identifier, a default value identifier and the like;
(c) designing an object type, which may contain several object properties;
(d) designing the inheritance relationship of the object type, one object type can inherit a plurality of parent classes.
It should be noted that the above steps are only examples, and the metadata type defined by the object-oriented schema is not limited in this application.
The scheme provided by the embodiment of the invention has high abstraction degree and strong universality, and is suitable for any application field related to data use.
In an embodiment, the metadata type management module 101 is further configured to implement deletion, update, and query of a metadata type.
In one embodiment, as shown in fig. 2, the metadata type management module 101 includes a metadata type loading sub-module 1011 and a metadata type query sub-module 1012, wherein:
the metadata type loading submodule 1011 is configured to implement addition, update, and deletion of a metadata type.
In an embodiment, the metadata type loading submodule 1011 is preloaded with a plurality of abstract metadata types, such as data sets, data objects, tables, fields, processing procedures, and the like, which may be inherited.
In one embodiment, after loading a new metadata type, first parsing enumeration, structure, and tag definition to create a basic metadata type instance; then analyzing the inheritance relationship and creating an abstract metadata type example; then analyzing the object type to create a metadata type example; then, the association, containment and inheritance relationships among the object types are analyzed, a storage request is sent to the metadata storage index module 103, metadata type nodes are created, edges are added among the metadata type nodes, and a graph relationship is established.
The metadata type query sub-module 1012 is configured to interact with the metadata storage indexing module 103 to view metadata types, including information of the metadata types and a relationship graph.
In an embodiment, a set of plug-in metadata collection architecture is designed in the metadata collection module 102, and bridges an industry mainstream database, a big data platform, a message interface, and the like, so as to automatically obtain scattered metadata. As shown in fig. 3, the metadata collection module 102 includes a metadata collector sub-module 1021 and a metadata collection task sub-module 1022, wherein:
the metadata collector sub-module 1021 is used to provide a variety of collectors, including but not limited to: implementing automatic collectors for a variety of commonly used, standardized metadata types, such as relational database tables, big database tables, SOAP (Simple Object Access Protocol), REST (Representational State Transfer), JMS (Java Message service), Elastic Search, Kafka, etc.; realizing an external synchronous metadata collector; and the user-defined collector interface allows the user-defined collector interface to be applied to realize automatic collection of the metadata.
The metadata collection task sub-module 1022 is configured to implement maintenance of an automatic collection task and an external synchronization task by using the collector.
When a user configures the automatic collection function of a sub-metadata instance of a certain resource interface (note that the resource interface is also a type of metadata instance), a collector of a corresponding metadata type is started, the collector is connected with the resource interface to automatically collect the metadata instance, a storage request is sent to the metadata storage index module 103, a metadata instance node is created, edges are added among the metadata instance nodes according to the incidence relation and the containment relation among the metadata types of the metadata instance, and a map relation is created.
According to the scheme provided by the embodiment, the automatic collection of the commonly used metadata types is realized according to the standard specification of the metadata, and the workload of manually collecting and changing the metadata instances by operation and maintenance personnel is reduced.
In an embodiment, the metadata storage index module 103 is further configured to, when storing the metadata types, further store relationships between the metadata types, and, when storing the metadata instances, further store relationships between the metadata instances. The relations include incidence relations, inheritance relations, containment relations, and the like.
In one embodiment, a graph database is used to store relationships between the metadata types, and a graph database is used to store relationships between the metadata instances. In the embodiment, the graph database is used for storing the relationship between the metadata types and the relationship between the metadata examples, so that the query speed is high, and the display effect is clear. In addition, the graph search can be used for flexibly and efficiently searching the metadata examples and the association relationship thereof, rich query service is provided for the outside, and the metadata consanguineous relationship traceability is enhanced.
In one embodiment, the metadata storage indexing module 103 storing the metadata type and the metadata instance comprises: the metadata storage indexing module stores the metadata type and the metadata instance using a columnar storage database. The columnar storage database is, for example, HBASE. By utilizing the storage and calculation characteristics of large data, such as no fixed columns, lateral expansion and high real-time concurrency, all metadata instances and metadata instance relations are uniformly stored into one table through a graph database engine (for example, HBASE is only an example, and other column type storage databases can also be used), so that the workload of data table definition in the traditional mode is reduced. The scheme provided by the embodiment can store large-scale metadata, provides the service capability of the large-scale metadata, is convenient to establish a unified metadata view, establishes a unified and stable data warehouse for big data processing, and provides strong basic support for improving the capability and efficiency of data management.
In one embodiment, the metadata storage indexing module 103 extracts information in the metadata instance to create a metadata index while storing the metadata instance to facilitate efficient querying (e.g., using a search engine such as SOLR, ElasticSearch, etc.). The scheme provided by the embodiment is based on the big data storage metadata instance and the index creation for the metadata instance, the storage and calculation capacity is expandable, and the expandability of metadata management is strong.
In an embodiment, the acquiring, by the metadata collection module 102, the metadata instance corresponding to the metadata type according to the metadata type includes at least one of:
the metadata collection module 102 collects external information by using a collector corresponding to the metadata type to create a metadata instance corresponding to the metadata type;
the metadata collection module 102 receives external synchronization information according to the metadata type to create a metadata instance corresponding to the metadata type.
The external information may be scattered metadata instances, or may be information stored in a database server or the like, such as information of a relational database, an FTP file server, a WEB server, or the like.
In an embodiment, the metadata management apparatus further comprises a metadata visualization maintenance module 104. The metadata visualization maintenance module 104 is configured to provide an operation interface interacting with the metadata management apparatus, where the operation interface includes a sub-interface for managing the metadata instance.
Namely, an interactive platform is provided, which is convenient for users to manage the metadata types and the metadata instances. The management comprises adding, modifying, deleting, inquiring the metadata type and the metadata example. In addition, other functional units can be provided on the operation interface according to the requirement.
In an embodiment, the metadata visualization maintenance module 104 is further configured to provide an interface for managing the metadata types.
In one embodiment, the interface for managing the metadata type is generated based on attributes of the metadata type and attribute extension rules.
The method comprises the following steps of realizing the visualization processing of characterization by defining attribute extension rules of metadata types, wherein the attribute extension rules comprise:
(a) attribute visualization rules, such as: editable rules, mask rules, display rules, attribute value change operation rules, interface attribute leave operation rules, and the like;
(b) attribute value domain check rules, for example: string length rules, numeric range rules, date range rules, regular rules, custom service check rules, etc.;
(c) attribute value domain rules, for example: the method comprises the steps of singly selecting a metadata instance of a specified type, multiply selecting the metadata instance of the specified type, calling a service acquisition value range (supporting a plurality of attribute values as parameters), singly selecting a father metadata attribute value, multiply selecting the father metadata attribute value, singly selecting a peer metadata attribute value, singly selecting a child metadata of a peer metadata attribute, a value range format rule (supporting a plurality of attribute values as parameters), a custom value range rule and the like.
In one embodiment, as shown in FIG. 4, the metadata visualization maintenance module 104 includes: the universal maintenance interface generation sub-module 1041 may further include at least one of: a metadata import and export sub-module 1042, a metadata classification sub-module 1043, a metadata modification sub-module 1044, and a metadata search sub-module 1045, wherein:
the interface generation submodule 1041 is configured to generate an operation interface according to the metadata type attribute and the extended attribute rule thereof, the containment relationship between the metadata types, and the association relationship between the metadata types.
When a user enters the metadata visualization maintenance module 104 to maintain a metadata instance, the universal maintenance interface generation submodule 1041 parses the metadata type attribute and the extension rule thereof, and creates an operation interface, and the configuration operation of the user follows the attribute extension rule; the general maintenance interface generation sub-module 1041 analyzes the metadata type containment relationship, creates a drill-in interface, and a user can view the sub-metadata in a drill-in manner; the general maintenance interface generation sub-module 1041 analyzes the metadata type association relationship and creates an association interface. The universal maintenance interface generation sub-module 1041 interacts with the metadata storage index module 103 to realize the new addition, modification, deletion and viewing of metadata instances.
The metadata import and export sub-module 1042 is used for implementing import and export of metadata files in various formats and implementing backup and recovery of a metadata database. The module is optional.
The metadata classification sub-module 1043 is configured to dynamically add a tag to a metadata instance, delete a tag, and query the metadata instance according to the tag. The module is optional.
The metadata change sub-module 1044 is configured to query a metadata instance change history according to a metadata instance, and implement metadata instance change statistics. The module is optional.
The metadata search sub-module 1045 is configured to implement a search for metadata instances through text. The module is optional.
The embodiment provides a set of visual metadata instance maintenance rules, breaks through the conventional way of maintaining metadata through a customized interface, provides a universal visual metadata maintenance method, provides the capability of adding or changing metadata types at any time, can visually maintain metadata instances of any type, and has high universality of metadata maintenance and management.
In an embodiment, the metadata management apparatus 100 further comprises a metadata service module 105, and the metadata service module 105 is configured to provide a query service for the metadata instance, wherein the query service includes a query for a consanguineous relationship of the metadata instance. The metadata service module 105 interacts with the metadata storage index module 103 to implement diversified metadata query services. In this embodiment, a metadata blood relationship tracing enhancement method is provided, which provides metadata blood relationship tracing at coarse and fine granularity levels.
In one embodiment, the metadata service module 105 interacts with the metadata store index module 103 to obtain a specified depth path along an output edge from a specified metadata instance node; obtaining a specified depth path along an input edge from a specified metadata instance node; and adding folding marks on the father nodes on the blood-margin paths according to the metadata type containment relationship, and accordingly creating and outputting a metadata blood-margin relationship graph. According to the scheme provided by the embodiment, the blood relationship graph is expanded from coarse granularity to fine granularity, the coarse granularity is folded at the fine granularity, and the history of how the metadata is generated and processed and used is abundantly displayed.
In order to realize the blood source tracing of coarse and fine granularity metadata, when defining metadata types:
(a) metadata type generalization
When multiple metadata types have the same semantics and most properties, they can be generalized and defined as parent types.
For example: defining a data set as a metadata father type, and extending the data set to define metadata types such as a data table, a structured data file and the like, namely the data set (father type), the data table and the structured data file (subtype); the field is defined as a metadata father type, and the extended field defines metadata types such as table fields, structured data file fields and the like. I.e., field (parent type) -table field, structured data file field (child type).
(b) Metadata type refinement
Refining the metadata type, further modeling partial attributes of the metadata type, and establishing a sub-metadata type;
for example: an Extract-Transform-Load (ETL) data integration job, wherein attributes such as an input data table, an output data table, a file, an interface and the like of the data integration job are set as a data set metadata type, the metadata type of the data integration job is further refined, and attributes such as a table field, a file field and the like of data processing are set as a field metadata type. Data tracing is carried out in such a way, and the data tracing can be refined from a data set level to a field level. I.e. table fields, file fields (field metadata type, being a subtype), input data table, output data table, file, interface (dataset metadata type, being a parent type)
In an embodiment of the present invention, each module of the metadata management apparatus is designed in a componentized manner, and interfaces are interacted.
The metadata management device provided by the embodiment of the invention can manage any type of metadata in a big data environment, breaks through a customized metadata maintenance mode, supports automatic acquisition of scattered metadata, and provides storage capacity of large-scale metadata, thereby establishing a unified metadata view. On the basis of the unified metadata view, the metadata blood source can be further known, and metadata search is conveniently carried out.
As shown in fig. 5, an embodiment of the present invention provides an interface for metadata type maintenance, in which in the child interface, attributes of a metadata type can be defined, a parent type inherited by the metadata type can be set, and an attribute extension rule can be defined.
As shown in fig. 6, an embodiment of the present invention provides a visual operation interface. In the operation interface, the left side is a metadata type list, and the right side comprises: a sub-interface for managing metadata instances (the "metadata information" menu in fig. 6), a sub-interface for querying metadata instance change history (the "change history" menu in fig. 6), a sub-interface for querying relationship of blood relationship (the "blood relationship influence" menu in fig. 6), a sub-interface for adding and deleting labels to/from metadata instances (the "category label" menu in fig. 6), a sub-interface for importing metadata files (the "import" menu in fig. 6), and a sub-interface for performing metadata instance query (the "please input word keyword search" box in fig. 6), and so on.
It should be noted that the layout and the menu of the operation interface are merely examples, and other operation interfaces may be used to interact with the metadata management apparatus as needed, which is not limited in the present application.
Fig. 7 is a schematic diagram of a metadata management apparatus according to another embodiment of the present invention. As shown in fig. 7, the metadata management apparatus 100 includes a metadata type management module 101, a metadata collection module 102, a metadata storage index module 103, a metadata visualization maintenance module 104, and a metadata service module 105, wherein the metadata storage index module 103 includes a big data platform 1031 and a graph database engine 1032. The graph database engine 1032 processes the metadata type and metadata instance and stores the processed metadata type and metadata instance to the big data platform 1031. A user interacts with the metadata management apparatus 100 through a WEB (WEB page) client 701.
The steps of the user using this metadata management apparatus are as follows:
the method comprises the following steps: after a user models a metadata type according to a service requirement (supports an offline mode and an online mode), the metadata type management module 101 loads the metadata type, and the metadata storage index module 103 records the metadata type and a map thereof.
Step two: metadata visualization maintenance module 104 maintains metadata instances corresponding to metadata types, and metadata storage indexing module 103 records metadata instances and maps thereof.
And if the user does not need the visual maintenance function, the same purpose can be achieved through the third step.
Step three: the metadata collection module 102 automatically collects metadata instances or receives externally synchronized metadata instances, and stores the metadata instances and their maps through the metadata storage index module 103.
Step four: various metadata inquiry interfaces are opened through the metadata service module 105 and are provided for a third-party system to use.
In the embodiment of the invention, any type of metadata can be managed, and different types and different geographic positions of metadata information are extracted, combined and subjected to heterogeneous processing of metadata data modeling according to business requirements.
The embodiment of the invention provides a complete set of complete metadata management functions, solves the problem that the prior art cannot manage any type and large-scale metadata automatically and universally, and enhances the traceability of the relationship of the blood relationship of the metadata.
As shown in fig. 8, an embodiment of the present invention provides a metadata management method, including:
step 801, loading a metadata type defined based on an object-oriented mode, and storing the metadata type;
step 802, obtaining a metadata instance corresponding to the metadata type according to the metadata type, and storing the metadata instance.
In an embodiment, the method further comprises: a graph database is also used to store relationships between metadata types when storing the metadata types, and a graph database is also used to store relationships between metadata instances when storing the metadata instances.
In one embodiment, the storing the metadata type includes: storing the metadata type using a columnar storage database;
the storing the metadata instance comprises: the metadata instances are stored using a columnar storage database.
In an embodiment, the obtaining, according to the metadata type, a metadata instance corresponding to the metadata type includes at least one of:
acquiring external information by using a collector corresponding to the metadata type to create a metadata instance corresponding to the metadata type;
and receiving external synchronous information according to the metadata type to create a metadata instance corresponding to the metadata type.
In an embodiment, the method further comprises: and receiving a management instruction for managing the metadata type through an operation interface, and executing corresponding management operation on the metadata instance.
In one embodiment, the management instructions include: and defining the attribute of the metadata type, wherein the attribute definition of the metadata type meets a preset attribute extension rule. It should be noted that the management instruction for the metadata type further includes an instruction of adding, deleting, updating, querying, and the like for the metadata type.
In an embodiment, the method further comprises, upon receiving a query request for the context of the metadata instance, outputting context information of the metadata instance.
The scheme provided by the embodiment of the invention has wide market application scenes, and comprises the following steps:
1) the metadata management in the invention helps users to establish a uniform data map, provides efficient and flexible query service for the outside, and functional modules of data integration, data security, data quality and the like all rely on the metadata management to acquire data to be managed and design data processing tasks to complete the data management goal.
2) In the field of data sharing and exchange, government departments accumulate massive data in an informatization process, however, due to data islanding caused by department barriers, the government is actively promoting the sharing and exchange of data among the departments. The data sharing exchange system is constructed, the sharing resource information of related departments needs to be collected and managed, the use flow direction of the data needs to be known, and by applying the scheme provided by the embodiment of the invention, the modeling and management can be carried out on any type of metadata, and the blood relationship of the data can be easily traced.
3) In the field of data opening, enterprises and governments open data to the society, share data resources, and data services needing to be opened to the outside are metadata.
The following illustrates the application of the present application in different scenarios.
Example 1
The governments in a certain place need to uniformly collect data of the public security bureau, the health bureau and the industrial and commercial bureau, realize centralized storage and further realize sharing and exchange of data among departments. The invention will be described in detail with the government using the device as an example:
description of the implementation Environment: in the present embodiment, a data sharing switching system is provided. The data of each department are stored in respective service systems, the service systems of each department are connected with the data sharing exchange system through a private network, and the open interfaces of the service systems have various forms such as a relational database, an FTP file server and a WEB server. The data sharing and exchanging system needs to collect and store data of all departments uniformly and provides data sharing and exchanging capacity among the departments. The data sharing exchange system is provided with the metadata management device of the embodiment of the invention to realize the management of the data information of the shared resource, and can acquire the generation, processing and use processes of the shared resource object.
As shown in fig. 9, includes:
step 901, loading metadata types;
wherein the metadata type is defined by a client, the metadata type being defined based on an object-oriented schema;
in this embodiment, the metadata types include: the system comprises a resource interface opened by a department service system, a resource object opened by the department service system, a data warehouse of a data sharing exchange system, data integration operation of the data sharing exchange system and data subscription of the data sharing exchange system.
Step 902, automatically collecting metadata examples of resource objects of departments through a metadata collection module 102 according to configured resource interface information of the departments;
step 903, according to the data sharing exchange system information configured by the client, automatically collecting the metadata instance of the data integration operation and the metadata instance of the data subscription application through the metadata collection module 102;
the data integration operation is created by a client after the data sharing exchange system inquires the metadata, and the data subscription application is submitted by the client after the data sharing exchange system inquires the metadata;
step 904, receiving a command of a client to check the blood relationship of the metadata, and outputting the blood relationship of the inquired metadata instance;
step 905, receiving a command of managing metadata of a client, and managing metadata instances, where the managing includes at least one of: new, modified, deleted, and viewed instances of metadata.
Example 2
The detailed description will be given by taking an example in which a certain government department manages various data standard specifications using the metadata management apparatus in the embodiment of the present invention.
Description of the implementation Environment: a government department needs to manage standards and specifications of data issued by countries, ministries, provinces and places. The government department manages various specifications by means of the metadata management device in the embodiment of the invention, and in the embodiment, the metadata management device comprises a metadata type management module, a metadata acquisition module, a metadata storage and indexing module and a metadata visualization maintenance module.
As shown in fig. 10, includes:
step 1001, loading defined metadata types;
wherein the metadata type is defined by the client based on an object-oriented schema; specifically, a client combs various data standard specification documents, extracts a public data standard and defines an extraction metadata type; the client defines the metadata types of various data standard specifications according to the object inheritance relationship, the containment relationship and the dependency relationship;
step 1002, according to the standard data file of the corresponding specification imported by the client in the metadata management device, automatically collecting metadata instances of each standard data specification (in this embodiment, the metadata instances are data standards) by using the file collector of the metadata collection module 102;
step 1003, managing various standard data standards according to the client instruction, wherein the management comprises at least one of the following steps: additions, modifications, deletions and views of data criteria (i.e., metadata instances) are made.
Example 3
Multiple heterogeneous business systems exist in multiple departments in a certain enterprise, data of all the business systems need to be managed in a centralized mode, data management is conducted, data quality is improved, and data safety is guaranteed. The present invention will be described in detail by taking the example where the enterprise uses the metadata management apparatus provided in the embodiment of the present invention.
Description of the implementation Environment: the method comprises the steps that an enterprise establishes a data management system, the data management system is connected with an existing heterogeneous business system through a private network, part of business systems in the heterogeneous business system directly open a database or a file server, part of business systems open WEB service interfaces, and part of business systems actively synchronize data to the data management system. The data management system is provided with the metadata management device, and all functions of data management are realized based on unified metadata management.
The present invention will be described in detail by taking the metadata management apparatus used by the enterprise as an example. As shown in fig. 11, includes:
step 1101, loading the defined metadata type;
wherein the metadata type is defined by the client based on an object-oriented schema, and the metadata type in this embodiment includes:
1. resource interface opened by service system
2. Resource object opened by service system
3. Data warehouse of data management system
4. Data integration operation of data management system
5. Data quality operation for data governance systems
6. Data security operations for data governance systems
1102, acquiring a metadata instance of a resource object according to resource interface information of a business system configured by a client;
specifically, when the resource interface is a direct connection interface, a metadata acquisition module is used for automatically acquiring a metadata instance of a resource object of the service system; and when the resource interface is a synchronous interface, receiving external synchronous information by using the metadata acquisition module to create a metadata instance of the resource object.
1103, automatically acquiring a metadata instance of data integration operation of the data management system, a metadata instance of data quality operation of the data management system and a metadata instance of data safety operation by using the metadata acquisition module 102 according to address information of the data management system configured by a customer;
the data integration operation, the data quality operation and the data safety operation are created by a client after the data management system inquires the metadata.
During collection, collecting corresponding metadata examples based on metadata types of data integration operation of the defined data governance system, collecting corresponding metadata examples based on metadata types of data quality operation of the defined data governance system, and collecting corresponding metadata examples based on metadata types of data safety operation of the defined data governance system.
Step 1104, receiving a customer's instruction for viewing the relationship of the blood relationship of the metadata, and outputting the blood relationship of the queried metadata instance;
step 1105, receiving a metadata management instruction of a client, and performing a management operation on a metadata instance, where the management operation on the metadata instance includes at least one of: new, modified, deleted, and viewed instances of metadata.
Example 4
A certain government department issues data services to the public, shares data resources and exerts data social value. The present invention will be described in detail by taking the example of the government using a metadata management apparatus provided by an embodiment of the present invention:
description of the implementation Environment: the government has established a data service system which registers various data services open to the outside, which the public can view and access. The government department deploys the metadata management device in the embodiment of the invention to manage various data services, and in the embodiment, the metadata management device comprises a metadata type management module, a metadata acquisition module, a metadata storage index module and a metadata service module.
As shown in fig. 12, includes:
step 1201, loading a defined metadata type;
wherein the metadata type is designed by a client according to a data service opened by a department, and the metadata type is defined based on an object-oriented mode.
Step 1202, according to the data service definition file submitted by the client, using the metadata collection module 102 to analyze the data service definition file, such as WSDL of SOAP service, YAML of REST service, and create a metadata instance according to the defined metadata type;
step 1203, receiving a data service information query request submitted by a client, and outputting corresponding data service information.
In conclusion, the scheme provided by the embodiment of the invention has wide market application scenes and can bring greater research and economic values.
As shown in fig. 13, an embodiment of the present invention provides a metadata management apparatus 130, including a memory 1310 and a processor 1320, where the memory 1310 stores a program, and when the program is read and executed by the processor 1320, the program implements the metadata management method according to any embodiment.
As shown in fig. 14, an embodiment of the present invention provides a computer-readable storage medium 140, where the computer-readable storage medium 140 stores one or more programs 141, and the one or more programs 141 are executable by one or more processors to implement the metadata management method according to any embodiment.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims (13)

1. A metadata management apparatus includes a metadata type management module, a metadata collection module, and a metadata storage index module, wherein:
the metadata type management module is used for loading metadata types defined based on an object-oriented mode and recording the metadata types in the metadata storage index module;
the metadata acquisition module is used for acquiring a metadata instance corresponding to the metadata type according to the metadata type and storing the metadata instance in the metadata storage index module;
the metadata storage index module is used for storing the metadata type and the metadata instance.
2. The metadata management apparatus according to claim 1, wherein said metadata storage indexing module is further configured to store relationships between said metadata types using a graph database when said metadata types are stored, and to store relationships between said metadata instances using a graph database when said metadata instances are stored.
3. The metadata management apparatus of claim 1, wherein the metadata storage indexing module stores the metadata type and the metadata instance comprises: the metadata storage indexing module stores the metadata type and the metadata instance using a columnar storage database.
4. The metadata management apparatus according to claim 1, wherein the metadata collection module obtains the metadata instance corresponding to the metadata type according to the metadata type, and includes at least one of:
the metadata acquisition module acquires external information by using an acquisition device corresponding to the metadata type to create a metadata instance corresponding to the metadata type;
and the metadata acquisition module receives external synchronous information according to the metadata type to create a metadata instance corresponding to the metadata type.
5. The metadata management apparatus according to any one of claims 1 to 4, further comprising a metadata visualization maintenance module, wherein:
the metadata visualization maintenance module is used for providing an operation interface interacting with the metadata management device, and the operation interface comprises a sub-interface for managing the metadata instance.
6. The metadata management apparatus according to any one of claims 1 to 4, further comprising a metadata service module, wherein the metadata service module is configured to provide a query service for the metadata instance, and the query service includes a query for a consanguinity of the metadata instance.
7. A metadata management method, comprising:
loading a metadata type defined based on an object-oriented mode, and storing the metadata type;
and acquiring a metadata instance corresponding to the metadata type according to the metadata type, and storing the metadata instance.
8. The metadata management method according to claim 7, further comprising: when storing the metadata types, a graph database is also used to store relationships between the metadata types, and when storing the metadata instances, a graph database is also used to store relationships between the metadata instances.
9. The metadata management method according to claim 7,
the storing the metadata type includes: storing the metadata type using a columnar storage database;
the storing the metadata instance comprises: the metadata instances are stored using a columnar storage database.
10. The method according to claim 7, wherein said obtaining the metadata instance corresponding to the metadata type according to the metadata type includes at least one of:
acquiring external information by using a collector corresponding to the metadata type to create a metadata instance corresponding to the metadata type;
and receiving external synchronous information according to the metadata type to create a metadata instance corresponding to the metadata type.
11. The method according to any one of claims 7 to 10, further comprising outputting the context information of the metadata instance when receiving a query request for the context of the metadata instance.
12. A metadata management apparatus comprising a memory and a processor, the memory storing a program which, when read and executed by the processor, implements the metadata management method according to any one of claims 7 to 11.
13. A computer-readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the metadata management method according to any one of claims 7 to 11.
CN201910780607.9A 2019-08-22 2019-08-22 Metadata management method and device, equipment and storage medium Pending CN112416923A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910780607.9A CN112416923A (en) 2019-08-22 2019-08-22 Metadata management method and device, equipment and storage medium
PCT/CN2020/110167 WO2021032146A1 (en) 2019-08-22 2020-08-20 Metadata management method and apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910780607.9A CN112416923A (en) 2019-08-22 2019-08-22 Metadata management method and device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112416923A true CN112416923A (en) 2021-02-26

Family

ID=74660194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910780607.9A Pending CN112416923A (en) 2019-08-22 2019-08-22 Metadata management method and device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112416923A (en)
WO (1) WO2021032146A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112947864A (en) * 2021-03-29 2021-06-11 南方电网数字电网研究院有限公司 Metadata storage method, device, equipment and storage medium
CN113220555A (en) * 2021-05-18 2021-08-06 北京百度网讯科技有限公司 Method, apparatus, device, medium and product for processing data
CN113297139A (en) * 2021-04-28 2021-08-24 上海淇玥信息技术有限公司 Metadata query method and system and electronic equipment
CN113377741A (en) * 2021-05-28 2021-09-10 中国铁道科学研究院集团有限公司电子计算技术研究所 Method and device for managing metadata of railway engineering design
CN114443913A (en) * 2022-04-06 2022-05-06 创智和宇信息技术股份有限公司 Metadata multi-function multi-condition based user-defined query method, system and medium
CN117312331A (en) * 2023-12-01 2023-12-29 浪潮云信息技术股份公司 Metadata blood-edge analysis method, device, equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1341901A (en) * 2001-01-04 2002-03-27 中国科学院南京土壤研究所 Agricultural ecological multi-dimensional data management technique
US8060514B2 (en) * 2006-08-04 2011-11-15 Apple Inc. Methods and systems for managing composite data files
CN107256247A (en) * 2017-06-07 2017-10-17 九次方大数据信息集团有限公司 Big data data administering method and device
CN107657052A (en) * 2017-10-17 2018-02-02 上海计算机软件技术开发中心 A kind of data governing system based on metadata management
CN108052618B (en) * 2017-12-15 2020-06-30 北京搜狐新媒体信息技术有限公司 Data management method and device
CN109299154B (en) * 2018-11-30 2020-12-18 长城计算机软件与系统有限公司 Big data storage system and method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112947864A (en) * 2021-03-29 2021-06-11 南方电网数字电网研究院有限公司 Metadata storage method, device, equipment and storage medium
CN112947864B (en) * 2021-03-29 2024-03-08 南方电网数字平台科技(广东)有限公司 Metadata storage method, apparatus, device and storage medium
CN113297139A (en) * 2021-04-28 2021-08-24 上海淇玥信息技术有限公司 Metadata query method and system and electronic equipment
CN113220555A (en) * 2021-05-18 2021-08-06 北京百度网讯科技有限公司 Method, apparatus, device, medium and product for processing data
CN113220555B (en) * 2021-05-18 2023-10-20 北京百度网讯科技有限公司 Method, apparatus, device, medium, and article for processing data
CN113377741A (en) * 2021-05-28 2021-09-10 中国铁道科学研究院集团有限公司电子计算技术研究所 Method and device for managing metadata of railway engineering design
CN114443913A (en) * 2022-04-06 2022-05-06 创智和宇信息技术股份有限公司 Metadata multi-function multi-condition based user-defined query method, system and medium
CN114443913B (en) * 2022-04-06 2022-06-07 创智和宇信息技术股份有限公司 Metadata multi-function multi-condition based user-defined query method, system and medium
CN117312331A (en) * 2023-12-01 2023-12-29 浪潮云信息技术股份公司 Metadata blood-edge analysis method, device, equipment and storage medium
CN117312331B (en) * 2023-12-01 2024-03-29 浪潮云信息技术股份公司 Metadata blood-edge analysis method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2021032146A1 (en) 2021-02-25

Similar Documents

Publication Publication Date Title
WO2021032146A1 (en) Metadata management method and apparatus, device, and storage medium
CN107819824B (en) Urban data opening and information service system and service method
US9800675B2 (en) Methods for dynamically generating an application interface for a modeled entity and devices thereof
CN109101652B (en) Label creating and managing system
US11663033B2 (en) Design-time information based on run-time artifacts in a distributed computing cluster
CN109033113B (en) Data warehouse and data mart management method and device
US9201700B2 (en) Provisioning computer resources on a network
CN109446274B (en) Method and device for managing BI metadata of big data platform
US9753960B1 (en) System, method, and computer program for dynamically generating a visual representation of a subset of a graph for display, based on search criteria
CN111026874A (en) Data processing method and server of knowledge graph
WO2018036324A1 (en) Smart city information sharing method and device
CN106294695A (en) A kind of implementation method towards the biggest data search engine
US9830385B2 (en) Methods and apparatus for partitioning data
CN103310025A (en) Unstructured-data description method and device
US10769143B1 (en) Composite index on hierarchical nodes in the hierarchical data model within case model
Abdullah et al. The mapping process of unstructured data to structured data
US11487707B2 (en) Efficient file path indexing for a content repository
US10949409B2 (en) On-demand, dynamic and optimized indexing in natural language processing
CN112182045A (en) Metadata management method and device, computer equipment and storage medium
US9542457B1 (en) Methods for displaying object history information
CN107463618B (en) Index creating method and device
CN111382155A (en) Data processing method of data warehouse, electronic equipment and medium
CN115858810A (en) Method, system, computer device and storage medium for automatically constructing knowledge graph
Bardi et al. Coping with interoperability and sustainability in cultural heritage aggregative data infrastructures
CN111159285B (en) Enterprise cross-system retrieval method based on distributed index service deployment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination