CN117688008A - Metadata integrated management method and device - Google Patents

Metadata integrated management method and device Download PDF

Info

Publication number
CN117688008A
CN117688008A CN202311544516.8A CN202311544516A CN117688008A CN 117688008 A CN117688008 A CN 117688008A CN 202311544516 A CN202311544516 A CN 202311544516A CN 117688008 A CN117688008 A CN 117688008A
Authority
CN
China
Prior art keywords
metadata
meta
attribute
definition
attribute dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311544516.8A
Other languages
Chinese (zh)
Inventor
周朝卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unihub China Information Technology Co Ltd
Original Assignee
Unihub China Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unihub China Information Technology Co Ltd filed Critical Unihub China Information Technology Co Ltd
Priority to CN202311544516.8A priority Critical patent/CN117688008A/en
Publication of CN117688008A publication Critical patent/CN117688008A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a metadata integrated management method and device, wherein the method comprises the following steps: abstracting the metadata model from the metadata object, the metadata attribute dimension and the relationship chain; defining a metadata model, wherein the metadata model comprises a definition of a unique identification code, a definition of a meta attribute dimension and a definition of a relation chain; abstracting the configuration of the metadata visualization correlation as a special meta attribute dimension of the metadata object; the metadata model is stored in a table of a relational database, an index of the meta attribute dimension is built based on the type of the meta attribute dimension, and a relational chain is stored in a search engine or a graph database for metadata query. The method and the device realize the integrated management of definition, storage, inquiry and display of metadata and have an automatic expansion function.

Description

Metadata integrated management method and device
Technical Field
The invention relates to the field of metadata management, in particular to a metadata integrated management method and device.
Background
Metadata management involves a number of key processes, including storage, querying, and visualization of metadata. Because of the wide variety of metadata, concatenating these flows and providing efficient metadata queries while adapting to any type of metadata is a challenging task. In particular, metadata extensibility, existing metadata management flows often have difficulty accommodating ever-increasing and changing metadata requirements. For example, when metadata dimensions of data quality inspection results need to be added to MySQL tables, a metadata model is often required to be modified, and the modification is large in scale, easy to make mistakes, time-consuming and labor-consuming, and has extremely poor expansibility.
The problems with metadata mainly include the following:
1. storage limitations and data redundancy: traditional metadata management methods may employ fixed data models or table structures, resulting in storage limitations and data redundancy. When metadata needs to be extended or new attributes added, the data model or table structure often needs to be modified, which can lead to cumbersome changes and data redundancy.
2. Query performance and efficiency: with the increase in metadata types and the increase in data volume, conventional query methods may face performance and efficiency issues. Some queries may require scanning a large number of metadata records, resulting in slower query speeds, impacting user experience and system responsiveness.
3. Lack of flexibility and scalability: traditional metadata management methods often lack flexibility and extensibility. When new metadata types or attributes need to be added, existing data models or code often need to be modified, which results in high coupling and low maintainability.
4. Highly dependent on the developer: traditional metadata management methods typically require the intervention of developers to modify data models, configure query rules, etc., which makes the management process highly dependent on the development team, adding to the complexity and cost of management.
5. The visualization and usability customization degree is high: traditional metadata management methods often lack the ability to expand metadata visualization, with a high degree of customization.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a metadata integrated management method and device, which realize the integrated management of definition, storage, inquiry and display of metadata and simultaneously have an automatic expansion function.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in an embodiment of the present invention, a metadata integrated management method is provided, which includes:
abstracting the metadata model from the metadata object, the metadata attribute dimension and the relationship chain;
defining a metadata model, wherein the metadata model comprises a definition of a unique identification code, a definition of a meta attribute dimension and a definition of a relation chain;
abstracting the configuration of the metadata visualization correlation as a special meta attribute dimension of the metadata object;
the metadata model is stored in a table of a relational database, an index of the meta attribute dimension is built based on the type of the meta attribute dimension, and a relational chain is stored in a search engine or a graph database for metadata query.
Further, the meta-attribute dimension is defined by the type of the meta-attribute dimension and the attribute of the meta-attribute dimension; metadata objects are described from different dimensions by defining the types of meta-attribute dimensions and corresponding attribute combinations.
Further, the relationship chain is defined by the type of the relationship chain and the objects upstream and downstream of the relationship chain; the association relationship between metadata objects is established and described by defining the type of relationship chain and specifying the unique identification code of the upstream and downstream objects.
Further, the unique identification code consists of an object type and an object definition containing references to other objects by which the association is established.
In an embodiment of the present invention, there is also provided a metadata integration management apparatus, including:
the metadata model abstraction module is used for abstracting the metadata model from the metadata object, the metadata attribute dimension and the relation chain;
the metadata model definition module is used for defining a metadata model, and comprises a definition of a unique identification code, a definition of a metadata attribute dimension and a definition of a relation chain;
the metadata visualization module is used for abstracting the configuration related to metadata visualization into a special metadata attribute dimension of the metadata object;
and the metadata model storage and query optimization module is used for storing the metadata model in a table of the relational database, constructing an index of the meta attribute dimension based on the type of the meta attribute dimension, and storing a relation chain in a search engine or a graph database for metadata query.
Further, the meta-attribute dimension is defined by the type of the meta-attribute dimension and the attribute of the meta-attribute dimension; metadata objects are described from different dimensions by defining the types of meta-attribute dimensions and corresponding attribute combinations.
Further, the relationship chain is defined by the type of the relationship chain and the objects upstream and downstream of the relationship chain; the association relationship between metadata objects is established and described by defining the type of relationship chain and specifying the unique identification code of the upstream and downstream objects.
Further, the unique identification code consists of an object type and an object definition containing references to other objects by which the association is established.
In an embodiment of the present invention, a computer device is further provided, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the metadata integrated management method.
In an embodiment of the present invention, a computer-readable storage medium storing a computer program for executing the metadata integration management method is also provided.
The beneficial effects are that:
1. the invention can easily add new attributes and types without modifying the whole metadata model, and improves the flexibility and expandability of metadata management.
2. The relational database can ensure the accuracy and the integrity of data, and the elastic search can be easily expanded to a plurality of nodes to realize horizontal expansion and load balancing, so that the relational database has good expandability when storing and inquiring a large amount of metadata, and can process high-concurrency inquiry requests.
3. The invention can solve the problem of query performance, improve the query efficiency and obtain accurate metadata information by utilizing the advantages of the elastic search index and the relational database through the process of searching and querying according to the dimension of the meta attribute.
4. The invention abstracts the configuration related to metadata visualization into the metadata object metadata dimension, provides a flexible way to customize front-end display without directly modifying the front-end code, realizes the expandability of the front-end display, and enables the system to adapt to front-end pages with different requirements.
Drawings
FIG. 1 is a flow chart of a metadata integrated management method of the present invention;
FIG. 2 is a flow chart of metadata writing in the present invention;
FIG. 3 is a flow chart of the metadata query of the present invention;
FIG. 4 is a schematic diagram of a metadata-integrated management apparatus according to the present invention;
fig. 5 is a schematic diagram of the computer device structure of the present invention.
Detailed Description
The principles and spirit of the present invention will be described below with reference to several exemplary embodiments, with the understanding that these embodiments are merely provided to enable those skilled in the art to better understand and practice the invention and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Those skilled in the art will appreciate that embodiments of the invention may be implemented as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the following forms, namely: complete hardware, complete software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to the embodiment of the invention, a metadata integrated management method is provided, which comprises the steps of designing an extensible metadata model, adopting a flexible data storage scheme, optimizing query performance and efficiency, constructing an extensible visual management method and the like. These measures may improve the flexibility, maintainability and ease of metadata management to better meet ever-increasing and changing metadata requirements.
The principles and spirit of the present invention are explained in detail below with reference to several representative embodiments thereof.
Fig. 1 is a flow chart of a metadata integrated management method according to the present invention. As shown in fig. 1, the method includes:
s1, abstracting a metadata model from a metadata object, a metadata attribute dimension and a relationship chain;
s2, defining a metadata model, wherein the metadata model comprises a definition of a unique identification code, a definition of a meta attribute dimension and a definition of a relation chain;
s3, abstracting the configuration related to the metadata visualization into a special meta attribute dimension of the metadata object;
s4, storing the metadata model in a table of a relational database, constructing an index of the meta attribute dimension based on the type of the meta attribute dimension, and storing a relational chain in a search engine or a graph database for metadata query.
It should be noted that although the operations of the method of the present invention are described in a particular order in the above embodiments and the accompanying drawings, this does not require or imply that the operations must be performed in the particular order or that all of the illustrated operations be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
In order to more clearly explain the metadata integration management method, a specific embodiment is described below, however, it should be noted that this embodiment is only for better illustrating the present invention and is not meant to limit the present invention unduly.
Examples:
1. metadata model abstraction
The metadata model must consider expansibility, and when a new type of metadata is added, the code is not required to be modified, and the expansion of the metadata can be realized only by adding the new data according to the metadata model.
The metadata model abstracts from three aspects:
(1) Metadata object: metadata objects are core components in a metadata model that are used to describe and represent specific entities of data and information resources. It may be a database, table, field, file, application, chart, etc.
(2) Meta attribute dimension: a meta-attribute dimension is a collection of specific aspects describing a metadata object that provides the ability to describe the metadata object in detail from different dimensions.
The meta-attribute dimension allows metadata objects to be described from multiple angles to obtain more comprehensive, detailed, and accurate information. For example, for a data table object, multiple meta-attribute dimensions may be defined, such as the user to which the data table belongs, the data field, schema field information, global tags, and the like.
Each meta-attribute dimension may contain a plurality of attributes that further subdivide the description of the dimension. Taking the meta attribute dimension of Schema field information as an example, the Schema field information may include a plurality of attributes such as a field name, a data type of a field, a default value, a description of the field, and the like. With these properties, the field structure of the data table can be described and understood in more detail.
By using the meta-attribute dimension, the meta-data object can be described from different angles, so that the description is more comprehensive, detailed and accurate. This flexibility and extensibility makes the meta-attribute dimension an important concept in metadata management, helping to ensure that the meta-data object's description information is more accurate and useful.
(3) The relation chain: relationships between metadata objects. For example, an upstream-downstream relationship (e.g., data of the data table a is generated by the data table B, and the object data table a and the object data table B are upstream-downstream relationships), a containing relationship (e.g., a dashboard contains a plurality of charts, and a containing relationship between an object dashboard and an object chart), an possession relationship (e.g., a user owns the data table, and a possession relationship between an object user and the object data table), and the like.
After the abstraction of the metadata model is provided, the metadata model can be defined based on the abstraction of the metadata model, so that comprehensive, detailed and concrete metadata management is realized.
2. Metadata model definition
2.1 composition of metadata model definitions
(1) Definition of unique identification code: the unique identification code is used for defining the unique identification metadata object, and the definition rule of the unique identification code directly contains the reference relation of other objects, so that the identification of the metadata object is simpler and more visual, and complex associated data or query operation is not required to be established. This approach provides efficient metadata management and operational capabilities, making querying and understanding of metadata more convenient and intuitive.
(2) Definition of meta-attribute dimensions
The meta-attribute dimension is defined by the following aspects:
(a) Types of meta-attribute dimensions: for classifying and describing particular aspects of the metadata object. For example, for a metadata object of a data table, the types of meta-attribute dimensions may be defined as: schema information of the data table, home users of the data table, and the like.
(b) Attributes of the meta-attribute dimension: specific aspects of metadata objects are described in detail through a number of attributes in which other objects may be referenced by unique identification codes. For example, for the meta attribute dimension of Schema information of a data table, attribute combinations may be defined including field names, field encodings, field data types, field descriptions, whether primary keys, etc.
By defining the types of meta-attribute dimensions and the corresponding attribute combinations, the meta-data objects can be fully and accurately described from different dimensions. For example, in a metadata object of a data table, by using Schema information whose meta attribute dimension type is the data table, the structure and characteristics of the data table can be described in detail by attribute combinations including attributes of field name, field code, field data type, field description, whether primary key, and the like.
The definition mode enables the meta attribute dimension to have flexibility and extensibility in metadata management, and different aspects of the metadata object can be classified and described according to specific requirements. By defining the types and attribute combinations of the meta-attribute dimensions, various aspects of the information of the metadata object can be more accurately known and used.
(3) Definition of relationship chain
The chain of relationships between metadata objects may be defined by:
(a) Type of relationship chain: the type of relationship chain is determined, such as upstream-downstream relationships, containment relationships, ownership relationships, and the like. The data that the upstream and downstream relations represent one object is generated by another object, the containing relation represents that the other object is contained in the one object, and the owning relation represents that the one object owns the other object.
(b) Upstream and downstream objects of the relationship chain: defining upstream and downstream object information in the relationship chain, and referencing the upstream and downstream objects using unique identification codes of the objects. By specifying unique identification codes for upstream and downstream objects, connections between objects in the relationship chain may be explicitly indicated.
By defining the type of relationship chain and specifying the unique identification code of the upstream and downstream objects, the association relationship between metadata objects can be established and described. This allows for better understanding and management of dependencies between objects, and thus more efficient analysis, querying and use of metadata.
In summary, the definition of the relationship chain between metadata objects includes the type of relationship chain and the designation of upstream and downstream objects. The definition mode enables the association relation among the objects to be clearly described, and a more comprehensive view angle is provided for metadata management and use.
Based on the above information, metadata of one object can be completely and comprehensively described.
(a) Metadata object type: a specific type of metadata object is determined, such as a database, a data table, a field, a file, a user, a chart, etc. By specifying the object type, the kind and purpose of the object can be clarified.
(b) Unique identification code: unique identification codes are used to ensure identification uniqueness of metadata objects for accurate locating and referencing of metadata objects.
(c) A set of element attribute dimensions: metadata objects can be described in full and detail by defining a set of combinations of metadata attribute dimensions. The meta-attribute dimensions represent different aspects of the metadata object, and by combining multiple meta-attribute dimensions, a more comprehensive and accurate description of the metadata object may be provided.
(d) Relationship chain between metadata objects: and establishing an association relation between the metadata objects to establish connection and dependence between the objects. By defining and recording relationships between objects, dependencies and effects between metadata objects can be better understood and managed.
In summary, through the metadata object, the unique identification code, the combination of the metadata attribute dimensions and the relationship chain between the metadata objects, the metadata of one object can be completely and comprehensively described, and meanwhile, the metadata has extremely strong expansibility. Together, these elements provide a detailed description and global view of the metadata object, facilitating efficient management and use of metadata.
2.2 definition of unique identification code
The definition rule of the unique identification code of the metadata object is as follows:
(1) The basic structure of the unique identification code: consisting of object types and object definitions. The format is: < objectType >: (object definition >).
(2) Object type (objectType): the object type represents the type of metadata object to which the unique identification code points. Common object types include data platform (dataPlatform), data set (dataset), fields (schema field), and the like.
(3) Object definition (objectDefinition): the object definition specifies specific information of the object, and the structure of the object definition may be different according to the type of the object. The object definition may contain references to other objects, with associations established by unique identification codes referencing other objects.
With the above rules, metadata objects may be identified and referenced using unique identification codes. The basic structure of the unique identification code is composed of an object type and an object definition, the object type represents the kind of the object, the object definition specifies specific information of the object, and the unique identification code of other objects can be referenced to establish an association relationship. Such definition rules may help accurately identify and locate objects in metadata.
The unique identification code of a metadata object is a unique identification code that defines and references other objects.
(a) Definition of simple objects: for definition of simple objects, only one element needs to be used. For example, for data source platforms such as Hive, mySQL, the following format may be used for definition: < dataPlatformType >, wherein dataPlatformType represents the type of data platform, e.g. hive, mysql. Thus, the unique identification code of Hive platform can be defined as: "dataPlatform: hive", the unique identification code of MySQL platform can be defined as: "dataPlatform: mysql".
(b) Definition of complex objects: for complex objects, two or more elements are required to be defined, which are connected using commas and placed in brackets. If the element therein is involved in referencing other objects, the referencing is done using the unique identification code of the other objects.
In summary, the unique identification code of the metadata object is a unique identification code for defining and referencing other objects, and for simple objects, only one element is used for defining, and for complex objects, two or more elements are used, comma-connected, and wrapped in brackets. If other objects need to be referenced, the unique identification code of the other objects is used for referencing.
Example 1: unique identification coding of Hive table
Hive table fct_users_created, object type is data set, expressed using dataset. An object that uniquely defines a data set type requires defining three elements: platform type, table name, environment name. The unique identification code of the platform type of Hive table is: "dataPlatform: hive". Hive's table name is fct_users_created. The environment name is PROD, and the names of the environments are used for distinguishing Hive tables with the same names possibly existing in different environments. Thus, the unique identification code of the Hive table is defined as: "dataset (dataPlatform: hive, fct_users_created, PROD)".
Example 2: unique identification coding of fields of Hive table
The object types of fields of the Hive table are represented using schema field, and the definition of one schema field type object requires defining two elements: data set, field name. Wherein the dataset is represented using a unique identification code of the dataset object, corresponding to a table of Hive, which unique identification code is: "dataset (dataPlatform: hive, fct_users_created, PROD)". The unique identification code of the field user_name can be expressed as: "schema field (dataset): (dataPlatform: hive, fct_users_created, PROD), user_name).
2.3 definition of meta-attribute dimensions
The meta-attribute dimension defines the type and attributes of the meta-attribute dimension using JSON format. Each meta-attribute dimension type has its specific manner of attribute definition.
The structure of the meta-attribute dimension can be flexibly defined by using the JSON format, so that the structure can adapt to different meta-data objects and attribute requirements. In JSON, the type of meta-attribute dimension may be specified and the corresponding attribute field defined in its attributes.
Each type of meta-attribute dimension has its own way of defining the attribute, which can meet the specific requirements of the different types of meta-attribute dimensions. By using JSON, information such as attribute names, data types, descriptions, and other required attribute fields of the meta-attribute dimension can be defined.
Summarizing, the meta-attribute dimension uses JSON format to define its types and attributes. Through the flexibility of JSON, corresponding attribute structures can be defined for different types of meta-attribute dimensions to meet different meta-data objects and attribute requirements. This definition allows the structure of the meta-attribute dimension to be clear, extensible, and provides flexibility and adaptability.
For example, the meta-attribute dimension is Schema information of the data table, and the corresponding attribute includes field information, creation information, modification information and other attributes. The definition can be made using the JSON example as follows:
in the above example:
(a) Specifying a unique identification encoding of the metadata object by urn: "dataset (dataPlatform: hive, fct_users_created, PROD)". By resolving the unique identification code, the platform type, table name and the like of the object can be obtained.
(b) The meta-attribute dimension is specified by dimension as a schema meta.
(c) The attributes of the meta-attribute dimension are defined by tips.
The created and lastModified specify attributes of the creation information and the modification information, and both the creation and modification sections specify a creator and a modifier by an actor that references an object of a type of a reader by a unique identifier (reader).
The field-related attributes are specified in fields, such as the name of the field, whether the value is null, field description, data type, etc.
If the metadata object needs to extend a new meta-attribute dimension, a record of a new meta-attribute dimension type is added, for example, the meta-attribute dimension type is added: the owner of the table. JSON is exemplified as follows:
in the above example:
(a) Specifying a unique identification encoding of the metadata object by urn: "dataset (dataPlatform: hive, fct_users_created, PROD)". By resolving the unique identification code, the platform type, table name and the like of the object can be obtained.
(b) The type of meta-attribute dimension is specified by dimension as owner.
(c) The attributes of the meta-attribute dimension are defined by tips.
The ownerst list specifies a list of owners of the table, and the actor references an object of the type of the reader by a unique identifier (reader: jdoe).
lastModified specifies modification information. .
2.4 definition of relationship chain
The definition of the relationship chain between metadata objects is as follows:
(1) The type of relationship chain;
(2) A unique identification code of the source object;
(3) The type of source object;
(4) Unique identification code of the target object;
(5) The type of target object.
The relationship chain is also expressed using JSON format. As an example, the relationship type is specified using the relationship_type, the source_urn specifies the unique identification code of the source object, the source_url specifies the type of the source object, the target_urn specifies the unique identification code of the target object, and the target_type specifies the type of the target object.
3. Metadata visualization
Metadata presentation is presented on a page according to certain rules (including access paths of front-end pages, presentation icons of metadata, whether a certain attribute is presented, etc.) by configuring front-end visualization elements. In order to achieve the extensibility of front-end presentation and to adapt to front-end pages of different requirements, the configuration associated with metadata visualization may be abstracted to a special meta-attribute dimension of the metadata object.
By adding the meta attribute dimension, the functions and options displayed at the front end can be conveniently expanded. Each meta-attribute dimension represents a presentation manner or configuration option, e.g., among the attributes, configure a presentation icon, whether a certain meta-attribute dimension is presented, etc. By adding new meta-attribute dimensions to the metadata object, new front-end presentation styles can be flexibly added.
This design is scalable because adding a new front-end presentation style only requires adding a new meta-attribute dimension, without modifying existing code or configuration. The front-end display can be customized according to specific requirements so as to adapt to the requirements of different users and application scenes.
Abstracting the metadata visualization-related configuration into the metadata object's meta-attribute dimension provides a flexible way to customize the front-end presentation without directly modifying the front-end code. The design realizes the expandability of front-end display and enables the system to adapt to front-end pages with different requirements. Only one new meta attribute dimension is needed to be added to add a new front-end display style, and the expansion process is further simplified.
4. Storage and query optimization of metadata models
The model of the metadata is stored in a table of a relational database, the fields are as follows:
(1) Unique identification coding of metadata objects, joint primary key 1;
(2) The type of the meta attribute dimension, the joint primary key 2;
(3) Attributes of metadata dimensions.
A record is uniquely identified using the unique identification code of the metadata object and the type of the meta-attribute dimension.
The corresponding data table is as follows table 1:
TABLE 1
Fields Description of the invention
Unique identification code Combined primary key
Types of meta-attribute dimensions Combined primary key
Attributes of meta-attribute dimensions Storage in JSON format
The relation chain between metadata is stored in a search engine or a graph database, so that the relation can be conveniently searched. The relationship chain between metadata is stored in the elastic search as follows, and the corresponding field information is as follows in table 2:
TABLE 2
Fields
Types of relationships
Unique identification coding of source objects
Source object type
Unique identification coding of target objects
Target object type
However, there is still another problem with the above storage model to be solved: the metadata model is stored in a relational database, attributes of the metadata dimension are stored in a JSON format, fuzzy matching is needed if related attribute information is to be retrieved, full-table scanning data is needed, and great performance cost is brought, and in order to improve the performance of inquiring data according to the metadata dimension, the writing and inquiring of the metadata are carried out according to the following procedures:
(1) As shown in FIG. 2, the writing of metadata
(a) Constructing an index of the elastiscearch based on the meta-attribute dimension;
(b) Metadata is firstly written into a metadata model table;
(c) Splitting the attribute of the meta attribute dimension into a plurality of fields, writing the fields needing to be searched, the unique identification code and the type of the meta attribute dimension into an index of the elastomer search, and taking the value obtained by splicing the unique identification code of the metadata object and the type of the meta attribute dimension as the value of the document ID of the index of the elastomer search, thereby realizing de-duplication, and repeatedly writing for many times does not cause data repetition.
For example, the data written into the metadata model table corresponding to the user object is as follows in table 3:
TABLE 3 Table 3
In the attribute of the meta attribute dimension, the full name is defined by the fullName, the active is designated as true, the email defines the mail, and the displayName defines the name of the presentation.
However, if the attributes such as fullName or email are used, the metadata model table as described above is queried, fuzzy matching is required, full table scan data is required, and the performance overhead is very high.
To solve this problem, an index specific to the type of the meta-attribute dimension of the reader is constructed according to the field to be retrieved, and the data is written into the index. For example, if data is required to be retrieved according to the fullName and email, only the two fields need to be written into the index of the elastsearch, and the unique identification identifier of the metadata object is taken as the value of the document ID of the elastsearch index. The generated elastiscearch data are shown in table 4 below:
TABLE 4 Table 4
(2) As shown in fig. 3, data queries
For search queries according to the meta-attribute dimension, the following procedure may be performed:
(a) And selecting the corresponding index of the elastic search for inquiring according to the type of the meta-attribute dimension.
(b) The query field is used to retrieve data from the index of the elastomer search and return a unique identification code. By specifying the query field, qualified data can be obtained from the index of the elastic search and the unique identification code of the metadata object associated therewith can be obtained. Since the elastiscearch constructs an inverted index, the query efficiency is very high.
(c) The metadata model table is queried from the relational database using the unique identification code and the type of meta-attribute dimension. By using the unique identification code and the type of the meta attribute dimension, related meta data information is quickly inquired from a meta data model table in the relational database, and the meta data model table is inquired based on a main key, so that the inquiry efficiency is high, the occupied resources are less, and the whole table scanning is not needed.
By carrying out the process of searching and inquiring according to the attribute of the metadata dimension, the advantages of the index of the elastic search and the relational database can be utilized, the problem of inquiring performance is solved, the inquiring efficiency is improved, and the accurate metadata information is obtained.
Based on the same inventive concept, the invention also provides a metadata integrated management device. The implementation of the device can be referred to as implementation of the above method, and the repetition is not repeated. The term "module" as used below may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 4 is a schematic diagram of a metadata integrated management apparatus according to the present invention. As shown in fig. 4, the apparatus includes:
the metadata model abstraction module 101 is configured to abstract a metadata model from a metadata object, a metadata attribute dimension, and a relationship chain.
A metadata model definition module 102, configured to define a metadata model, including a definition of a unique identification code, a definition of a meta attribute dimension, and a definition of a relationship chain;
the unique identification code consists of an object type and an object definition that contains references to other objects by which an association is established.
The meta-attribute dimension is defined by the type of the meta-attribute dimension and the attribute of the meta-attribute dimension; metadata objects are described from different dimensions by defining the types of meta-attribute dimensions and corresponding attribute combinations.
The relationship chain is defined by the type of the relationship chain and the objects upstream and downstream of the relationship chain; the association relationship between metadata objects is established and described by defining the type of relationship chain and specifying the unique identification code of the upstream and downstream objects.
Metadata visualization module 103 is configured to abstract the configuration associated with metadata visualization into a particular meta-attribute dimension of the metadata object.
The metadata model storage and query optimization module 104 is configured to store the metadata model in a table of a relational database, construct an index of a meta attribute dimension based on a type of the meta attribute dimension, and store a relationship chain in a search engine or a graph database for metadata query.
It should be noted that although several modules of the metadata integration management apparatus are mentioned in the above detailed description, such partitioning is merely exemplary and not mandatory. Indeed, the features and functions of two or more modules described above may be embodied in one module in accordance with embodiments of the present invention. Conversely, the features and functions of one module described above may be further divided into a plurality of modules to be embodied.
Based on the foregoing inventive concept, as shown in fig. 5, the present invention further proposes a computer device 200, including a memory 210, a processor 220, and a computer program 230 stored on the memory 210 and capable of running on the processor 220, where the processor 220 implements the foregoing metadata integrated management method when executing the computer program 230.
Based on the foregoing inventive concept, the present invention also proposes a computer-readable storage medium storing a computer program for executing the foregoing metadata integration management method.
The metadata integrated management method and device provided by the invention are provided with the following bright spots:
1. extensible metadata model: new attributes and types can be easily added without modifying the whole metadata model, and the flexibility and expandability of metadata management are improved.
2. Flexible data storage scheme: the accuracy and the completeness of data can be ensured by inquiring the relational database, the elastic search can be easily expanded to a plurality of nodes, horizontal expansion and load balancing are realized, good expandability is realized when a large amount of metadata is stored and inquired, and high concurrent inquiry requests can be processed.
3. High performance data query: by carrying out the process of searching and inquiring according to the dimension of the meta-attribute, the advantages of the elastic search index and the relational database can be utilized, the problem of inquiring performance is solved, the inquiring efficiency is improved, and the accurate meta-data information is obtained.
4. Extensible visual management scheme: the configuration related to metadata visualization is abstracted into the meta attribute dimension of the metadata object, a flexible mode is provided for customizing the front-end display without directly modifying the front-end code, the expandability of the front-end display is realized, and the system can adapt to front-end pages with different requirements.
While the spirit and principles of the present invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments nor does it imply that features of the various aspects are not useful in combination, nor are they useful in any combination, such as for convenience of description. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
It should be apparent to those skilled in the art that various modifications or variations can be made in the present invention without requiring any inventive effort by those skilled in the art based on the technical solutions of the present invention.

Claims (10)

1. A metadata integration management method, the method comprising:
abstracting the metadata model from the metadata object, the metadata attribute dimension and the relationship chain;
defining a metadata model, wherein the metadata model comprises a definition of a unique identification code, a definition of a meta attribute dimension and a definition of a relation chain;
abstracting the configuration of the metadata visualization correlation as a special meta attribute dimension of the metadata object;
the metadata model is stored in a table of a relational database, an index of the meta attribute dimension is built based on the type of the meta attribute dimension, and a relational chain is stored in a search engine or a graph database for metadata query.
2. The metadata integration management method according to claim 1, wherein the meta-attribute dimension is defined by a type of meta-attribute dimension and an attribute of the meta-attribute dimension; metadata objects are described from different dimensions by defining the types of meta-attribute dimensions and corresponding attribute combinations.
3. The metadata integration management method according to claim 1, wherein the relationship chain is defined by a type of a relationship chain and an upstream and downstream object of the relationship chain; the association relationship between metadata objects is established and described by defining the type of relationship chain and specifying the unique identification code of the upstream and downstream objects.
4. The metadata integration management method according to claim 1, wherein the unique identification code is composed of an object type and an object definition, the object definition containing references to other objects, and the association is established by referring to the unique identification code of the other objects.
5. A metadata integration management device, characterized in that the device comprises:
the metadata model abstraction module is used for abstracting the metadata model from the metadata object, the metadata attribute dimension and the relation chain;
the metadata model definition module is used for defining a metadata model, and comprises a definition of a unique identification code, a definition of a metadata attribute dimension and a definition of a relation chain;
the metadata visualization module is used for abstracting the configuration related to metadata visualization into a special metadata attribute dimension of the metadata object;
and the metadata model storage and query optimization module is used for storing the metadata model in a table of the relational database, constructing an index of the meta attribute dimension based on the type of the meta attribute dimension, and storing a relation chain in a search engine or a graph database for metadata query.
6. The metadata-integrated management apparatus according to claim 5, wherein the meta-attribute dimension is defined by a type of meta-attribute dimension and an attribute of the meta-attribute dimension; metadata objects are described from different dimensions by defining the types of meta-attribute dimensions and corresponding attribute combinations.
7. The metadata-integrated management apparatus according to claim 5, wherein the relationship chain is defined by a type of a relationship chain and an upstream and downstream object of the relationship chain; the association relationship between metadata objects is established and described by defining the type of relationship chain and specifying the unique identification code of the upstream and downstream objects.
8. The metadata integration management device of claim 5, wherein the unique identification code consists of an object type and an object definition, the object definition containing references to other objects, the association being established by referencing the unique identification code of the other objects.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1-4 when executing the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program for executing the method of any one of claims 1-4.
CN202311544516.8A 2023-11-20 2023-11-20 Metadata integrated management method and device Pending CN117688008A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311544516.8A CN117688008A (en) 2023-11-20 2023-11-20 Metadata integrated management method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311544516.8A CN117688008A (en) 2023-11-20 2023-11-20 Metadata integrated management method and device

Publications (1)

Publication Number Publication Date
CN117688008A true CN117688008A (en) 2024-03-12

Family

ID=90134203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311544516.8A Pending CN117688008A (en) 2023-11-20 2023-11-20 Metadata integrated management method and device

Country Status (1)

Country Link
CN (1) CN117688008A (en)

Similar Documents

Publication Publication Date Title
US6108651A (en) Heuristic co-identification of objects across heterogeneous information sources
US6799174B2 (en) Retrieving, organizing, and utilizing networked data using databases
Clifford et al. Tracking provenance in a virtual data grid
US6704739B2 (en) Tagging data assets
US7376658B1 (en) Managing cross-store relationships to data objects
US6925462B2 (en) Database management system, and query method and query execution program in the database management system
KR101213798B1 (en) Complex data access
US20080320012A1 (en) Dynamic data discovery of a source data schema and mapping to a target data schema
EP1622049A2 (en) Methods and systems for data integration
KR20090028758A (en) Methods and apparatus for reusing data access and presentation elements
CN111488406B (en) Graph database management method
KR100529661B1 (en) Object integrated management system
Lee et al. An architecture for retaining and analyzing visual explorations of databases
US20040078355A1 (en) Information management system
US20080114752A1 (en) Querying across disparate schemas
Guidi et al. A query language for a metadata framework about mathematical resources
CN112835638A (en) Configuration information management method and device based on embedded application program
Liu et al. Ontology-based big dimension modeling in data warehouse schema design
CN116483829A (en) Data query method, device, computer equipment and storage medium
CN117688008A (en) Metadata integrated management method and device
CN113360496B (en) Method and device for constructing metadata tag library
Suárez-Otero et al. Leveraging conceptual data models to ensure the integrity of Cassandra databases
CN114840551A (en) Database table processing method and device, electronic equipment and storage medium
JP2004192657A (en) Information retrieval system, and recording medium recording information retrieval method and program for information retrieval
Gašpar et al. Integrating Two Worlds: Relational and NoSQL

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination