CN112685425B - Data asset meta-information processing system and method - Google Patents

Data asset meta-information processing system and method Download PDF

Info

Publication number
CN112685425B
CN112685425B CN202110023049.9A CN202110023049A CN112685425B CN 112685425 B CN112685425 B CN 112685425B CN 202110023049 A CN202110023049 A CN 202110023049A CN 112685425 B CN112685425 B CN 112685425B
Authority
CN
China
Prior art keywords
information
meta
data asset
management control
control component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110023049.9A
Other languages
Chinese (zh)
Other versions
CN112685425A (en
Inventor
林健
余波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongyun Ruilian Wuhan Computing Technology Co ltd
Original Assignee
Dongyun Ruilian Wuhan Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongyun Ruilian Wuhan Computing Technology Co ltd filed Critical Dongyun Ruilian Wuhan Computing Technology Co ltd
Priority to CN202110023049.9A priority Critical patent/CN112685425B/en
Publication of CN112685425A publication Critical patent/CN112685425A/en
Application granted granted Critical
Publication of CN112685425B publication Critical patent/CN112685425B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a data asset meta-information processing system and a method, wherein the system comprises a management control component, a delivery member construction component and a meta-information database; the meta-information database is used for storing meta-information of the data assets; the management control component is used for managing and accessing data asset meta-information and a delivery piece construction process and providing a management control component interface, and when a user or external software calls the management control component interface to transmit the meta-information of the data asset, the data asset meta-information description file is used as input data; and the consignment construction component responds to the construction instruction, takes the data asset meta-information description file as a rule, takes the data asset input component set as input component information, and takes the data asset consignment component set as consignment information for output. The method solves the problems of data asset meta-information management and delivery piece construction in various types of software, improves the software development workflow, and improves the development and maintenance efficiency of systems, platforms and service software.

Description

Data asset meta-information processing system and method
Technical Field
The invention relates to the field of data processing, in particular to a data asset meta-information processing system and a data asset meta-information processing method.
Background
In the era of the popularity of cloud computing, big data and artificial intelligence technologies, data assets and their deliveries are indispensable elements of various systems, platforms and service-class software. The data assets with rich types can attract users to use the software, and the ecological value of the software is improved. The flexible and easy-to-use data asset delivery part can reduce the use threshold of the software and facilitate the distribution and the deployment of the software. Taking artificial intelligence cloud service software as an example, data assets related to the software mainly comprise algorithms, models and data sets, and delivery pieces of the data assets are generally stored on an object storage system in a binary file form. At the same time, the meta-information of the data asset is saved in a database of the software. When the user uses the artificial intelligent cloud service software, the preset data assets of the software can be directly used without self-preparation and uploading of the data assets. The preset data asset meta-information of the software is convenient for users to intuitively understand the functions of the data assets on one hand and can also ensure that the system calls the data assets in a correct mode on the other hand.
Data assets are of great value in software, but software developers often experience trouble in managing and building data assets, particularly in processing meta-information and deliverables for data assets. For example:
the meta-information storage formats and storage positions of the data assets are various, and the management complexity is high. Conventionally, meta-information for different types of data assets may be stored in different locations using different formats, e.g., some meta-information is stored in a text format on a file system and some meta-information is stored in a record format in a database. And the meta information of a plurality of positions is managed simultaneously, so that the development and maintenance process of software is challenged.
The dependency relationship between data assets or between data assets and external resources is complex, and manual maintenance is prone to errors. The various data assets often do not work independently, but need to cooperate with one another. The version dependency between them determines the feasibility of the collaborative work. These relationships lack a uniform and efficient management of the traditional approach, and are prone to human errors, particularly after multiple iterations of the data asset version.
Data assets of different releases of software have different requirements and are easy to confuse. One software set is often released with different versions for different scenes, and the preset data asset list of each version may have differences. How to manage the respective data asset meta-information of a plurality of sets of release versions at the same time with higher efficiency is also a problem in software engineering.
The construction process of the data asset delivery part is tedious and time-consuming, and the automation degree is low. The build flow exists because the data asset may be delivered to the user in a different form than it originally existed and may need to be compiled, packaged, etc. prior to delivery. Differences exist in the construction flow of each data asset delivery part, and manual intervention links can be involved, so that the efficiency of software delivery is influenced.
Although some software developers have noted these problems and started to manage data asset meta-information in a versioned warehouse fashion and design delivery build flows with reference to the DevOps schema, most of the existing solutions are in the single hand and lack a uniform and complete mechanism, thus having drawbacks in universality and flexibility. The invention aims to solve the problems of data asset meta-information management and delivery construction in various types of software by using a universal method and a universal system, thereby improving the working process of software developers and improving the development and maintenance efficiency of systems, platforms and service software.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a data asset meta-information processing system and a method, wherein the system comprises a management control component, a delivery piece construction component and a meta-information database;
the meta-information database is used for storing meta-information of the data assets; the meta-information attribute category of the data asset at least comprises basic information, input piece information and delivery piece information; each field in a meta-information data table storing meta-information of the data assets corresponds to each field in a data asset meta-information description language, wherein the data asset meta-information description language is used for formally describing the attributes of one or more items of data assets, so that the meta-information of the data assets has the functions of structured storage and processing; wherein the meta information data table is a data table in the meta information database;
the management control component is used for managing and accessing data asset meta-information and a delivery piece construction process and providing a management control component interface for a user instruction or external software, wherein the user instruction or the external software calls the management control component interface to perform addition, deletion, modification and check operations on the meta-information of the data asset stored in the meta-information database, or the user instruction or the external software calls the management control component interface to send a construction instruction to the delivery piece construction component; when a user or external software calls the management control component interface to transmit the meta-information of the data asset, the meta-information description file of the data asset is used as input data, wherein the meta-information description file of the data asset represents a computer file written in the meta-information description language of the data asset;
and the delivery part construction component is used for responding to a construction instruction sent by the user or external software, taking the data asset meta-information description file as a rule, taking the data asset input part set as input part information, and taking the data asset delivery part set as delivery part information to output.
Accordingly, the interface type of the management control component interface is at least one of REST API, command line, and other forms of interfaces possible in a computer system.
In addition, in order to achieve the above object, the present invention further provides a processing method based on a data asset meta-information processing system, wherein the data asset meta-information processing system includes a management control component, a delivery member construction component, and a meta-information database; each field in a meta-information data table storing meta-information of data assets corresponds to each field in a data asset meta-information description language, wherein the data asset meta-information description language is used for formally describing the attributes of one or more data assets, so that the meta-information of the data assets has the function of structured storage and processing; wherein the meta information data table is a data table in the meta information database;
correspondingly, the processing method comprises the following steps:
calling the management control component interface to send a construction instruction to the delivery member construction component;
and responding to the construction instruction by the consignment construction component, taking the data asset meta-information description file as a rule, taking the data asset input member set as input member information, and taking the data asset consignment set as consignment information for outputting.
Preferably, the management control component interface comprises a build data asset delivery component interface, an interface of the meta information database, and an interface of the delivery component;
correspondingly, the step of calling the management control component interface to send a building instruction to the delivered part building component specifically includes:
the management control component receives the call of a user or external software to the interface of the delivery piece of the constructed data asset, and acquires the retrieval condition of the data asset to be constructed;
calling an interface of the meta-information database by the management control component, and inquiring the meta-information of the data asset to be constructed;
calling an interface of the delivery part construction component by the management control component, transmitting meta-information of the data asset to be constructed, and starting a construction flow to obtain a construction process;
correspondingly, the step of responding to the building instruction by the delivery part building component, using the data asset meta-information description file as a rule, using the data asset input part set as input part information, and using the data asset delivery part set as delivery part information to output includes:
downloading each input piece by the delivery piece construction component according to the input piece information in the meta information of the data assets respectively;
calling a corresponding construction process by the delivery member construction component according to the delivery member information in the meta information of the data asset, and respectively constructing each delivery member;
the delivery member construction component uploads each delivery member according to the delivery member information in the data asset meta-information;
returning, by the management control component, a build result.
Preferably, the method further comprises:
and calling a management control component interface provided by the management control component to perform the operation of increasing, deleting, modifying and checking the meta information of the data assets stored in the meta information database, and returning the operation result of increasing, deleting, modifying and checking.
The invention has the beneficial effects that: the problem of data asset meta-information management and delivery piece construction in various types of software is solved, so that the software development workflow is improved, and the development and maintenance efficiency of systems, platforms and service software is improved.
In particular, the invention can effectively solve the problems encountered by software developers when managing and constructing data assets, especially when processing meta-information and deliverables of data assets. Specifically, the method comprises the following steps:
effect 1: aiming at the problems of various meta-information storage formats and storage positions of data assets and high management complexity, the invention provides a standardized meta-information description language of the data assets as a standard storage format, and adopts a meta-information database as a centralized storage position, thereby obviously reducing the complexity of meta-information management.
Effect 2: aiming at the problems that the dependency relationship between data assets or between the data assets and external resources is complex and manual maintenance is easy to make mistakes, the invention designs the 'dependency information' related field in the meta-information description language of the data assets, can effectively record the dependency relationship between the data assets or the external resources and avoids the manual maintenance cost.
Effect 3: aiming at the problems that the data asset requirements of different release versions of software are different and are easy to be confused, the invention designs 'version number' related fields in a plurality of parts of a data asset meta-information description language, supports the efficient management of different versions of the same data asset in a versioning mode, and different versions can have independent input pieces, delivery pieces and dependency information without being confused.
Effect 4: aiming at the problems of complicated and time-consuming construction process of the data asset delivery part and low automation degree, the invention provides a standardized meta-information management method and a delivery part construction method, which shield the construction process difference among different data assets as far as possible and are convenient to realize in a computer system and a software engineering process in a unified and automatic mode.
Drawings
FIG. 1 is a signal flow diagram illustrating the structure of a data asset meta-information processing system and the operation of the system according to the present invention;
FIG. 2 is a flow chart illustrating a processing method of a data asset meta-information based processing system according to the present invention;
FIG. 3 is a schematic flow chart of the construction of a data asset delivery part provided by the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples. Referring to fig. 1, fig. 1 is a data asset meta-information processing system provided in the present invention, the system including a management control component, a delivery member construction component, and a meta-information database;
the meta-information database is used for storing meta-information of the data assets; the meta-information attribute category of the data asset at least comprises basic information, input piece information and delivery piece information; each field in a meta-information data table storing meta-information of the data assets corresponds to each field in a data asset meta-information description language, wherein the data asset meta-information description language is used for formally describing the attributes of one or more items of data assets, so that the meta-information of the data assets has the functions of structured storage and processing; wherein the meta information data table is a data table in the meta information database;
the management control component is used for managing and accessing data asset meta-information and a delivery piece construction process and providing a management control component interface for a user instruction or external software, wherein the user instruction or the external software calls the management control component interface to perform addition, deletion, modification and check operations on the meta-information of the data asset stored in the meta-information database, or the user instruction or the external software calls the management control component interface to send a construction instruction to the delivery piece construction component; when a user or external software calls the management control component interface to transmit the meta-information of the data asset, the meta-information description file of the data asset is used as input data, wherein the meta-information description file of the data asset represents a computer file written in the meta-information description language of the data asset;
and the delivery part construction component is used for responding to a construction instruction sent by the user or external software, taking the data asset meta-information description file as a rule, taking the data asset input part set as input part information, and taking the data asset delivery part set as delivery part information to output.
In particular implementations, the interface type of the management control component interface is at least one of a REST API, a command line, and other forms of interfaces that are possible in a computer system.
It should be noted that the present invention provides a data asset meta-information description language for formally describing the attributes of one or more data assets, so that the data asset meta-information can be stored and processed in a structured manner, thereby laying a foundation for the automatic construction of delivery pieces.
The data asset meta-information description Language is essentially a Domain-Specific Language (DSL). The data asset meta-information description language can be newly designed, and can also be customized or modified based on certain existing computer languages (such as JSON and XML);
the data asset meta-information description language needs to have the following features:
1. attributes of one or more data assets can be described.
2. For each data asset it describes, the data asset attributes that can be described include specifically the following 5 parts:
-basic information
-input piece information
Delivery information
-dependency information
-extension information
With respect to feature 1, the present invention requires that the data asset meta-information description language structurally organize the attributes of one or more data assets in an array or list of sets.
For feature 2, the present invention defines each of the above information as follows:
basic information (general): describes the fundamental essential attributes inherent to a data asset itself. The fields involved include an identifier (id), name (name), type (type), version number (version), etc. of the data asset.
Input information (inputs): attributes of the input pieces required to generate the item of data asset delivery are described. Generally in the form of a collection such as an array or a list, each element in the collection represents an input piece, and the content of the input piece includes the name (name), type (type), storage address (url), and the like of the input piece.
Delivered information (outputs): attributes of the various deliveries generated by the item of data asset are described. Typically in the form of a collection such as an array or a list, each element in the collection represents a delivery, and the content includes the name (name), type (type), storage address (url), etc. of the delivery.
Dependency information (dependences): a list of other data assets or external resources that a data asset needs to rely on when working properly is described. Generally, the data asset or external resource is referred to as an aggregate, and each element in the aggregate represents a depended data asset or external resource, and the content of the aggregate may be an identifier (id) of the depended data asset or external resource, or a tuple consisting of a name (name), a type (type), and a version number range (minimum version number min _ version, maximum version number max _ version). Dependency information is optional and may be an empty set for data assets that do not need to rely on information.
Extension information (extensions): other attributes are described that relate to functions specific to the data asset. The extended information field list for each data asset may be different. Each field is composed of a field name and a field value. The extension information is optional and may be an empty set for data assets that do not require extension information.
It should be noted that the present invention does not limit the arrangement order of the 5 types of information in the data asset meta-information description language, wherein the meta-information of the data asset of the present invention must include basic information (general), input piece information (inputs), and delivery piece information (outputs);
further optionally, the meta-information of the data asset may also include dependency information (dependencies) and extension information (extensions)
Among them, a computer file written in a data asset meta-information description language is called a data asset meta-information description file. The data asset meta-information description file is the carrier of data asset meta-information in a computer system.
In one embodiment, artificial intelligence cloud service software is taken as an example, and a data asset meta-information description language designed for the software is provided. The data assets involved in the software comprise 3 types of algorithms, models and data sets, so the meta-information description language of the data assets needs to be capable of covering the attributes of the 3 types of data assets. The element types that make up the data asset delivery collection may include binary files, container images, and SQL statements.
The description language of the data asset meta-information given in the present embodiment is customized based on the JSON language. The language does not extend the JSON language syntax but rather constrains the field manifest and its organizational structure based on the JSON language.
The general form of the language is given below:
Figure BDA0002889253320000091
Figure BDA0002889253320000101
a specific example of describing an item of data asset using this language is given below. The data assets described in this example are a set of inclusion algorithms (version 3.0) preset in the artificial intelligence cloud service software:
Figure BDA0002889253320000102
Figure BDA0002889253320000111
the specific information described by this example is explained as follows:
basic information: the data asset has an identifier of "00000789," entitled "inclusion," type of "algorithmm," and version number of "3.0.
Input piece information: an input is required to construct the data asset. The name of the input piece is "inclusion source code", the type is "git _ code", and the storage address is "https: // git. example. com/algorithm/acquisition. git ".
Delivered piece information: the data asset is built and two deliverables are generated. The first delivery piece is named as "inclusion package", is of type "Python _ wheel (Python installation package)", and has a storage address of "oss: example. com/python _ wheel/acceptance. whl ". The name of the second delivery part is "inclusion SQL (inclusion SQL statement)", the type is "SQL statement", the storage address is "nfs. Sql/algorithm/acceptance.
Dependency information: two dependent external resources are required to build the data asset. The name of the first dependent item is python, the type is DEB _ package (DEB installation package), and the version number range is 3.5-3.7. The second dependent item is named "tensorflow", is of the type "Python _ wheel (Python installation package)", and has a version number range of "1.14".
Extended information: the data asset has two items of extension information, that is, "engine _ type" is "TensorFlow" and "engine _ version" is "1.14", respectively.
To further explain the data asset meta-information processing system of the present invention, referring to FIG. 1, FIG. 1 shows a signal flow diagram of the data asset meta-information processing system of the present invention in operation;
in this embodiment, the data asset meta-information processing system provided by the present invention is, on one hand, configured to store and manage data asset meta-information and provide an access interface for the data asset meta-information; a data asset delivery is generated based on the data asset meta-information and the data asset input.
In particular, the meta-information database is used to store data asset meta-information. The content stored in the database can be a series of data asset meta-information description files per se, or can be an equivalent form of the content of the series of data asset meta-information description files after reorganization or conversion. The present invention does not restrict the technical typing of the meta information database and may use a relational database, a NoSQL database, or other database-equivalent storage system, such as a set of files structurally organized on a file system.
The management control component is used for managing and accessing the data asset meta-information and the delivery piece construction process and providing a corresponding interface for a user or external software. Through the interface provided by the component, the user or external software can perform operations such as adding, deleting, modifying, inquiring and the like on the data asset meta-information stored in the meta-information database, and can also send a construction instruction to the delivered piece construction component. When a user or external software calls an interface provided by the component, the data asset meta-information description file is used as input data in all cases related to the incoming data asset meta-information. The present invention is not limited to the form of interface provided by the management control component and may use a REST API, command line, or other form of interface that is feasible in a computer system.
The delivery build component is operable to generate a data asset delivery. It takes the meta-information description file of the data asset as a rule, the input of the input assembly of the data asset, and the output of the delivery assembly of the data asset. The component accepts instructions to manage the control component, whose main workflow is based on rules, building the input as a delivery. The invention does not restrict the specific technical mechanism of delivery member construction. Common technical mechanisms include: compiling the source code into a binary file, packaging resource files such as programs and data into container mirror images, and converting data asset meta information into SQL statements and the like.
The set of data asset inputs and the set of data asset deliverables involved in the deliverable building component are described as follows:
first, the set of data asset inputs is a set of source material files that are used to construct the data asset deliveries. Typical element types that make up the collection include source code, resource files, configuration files, and the like.
Second, a data asset delivery package is a collection of output files that are used to productively deliver a data asset to a customer or project. Typical element types that make up the collection include binary files, container images, SQL statements, and the like.
Further, in order to better explain the embodiment of fig. 1, the embodiment takes artificial intelligence cloud service software as an example, and provides a data asset meta-information processing system designed for the software. The system is used to manage the meta-information of 3 types of data assets in total, including algorithms, models and data sets, and to build a delivery of these 3 types of data assets.
It is understood that table 1 gives a description of the respective input piece set, delivery piece set, and build process for each delivery piece for 3 data assets:
TABLE 1
Figure BDA0002889253320000141
A detailed description of each component of the data asset meta-information management and delivery part construction system given in the present embodiment is as follows.
(1) The meta-information database is used to store meta-information for algorithms, models, and data sets. The present embodiment implements a meta-information database using a PostgreSQL database, and each type of data asset meta-information is stored in a table in the database. Each field in the table corresponds one-to-one to each field in the data asset meta-information description language.
(2) The management control component is used for managing and accessing the meta-information of the algorithms, models and data sets, and the meta-information of the algorithms, the models and the data sets, respectively, are used for constructing the processes of the payment pieces and providing corresponding interfaces for users or external software. Through the interface provided by the component, a user or external software can perform operations such as adding, deleting, modifying, inquiring and the like on the meta information of the algorithm, the model and the data set stored in the meta information database, and can also send a construction instruction to the delivered piece construction component. When a user or external software calls the interface provided by the component, the meta-information description file of the data asset is used as input data in all cases related to the meta-information of the incoming algorithm, model and data set.
(3) The delivery build component is used to generate deliveries of algorithms, models, and data sets. It takes the meta-information description files of the algorithms, models and data sets as rules, the 3 sets of input components listed in table 1 as inputs, and the 3 sets of delivery components listed in table 1 as outputs. The component accepts the instructions of the management control component, whose main workflow is to perform the 3-group construction process listed in table 1, constructing the inputs of the algorithm, model and data set as the deliveries.
Further, based on the data asset meta-information processing system shown in fig. 1, the present embodiment provides a processing method, and referring to fig. 2, fig. 2 is a schematic flow chart of the processing method based on the data asset meta-information processing system provided by the present invention, where the processing method includes the following steps:
step S10, calling a management control component interface provided by the management control component to carry out the operation of increasing, deleting, modifying and checking the meta information of the data assets stored in the meta information database, and returning the operation result of the operation of increasing, deleting, modifying and checking;
or
Step S20, calling the management control component interface to send a construction instruction to the delivery part construction component;
and step S30, responding to the construction instruction by the consignment construction component, taking the data asset meta-information description file as a rule, taking the data asset input piece set as input piece information, and taking the data asset consignment set as consignment information for outputting.
It is understood that the processing methods can be specifically divided into a meta-information management method of the data asset (corresponding to step S10) and a meta-information delivery piece construction method of the data asset (corresponding to steps S20 and S30):
the meta-information management method of the data asset (corresponding to step S10) is used to access and maintain meta-information of the data asset.
The operation flow related to the meta-information management method mainly comprises 4 aspects of adding, deleting, modifying, inquiring and the like. The meta information management method is performed by a management control component and a meta information database together. The management control assembly receives the call from a user or external software, executes the meta-information checking and converting operation, then calls the meta-information database, executes the storage and change operation of the meta-information by the meta-information database, and finally feeds back the result to the user or the external software by the management control assembly.
The following respectively describes the flows of 4 aspects related to the meta information management method;
adding data asset meta-information
The embodiment provides a flow of adding data asset meta-information, and the management control component interface comprises an adding data asset meta-information interface;
correspondingly, the step S10 specifically includes:
the management control component receives the call of a user or external software to a 'newly added data asset meta-information' interface to obtain a data asset meta-information description file;
checking, by the management control component, correctness of the data asset meta-information description file;
retrieving, by the management control component, the meta-information database to detect whether the data asset meta-information description file conflicts with existing data asset meta-information;
if no conflict exists, the management control component converts the data asset meta-information description file into a form accepted by the meta-information database and stores the form into the meta-information database;
if the conflict exists, the management control component reports an error to a user or external software.
Deleting data asset meta-information
The embodiment provides a process for deleting data asset meta-information, and the management control component interface comprises a data asset meta-information deleting interface;
correspondingly, the step S10 specifically includes:
receiving the call of a user or external software to a deleted data asset meta-information interface by the management control component to obtain the retrieval condition of the meta-information of the data asset;
retrieving, by the management control component, the meta-information database using retrieval conditions of the meta-information of the data asset, detecting existence of meta-information of the data asset to be deleted, and determining whether the meta-information of the data asset to be deleted exists;
if the data assets exist, the management control component calls an interface of the meta-information database, and the meta-information of the data assets to be deleted is deleted from the meta-information database;
if not, the management control component reports an error to a user or external software.
Modifying data asset meta-information
The embodiment provides a flow for modifying data asset meta-information, and the management control component interface comprises a data asset meta-information modification interface;
correspondingly, the step S10 specifically includes:
the management control component receives the call of a user or external software to the interface for modifying the data asset meta-information and obtains a data asset meta-information description file;
checking, by the management control component, correctness of the data asset meta-information description file;
retrieving, by the management control component, the meta-information database, detecting whether the data asset meta-information description file conflicts with data asset meta-information stored in the meta-information database, and determining whether a conflict exists;
if no conflict exists, the management control component converts the data asset meta-information description file into a form accepted by the meta-information database so as to update the original record in the meta-information database;
if the conflict exists, the management control component reports an error to a user or external software.
Querying data asset meta-information
The embodiment provides a process for querying data asset meta-information, wherein the management control component interface comprises a data asset meta-information query interface;
correspondingly, the step S10 specifically includes:
the management control component receives the call of a user or external software to an interface for inquiring the data asset meta-information to obtain the retrieval condition of the data asset meta-information;
retrieving, by the management control component, a meta-information database using retrieval conditions of the data assets, detecting existence of meta-information of the data assets to be queried, and judging whether the meta-information of the data assets to be queried exists;
if the data asset meta information exists, the management control component calls an interface of a meta information database, inquires the meta information of the data asset to be inquired, and returns the meta information to the user or external software;
if not, the management control component reports an error to a user or external software.
Further, referring to fig. 3, a meta information delivery part construction method of a data asset (corresponding to steps S20 and S30) for constructing a delivery part of a data asset. The delivery member construction method is executed by the management control component, the meta information database, and the delivery member construction component collectively. The management control component receives the call from a user or external software and obtains the meta-information of the data asset to be constructed from the meta-information database; the delivery member construction component receives the data asset meta-information, and executes delivery member construction and related operations thereof based on input member information and delivery member information in the data asset meta-information; and finally, the management control component feeds back the result to a user or external software.
FIG. 3 illustrates a process for building a data asset delivery, the management control component interface comprising a build data asset delivery interface, an interface of the meta-information database, and an interface of the delivery build component;
correspondingly, the step S20 specifically includes:
step S201, the management control component receives the call of a user or external software to a construction data asset delivery part interface, and obtains the retrieval condition of the data asset to be constructed;
step S202, the management control component calls an interface of the meta-information database to inquire the meta-information of the data asset to be constructed;
step S203, the management control component calls an interface of the delivery part construction component, transmits meta information of the data asset to be constructed, and starts a construction flow to obtain a construction process;
correspondingly, the step 30 specifically includes:
step 301, the delivery part construction component downloads each input part according to the input part information in the meta information of the data assets;
step 302, the consignment construction component calls corresponding construction processes according to the consignment information in the meta-information of the data assets to respectively construct each consignment;
step 303, the delivery member construction component uploads each delivery member according to the delivery member information in the data asset meta information;
step 304, the construction result is returned by the management control component.
It should be noted that, in the step 302, the construction method may be different for each delivery part according to the different types. The specific construction method of the delivery member is determined by its own technical features and is not restricted by the present invention. Each type of delivery member generally has its own recognized and inherent construction methodology, and thus the specific construction methodology need not be specified in the methods and systems of the present invention. The main body that specifically executes each type of delivery member construction operation may be the delivery member construction component itself, or may be external software called by the delivery member construction component.
For example, for binary type deliveries, the building method is typically to perform a compilation operation based on the source code. When constructing such a delivered part, the delivered part construction component needs to call an external compiler to perform a compiling operation on the source code as an input part, thereby generating a binary program as the delivered part.
Further, in order to better explain the flow of building the data asset delivery part given in fig. 3, the present embodiment takes artificial intelligence cloud service software as an example, and gives a delivery part building method designed for the software. The method is used to build a delivery of 3 types of data assets in total for algorithms, models and data sets. The 3 data assets are shown in Table 1 for the respective input piece set, delivery piece set, and build process description for each delivery piece.
The following describes a specific workflow of the delivery part construction method, taking the delivery part construction process of the data asset of "FastRCNN model (version 1.0)" as an example.
(1) And the management control component receives the call of a user to the interface of the 'construction data asset delivery part' to obtain the retrieval condition of the data asset to be constructed. The interface calls the incoming request message in this example as follows. The retrieval conditions described therein are: a data asset named "FastCNN", of type "model" (model), version "1.0".
Figure BDA0002889253320000191
(2) And the management control component calls an interface of the meta-information database to inquire the meta-information of the data asset to be constructed. The meta-information of the data asset in this example is represented using a data asset meta-information description language as follows. Two inputs are described that are required to construct the FastRCNN model: the download addresses of the FastCNN source code and COCO data sets, respectively, and two deliveries of expected outputs: the respective upload addresses of the fastrcn model and the SQL statements submitted to the software database.
Figure BDA0002889253320000201
Figure BDA0002889253320000211
(3) And the management control component calls an interface of the delivery part construction component, transmits the meta-information of the data asset to be constructed and starts a construction flow. The meta-information that is passed in this example is the data asset meta-information description file given in step 2.
(4) And the delivery part construction component downloads each input part according to the input part information in the data asset meta-information. In this example, the delivery build component needs to download two inputs required by the FastRCNN model separately: FastRCNN source code and COCO dataset. The download addresses of the two are https:// git. example. com/algorithm/fastcrnn. git in the git source code library and nfs. example. com:/dataset/coco in the network file system, respectively.
(5) And the consignment construction component calls a corresponding construction process according to the consignment information in the data asset meta-information to respectively construct each consignment. In this example, the delivered part construction component needs to respectively construct two delivered parts corresponding to the FastRCNN model: a FastRCNN model and SQL statements submitted to a software database. The construction processes of the two are respectively as follows:
FastRCNN model: and running the downloaded FastRCNN source code (namely a model training program), loading a COCO data set as training input data, executing a model training process, and finally generating a FastRCNN model.
SQL statements submitted to the software database: and operating external data asset meta-information description language parsing software, reading in the data asset meta-information description file given in the step 2, and translating the description file by the software to generate semantically equivalent SQL sentences, namely SQL sentences submitted to a software database.
(6) And the delivery member construction component uploads each delivery member according to the delivery member information in the data asset meta-information. In this example, the delivered part construction component needs to upload two delivered parts corresponding to the FastRCNN model respectively: a FastRCNN model and SQL statements submitted to a software database. The uploading addresses of the object storage system and the network file system are oss:// oss.example.com/model/fastrcnn in the object storage system and nfs.example.com:/sql/model/fastrcnn.
(7) And the management control component returns the construction result to the user or the external software, and the operation is finished. In this example, the feedback is constructed as "successful".
The invention has the beneficial effects that: the problem of data asset meta-information management and delivery piece construction in various types of software is solved, so that the software development workflow is improved, and the development and maintenance efficiency of the system, the platform and the service software is improved.
In particular, the invention can effectively solve the problems encountered by software developers when managing and constructing data assets, especially when processing meta-information and deliverables of data assets. Specifically, the method comprises the following steps:
effect 1: aiming at the problems of various meta-information storage formats and storage positions of data assets and high management complexity, the invention provides a standardized meta-information description language of the data assets as a standard storage format, and adopts a meta-information database as a centralized storage position, thereby obviously reducing the complexity of meta-information management.
Effect 2: aiming at the problems that the dependency relationship between data assets or between the data assets and external resources is complex and manual maintenance is easy to make mistakes, the invention designs the 'dependency information' related field in the meta-information description language of the data assets, so that the dependency relationship between the data assets or the external resources can be effectively recorded, and the manual maintenance cost is avoided.
Effect 3: aiming at the problems that the data asset requirements of different release versions of software are different and are easy to be confused, the invention designs 'version number' related fields in a plurality of parts of a data asset meta-information description language, supports the efficient management of different versions of the same data asset in a versioning mode, and different versions can have independent input pieces, delivery pieces and dependency information without being confused.
Effect 4: aiming at the problems of complicated and time-consuming construction process of data asset delivery parts and low automation degree, the invention provides a standard meta-information management method and a delivery part construction method, which shield the construction process difference among different data assets as much as possible and are convenient to realize in a unified and automatic mode in a computer system and a software engineering process.
Further, the present invention is not limited to the above-mentioned embodiments, and it will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements are also considered to be within the scope of the present invention. Those not described in detail in this specification are within the skill of the art.

Claims (9)

1. A data asset meta-information processing system is characterized in that the system comprises a management control component, a delivery member construction component and a meta-information database;
the meta-information database is used for storing meta-information of the data assets; the meta-information attribute category of the data asset at least comprises basic information, input piece information, delivery piece information, dependency information and extension information; wherein, the basic information is used for describing the inherent basic necessary attributes of the data asset; the input piece information is used for describing the attributes of each input piece required for generating the delivery piece of the data asset; the delivery piece information is used for describing the attributes of each delivery piece generated by the data asset; the dependency information is used for describing a list of other data assets or external resources on which the data asset needs to depend when the data asset works normally; the extended information is used for describing other attributes related to functions specific to the data assets;
each field in a meta-information data table storing meta-information of the data assets corresponds to each field in a data asset meta-information description language, wherein the data asset meta-information description language is used for formally describing the attributes of one or more data assets, so that the meta-information of the data assets has the functions of structured storage and processing; wherein the meta information data table is a data table in the meta information database;
the management control component is used for managing and accessing data asset meta-information and a delivery piece construction process and providing a management control component interface for a user instruction or external software, wherein the user instruction or the external software calls the management control component interface to perform addition, deletion, modification and check operations on the meta-information of the data asset stored in the meta-information database, or the user instruction or the external software calls the management control component interface to send a construction instruction to the delivery piece construction component; when a user or external software calls the management control component interface to transmit the meta-information of the data asset, the meta-information description file of the data asset is used as input data, wherein the meta-information description file of the data asset represents a computer file written in the meta-information description language of the data asset;
and the delivery part construction component is used for responding to a construction instruction sent by the user or external software, taking the data asset meta-information description file as a rule, taking the data asset input part set as input part information, and taking the data asset delivery part set as delivery part information to output.
2. The system of claim 1, wherein the interface type of the management control component interface is at least one of a REST API, a command line, and other forms of interfaces available in a computer system.
3. The processing method based on the data asset meta-information processing system is characterized in that the data asset meta-information processing system comprises a management control component, a delivery piece construction component and a meta-information database; each field in a meta-information data table storing meta-information of the data assets corresponds to each field in a data asset meta-information description language, wherein the data asset meta-information description language is used for formally describing the attributes of one or more items of data assets, so that the meta-information of the data assets has the functions of structured storage and processing; wherein, the meta-information data table is a data table in the meta-information database; the meta-information database is used for storing meta-information of the data assets; the meta-information attribute category of the data asset at least comprises basic information, input piece information, delivery piece information, dependency information and extension information; the basic information is used for describing the inherent basic necessary attributes of the data asset; the input piece information is used for describing the attributes of each input piece required for generating the delivery piece of the data asset; the delivery piece information is used for describing the attributes of each delivery piece generated by the data asset; the dependency information is used for describing a list of other data assets or external resources on which the data asset needs to depend when the data asset works normally; the extended information is used for describing other attributes related to functions specific to the data assets;
the management control component is used for managing and accessing data asset meta-information and a delivery piece construction process and providing a management control component interface for a user instruction or external software, wherein the user instruction or the external software calls the management control component interface to perform addition, deletion, modification and check operations on the meta-information of the data asset stored in the meta-information database, or the user instruction or the external software calls the management control component interface to send a construction instruction to the delivery piece construction component; when a user or external software calls the management control component interface to transmit the meta-information of the data asset, the meta-information description file of the data asset is used as input data, wherein the meta-information description file of the data asset represents a computer file written in the meta-information description language of the data asset;
correspondingly, the processing method comprises the following steps:
calling the management control component interface to send a construction instruction to the delivery member construction component;
and responding to the construction instruction by the consignment construction component, taking the data asset meta-information description file as a rule, taking the data asset input member set as input member information, and taking the data asset consignment set as consignment information for outputting.
4. The process of claim 3, wherein the management control component interface comprises a build data asset delivery component interface, an interface of the meta-information database, and an interface of the delivery component;
correspondingly, the step of calling the management control component interface to send a building instruction to the delivered part building component specifically includes:
the management control component receives the call of a user or external software to the construction data asset delivery part interface to obtain the retrieval condition of the data asset to be constructed;
calling an interface of the meta-information database by the management control component, and inquiring the meta-information of the data asset to be constructed;
calling an interface of the delivery part construction component by the management control component, transmitting meta-information of the data asset to be constructed, and starting a construction flow to obtain a construction process;
correspondingly, the step of responding to the building instruction by the delivery part building component, using the data asset meta-information description file as a rule, using the data asset input part set as input part information, and using the data asset delivery part set as delivery part information to output specifically includes:
downloading each input piece by the delivery piece construction component according to the input piece information in the meta information of the data assets respectively;
calling a corresponding construction process by the delivery member construction component according to the delivery member information in the meta information of the data asset, and respectively constructing each delivery member;
the delivery member construction component uploads each delivery member according to the delivery member information in the data asset meta-information;
returning, by the management control component, a build result.
5. The process of claim 4, wherein the process further comprises:
and calling a management control component interface provided by the management control component to perform the operation of increasing, deleting, modifying and checking the meta information of the data assets stored in the meta information database, and returning the operation result of increasing, deleting, modifying and checking.
6. The process of claim 5, wherein the management control component interface comprises an add-on data asset meta-information interface;
correspondingly, the step of calling the management control component interface provided by the management control component to perform the operations of adding, deleting, modifying and checking the meta information of the data asset stored in the meta information database and returning the operation results of adding, deleting, modifying and checking includes the following steps:
the management control component receives the call of a user or external software to a newly added data asset meta-information interface to obtain a data asset meta-information description file;
checking, by the management control component, correctness of the data asset meta-information description file;
retrieving, by the management control component, the meta-information database to detect whether the data asset meta-information description file conflicts with existing data asset meta-information;
if no conflict exists, the management control component converts the data asset meta-information description file into a form accepted by the meta-information database and stores the form into the meta-information database;
if the conflict exists, the management control component reports an error to a user or external software.
7. The process of claim 5, wherein the management control component interface comprises a delete data asset meta information interface;
correspondingly, the step of calling the management control component interface provided by the management control component to perform the operations of adding, deleting, modifying and checking the meta information of the data asset stored in the meta information database and returning the operation results of adding, deleting, modifying and checking includes the following steps:
receiving the call of a user or external software to a deleted data asset meta-information interface by the management control component to obtain the retrieval condition of the meta-information of the data asset;
retrieving, by the management control component, the meta-information database using retrieval conditions of the meta-information of the data asset, detecting existence of meta-information of the data asset to be deleted, and determining whether the meta-information of the data asset to be deleted exists;
if the data assets exist, the management control component calls an interface of the meta-information database, and the meta-information of the data assets to be deleted is deleted from the meta-information database;
if not, the management control component reports an error to a user or external software.
8. The process of claim 5, wherein the management control component interface comprises a modify data asset meta information interface;
correspondingly, the step of calling the management control component interface provided by the management control component to perform the operations of adding, deleting, modifying and checking the meta information of the data asset stored in the meta information database and returning the operation results of adding, deleting, modifying and checking includes the following steps:
the management control component receives the call of a user or external software to the interface for modifying the data asset meta-information and obtains a data asset meta-information description file;
checking, by the management control component, correctness of the data asset meta-information description file;
retrieving, by the management control component, the meta-information database, detecting whether the data asset meta-information description file conflicts with data asset meta-information stored in the meta-information database, and determining whether a conflict exists;
if no conflict exists, the management control component converts the data asset meta-information description file into a form accepted by the meta-information database so as to update the original record in the meta-information database;
if the conflict exists, the management control component reports an error to a user or external software.
9. The process of claim 5, wherein the management control component interface comprises a query data asset meta information interface;
correspondingly, the step of calling the management control component interface provided by the management control component to perform the operations of adding, deleting, modifying and checking the meta information of the data asset stored in the meta information database and returning the operation results of adding, deleting, modifying and checking includes the following steps:
the management control component receives the call of a user or external software to an interface for inquiring the data asset meta-information to obtain the retrieval condition of the data asset meta-information;
retrieving, by the management control component, a meta-information database using retrieval conditions of the data assets, detecting existence of meta-information of the data assets to be queried, and judging whether the meta-information of the data assets to be queried exists;
if yes, the management control component calls an interface of a meta-information database, inquires the meta-information of the data assets to be inquired, and returns the meta-information to the user or external software;
if not, the management control component reports an error to a user or external software.
CN202110023049.9A 2021-01-08 2021-01-08 Data asset meta-information processing system and method Active CN112685425B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110023049.9A CN112685425B (en) 2021-01-08 2021-01-08 Data asset meta-information processing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110023049.9A CN112685425B (en) 2021-01-08 2021-01-08 Data asset meta-information processing system and method

Publications (2)

Publication Number Publication Date
CN112685425A CN112685425A (en) 2021-04-20
CN112685425B true CN112685425B (en) 2022-06-17

Family

ID=75456519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110023049.9A Active CN112685425B (en) 2021-01-08 2021-01-08 Data asset meta-information processing system and method

Country Status (1)

Country Link
CN (1) CN112685425B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008020932A2 (en) * 2006-07-10 2008-02-21 Elephantdrive, Llc Abstracted and optimized online backup and digital asset management service
CN102365634A (en) * 2009-01-30 2012-02-29 伊斯曼柯达公司 System for managing distributed assets and metadata
CN102968685A (en) * 2012-10-26 2013-03-13 广东电子工业研究院有限公司 Account information asset management system and method thereof
CN105404974A (en) * 2015-12-01 2016-03-16 国网江西省电力公司信息通信分公司 Data capitalization method and apparatus and management platform
CN106202452A (en) * 2016-07-15 2016-12-07 复旦大学 The uniform data resource management system of big data platform and method
CN110163458A (en) * 2018-02-23 2019-08-23 徐峰 Data assets management and monitoring method based on artificial intelligence technology
CN111506779A (en) * 2020-04-20 2020-08-07 东云睿连(武汉)计算技术有限公司 Object version and associated information management method and system facing data processing
CN111813378A (en) * 2020-07-08 2020-10-23 北京迪力科技有限责任公司 Code base construction system, method and related device
CN111966866A (en) * 2020-08-11 2020-11-20 福建博思数字科技有限公司 Data asset management method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008020932A2 (en) * 2006-07-10 2008-02-21 Elephantdrive, Llc Abstracted and optimized online backup and digital asset management service
CN102365634A (en) * 2009-01-30 2012-02-29 伊斯曼柯达公司 System for managing distributed assets and metadata
CN102968685A (en) * 2012-10-26 2013-03-13 广东电子工业研究院有限公司 Account information asset management system and method thereof
CN105404974A (en) * 2015-12-01 2016-03-16 国网江西省电力公司信息通信分公司 Data capitalization method and apparatus and management platform
CN106202452A (en) * 2016-07-15 2016-12-07 复旦大学 The uniform data resource management system of big data platform and method
CN110163458A (en) * 2018-02-23 2019-08-23 徐峰 Data assets management and monitoring method based on artificial intelligence technology
CN111506779A (en) * 2020-04-20 2020-08-07 东云睿连(武汉)计算技术有限公司 Object version and associated information management method and system facing data processing
CN111813378A (en) * 2020-07-08 2020-10-23 北京迪力科技有限责任公司 Code base construction system, method and related device
CN111966866A (en) * 2020-08-11 2020-11-20 福建博思数字科技有限公司 Data asset management method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
软件资产管理中基于本体的语义检索;郭佳等;《计算机工程》;20090720;第93-95,113页 *

Also Published As

Publication number Publication date
CN112685425A (en) 2021-04-20

Similar Documents

Publication Publication Date Title
US10127250B2 (en) Data transformation system, graphical mapping tool and method for creating a schema map
US8918447B2 (en) Methods, apparatus, systems and computer readable mediums for use in sharing information between entities
US9940108B2 (en) Automated merging in a software development environment
US9201558B1 (en) Data transformation system, graphical mapping tool, and method for creating a schema map
US7054858B2 (en) System and method for retrieval of objects from object to relational mappings
US20070050762A1 (en) Build optimizer tool for efficient management of software builds for mobile devices
US20080098037A1 (en) Markup language based database upgrades
US10726040B2 (en) Lossless conversion of database tables between formats
JP2021530766A (en) Issuance to data warehouse
CN104965735A (en) Apparatus for generating upgrade SQL script
US20150254073A1 (en) System and Method for Managing Versions of Program Assets
US8707260B2 (en) Resolving interdependencies between heterogeneous artifacts in a software system
Alonso et al. Towards a polyglot data access layer for a low-code application development platform
CN112988280B (en) Configuration data processing method and device
CN108416035B (en) Disconf-based unified management method for database mapping files
CN112685425B (en) Data asset meta-information processing system and method
US9244706B2 (en) Command line shell command generation based on schema
CN112486990B (en) Method and equipment for describing synchronous database table structure according to model
EP1881420B1 (en) Mark Up Language Based Database Upgrades
CN117348916B (en) Script generation method, device, equipment and storage medium
US20230280990A1 (en) System and method for source code translation using an intermediate language
CN103678616A (en) Obtaining method and system for product structure information of product data management system
CN112905153B (en) Software parallel construction method and device for software defined satellite
CN115543960B (en) Dynamic modeling method and system for business object
CN117632133A (en) Dependency management method, device, equipment and storage medium based on net

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant