CN112906907B - Method and system for layering management and distribution of machine learning pipeline model - Google Patents
Method and system for layering management and distribution of machine learning pipeline model Download PDFInfo
- Publication number
- CN112906907B CN112906907B CN202110313978.3A CN202110313978A CN112906907B CN 112906907 B CN112906907 B CN 112906907B CN 202110313978 A CN202110313978 A CN 202110313978A CN 112906907 B CN112906907 B CN 112906907B
- Authority
- CN
- China
- Prior art keywords
- model
- machine learning
- mirror image
- pipeline
- warehouse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010801 machine learning Methods 0.000 title claims abstract description 110
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000010276 construction Methods 0.000 claims abstract description 38
- 239000010410 layer Substances 0.000 claims description 55
- 238000013515 script Methods 0.000 claims description 16
- 239000002356 single layer Substances 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 claims 1
- 238000013459 approach Methods 0.000 abstract description 5
- 238000012549 training Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/61—Installation
- G06F8/63—Image based installation; Cloning; Build to order
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a method and a system for layered management and distribution of a machine learning pipeline model, wherein the method comprises a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module; the command line management tool module is used for providing tools such as machine learning pipeline model construction, uploading, downloading and the like in a command line mode; the Docker background module is used for providing an API to receive a request from a command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, distributing the machine learning pipeline model to different modules according to different requests and further executing corresponding work; the model mirror image warehouse supports uploading and downloading of machine learning pipeline models by users, and can conduct layered storage management on model files and model attributes at a server side; a mirrored warehouse approach is supported that meets the OCI standard.
Description
Technical Field
The invention relates to the field of model warehouses in computer storage systems, in particular to a method and a system for layering management and distribution of machine learning pipeline models.
Background
The distribution systems of the machine learning model in the industry can be classified into a scheme: model warehouse based on file system or object storage system and with single model file as basic storage object. The user needs to upload the model to the model repository by way of SDK or UI. After the model is uploaded, the model repository will store the model or metadata of the model in a storage backend maintained by itself. When the model is needed to be used for reasoning, a user can download the model by using the SDK or the interface provided by the model warehouse to perform reasoning service. This approach does not distinguish between a single model and a pipeline model (i.e., a combination of several models based on a workflow).
In fact, this approach stores the pipeline model as a single model file, which cannot cope with the requirements of pipeline model training or deployment in terms of flexibility:
a) The user hopes to reorganize and combine the models in the pipeline to form a new pipeline model;
b) The user wishes to extract one or more of the pipeline models, retrain or deploy alone;
in addition, complex pipeline models present the following performance challenges:
c) Slower pipeline model upload and download speeds;
d) Higher model persistence storage costs.
Disclosure of Invention
The invention aims to overcome the defects of low management and distribution flexibility and low performance of a pipeline model in the existing scheme, and aims to provide a layering management and distribution method and system of a machine learning pipeline model, which support the storage and distribution of the machine learning pipeline model by referring to a mirror image warehouse distribution mirror image mode, so that each machine learning model in the pipeline is stored as a single layer in a file system in a mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system.
The invention is realized by the following technical scheme:
a method of machine learning pipeline model hierarchical management and distribution, comprising the steps of:
s1: obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using a DAG, and obtaining the DAG model through serialization operation;
s2: customizing a configuration file;
s3: generating a mirror image construction script according to the self-defined configuration file;
s4: constructing the plurality of machine learning models and the DAG model according to the mirror image construction script, and generating a Docker mirror image of a machine learning pipeline model and a corresponding workpiece type file;
s5: and pushing the Docker mirror image of the machine learning pipeline model and the corresponding workpiece type file to a model mirror image warehouse.
The method comprises the steps of supporting storage and distribution of machine learning pipeline models by means of mirror image warehouse distribution mirror images, and accordingly storing each machine learning model in a pipeline as a single layer in a file system in the mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system. The complex pipeline model is shared among data scientists, and the consistency of multiparty training and service is ensured. And can construct and test any complex model on a working basis, which also provides the possibility for more complex model structures required by integrated learning, multitasking learning and federal learning techniques, and simultaneously allows users to dynamically implement model operations and custom evaluations.
Further, the DAG is configured to define a conduit relationship between a plurality of machine learning models; the DAG called by a plurality of machine learning models is completed by using a programming language, and the DAG is stored as a model through a serialization operation.
Further, the hierarchical relationship of the plurality of machine learning models and the DAG model in the Docker mirror image to be constructed is divided through a custom configuration file.
Further, the custom configuration file specifically includes: the method comprises the steps that a configuration file needs to be assigned to a read position of a machine learning model, each mirror image layer in a Docker mirror image to be divided by the configuration file comprises a constraint relation of the machine learning model, a construction sequence and a construction hierarchy of the mirror image layers corresponding to a plurality of machine learning models to be divided by the configuration file, and a construction sequence and a construction hierarchy of the mirror image layers corresponding to a DAG model to be divided by the configuration file. Namely, n machine learning models sequentially correspond to a 1 st layer, a 2 nd layer, a … … th layer, an n-1 st layer and an n th layer; the configuration file needs to describe the construction sequence and the construction hierarchy of the mirror image layer corresponding to the DAG model; i.e. the n +1 layer located at the top most layer.
Further, the specific process of generating the mirror image construction script includes: and after the model management client side reads and analyzes the configuration file operation, generating a mirror image construction script conforming to the Docker specification.
Further, constructing the machine learning models and the DAG model according to the mirror image construction script, and generating a Docker mirror image of the machine learning pipeline model conforming to the Docker standard by the model management client and generating a workpiece type manual file of the machine learning pipeline model conforming to the OCI standard after expanding a Docker background function.
Further, the structure of the Manifest file is according to the OCI distribution specification; wherein, the Manifest is a JSON format file, comprising two parts: a Config part and a Layers part; the Config part records configuration about the mirror image, is metadata of the mirror image, and is used for displaying information in a UI of a mirror image warehouse and distinguishing construction of different operating systems; the Layers part consists of the content of which the multi-layer media type is application/vnd.oci.image layer.v1 in the OCI standard; each layer in the Config part and the Layers part is stored in a model mirror warehouse in a Blob mode; wherein, the digest of the pipeline model image exists as a Key; mirror warehouse provides a low latency pipeline model management and distribution infrastructure and can save significantly on model storage space.
Further, pushing the Docker mirror image of the machine learning pipeline model to a model mirror image warehouse specifically includes: the model management client interacts with the model mirror warehouse through a pushing command, and uploads the pipeline model mirror to the model mirror warehouse; each layer of Config and Layers in the model mirror image warehouse is stored in the model mirror image warehouse in a Blob mode; wherein the digest of the pipeline model image exists as a Key.
Further, the model management client pulls the machine learning pipeline model Docker mirror image from the model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to the Manifest of the model image warehouse, the model management client side uses the digest to download all Blobs in parallel; including Config and all Layers.
A layering management and distribution system for machine learning pipeline models comprises a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module;
the model management client pulls a Docker mirror image of a machine learning pipeline model from the model mirror image warehouse; the model management client interacts with the model mirror image warehouse through a pushing command, and uploads a pipeline model mirror image to the model mirror image warehouse
The command line management tool module is used for providing tools such as machine learning pipeline model construction, uploading, downloading and the like in a command line mode;
the Docker background module is used for providing an API to receive a request from a command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, distributing the machine learning pipeline model to different modules according to different requests and further executing corresponding work;
the model mirror warehouse is used for providing Docker Registry API to receive a request from a model management client, supporting a user to upload and download a machine learning pipeline model, and being capable of carrying out layered storage management on model files and model attributes at a server side; a mirrored warehouse approach is supported that meets the OCI standard.
Further, the model management client supports the OCI standard, and model files and model attributes in the model management client are subjected to hierarchical storage management.
Further, the model management client pulls the machine learning pipeline model Docker mirror image from the model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to a Manifest file of a model mirror warehouse, the model management client side uses a digest to download all blobs in parallel; including a Config portion and a Layers portion.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention relates to a method and a system for hierarchically managing and distributing machine learning pipeline models, which support the storage and distribution of the machine learning pipeline models by referring to a mirror image warehouse distribution mirror image mode, so that each machine learning model in the pipeline is stored as a single layer in a file system in the mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system. The invention can lead data scientists to share complex pipeline models and ensure the consistency of multiparty training and service. And can build and test any complex model on the basis of the work of the users, which also provides the possibility for more complex model structures required by the integrated learning, the multitasking learning and the federal learning technologies, and simultaneously enables the users to dynamically realize model operation and custom evaluation. Moreover, the invention uses mirroring to encapsulate the pipeline model, provides a low-latency pipeline model management and distribution infrastructure with a mirrored warehouse, and can greatly save the space for model storage.
Drawings
The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention. In the drawings:
FIG. 1 is a system frame diagram of the present invention;
FIG. 2 is a flow chart of the system of the present invention.
Detailed Description
For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that: no such specific details are necessary to practice the invention. In other instances, well-known structures, circuits, materials, or methods have not been described in detail in order not to obscure the invention.
Throughout the specification, references to "one embodiment," "an embodiment," "one example," or "an example" mean: a particular feature, structure, or characteristic described in connection with the embodiment or example is included within at least one embodiment of the invention. Thus, the appearances of the phrases "in one embodiment," "in an example," or "in an example" in various places throughout this specification are not necessarily all referring to the same embodiment or example. Furthermore, the particular features, structures, or characteristics may be combined in any suitable combination and/or sub-combination in one or more embodiments or examples. Moreover, those of ordinary skill in the art will appreciate that the illustrations provided herein are for illustrative purposes and that the illustrations are not necessarily drawn to scale. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
In the description of the present invention, it should be understood that the terms "front", "rear", "left", "right", "upper", "lower", "vertical", "horizontal", "high", "low", "inner", "outer", etc. indicate orientations or positional relationships based on the drawings, are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the scope of the present invention.
Embodiment one:
as shown in fig. 2, a method for hierarchically managing and distributing a machine learning pipeline model includes the following steps:
s1: obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using a DAG, and obtaining the DAG model through serialization operation;
s2: customizing a configuration file;
s3: generating a mirror image construction script according to the self-defined configuration file;
s4: constructing a plurality of machine learning models and DAG models according to the mirror image construction script, and generating a Docker mirror image of the machine learning pipeline model and a corresponding workpiece type file;
s5: and pushing the Docker mirror image of the machine learning pipeline model and the corresponding workpiece type file to a model mirror image warehouse.
The method comprises the steps of supporting storage and distribution of machine learning pipeline models by means of mirror image warehouse distribution mirror images, and accordingly storing each machine learning model in a pipeline as a single layer in a file system in the mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system. The complex pipeline model is shared among data scientists, and the consistency of multiparty training and service is ensured. And can construct and test any complex model on a working basis, which also provides the possibility for more complex model structures required by integrated learning, multitasking learning and federal learning techniques, and simultaneously allows users to dynamically implement model operations and custom evaluations.
The DAG is used to define a pipeline relationship between a plurality of machine learning models; the DAG called by a plurality of machine learning models is completed by using a programming language, and the DAG is stored as a model through a serialization operation.
And dividing the hierarchical relationship of the plurality of machine learning models and the DAG model in the Docker mirror image to be constructed through a custom configuration file.
The custom configuration file specifically includes: the method comprises the steps that a configuration file needs to be assigned to a read position of a machine learning model, each mirror image layer in a Docker mirror image to be divided by the configuration file comprises a constraint relation of the machine learning model, a construction sequence and a construction hierarchy of the mirror image layers corresponding to a plurality of machine learning models to be divided by the configuration file, and a construction sequence and a construction hierarchy of the mirror image layers corresponding to a DAG model to be divided by the configuration file. Namely, n machine learning models sequentially correspond to a 1 st layer, a 2 nd layer, a … … th layer, an n-1 st layer and an n th layer; the configuration file needs to describe the construction sequence and the construction hierarchy of the mirror image layer corresponding to the DAG model; i.e. the n +1 layer located at the top most layer.
The specific process for generating the mirror image construction script comprises the following steps: and after the model management client side reads and analyzes the configuration file operation, generating a mirror image construction script conforming to the Docker specification.
And constructing a plurality of machine learning models and DAG models according to the mirror image construction script, and generating a workpiece type Manifest file of the machine learning pipeline model which accords with the OCI standard after a Docker mirror image of the machine learning pipeline model which accords with the Docker standard is constructed by the generating model management client and a Docker background function is expanded.
The structure of the management file is according to the OCI distribution specification; wherein, the Manifest is a JSON format file, comprising two parts: a Config part and a Layers part; the Config part records configuration about the mirror image, is metadata of the mirror image, and is used for displaying information in a UI of a mirror image warehouse and distinguishing construction of different operating systems; the Layers part is composed of the content of the multilayer mediaType as application/vnd. OCI. Image layer. V1 in the OCI standard; each layer in the Config part and the Layers part is stored in a model mirror warehouse in a Blob mode; wherein, the digest of the pipeline model image exists as a Key; mirror warehouse provides a low latency pipeline model management and distribution infrastructure and can save significantly on model storage space.
Pushing the Docker mirror image of the machine learning pipeline model to a model mirror image warehouse, which specifically comprises the following steps: the model management client interacts with the model mirror warehouse through a pushing command, and uploads the pipeline model mirror to the model mirror warehouse; each layer of Config and Layers in the model mirror image warehouse is stored in the model mirror image warehouse in a Blob mode; wherein the digest of the pipeline model image exists as a Key.
The model management client pulls a machine learning pipeline model dock mirror image from a model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to the Manifest of the model image warehouse, the model management client side uses the digest to download all Blobs in parallel; including Config and all Layers.
As shown in fig. 1, a machine learning pipeline model layering management and distribution system comprises a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module;
the model management client pulls a Docker mirror image of the machine learning pipeline model from the model mirror image warehouse; the model management client interacts with the model mirror warehouse through a pushing command, and uploads the pipeline model mirror to the model mirror warehouse
The command line management tool module is used for providing tools such as machine learning pipeline model construction, uploading, downloading and the like in a command line mode;
the Docker background module is used for providing an API to receive a request from the command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, and distributing the machine learning pipeline model to different modules according to different requests so as to execute corresponding work;
the model mirror image warehouse is used for providing Docker Registry API to receive a request from a model management client, supporting a user to upload and download a machine learning pipeline model, and being capable of carrying out hierarchical storage management on model files and model attributes at a server side; a mirrored warehouse approach is supported that meets the OCI standard.
The model management client supports OCI standard, and performs hierarchical storage management on model files and model attributes in the model management client.
The model management client pulls a machine learning pipeline model dock mirror image from a model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to a Manifest file of a model mirror warehouse, the model management client side uses a digest to download all blobs in parallel; including a Config portion and a Layers portion.
The key steps are as follows:
the developer of the pipeline model constructs a Docker mirror image according to the machine learning pipeline model mirror image layering method and the Manifest specification;
delivering the constructed mirror image to a machine learning pipeline mirror image warehouse by a developer;
the model user uses the client to pull the mirror image from the mirror image warehouse, and the pulling process accords with the standard mirror image warehouse service API:
the client side firstly requests the Manifest of the mirror image from the mirror image warehouse;
the client pulls the mirror image Config information;
the client pulls all file layers that contain the pipeline model.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (10)
1. A method for hierarchically managing and distributing machine-learned pipeline models, comprising the steps of:
s1: obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using a DAG, and obtaining the DAG model through serialization operation;
s2: customizing a configuration file;
s3: generating a mirror image construction script according to the self-defined configuration file;
s4: constructing the plurality of machine learning models and the DAG model according to the mirror image construction script, and generating a Docker mirror image of a machine learning pipeline model and a corresponding workpiece type file;
s5: and pushing the Docker mirror image of the machine learning pipeline model and the corresponding workpiece type file to a model mirror image warehouse.
2. The method of claim 1, wherein the DAG is configured to define a pipeline relationship between a plurality of machine learning models; the DAG called by a plurality of machine learning models is completed by using a programming language, and the DAG is stored as a model through a serialization operation.
3. The method for hierarchical management and distribution of machine learning pipeline models according to claim 1, wherein the hierarchical relationship of the plurality of machine learning models and DAG models in the Docker mirror image to be constructed is divided by a custom configuration file.
4. A method for hierarchically managing and distributing a machine-learning pipeline model according to claim 3, wherein the custom configuration file specifically comprises: the method comprises the steps that a configuration file needs to be assigned to a read position of a machine learning model, each mirror image layer in a Docker mirror image to be divided by the configuration file comprises a constraint relation of the machine learning model, a construction sequence and a construction hierarchy of the mirror image layers corresponding to a plurality of machine learning models to be divided by the configuration file, and a construction sequence and a construction hierarchy of the mirror image layers corresponding to a DAG model to be divided by the configuration file.
5. The method for hierarchically managing and distributing machine-learning pipeline models according to claim 1, wherein the specific process of generating the mirror image construction script comprises: and after the model management client side reads and analyzes the configuration file operation, generating a mirror image construction script conforming to the Docker specification.
6. The method for hierarchically managing and distributing machine learning pipeline models according to claim 1, wherein the plurality of machine learning models and the DAG model are constructed according to the mirror image construction script, and a Docker mirror image of the machine learning pipeline model conforming to the Docker specification is constructed by a generating model management client and a workpiece type management file of the machine learning pipeline model conforming to the OCI standard is generated by expanding a Docker background function.
7. The method for hierarchically managing and distributing machine-learning pipeline models according to claim 6, wherein the structure of the management file is according to the OCI distribution specification; wherein, the Manifest is a JSON format file, comprising two parts: a Config part and a Layers part; the Config part records configuration about the mirror image, is metadata of the mirror image, and is used for displaying information in a UI of a mirror image warehouse and distinguishing construction of different operating systems; the Layers part consists of the content of multilayer media type being application/vnd.oci.image layer v1 in the OCI standard; each layer in the Config part and the Layers part is stored in a model mirror warehouse in a Blob mode; wherein the digest of the pipeline model image exists as a Key.
8. The system for hierarchically managing and distributing the machine learning pipeline model is characterized by comprising a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module;
the model management client pulls a Docker mirror image of a machine learning pipeline model from the model mirror image warehouse; the model management client interacts with the model mirror warehouse through a pushing command, and uploads a pipeline model mirror to the model mirror warehouse;
the command line management tool module is used for providing machine learning pipeline model building, uploading and downloading tools in a command line mode;
the Docker background module is used for providing an API to receive a request from a command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, distributing the machine learning pipeline model to different modules according to different requests and further executing corresponding work;
the model mirror image warehouse is used for providing a DockerRaegistryAPI to receive a request from a model management client, supporting a user to upload and download a machine learning pipeline model, and carrying out layered storage management on model files and model attributes at a server; supporting a mirror warehouse scheme conforming to the OCI standard;
the method comprises the steps of obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using DAGs, obtaining a DAG model through serialization operation, defining a calculation relation diagram of the machine learning pipeline model by using the DAG model, storing the DAG model as an uppermost single layer in a file system, constructing the machine learning models and the DAG model according to mirror image construction scripts, generating Docker images and corresponding workpiece type files of the machine learning pipeline model, and pushing the Docker images and the corresponding workpiece type files of the machine learning pipeline model to a model image warehouse.
9. The system for hierarchical management and distribution of machine learning pipeline models of claim 8 wherein the model management client supports the OCI standard and wherein model files and model attributes in the model management client are managed in a hierarchical storage.
10. The system for hierarchically managing and distributing machine learning pipeline models according to claim 8, wherein the model management client pulls machine learning pipeline model Docker images from a model image warehouse, and specifically comprises: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to a Manifest file of a model mirror warehouse, the model management client side uses a digest to download all blobs in parallel; including a Config portion and a Layers portion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110313978.3A CN112906907B (en) | 2021-03-24 | 2021-03-24 | Method and system for layering management and distribution of machine learning pipeline model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110313978.3A CN112906907B (en) | 2021-03-24 | 2021-03-24 | Method and system for layering management and distribution of machine learning pipeline model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112906907A CN112906907A (en) | 2021-06-04 |
CN112906907B true CN112906907B (en) | 2024-02-23 |
Family
ID=76106214
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110313978.3A Active CN112906907B (en) | 2021-03-24 | 2021-03-24 | Method and system for layering management and distribution of machine learning pipeline model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112906907B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114327754B (en) * | 2021-12-15 | 2022-10-04 | 中电信数智科技有限公司 | Mirror image exporting and assembling method based on container layering technology |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107111787A (en) * | 2014-09-08 | 2017-08-29 | 皮沃塔尔软件公司 | Stream process |
CN108040074A (en) * | 2018-01-26 | 2018-05-15 | 华南理工大学 | A kind of real-time network unusual checking system and method based on big data |
EP3376361A2 (en) * | 2017-10-19 | 2018-09-19 | Pure Storage, Inc. | Ensuring reproducibility in an artificial intelligence infrastructure |
CN108984257A (en) * | 2018-07-06 | 2018-12-11 | 无锡雪浪数制科技有限公司 | A kind of machine learning platform for supporting custom algorithm component |
CN109740765A (en) * | 2019-01-31 | 2019-05-10 | 成都品果科技有限公司 | A kind of machine learning system building method based on Amazon server |
WO2019143412A1 (en) * | 2018-01-19 | 2019-07-25 | Umajin Inc. | Configurable server kit |
CN110266771A (en) * | 2019-05-30 | 2019-09-20 | 天津神兔未来科技有限公司 | Distributed intelligence node and distributed swarm intelligence system dispositions method |
CN110287171A (en) * | 2019-06-28 | 2019-09-27 | 北京九章云极科技有限公司 | A kind of data processing method and system |
WO2019184750A1 (en) * | 2018-03-30 | 2019-10-03 | 华为技术有限公司 | Deep learning task scheduling method and system and related apparatus |
CN110543464A (en) * | 2018-12-12 | 2019-12-06 | 广东鼎义互联科技股份有限公司 | Big data platform applied to smart park and operation method |
CN110716744A (en) * | 2019-10-21 | 2020-01-21 | 中国科学院空间应用工程与技术中心 | Data stream processing method, system and computer readable storage medium |
US10565093B1 (en) * | 2018-10-09 | 2020-02-18 | International Business Machines Corporation | Providing cognitive intelligence across continuous delivery pipeline data |
CN111353609A (en) * | 2020-02-28 | 2020-06-30 | 平安科技(深圳)有限公司 | Machine learning system |
CN111901294A (en) * | 2020-06-09 | 2020-11-06 | 北京迈格威科技有限公司 | Method for constructing online machine learning project and machine learning system |
CN112148810A (en) * | 2020-11-10 | 2020-12-29 | 南京智数云信息科技有限公司 | User portrait analysis system supporting custom label |
CN112418438A (en) * | 2020-11-24 | 2021-02-26 | 国电南瑞科技股份有限公司 | Container-based machine learning procedural training task execution method and system |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10026041B2 (en) * | 2014-07-12 | 2018-07-17 | Microsoft Technology Licensing, Llc | Interoperable machine learning platform |
US10936969B2 (en) * | 2016-09-26 | 2021-03-02 | Shabaz Basheer Patel | Method and system for an end-to-end artificial intelligence workflow |
WO2018069260A1 (en) * | 2016-10-10 | 2018-04-19 | Proekspert AS | Data science versioning and intelligence systems and methods |
US11922564B2 (en) * | 2017-06-05 | 2024-03-05 | Umajin Inc. | Generative content system that supports location-based services and methods therefor |
US10671434B1 (en) * | 2017-10-19 | 2020-06-02 | Pure Storage, Inc. | Storage based artificial intelligence infrastructure |
EP3985684A1 (en) * | 2018-07-18 | 2022-04-20 | NVIDIA Corporation | Virtualized computing platform for inferencing, advanced processing, and machine learning applications |
US20200125639A1 (en) * | 2018-10-22 | 2020-04-23 | Ca, Inc. | Generating training data from a machine learning model to identify offensive language |
US20200193221A1 (en) * | 2018-12-17 | 2020-06-18 | At&T Intellectual Property I, L.P. | Systems, Methods, and Computer-Readable Storage Media for Designing, Creating, and Deploying Composite Machine Learning Applications in Cloud Environments |
US11616839B2 (en) * | 2019-04-09 | 2023-03-28 | Johnson Controls Tyco IP Holdings LLP | Intelligent edge computing platform with machine learning capability |
US20200401930A1 (en) * | 2019-06-19 | 2020-12-24 | Sap Se | Design of customizable machine learning services |
US11966856B2 (en) * | 2019-07-26 | 2024-04-23 | Live Nation Entertainment, Inc. | Enhanced validity modeling using machine-learning techniques |
US11663523B2 (en) * | 2019-09-14 | 2023-05-30 | Oracle International Corporation | Machine learning (ML) infrastructure techniques |
US11562267B2 (en) * | 2019-09-14 | 2023-01-24 | Oracle International Corporation | Chatbot for defining a machine learning (ML) solution |
-
2021
- 2021-03-24 CN CN202110313978.3A patent/CN112906907B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107111787A (en) * | 2014-09-08 | 2017-08-29 | 皮沃塔尔软件公司 | Stream process |
EP3376361A2 (en) * | 2017-10-19 | 2018-09-19 | Pure Storage, Inc. | Ensuring reproducibility in an artificial intelligence infrastructure |
WO2019143412A1 (en) * | 2018-01-19 | 2019-07-25 | Umajin Inc. | Configurable server kit |
CN108040074A (en) * | 2018-01-26 | 2018-05-15 | 华南理工大学 | A kind of real-time network unusual checking system and method based on big data |
WO2019184750A1 (en) * | 2018-03-30 | 2019-10-03 | 华为技术有限公司 | Deep learning task scheduling method and system and related apparatus |
CN108984257A (en) * | 2018-07-06 | 2018-12-11 | 无锡雪浪数制科技有限公司 | A kind of machine learning platform for supporting custom algorithm component |
US10565093B1 (en) * | 2018-10-09 | 2020-02-18 | International Business Machines Corporation | Providing cognitive intelligence across continuous delivery pipeline data |
CN110543464A (en) * | 2018-12-12 | 2019-12-06 | 广东鼎义互联科技股份有限公司 | Big data platform applied to smart park and operation method |
CN109740765A (en) * | 2019-01-31 | 2019-05-10 | 成都品果科技有限公司 | A kind of machine learning system building method based on Amazon server |
CN110266771A (en) * | 2019-05-30 | 2019-09-20 | 天津神兔未来科技有限公司 | Distributed intelligence node and distributed swarm intelligence system dispositions method |
CN110287171A (en) * | 2019-06-28 | 2019-09-27 | 北京九章云极科技有限公司 | A kind of data processing method and system |
CN110716744A (en) * | 2019-10-21 | 2020-01-21 | 中国科学院空间应用工程与技术中心 | Data stream processing method, system and computer readable storage medium |
CN111353609A (en) * | 2020-02-28 | 2020-06-30 | 平安科技(深圳)有限公司 | Machine learning system |
CN111901294A (en) * | 2020-06-09 | 2020-11-06 | 北京迈格威科技有限公司 | Method for constructing online machine learning project and machine learning system |
CN112148810A (en) * | 2020-11-10 | 2020-12-29 | 南京智数云信息科技有限公司 | User portrait analysis system supporting custom label |
CN112418438A (en) * | 2020-11-24 | 2021-02-26 | 国电南瑞科技股份有限公司 | Container-based machine learning procedural training task execution method and system |
Non-Patent Citations (2)
Title |
---|
A practical tutorial on bagging and boosting based ensembles for machine learning : Algorithms, software tools, performance study, practical perspectives and opportunities;S. González等;《Inf. Fusion》;205-237 * |
基于分片复用的多版本容器镜像加载方法;陆志刚等;《软件学报》;第31卷(第6期);1875-1888 * |
Also Published As
Publication number | Publication date |
---|---|
CN112906907A (en) | 2021-06-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9348482B2 (en) | Modeling system for graphic user interface | |
US8963961B2 (en) | Fractal whiteboarding | |
US9787752B2 (en) | Hotspot editor for a user interface | |
CN104463957B (en) | A kind of three-dimensional scenic Core Generator integrated approach based on material | |
US10635408B2 (en) | Method and apparatus for enabling agile development of services in cloud computing and traditional environments | |
CN107077752A (en) | Data visualization extensible architecture | |
KR20060050401A (en) | Maintaining graphical presentations based on user customizations | |
JP2006114014A (en) | Graphical method for navigation in database of modeled object | |
CN112906907B (en) | Method and system for layering management and distribution of machine learning pipeline model | |
CN107402906A (en) | Dynamic content layout in application based on grid | |
CN107729304A (en) | Interacted with the document as application | |
US20230108560A1 (en) | Methods and Systems for Representation, Composition and Execution of Artificial Intelligence Centric Applications | |
CN108958731B (en) | Application program interface generation method, device, equipment and storage medium | |
US10417924B2 (en) | Visual work instructions for assembling product | |
CN105279599B (en) | Content Management System | |
JP2006178991A (en) | Method and system for graphically navigating among stored multiple objects | |
CN115469784A (en) | User interface and method for generating new artifacts based on existing artifacts | |
US8140977B2 (en) | Hosted data visualization service | |
US11113037B2 (en) | Software performance modification | |
Caraceni et al. | I-media-cities, a searchable platform on moving images with automatic and manual annotations | |
US20070055928A1 (en) | User workflow lists to organize multimedia files | |
US11921997B2 (en) | User interfaces and methods for generating a new artifact based on existing artifacts | |
CN114707680B (en) | Aircraft 3D model generation method and device, electronic equipment and readable medium | |
EP4343715A1 (en) | Determining 3d models corresponding to an image | |
US20170131954A1 (en) | Managing Printing to Reduce Physical Media Waste |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |