CN112906907B - Method and system for layering management and distribution of machine learning pipeline model - Google Patents

Method and system for layering management and distribution of machine learning pipeline model Download PDF

Info

Publication number
CN112906907B
CN112906907B CN202110313978.3A CN202110313978A CN112906907B CN 112906907 B CN112906907 B CN 112906907B CN 202110313978 A CN202110313978 A CN 202110313978A CN 112906907 B CN112906907 B CN 112906907B
Authority
CN
China
Prior art keywords
model
machine learning
mirror image
pipeline
warehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110313978.3A
Other languages
Chinese (zh)
Other versions
CN112906907A (en
Inventor
董昕
郭勇
梁艳
王杰
杨雅志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Technological University CDTU
Original Assignee
Chengdu Technological University CDTU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Technological University CDTU filed Critical Chengdu Technological University CDTU
Priority to CN202110313978.3A priority Critical patent/CN112906907B/en
Publication of CN112906907A publication Critical patent/CN112906907A/en
Application granted granted Critical
Publication of CN112906907B publication Critical patent/CN112906907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method and a system for layered management and distribution of a machine learning pipeline model, wherein the method comprises a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module; the command line management tool module is used for providing tools such as machine learning pipeline model construction, uploading, downloading and the like in a command line mode; the Docker background module is used for providing an API to receive a request from a command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, distributing the machine learning pipeline model to different modules according to different requests and further executing corresponding work; the model mirror image warehouse supports uploading and downloading of machine learning pipeline models by users, and can conduct layered storage management on model files and model attributes at a server side; a mirrored warehouse approach is supported that meets the OCI standard.

Description

Method and system for layering management and distribution of machine learning pipeline model
Technical Field
The invention relates to the field of model warehouses in computer storage systems, in particular to a method and a system for layering management and distribution of machine learning pipeline models.
Background
The distribution systems of the machine learning model in the industry can be classified into a scheme: model warehouse based on file system or object storage system and with single model file as basic storage object. The user needs to upload the model to the model repository by way of SDK or UI. After the model is uploaded, the model repository will store the model or metadata of the model in a storage backend maintained by itself. When the model is needed to be used for reasoning, a user can download the model by using the SDK or the interface provided by the model warehouse to perform reasoning service. This approach does not distinguish between a single model and a pipeline model (i.e., a combination of several models based on a workflow).
In fact, this approach stores the pipeline model as a single model file, which cannot cope with the requirements of pipeline model training or deployment in terms of flexibility:
a) The user hopes to reorganize and combine the models in the pipeline to form a new pipeline model;
b) The user wishes to extract one or more of the pipeline models, retrain or deploy alone;
in addition, complex pipeline models present the following performance challenges:
c) Slower pipeline model upload and download speeds;
d) Higher model persistence storage costs.
Disclosure of Invention
The invention aims to overcome the defects of low management and distribution flexibility and low performance of a pipeline model in the existing scheme, and aims to provide a layering management and distribution method and system of a machine learning pipeline model, which support the storage and distribution of the machine learning pipeline model by referring to a mirror image warehouse distribution mirror image mode, so that each machine learning model in the pipeline is stored as a single layer in a file system in a mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system.
The invention is realized by the following technical scheme:
a method of machine learning pipeline model hierarchical management and distribution, comprising the steps of:
s1: obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using a DAG, and obtaining the DAG model through serialization operation;
s2: customizing a configuration file;
s3: generating a mirror image construction script according to the self-defined configuration file;
s4: constructing the plurality of machine learning models and the DAG model according to the mirror image construction script, and generating a Docker mirror image of a machine learning pipeline model and a corresponding workpiece type file;
s5: and pushing the Docker mirror image of the machine learning pipeline model and the corresponding workpiece type file to a model mirror image warehouse.
The method comprises the steps of supporting storage and distribution of machine learning pipeline models by means of mirror image warehouse distribution mirror images, and accordingly storing each machine learning model in a pipeline as a single layer in a file system in the mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system. The complex pipeline model is shared among data scientists, and the consistency of multiparty training and service is ensured. And can construct and test any complex model on a working basis, which also provides the possibility for more complex model structures required by integrated learning, multitasking learning and federal learning techniques, and simultaneously allows users to dynamically implement model operations and custom evaluations.
Further, the DAG is configured to define a conduit relationship between a plurality of machine learning models; the DAG called by a plurality of machine learning models is completed by using a programming language, and the DAG is stored as a model through a serialization operation.
Further, the hierarchical relationship of the plurality of machine learning models and the DAG model in the Docker mirror image to be constructed is divided through a custom configuration file.
Further, the custom configuration file specifically includes: the method comprises the steps that a configuration file needs to be assigned to a read position of a machine learning model, each mirror image layer in a Docker mirror image to be divided by the configuration file comprises a constraint relation of the machine learning model, a construction sequence and a construction hierarchy of the mirror image layers corresponding to a plurality of machine learning models to be divided by the configuration file, and a construction sequence and a construction hierarchy of the mirror image layers corresponding to a DAG model to be divided by the configuration file. Namely, n machine learning models sequentially correspond to a 1 st layer, a 2 nd layer, a … … th layer, an n-1 st layer and an n th layer; the configuration file needs to describe the construction sequence and the construction hierarchy of the mirror image layer corresponding to the DAG model; i.e. the n +1 layer located at the top most layer.
Further, the specific process of generating the mirror image construction script includes: and after the model management client side reads and analyzes the configuration file operation, generating a mirror image construction script conforming to the Docker specification.
Further, constructing the machine learning models and the DAG model according to the mirror image construction script, and generating a Docker mirror image of the machine learning pipeline model conforming to the Docker standard by the model management client and generating a workpiece type manual file of the machine learning pipeline model conforming to the OCI standard after expanding a Docker background function.
Further, the structure of the Manifest file is according to the OCI distribution specification; wherein, the Manifest is a JSON format file, comprising two parts: a Config part and a Layers part; the Config part records configuration about the mirror image, is metadata of the mirror image, and is used for displaying information in a UI of a mirror image warehouse and distinguishing construction of different operating systems; the Layers part consists of the content of which the multi-layer media type is application/vnd.oci.image layer.v1 in the OCI standard; each layer in the Config part and the Layers part is stored in a model mirror warehouse in a Blob mode; wherein, the digest of the pipeline model image exists as a Key; mirror warehouse provides a low latency pipeline model management and distribution infrastructure and can save significantly on model storage space.
Further, pushing the Docker mirror image of the machine learning pipeline model to a model mirror image warehouse specifically includes: the model management client interacts with the model mirror warehouse through a pushing command, and uploads the pipeline model mirror to the model mirror warehouse; each layer of Config and Layers in the model mirror image warehouse is stored in the model mirror image warehouse in a Blob mode; wherein the digest of the pipeline model image exists as a Key.
Further, the model management client pulls the machine learning pipeline model Docker mirror image from the model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to the Manifest of the model image warehouse, the model management client side uses the digest to download all Blobs in parallel; including Config and all Layers.
A layering management and distribution system for machine learning pipeline models comprises a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module;
the model management client pulls a Docker mirror image of a machine learning pipeline model from the model mirror image warehouse; the model management client interacts with the model mirror image warehouse through a pushing command, and uploads a pipeline model mirror image to the model mirror image warehouse
The command line management tool module is used for providing tools such as machine learning pipeline model construction, uploading, downloading and the like in a command line mode;
the Docker background module is used for providing an API to receive a request from a command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, distributing the machine learning pipeline model to different modules according to different requests and further executing corresponding work;
the model mirror warehouse is used for providing Docker Registry API to receive a request from a model management client, supporting a user to upload and download a machine learning pipeline model, and being capable of carrying out layered storage management on model files and model attributes at a server side; a mirrored warehouse approach is supported that meets the OCI standard.
Further, the model management client supports the OCI standard, and model files and model attributes in the model management client are subjected to hierarchical storage management.
Further, the model management client pulls the machine learning pipeline model Docker mirror image from the model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to a Manifest file of a model mirror warehouse, the model management client side uses a digest to download all blobs in parallel; including a Config portion and a Layers portion.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention relates to a method and a system for hierarchically managing and distributing machine learning pipeline models, which support the storage and distribution of the machine learning pipeline models by referring to a mirror image warehouse distribution mirror image mode, so that each machine learning model in the pipeline is stored as a single layer in a file system in the mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system. The invention can lead data scientists to share complex pipeline models and ensure the consistency of multiparty training and service. And can build and test any complex model on the basis of the work of the users, which also provides the possibility for more complex model structures required by the integrated learning, the multitasking learning and the federal learning technologies, and simultaneously enables the users to dynamically realize model operation and custom evaluation. Moreover, the invention uses mirroring to encapsulate the pipeline model, provides a low-latency pipeline model management and distribution infrastructure with a mirrored warehouse, and can greatly save the space for model storage.
Drawings
The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention. In the drawings:
FIG. 1 is a system frame diagram of the present invention;
FIG. 2 is a flow chart of the system of the present invention.
Detailed Description
For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that: no such specific details are necessary to practice the invention. In other instances, well-known structures, circuits, materials, or methods have not been described in detail in order not to obscure the invention.
Throughout the specification, references to "one embodiment," "an embodiment," "one example," or "an example" mean: a particular feature, structure, or characteristic described in connection with the embodiment or example is included within at least one embodiment of the invention. Thus, the appearances of the phrases "in one embodiment," "in an example," or "in an example" in various places throughout this specification are not necessarily all referring to the same embodiment or example. Furthermore, the particular features, structures, or characteristics may be combined in any suitable combination and/or sub-combination in one or more embodiments or examples. Moreover, those of ordinary skill in the art will appreciate that the illustrations provided herein are for illustrative purposes and that the illustrations are not necessarily drawn to scale. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
In the description of the present invention, it should be understood that the terms "front", "rear", "left", "right", "upper", "lower", "vertical", "horizontal", "high", "low", "inner", "outer", etc. indicate orientations or positional relationships based on the drawings, are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the scope of the present invention.
Embodiment one:
as shown in fig. 2, a method for hierarchically managing and distributing a machine learning pipeline model includes the following steps:
s1: obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using a DAG, and obtaining the DAG model through serialization operation;
s2: customizing a configuration file;
s3: generating a mirror image construction script according to the self-defined configuration file;
s4: constructing a plurality of machine learning models and DAG models according to the mirror image construction script, and generating a Docker mirror image of the machine learning pipeline model and a corresponding workpiece type file;
s5: and pushing the Docker mirror image of the machine learning pipeline model and the corresponding workpiece type file to a model mirror image warehouse.
The method comprises the steps of supporting storage and distribution of machine learning pipeline models by means of mirror image warehouse distribution mirror images, and accordingly storing each machine learning model in a pipeline as a single layer in a file system in the mirror image; the computational relationship graph of the machine-learned pipeline model is defined with the DAG, which is then named as the model of the DAG to store the computational relationship of the pipeline, and the DAG model is stored as the uppermost separate layer in the file system. The complex pipeline model is shared among data scientists, and the consistency of multiparty training and service is ensured. And can construct and test any complex model on a working basis, which also provides the possibility for more complex model structures required by integrated learning, multitasking learning and federal learning techniques, and simultaneously allows users to dynamically implement model operations and custom evaluations.
The DAG is used to define a pipeline relationship between a plurality of machine learning models; the DAG called by a plurality of machine learning models is completed by using a programming language, and the DAG is stored as a model through a serialization operation.
And dividing the hierarchical relationship of the plurality of machine learning models and the DAG model in the Docker mirror image to be constructed through a custom configuration file.
The custom configuration file specifically includes: the method comprises the steps that a configuration file needs to be assigned to a read position of a machine learning model, each mirror image layer in a Docker mirror image to be divided by the configuration file comprises a constraint relation of the machine learning model, a construction sequence and a construction hierarchy of the mirror image layers corresponding to a plurality of machine learning models to be divided by the configuration file, and a construction sequence and a construction hierarchy of the mirror image layers corresponding to a DAG model to be divided by the configuration file. Namely, n machine learning models sequentially correspond to a 1 st layer, a 2 nd layer, a … … th layer, an n-1 st layer and an n th layer; the configuration file needs to describe the construction sequence and the construction hierarchy of the mirror image layer corresponding to the DAG model; i.e. the n +1 layer located at the top most layer.
The specific process for generating the mirror image construction script comprises the following steps: and after the model management client side reads and analyzes the configuration file operation, generating a mirror image construction script conforming to the Docker specification.
And constructing a plurality of machine learning models and DAG models according to the mirror image construction script, and generating a workpiece type Manifest file of the machine learning pipeline model which accords with the OCI standard after a Docker mirror image of the machine learning pipeline model which accords with the Docker standard is constructed by the generating model management client and a Docker background function is expanded.
The structure of the management file is according to the OCI distribution specification; wherein, the Manifest is a JSON format file, comprising two parts: a Config part and a Layers part; the Config part records configuration about the mirror image, is metadata of the mirror image, and is used for displaying information in a UI of a mirror image warehouse and distinguishing construction of different operating systems; the Layers part is composed of the content of the multilayer mediaType as application/vnd. OCI. Image layer. V1 in the OCI standard; each layer in the Config part and the Layers part is stored in a model mirror warehouse in a Blob mode; wherein, the digest of the pipeline model image exists as a Key; mirror warehouse provides a low latency pipeline model management and distribution infrastructure and can save significantly on model storage space.
Pushing the Docker mirror image of the machine learning pipeline model to a model mirror image warehouse, which specifically comprises the following steps: the model management client interacts with the model mirror warehouse through a pushing command, and uploads the pipeline model mirror to the model mirror warehouse; each layer of Config and Layers in the model mirror image warehouse is stored in the model mirror image warehouse in a Blob mode; wherein the digest of the pipeline model image exists as a Key.
The model management client pulls a machine learning pipeline model dock mirror image from a model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to the Manifest of the model image warehouse, the model management client side uses the digest to download all Blobs in parallel; including Config and all Layers.
As shown in fig. 1, a machine learning pipeline model layering management and distribution system comprises a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module;
the model management client pulls a Docker mirror image of the machine learning pipeline model from the model mirror image warehouse; the model management client interacts with the model mirror warehouse through a pushing command, and uploads the pipeline model mirror to the model mirror warehouse
The command line management tool module is used for providing tools such as machine learning pipeline model construction, uploading, downloading and the like in a command line mode;
the Docker background module is used for providing an API to receive a request from the command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, and distributing the machine learning pipeline model to different modules according to different requests so as to execute corresponding work;
the model mirror image warehouse is used for providing Docker Registry API to receive a request from a model management client, supporting a user to upload and download a machine learning pipeline model, and being capable of carrying out hierarchical storage management on model files and model attributes at a server side; a mirrored warehouse approach is supported that meets the OCI standard.
The model management client supports OCI standard, and performs hierarchical storage management on model files and model attributes in the model management client.
The model management client pulls a machine learning pipeline model dock mirror image from a model mirror image warehouse, and specifically comprises the following steps: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to a Manifest file of a model mirror warehouse, the model management client side uses a digest to download all blobs in parallel; including a Config portion and a Layers portion.
The key steps are as follows:
the developer of the pipeline model constructs a Docker mirror image according to the machine learning pipeline model mirror image layering method and the Manifest specification;
delivering the constructed mirror image to a machine learning pipeline mirror image warehouse by a developer;
the model user uses the client to pull the mirror image from the mirror image warehouse, and the pulling process accords with the standard mirror image warehouse service API:
the client side firstly requests the Manifest of the mirror image from the mirror image warehouse;
the client pulls the mirror image Config information;
the client pulls all file layers that contain the pipeline model.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. A method for hierarchically managing and distributing machine-learned pipeline models, comprising the steps of:
s1: obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using a DAG, and obtaining the DAG model through serialization operation;
s2: customizing a configuration file;
s3: generating a mirror image construction script according to the self-defined configuration file;
s4: constructing the plurality of machine learning models and the DAG model according to the mirror image construction script, and generating a Docker mirror image of a machine learning pipeline model and a corresponding workpiece type file;
s5: and pushing the Docker mirror image of the machine learning pipeline model and the corresponding workpiece type file to a model mirror image warehouse.
2. The method of claim 1, wherein the DAG is configured to define a pipeline relationship between a plurality of machine learning models; the DAG called by a plurality of machine learning models is completed by using a programming language, and the DAG is stored as a model through a serialization operation.
3. The method for hierarchical management and distribution of machine learning pipeline models according to claim 1, wherein the hierarchical relationship of the plurality of machine learning models and DAG models in the Docker mirror image to be constructed is divided by a custom configuration file.
4. A method for hierarchically managing and distributing a machine-learning pipeline model according to claim 3, wherein the custom configuration file specifically comprises: the method comprises the steps that a configuration file needs to be assigned to a read position of a machine learning model, each mirror image layer in a Docker mirror image to be divided by the configuration file comprises a constraint relation of the machine learning model, a construction sequence and a construction hierarchy of the mirror image layers corresponding to a plurality of machine learning models to be divided by the configuration file, and a construction sequence and a construction hierarchy of the mirror image layers corresponding to a DAG model to be divided by the configuration file.
5. The method for hierarchically managing and distributing machine-learning pipeline models according to claim 1, wherein the specific process of generating the mirror image construction script comprises: and after the model management client side reads and analyzes the configuration file operation, generating a mirror image construction script conforming to the Docker specification.
6. The method for hierarchically managing and distributing machine learning pipeline models according to claim 1, wherein the plurality of machine learning models and the DAG model are constructed according to the mirror image construction script, and a Docker mirror image of the machine learning pipeline model conforming to the Docker specification is constructed by a generating model management client and a workpiece type management file of the machine learning pipeline model conforming to the OCI standard is generated by expanding a Docker background function.
7. The method for hierarchically managing and distributing machine-learning pipeline models according to claim 6, wherein the structure of the management file is according to the OCI distribution specification; wherein, the Manifest is a JSON format file, comprising two parts: a Config part and a Layers part; the Config part records configuration about the mirror image, is metadata of the mirror image, and is used for displaying information in a UI of a mirror image warehouse and distinguishing construction of different operating systems; the Layers part consists of the content of multilayer media type being application/vnd.oci.image layer v1 in the OCI standard; each layer in the Config part and the Layers part is stored in a model mirror warehouse in a Blob mode; wherein the digest of the pipeline model image exists as a Key.
8. The system for hierarchically managing and distributing the machine learning pipeline model is characterized by comprising a model management client and a model mirror warehouse; the model management client comprises a command line management tool module and a Docker background module;
the model management client pulls a Docker mirror image of a machine learning pipeline model from the model mirror image warehouse; the model management client interacts with the model mirror warehouse through a pushing command, and uploads a pipeline model mirror to the model mirror warehouse;
the command line management tool module is used for providing machine learning pipeline model building, uploading and downloading tools in a command line mode;
the Docker background module is used for providing an API to receive a request from a command line management tool, supporting a user to manage a machine learning pipeline model in a self-defined mirror image mode, distributing the machine learning pipeline model to different modules according to different requests and further executing corresponding work;
the model mirror image warehouse is used for providing a DockerRaegistryAPI to receive a request from a model management client, supporting a user to upload and download a machine learning pipeline model, and carrying out layered storage management on model files and model attributes at a server; supporting a mirror warehouse scheme conforming to the OCI standard;
the method comprises the steps of obtaining a plurality of machine learning models, defining pipeline relations among the machine learning models by using DAGs, obtaining a DAG model through serialization operation, defining a calculation relation diagram of the machine learning pipeline model by using the DAG model, storing the DAG model as an uppermost single layer in a file system, constructing the machine learning models and the DAG model according to mirror image construction scripts, generating Docker images and corresponding workpiece type files of the machine learning pipeline model, and pushing the Docker images and the corresponding workpiece type files of the machine learning pipeline model to a model image warehouse.
9. The system for hierarchical management and distribution of machine learning pipeline models of claim 8 wherein the model management client supports the OCI standard and wherein model files and model attributes in the model management client are managed in a hierarchical storage.
10. The system for hierarchically managing and distributing machine learning pipeline models according to claim 8, wherein the model management client pulls machine learning pipeline model Docker images from a model image warehouse, and specifically comprises: the model management client interacts with the model mirror image warehouse through a pull command and downloads a machine learning pipeline model mirror image; after sending a request to a Manifest file of a model mirror warehouse, the model management client side uses a digest to download all blobs in parallel; including a Config portion and a Layers portion.
CN202110313978.3A 2021-03-24 2021-03-24 Method and system for layering management and distribution of machine learning pipeline model Active CN112906907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110313978.3A CN112906907B (en) 2021-03-24 2021-03-24 Method and system for layering management and distribution of machine learning pipeline model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110313978.3A CN112906907B (en) 2021-03-24 2021-03-24 Method and system for layering management and distribution of machine learning pipeline model

Publications (2)

Publication Number Publication Date
CN112906907A CN112906907A (en) 2021-06-04
CN112906907B true CN112906907B (en) 2024-02-23

Family

ID=76106214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110313978.3A Active CN112906907B (en) 2021-03-24 2021-03-24 Method and system for layering management and distribution of machine learning pipeline model

Country Status (1)

Country Link
CN (1) CN112906907B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327754B (en) * 2021-12-15 2022-10-04 中电信数智科技有限公司 Mirror image exporting and assembling method based on container layering technology

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107111787A (en) * 2014-09-08 2017-08-29 皮沃塔尔软件公司 Stream process
CN108040074A (en) * 2018-01-26 2018-05-15 华南理工大学 A kind of real-time network unusual checking system and method based on big data
EP3376361A2 (en) * 2017-10-19 2018-09-19 Pure Storage, Inc. Ensuring reproducibility in an artificial intelligence infrastructure
CN108984257A (en) * 2018-07-06 2018-12-11 无锡雪浪数制科技有限公司 A kind of machine learning platform for supporting custom algorithm component
CN109740765A (en) * 2019-01-31 2019-05-10 成都品果科技有限公司 A kind of machine learning system building method based on Amazon server
WO2019143412A1 (en) * 2018-01-19 2019-07-25 Umajin Inc. Configurable server kit
CN110266771A (en) * 2019-05-30 2019-09-20 天津神兔未来科技有限公司 Distributed intelligence node and distributed swarm intelligence system dispositions method
CN110287171A (en) * 2019-06-28 2019-09-27 北京九章云极科技有限公司 A kind of data processing method and system
WO2019184750A1 (en) * 2018-03-30 2019-10-03 华为技术有限公司 Deep learning task scheduling method and system and related apparatus
CN110543464A (en) * 2018-12-12 2019-12-06 广东鼎义互联科技股份有限公司 Big data platform applied to smart park and operation method
CN110716744A (en) * 2019-10-21 2020-01-21 中国科学院空间应用工程与技术中心 Data stream processing method, system and computer readable storage medium
US10565093B1 (en) * 2018-10-09 2020-02-18 International Business Machines Corporation Providing cognitive intelligence across continuous delivery pipeline data
CN111353609A (en) * 2020-02-28 2020-06-30 平安科技(深圳)有限公司 Machine learning system
CN111901294A (en) * 2020-06-09 2020-11-06 北京迈格威科技有限公司 Method for constructing online machine learning project and machine learning system
CN112148810A (en) * 2020-11-10 2020-12-29 南京智数云信息科技有限公司 User portrait analysis system supporting custom label
CN112418438A (en) * 2020-11-24 2021-02-26 国电南瑞科技股份有限公司 Container-based machine learning procedural training task execution method and system

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10026041B2 (en) * 2014-07-12 2018-07-17 Microsoft Technology Licensing, Llc Interoperable machine learning platform
US10936969B2 (en) * 2016-09-26 2021-03-02 Shabaz Basheer Patel Method and system for an end-to-end artificial intelligence workflow
WO2018069260A1 (en) * 2016-10-10 2018-04-19 Proekspert AS Data science versioning and intelligence systems and methods
US11922564B2 (en) * 2017-06-05 2024-03-05 Umajin Inc. Generative content system that supports location-based services and methods therefor
US10671434B1 (en) * 2017-10-19 2020-06-02 Pure Storage, Inc. Storage based artificial intelligence infrastructure
EP3985684A1 (en) * 2018-07-18 2022-04-20 NVIDIA Corporation Virtualized computing platform for inferencing, advanced processing, and machine learning applications
US20200125639A1 (en) * 2018-10-22 2020-04-23 Ca, Inc. Generating training data from a machine learning model to identify offensive language
US20200193221A1 (en) * 2018-12-17 2020-06-18 At&T Intellectual Property I, L.P. Systems, Methods, and Computer-Readable Storage Media for Designing, Creating, and Deploying Composite Machine Learning Applications in Cloud Environments
US11616839B2 (en) * 2019-04-09 2023-03-28 Johnson Controls Tyco IP Holdings LLP Intelligent edge computing platform with machine learning capability
US20200401930A1 (en) * 2019-06-19 2020-12-24 Sap Se Design of customizable machine learning services
US11966856B2 (en) * 2019-07-26 2024-04-23 Live Nation Entertainment, Inc. Enhanced validity modeling using machine-learning techniques
US11663523B2 (en) * 2019-09-14 2023-05-30 Oracle International Corporation Machine learning (ML) infrastructure techniques
US11562267B2 (en) * 2019-09-14 2023-01-24 Oracle International Corporation Chatbot for defining a machine learning (ML) solution

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107111787A (en) * 2014-09-08 2017-08-29 皮沃塔尔软件公司 Stream process
EP3376361A2 (en) * 2017-10-19 2018-09-19 Pure Storage, Inc. Ensuring reproducibility in an artificial intelligence infrastructure
WO2019143412A1 (en) * 2018-01-19 2019-07-25 Umajin Inc. Configurable server kit
CN108040074A (en) * 2018-01-26 2018-05-15 华南理工大学 A kind of real-time network unusual checking system and method based on big data
WO2019184750A1 (en) * 2018-03-30 2019-10-03 华为技术有限公司 Deep learning task scheduling method and system and related apparatus
CN108984257A (en) * 2018-07-06 2018-12-11 无锡雪浪数制科技有限公司 A kind of machine learning platform for supporting custom algorithm component
US10565093B1 (en) * 2018-10-09 2020-02-18 International Business Machines Corporation Providing cognitive intelligence across continuous delivery pipeline data
CN110543464A (en) * 2018-12-12 2019-12-06 广东鼎义互联科技股份有限公司 Big data platform applied to smart park and operation method
CN109740765A (en) * 2019-01-31 2019-05-10 成都品果科技有限公司 A kind of machine learning system building method based on Amazon server
CN110266771A (en) * 2019-05-30 2019-09-20 天津神兔未来科技有限公司 Distributed intelligence node and distributed swarm intelligence system dispositions method
CN110287171A (en) * 2019-06-28 2019-09-27 北京九章云极科技有限公司 A kind of data processing method and system
CN110716744A (en) * 2019-10-21 2020-01-21 中国科学院空间应用工程与技术中心 Data stream processing method, system and computer readable storage medium
CN111353609A (en) * 2020-02-28 2020-06-30 平安科技(深圳)有限公司 Machine learning system
CN111901294A (en) * 2020-06-09 2020-11-06 北京迈格威科技有限公司 Method for constructing online machine learning project and machine learning system
CN112148810A (en) * 2020-11-10 2020-12-29 南京智数云信息科技有限公司 User portrait analysis system supporting custom label
CN112418438A (en) * 2020-11-24 2021-02-26 国电南瑞科技股份有限公司 Container-based machine learning procedural training task execution method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A practical tutorial on bagging and boosting based ensembles for machine learning : Algorithms, software tools, performance study, practical perspectives and opportunities;S. González等;《Inf. Fusion》;205-237 *
基于分片复用的多版本容器镜像加载方法;陆志刚等;《软件学报》;第31卷(第6期);1875-1888 *

Also Published As

Publication number Publication date
CN112906907A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
US9348482B2 (en) Modeling system for graphic user interface
US8963961B2 (en) Fractal whiteboarding
US9787752B2 (en) Hotspot editor for a user interface
CN104463957B (en) A kind of three-dimensional scenic Core Generator integrated approach based on material
US10635408B2 (en) Method and apparatus for enabling agile development of services in cloud computing and traditional environments
CN107077752A (en) Data visualization extensible architecture
KR20060050401A (en) Maintaining graphical presentations based on user customizations
JP2006114014A (en) Graphical method for navigation in database of modeled object
CN112906907B (en) Method and system for layering management and distribution of machine learning pipeline model
CN107402906A (en) Dynamic content layout in application based on grid
CN107729304A (en) Interacted with the document as application
US20230108560A1 (en) Methods and Systems for Representation, Composition and Execution of Artificial Intelligence Centric Applications
CN108958731B (en) Application program interface generation method, device, equipment and storage medium
US10417924B2 (en) Visual work instructions for assembling product
CN105279599B (en) Content Management System
JP2006178991A (en) Method and system for graphically navigating among stored multiple objects
CN115469784A (en) User interface and method for generating new artifacts based on existing artifacts
US8140977B2 (en) Hosted data visualization service
US11113037B2 (en) Software performance modification
Caraceni et al. I-media-cities, a searchable platform on moving images with automatic and manual annotations
US20070055928A1 (en) User workflow lists to organize multimedia files
US11921997B2 (en) User interfaces and methods for generating a new artifact based on existing artifacts
CN114707680B (en) Aircraft 3D model generation method and device, electronic equipment and readable medium
EP4343715A1 (en) Determining 3d models corresponding to an image
US20170131954A1 (en) Managing Printing to Reduce Physical Media Waste

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant