CN114721674A - Model deployment method, device, equipment and storage medium - Google Patents

Model deployment method, device, equipment and storage medium Download PDF

Info

Publication number
CN114721674A
CN114721674A CN202210448435.7A CN202210448435A CN114721674A CN 114721674 A CN114721674 A CN 114721674A CN 202210448435 A CN202210448435 A CN 202210448435A CN 114721674 A CN114721674 A CN 114721674A
Authority
CN
China
Prior art keywords
model
test
file
configuration information
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210448435.7A
Other languages
Chinese (zh)
Inventor
龚乐诚
马兴宇
康平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd filed Critical Shanghai Pudong Development Bank Co Ltd
Priority to CN202210448435.7A priority Critical patent/CN114721674A/en
Publication of CN114721674A publication Critical patent/CN114721674A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a model deployment method, a model deployment device, model deployment equipment and a storage medium. The method comprises the following steps: obtaining model configuration information, metadata and a model file, wherein the model configuration information comprises: environment mirror image information; uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database; generating a first yaml configuration file according to the model configuration information; receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file. The method can simplify the process of model deployment, support dynamic elastic expansion and increase the total calculation resources of the model at any time.

Description

Model deployment method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a model deployment method, a model deployment device, model deployment equipment and a storage medium.
Background
With the continuous deepening of the artificial intelligence technology based on big data, machine learning and deep learning in various industries, the management and the deployment of the machine learning model constructed by the big data are important links for applying the artificial intelligence in the industries.
Although the method and system for managing and deploying models proposed in the market can uniformly manage the metadata of the models and the running state of the model services, and solve the problem of model management and deployment to some extent, these systems still have some disadvantages.
1. The model service is manually deployed on a single server, and the model service is updated in the later stage, so that the shutdown updating is required, and the deployment difficulty is high.
2. Model computing resources cannot be monitored, and unified management of post-model evaluation results is lacking.
3. And the flexible capacity expansion of model computing resources and the intelligent configuration of the model computing resources are not supported.
Disclosure of Invention
Embodiments of the present invention provide a model deployment method, apparatus, device, and storage medium, which solve the problems of metadata information loss in model management, lack of model evaluation results, computational resource monitoring loss in model deployment, and lack of flexible capacity expansion. The method can simplify the process of model deployment, improve the efficiency of developers, support dynamic elastic expansion, increase the total calculation resources of the model at any time, and better monitor the calculation resources of the model.
In a first aspect, an embodiment of the present invention provides a model deployment method, including:
obtaining model configuration information, metadata and a model file, wherein the model configuration information comprises: environment mirror image information;
uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database;
generating a first yaml configuration file according to the model configuration information;
and receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
Further, receiving a test instruction, and sending the test instruction to the test kubernets cluster, so that the test kubernets cluster deploying the model according to the first yaml profile includes:
receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster generates a first target pod according to the first yaml configuration file, and the first target pod acquires a model file according to a local directory and performs testing according to the model file.
Further, after receiving a test instruction and sending the test instruction to the test kubernets cluster, so that the test kubernets cluster generates a first target pod according to the first yaml configuration file, and the first target pod acquires a model file according to a local directory and performs a test according to the model file, the method further includes:
acquiring computing resource occupation data sent by the testing Kubernetes cluster;
and determining a calculation resource parameter according to the calculation resource occupation data, and writing the calculation resource parameter into a background database.
Further, after determining the calculation resource parameters according to the calculation resource occupation data and writing the calculation resource parameters into a background database, the method further includes:
generating a second yaml configuration file according to the computing resource parameters and the model configuration information;
receiving an issuing instruction, and sending the issuing instruction to a production environment Kubernetes cluster so as to enable the production environment Kubernetes cluster to produce according to the second yaml configuration file;
and obtaining a model operation log and a model evaluation result by accessing the GPRC port, and displaying the model operation log and the model evaluation result.
In a second aspect, an embodiment of the present invention further provides a model deployment apparatus, where the apparatus includes:
an obtaining module, configured to obtain model configuration information, metadata, and a model file, where the model configuration information includes: environment mirror image information;
the uploading module is used for uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested and writing the model configuration information into a background database;
the generating module is used for generating a first xml configuration file according to the model configuration information;
and the receiving module is used for receiving a test instruction and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
Further, the receiving module is specifically configured to:
receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster generates a first target pod according to the first yaml configuration file, and the first target pod acquires a model file according to a local directory and performs testing according to the model file.
Further, the receiving module is further configured to:
receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster generates a first target pod according to the first xml configuration file, the first target pod acquires a model file according to a local directory, and acquires computing resource occupation data sent by the test Kubernets cluster after testing is performed according to the model file;
and determining a calculation resource parameter according to the calculation resource occupation data, and writing the calculation resource parameter into a background database.
Further, the receiving module is further configured to:
after determining a computing resource parameter according to the computing resource occupation data and writing the computing resource parameter into a background database, generating a second xml configuration file according to the computing resource parameter and the model configuration information;
receiving an issuing instruction, and sending the issuing instruction to a production environment Kubernetes cluster so as to enable the production environment Kubernetes cluster to produce according to the second yaml configuration file;
and obtaining a model operation log and a model evaluation result by accessing the GPRC port, and displaying the model operation log and the model evaluation result.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the model deployment method according to any one of the embodiments of the present invention.
In a fourth aspect, the embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the model deployment method according to any one of the embodiments of the present invention.
The embodiment of the invention obtains model configuration information, metadata and a model file, wherein the model configuration information comprises: environment mirror image information; uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database; generating a first yaml configuration file according to the model configuration information; and receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster deploys the model according to the first yaml configuration file, and the problems of metadata information loss, model evaluation result lack, calculation resource monitoring loss in model deployment and elastic capacity expansion lack in model management are solved. The method can simplify the process of model deployment, improve the efficiency of developers, support dynamic elastic expansion, increase the total calculation resources of the model at any time, and better monitor the calculation resources of the model.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flow chart of a model deployment method in an embodiment of the invention;
FIG. 2 is a schematic structural diagram of a model deployment apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer-readable storage medium containing a computer program in an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, subprograms, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
The term "include" and variations thereof as used herein are intended to be open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment".
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not construed as indicating or implying relative importance.
Fig. 1 is a flowchart of a model deployment method provided in an embodiment of the present invention, where this embodiment is applicable to a case of model deployment, and the method may be executed by a model deployment apparatus in an embodiment of the present invention, where the model deployment apparatus may be implemented in a software and/or hardware manner, as shown in fig. 1, the model deployment method specifically includes the following steps:
s110, obtaining model configuration information, metadata and a model file, wherein the model configuration information comprises: the environment mirrors information.
The obtaining mode of the environment mirror image information can be that the environment mirror image information of model training is selected from a mirror image library.
Specifically, the mode of obtaining the model configuration information, the metadata, and the model file may be: filling other model configuration information and metadata except the environment mirror image information in a model factory, selecting the environment mirror image information of model training in a mirror image library, and uploading a local model file to a platform.
And S120, uploading the model file and the model configuration information to a local testing Kubernets cluster, and writing the model configuration information into a background database.
Specifically, the model file and the model configuration information are uploaded to a local testing Kubernetes cluster, and the model configuration information is written into a background database, for example, other model configuration information and metadata except environment mirror image information are filled in a model factory, environment mirror image information for model training is selected from the mirror image database, the local model file is uploaded to a platform, the platform packages and uploads the model file and the configuration information uploaded by a model developer to a local directory of the testing Kubernetes cluster, and the configuration information is written into the background database.
S130, generating a first xml configuration file according to the model configuration information.
Specifically, the manner of generating the first yaml configuration file according to the model configuration information may be: and automatically generating a yaml configuration file of the pod for testing the Kubernetes cluster by using model training, and acquiring the model file in a mode of mounting a local directory by the pod.
And S140, receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
Specifically, the manner of receiving the test instruction may be: the model developer applies for deployment of the model in a test and commissioning environment.
Specifically, the test instruction is received and sent to the test kubernets cluster, so that the test kubernets cluster deploys the model according to the first yaml configuration file, for example, a model developer applies for deploying the model in a test and test running environment, and the platform deploys the model in the test kubernets cluster after resource auditing.
Optionally, receiving a test instruction, and sending the test instruction to the test kubernets cluster, so that the test kubernets cluster deploys the model according to the first yaml profile includes:
and receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster generates a first target pod according to the first xml configuration file, and the first target pod acquires a model file according to a local directory and performs testing according to the model file.
Optionally, after receiving the test instruction and sending the test instruction to the testing kubernets cluster, so that the testing kubernets cluster generates a first target pod according to the first yaml configuration file, where the first target pod obtains the model file according to the local directory, and performs the test according to the model file, the method further includes:
acquiring computing resource occupation data sent by the testing Kubernetes cluster;
and determining a calculation resource parameter according to the calculation resource occupation data, and writing the calculation resource parameter into a background database.
Specifically, the manner of acquiring the computing resource occupation data sent by the test kubernets cluster may be: and calculating resource occupation data when the model is monitored by a Kubernets cluster monitoring component Prometheus.
Optionally, after determining the computing resource parameter according to the computing resource occupation data and writing the computing resource parameter into the background database, the method further includes:
generating a second yaml configuration file according to the computing resource parameters and the model configuration information;
receiving an issuing instruction, and sending the issuing instruction to a production environment Kubernets cluster so that the production environment Kubernets cluster can produce according to the second xml configuration file;
and obtaining a model operation log and a model evaluation result by accessing the GPRC port, and displaying the model operation log and the model evaluation result.
Specifically, the platform automatically configures the most appropriate computing resource parameters for the model by performing data analysis on the resources occupied by the model and writes the most appropriate computing resource parameters into a platform background database. A developer publishes a model at a platform key, and the platform uploads a packaged model file and a packaged configuration file to a Kubernetes cluster in a production environment. The platform automatically generates a yaml configuration file of pod used for model training of a Kubernetes cluster in a production environment through computing resource configuration information, mirror image information and model configuration information in a database, and externally exposes a model operation log and a GRPC port for model result evaluation. And the platform displays the model operation log and the model result evaluation page at the front end by accessing the port outside the model pod.
In a specific example, an embodiment of the present invention provides a system for managing and deploying a multi-language type model based on a kubernets cluster, including: the system comprises a model factory management module, a mirror image library management module, a model resource quota management module, a model service management module and a model result monitoring module; the model factory management module is used for uniformly managing the models and storing metadata of the models, and the models are uploaded through local model files; the mirror image library management module is used for providing basic environment mirror image support for model training and release, and a user can freely select basic environment mirror images supporting different modeling languages; the resource quota management module optimally allocates model computing resources through the test and trial operation conditions of the model; the service management module is used for deploying the model and managing the life cycle of the model according to the model management mode and the resource quota management mode output by the model factory management module and the resource quota management module; the model result monitoring module is used for monitoring the model operation condition and the model evaluation result in real time. The relevant functional modules of the system are as follows:
1. the functions of the model plant management module further include: management of model parameters, model version management, and one-touch service publishing.
2. The basic environment mirror image comprises a built-in mirror image and a user-defined mirror image, the built-in mirror image comprises a dependent environment of the fixed frame, and the user-defined mirror image comprises a user-defined environment mirror image except the dependent environment of the fixed frame.
3. The resource quota management module has the functions of resource application, resource approval, resource monitoring and elastic expansion and contraction capacity.
4. The deployment of the model in the service management module supports the deployment of a built-in fixed framework and a user-defined framework.
5. The service monitoring module provides a visual interface and monitors the performance of each model service after the service management module deploys each model service and the running condition of each model service in real time.
In another specific example, the machine learning model management platform of the kubernets cluster-based multi-language type model management and deployment system is implemented as follows:
step 1: the model developer fills in model configuration information and metadata in the model factory in addition to the environment mirror image information and uploads the local model file to the platform.
Step 2: the model developer selects an environmental image of the model training in the image library.
And step 3: and the platform packs and uploads the model file and the configuration information uploaded by the model developer to a local directory for testing the Kubernetes cluster.
And 4, step 4: the platform writes configuration information input by a model developer into a background database, automatically generates a yaml configuration file of a pod for testing Kubernets cluster and used for model training, and acquires the model file in a mode that the pod mounts a local directory.
And 5: a model developer applies for deploying a model in a test and test running environment, deploys the model in a Kubernetes cluster after a platform is subjected to resource audit, and calculates resource occupation data when a Kubernetes cluster monitoring component Prometeus monitors the model to run.
Step 6: and the platform automatically configures the most appropriate computing resource parameters for the model by performing data analysis on the resources occupied by the model and writes the most appropriate computing resource parameters into a platform background database.
And 7: a developer publishes a model at a platform key, and the platform uploads a packaged model file and a packaged configuration file to a Kubernetes cluster in a production environment.
And step 8: and the platform automatically generates a yaml configuration file of pod used for model training of a Kubernets cluster in a production environment through computing resource parameters and model configuration information in the database, and externally exposes a GRPC port for model operation logs and model result evaluation.
And step 9: and the platform displays the model operation log and the model result evaluation page at the front end by accessing the port outside the model pod.
According to the embodiment of the invention, through data analysis of the occupied resources of the model in the test and test running environment, the most appropriate calculation resource parameters are automatically configured for the model. And a Kubernetes cluster is adopted for model training, and a model training task runs in a pod form, so that the deployment flow of the model is simplified, and the dynamic capacity expansion of computing resources is supported.
The embodiment of the invention is based on a Kubernetes cluster multi-language type model management and deployment system, simplifies the process of model deployment and improves the efficiency of developers. The multi-language type model management and deployment system based on the Kubernetes cluster supports dynamic elastic expansion, and can increase the total model calculation resources at any time. The multi-language type model management and deployment system based on the Kubernetes cluster has better monitoring on the computing resources of the model.
In the technical solution of this embodiment, model configuration information, metadata, and a model file are obtained, where the model configuration information includes: environment mirror image information; uploading the model file and the model configuration information to a local testing Kubernets cluster, and writing the model configuration information into a background database; generating a first yaml configuration file according to the model configuration information; and receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster deploys the model according to the first yaml configuration file, and the problems of metadata information loss, model evaluation result lack, calculation resource monitoring loss in model deployment and elastic expansion lack in model management are solved. The method can simplify the process of model deployment, improve the efficiency of developers, support dynamic elastic expansion, increase the total calculation resources of the model at any time, and better monitor the calculation resources of the model.
Fig. 2 is a schematic structural diagram of a model deployment apparatus according to an embodiment of the present invention. The embodiment may be applicable to a model deployment situation, where the apparatus may be implemented in a software and/or hardware manner, and the apparatus may be integrated in any device providing a model deployment function, as shown in fig. 2, where the model deployment apparatus specifically includes: an acquisition module 210, an upload module 220, a generation module 230, and a reception module 240.
The obtaining module 210 is configured to obtain model configuration information, metadata, and a model file, where the model configuration information includes: environment mirror image information;
the uploading module 220 is configured to upload the model file and the model configuration information to a local testing kubernets cluster, and write the model configuration information into a background database;
a generating module 230, configured to generate a first yaml configuration file according to the model configuration information;
the receiving module 240 is configured to receive a test instruction, and send the test instruction to the test kubernets cluster, so that the test kubernets cluster deploys the model according to the first yaml configuration file.
Optionally, the receiving module is specifically configured to:
receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster generates a first target pod according to the first yaml configuration file, and the first target pod acquires a model file according to a local directory and performs testing according to the model file.
Optionally, the receiving module is further configured to:
after receiving a test instruction and sending the test instruction to the test Kubernetes cluster, enabling the test Kubernetes cluster to generate a first target pod according to the first yaml configuration file, obtaining a model file by the first target pod according to a local directory, and obtaining computing resource occupation data sent by the test Kubernetes cluster after testing according to the model file;
and determining a calculation resource parameter according to the calculation resource occupation data, and writing the calculation resource parameter into a background database.
Optionally, the receiving module is further configured to:
after determining a computing resource parameter according to the computing resource occupation data and writing the computing resource parameter into a background database, generating a second xml configuration file according to the computing resource parameter and the model configuration information;
receiving an issuing instruction, and sending the issuing instruction to a production environment Kubernetes cluster so as to enable the production environment Kubernetes cluster to produce according to the second yaml configuration file;
and obtaining a model operation log and a model evaluation result by accessing the GPRC port, and displaying the model operation log and the model evaluation result.
The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
In the technical solution of this embodiment, model configuration information, metadata, and a model file are obtained, where the model configuration information includes: environment mirror image information; uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database; generating a first xml configuration file according to the model configuration information; and receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster deploys the model according to the first yaml configuration file, and the problems of metadata information loss, model evaluation result lack, calculation resource monitoring loss in model deployment and elastic capacity expansion lack in model management are solved. The method can simplify the process of model deployment, improve the efficiency of developers, support dynamic elastic expansion, increase the total calculation resources of the model at any time, and better monitor the calculation resources of the model.
Fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the present invention. FIG. 3 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 3 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 3, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (a Compact disk-Read Only Memory (CD-ROM)), Digital Video disk (DVD-ROM), or other optical media may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. System memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with electronic device 12, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. In the electronic device 12 of the present embodiment, the display 24 is not provided as a separate body but is embedded in the mirror surface, and when the display surface of the display 24 is not displayed, the display surface of the display 24 and the mirror surface are visually integrated. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), Wide Area Network (WAN), and/or a public Network such as the internet) via the Network adapter 20. As shown, the network adapter 20 communicates with the other modules of the electronic device 12 over the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, Redundant processing units, external disk drive Arrays, disk array (RAID) systems, tape drives, and data backup storage systems, to name a few.
The processing unit 16 executes programs stored in the system memory 28 to execute various functional applications and data processing, for example, to implement the model deployment method provided by the embodiment of the present invention:
obtaining model configuration information, metadata and a model file, wherein the model configuration information comprises: environment mirror image information;
uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database;
generating a first yaml configuration file according to the model configuration information;
and receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
Fig. 4 is a schematic structural diagram of a computer-readable storage medium containing a computer program according to an embodiment of the present invention. Embodiments of the present invention provide a computer-readable storage medium 61, on which a computer program 610 is stored, which when executed by one or more processors implements the model deployment method provided in all embodiments of the invention of the present application:
obtaining model configuration information, metadata and a model file, wherein the model configuration information comprises: environment mirror image information;
uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database;
generating a first yaml configuration file according to the model configuration information;
and receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (Hyper Text Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. Those skilled in the art will appreciate that the present invention is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements and substitutions will now be apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method of model deployment, comprising:
obtaining model configuration information, metadata and a model file, wherein the model configuration information comprises: environment mirror image information;
uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested, and writing the model configuration information into a background database;
generating a first yaml configuration file according to the model configuration information;
and receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
2. The method of claim 1, wherein receiving a test instruction and sending the test instruction to the testing kubernets cluster, such that deploying the testing kubernets cluster according to the first yaml profile comprises:
receiving a test instruction, and sending the test instruction to the test Kubernets cluster, so that the test Kubernets cluster generates a first target pod according to the first yaml configuration file, and the first target pod acquires a model file according to a local directory and performs testing according to the model file.
3. The method according to claim 2, wherein after receiving a test instruction and sending the test instruction to the test kubernets cluster, so that the test kubernets cluster generates a first target pod according to the first yaml configuration file, the first target pod acquires a model file according to a local directory, and performs a test according to the model file, the method further comprises:
acquiring computing resource occupation data sent by the testing Kubernetes cluster;
and determining a calculation resource parameter according to the calculation resource occupation data, and writing the calculation resource parameter into a background database.
4. The method of claim 3, after determining the computing resource parameters from the computing resource occupancy data and writing the computing resource parameters to a background database, further comprising:
generating a second yaml configuration file according to the computing resource parameters and the model configuration information;
receiving an issuing instruction, and sending the issuing instruction to a production environment Kubernetes cluster so as to enable the production environment Kubernetes cluster to produce according to the second yaml configuration file;
and obtaining a model operation log and a model evaluation result by accessing the GPRC port, and displaying the model operation log and the model evaluation result.
5. A model deployment apparatus, comprising:
an obtaining module, configured to obtain model configuration information, metadata, and a model file, where the model configuration information includes: environment mirror image information;
the uploading module is used for uploading the model file and the model configuration information to the local of a Kubernetes cluster to be tested and writing the model configuration information into a background database;
the generating module is used for generating a first xml configuration file according to the model configuration information;
and the receiving module is used for receiving a test instruction and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster deploys the model according to the first yaml configuration file.
6. The apparatus of claim 5, wherein the receiving module is specifically configured to:
and receiving a test instruction, and sending the test instruction to the test Kubernets cluster so that the test Kubernets cluster generates a first target pod according to the first xml configuration file, and the first target pod acquires a model file according to a local directory and performs testing according to the model file.
7. The apparatus of claim 6, wherein the receiving module is further configured to:
after receiving a test instruction and sending the test instruction to the test Kubernetes cluster, enabling the test Kubernetes cluster to generate a first target pod according to the first yaml configuration file, obtaining a model file by the first target pod according to a local directory, and obtaining computing resource occupation data sent by the test Kubernetes cluster after testing according to the model file;
and determining a calculation resource parameter according to the calculation resource occupation data, and writing the calculation resource parameter into a background database.
8. The apparatus of claim 7, wherein the receiving module is further configured to:
after determining a computing resource parameter according to the computing resource occupation data and writing the computing resource parameter into a background database, generating a second xml configuration file according to the computing resource parameter and the model configuration information;
receiving an issuing instruction, and sending the issuing instruction to a production environment Kubernetes cluster so as to enable the production environment Kubernetes cluster to produce according to the second yaml configuration file;
and obtaining a model operation log and a model evaluation result by accessing the GPRC port, and displaying the model operation log and the model evaluation result.
9. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the processors to implement the method of any of claims 1-4.
10. A computer-readable storage medium containing a computer program, on which the computer program is stored, characterized in that the program, when executed by one or more processors, implements the method according to any one of claims 1-4.
CN202210448435.7A 2022-04-26 2022-04-26 Model deployment method, device, equipment and storage medium Pending CN114721674A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210448435.7A CN114721674A (en) 2022-04-26 2022-04-26 Model deployment method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210448435.7A CN114721674A (en) 2022-04-26 2022-04-26 Model deployment method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114721674A true CN114721674A (en) 2022-07-08

Family

ID=82246715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210448435.7A Pending CN114721674A (en) 2022-04-26 2022-04-26 Model deployment method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114721674A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024041035A1 (en) * 2022-08-23 2024-02-29 网络通信与安全紫金山实验室 Machine learning model management method and device, model management platform, and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024041035A1 (en) * 2022-08-23 2024-02-29 网络通信与安全紫金山实验室 Machine learning model management method and device, model management platform, and storage medium

Similar Documents

Publication Publication Date Title
EP3511836A1 (en) Generation of automated testing scripts by converting manual test cases
US9811442B2 (en) Dynamic trace level control
US20210072965A1 (en) Deploying microservices across a service infrastructure
US11568242B2 (en) Optimization framework for real-time rendering of media using machine learning techniques
WO2023087764A1 (en) Algorithm application element packaging method and apparatus, device, storage medium, and computer program product
CN111340220A (en) Method and apparatus for training a predictive model
CN113505302A (en) Method, device and system for supporting dynamic acquisition of buried point data and electronic equipment
CN114721674A (en) Model deployment method, device, equipment and storage medium
US11061739B2 (en) Dynamic infrastructure management and processing
US10972548B2 (en) Distributed system deployment
US10200271B2 (en) Building and testing composite virtual services using debug automation
CN113378346A (en) Method and device for model simulation
CN115809119A (en) Monitoring method, system and device for container arrangement engine
CN112328184B (en) Cluster capacity expansion method, device, equipment and storage medium
CN112422648B (en) Data synchronization method and system
CN113253991A (en) Task visualization processing method and device, electronic equipment and storage medium
CN116820354B (en) Data storage method, data storage device and data storage system
CN116467178B (en) Database detection method, apparatus, electronic device and computer readable medium
CN116561015B (en) Map application testing method, electronic device and computer readable medium
CN110022244B (en) Method and apparatus for transmitting information
CN116643946A (en) Method, device, equipment and medium for collecting performance data
CN115630251A (en) Method and system for realizing prerendering static page based on microservice
CN113360417A (en) Test method, session modifier, electronic device, and medium
CN112817737A (en) Method and device for calling model in real time
CN114840441A (en) Application performance capacity estimation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination