CN110413294B - Service release system, method, device and equipment - Google Patents

Service release system, method, device and equipment Download PDF

Info

Publication number
CN110413294B
CN110413294B CN201910720644.0A CN201910720644A CN110413294B CN 110413294 B CN110413294 B CN 110413294B CN 201910720644 A CN201910720644 A CN 201910720644A CN 110413294 B CN110413294 B CN 110413294B
Authority
CN
China
Prior art keywords
service
target
image
deep learning
target service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910720644.0A
Other languages
Chinese (zh)
Other versions
CN110413294A (en
Inventor
王磊
周文泽
陆新龙
吴冕冠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN201910720644.0A priority Critical patent/CN110413294B/en
Publication of CN110413294A publication Critical patent/CN110413294A/en
Application granted granted Critical
Publication of CN110413294B publication Critical patent/CN110413294B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a service release system, a method, a device and equipment, wherein the system comprises the following components: a storage server for receiving and storing an object code input by a user; the model warehouse server is used for storing a plurality of training models obtained through pre-training; the mirror image construction module is used for acquiring the target code from the storage server under the condition of receiving the service release request of the user, pulling a target training model from the model warehouse server according to the target code, and constructing a target service mirror image based on a preset deep learning basic frame mirror image, the target code and the target training model; the mirror image warehouse is used for receiving and storing target service mirrors; and the plurality of computing nodes are used for pulling the target service image from the image warehouse and generating target services based on the target service image. In the embodiment of the application, after the user sends the service release request, the service release system can automatically generate the target service, thereby effectively improving the release efficiency of the service.

Description

Service release system, method, device and equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a service publishing system, method, device, and equipment.
Background
The process of deep learning generally requires several major steps of selecting algorithms and frameworks, training models, adjusting models, integrating models with applications, and deploying. In the existing service release based on deep learning, related personnel are usually required to write a training model, service codes and a service release framework, a model package and various dependency packages required by deep learning are manually introduced, and mirror images are manually constructed and release service is deployed on a cloud platform. The whole service release process is complex, each step needs extremely complicated manual work, and a great deal of expertise and access to a great deal of calculation and storage are also needed. Therefore, the manual service release method needs to consume a great deal of time, has complex process and lower release efficiency.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the application provides a service release system, a method, a device and equipment, which are used for solving the problems that a great deal of time is required to be consumed for service release in a manual mode in the prior art, the process is complicated and the release efficiency is low.
The embodiment of the application provides a service release system, which comprises the following components: a storage server for receiving and storing an object code input by a user; the model warehouse server is used for storing a plurality of training models obtained based on deep learning basic framework pre-training; the mirror image construction module is used for acquiring the target code input by the user from the storage server under the condition of receiving the service release request of the user, pulling a corresponding training model from a model warehouse server according to the target code to serve as a target training model, constructing a target service mirror image based on a preset deep learning basic frame mirror image, the target code and the target training model, and pushing the target service mirror image to a mirror image warehouse; the mirror image warehouse is in communication connection with the mirror image construction module and is used for storing the preset deep learning basic frame mirror image and receiving and storing the target service mirror image; and the plurality of computing nodes are in communication connection with the mirror image warehouse, and are used for pulling the target service mirror image from the mirror image warehouse and generating target service based on the target service mirror image.
In one embodiment, the preset deep learning base frame image includes: deep learning infrastructure and/or dependency packages required for deep learning training.
In one embodiment, the computing node is specifically configured to generate a target service by deploying a container based on the target service image, wherein the container is deployed based on Kubernetes.
In one embodiment, further comprising: and the optimization server is used for monitoring the concurrency quantity of the target service in real time, comparing the monitored concurrency quantity with a preset concurrency quantity, and adaptively increasing and decreasing the number of the pod of the target service according to the comparison result.
The embodiment of the application also provides a service release method, which comprises the following steps: receiving a service release request of a user; acquiring the target code input by the user from a storage server according to the service release request; according to the target codes, pulling a corresponding training model from a model warehouse server to serve as a target training model; and constructing a target service image based on a preset deep learning basic frame image, the target code and the target training model, wherein the target service image is used for generating a target service by a computing node.
In one embodiment, constructing a target service image based on a preset deep learning infrastructure image, the target code, and the target training model includes: obtaining a preset deep learning basic frame image from an image warehouse, wherein the target training model is obtained based on deep learning basic frame pre-training; constructing a target service image through a Dockerfile based on the preset deep learning basic frame image, the target codes and the target training model; pushing the target service image to an image warehouse.
In one embodiment, after building the target service image, further comprising: the current computing node with computing capability in the plurality of computing nodes pulls the target service image from the image warehouse; based on the target service image, the target service is generated on a current computing node with computing capability by using a Kubernetes deployment container.
In one embodiment, after generating the target service, further comprising: monitoring the concurrency of the target service request in real time; comparing the monitored concurrency with a preset concurrency; and adaptively increasing and decreasing the number of pod of the target service according to the comparison result.
The embodiment of the application also provides a service release device, which comprises: the receiving module is used for receiving a service release request of a user; the acquisition module is used for acquiring the target code input by the user from a storage server according to the service release request; the pulling module is used for pulling the corresponding training model from the model warehouse server to serve as a target training model according to the target code; the building module is used for building a target service image based on a preset deep learning basic framework image, the target codes and the target training model, wherein the target service image is used for generating a target service by a computing node.
The embodiment of the application provides a service release system, which can pull a corresponding training model from a model warehouse server as a target training model according to a target code input by a user under the condition that a mirror image construction module receives a service release request of the user by pre-storing a plurality of training models which are pre-trained based on a deep learning basic frame in the model warehouse server. Furthermore, the mirror image construction module can automatically construct a target service mirror image based on a preset basic learning frame, a target code and a target training model, and push the target service mirror image to a mirror image warehouse, and the mirror image warehouse can receive and store the target service mirror image. A plurality of computing nodes in communication with the image repository may pull a target service image from the image repository and generate a target service based on the target service image. Therefore, after the user sends the service release request, the service release system can automatically perform operations such as mirror image construction and service generation without manual construction and deployment, so that the time required by service release is effectively reduced, and the service release efficiency is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application. In the drawings:
fig. 1 is a schematic structural diagram of a service distribution system provided according to an embodiment of the present application;
FIG. 2 is a schematic diagram of steps of a service distribution method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a service distribution method provided according to an embodiment of the present application;
fig. 4 is a schematic structural view of a service issuing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a service distribution apparatus provided according to an embodiment of the present application.
Detailed Description
The principles and spirit of the present application will be described below with reference to several exemplary embodiments. It should be understood that these embodiments are presented merely to enable those skilled in the art to better understand and practice the application and are not intended to limit the scope of the application in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Those skilled in the art will appreciate that embodiments of the application may be implemented as a system, apparatus device, method, or computer program product. Accordingly, the present disclosure may be embodied in the following forms, namely: complete hardware, complete software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
Considering the existing deep learning-based service release, related personnel are usually required to write a training model, service codes and a service release framework, manually introduce model packages and various dependency packages required by deep learning, and manually construct images and deploy release services on a cloud platform. The whole service release process is complex, each step needs extremely complicated manual work, and a great deal of expertise and access to a great deal of calculation and storage are also needed. Therefore, the manual service release method needs to consume a great deal of time, has complex process and lower release efficiency.
Based on the above problems, an embodiment of the present invention provides a service publishing system, as shown in fig. 1, which may include: the storage server 11, the model repository server 12, the mirror image construction module 13, the mirror image repository 14 and the plurality of computing nodes 15 are respectively described in the following five components of the service distribution system:
1) The storage server 11 is used for receiving and storing the target codes input by the user.
After the user writes the completion code, the written object code may be uploaded to the storage server 11, so that the object code input by the user may be pulled from the storage server 11 during image construction later, and the object code may include a specified model name, so that the image construction module 13 may pull the corresponding training model from the model repository server 12 according to the model name specified in the object code. Furthermore, the object code input by the user can be a code segment, and the complete application program code, namely the business code written by the developer, is not needed.
The business code may be a piece of code written according to a deep learning infrastructure specification format that requires a fixed function entry so that the code fragment described above needs to be run in combination with the deep learning infrastructure and training model. The code directly used for realizing the user requirement is a service code, for example: if the user needs to inquire a certain data, the user directly inquires the database, and the code for returning the result is the service code.
2) The model repository server 12 is configured to store a plurality of training models pre-trained based on the deep learning infrastructure.
Prior to performing the deep learning, it is often desirable to determine an appropriate deep learning infrastructure that may include, but is not limited to, at least one of the following: tensorflow, caffe, theano, MXNet, torch, pyTorch. The training models pre-trained based on the deep learning basic framework can be stored in the model warehouse server 12 in advance before service release is performed, so that when the mirror image is constructed, the corresponding training model can be pulled from the model warehouse server 12 as a target training model according to the target code input by the user.
The plurality of training models can be a plurality of basic and common training models obtained by pre-training based on the deep learning basic framework in the construction of the service release system so as to avoid spending unnecessary time for training the deep learning models. In one embodiment, the plurality of training models may also be training models obtained by training the user based on the deep learning base framework according to the self requirements and actual conditions, and the obtained training models are stored in the model warehouse server 12, so that the training models used subsequently meet the requirements of the user. It will be appreciated that the manner of obtaining and storing the training model in the model warehouse server 12 may be determined according to practical situations, which is not limited by the present application.
3) The mirror image construction module 13 is configured to obtain, when a service release request of a user is received, an object code input by the user from the storage server 11, pull a corresponding training model from the model repository server 12 according to the object code as a target training model, construct a target service mirror image based on a preset deep learning infrastructure mirror image, the object code and the target training model, and push the target service mirror image to the mirror image repository 14.
The service release request of the user may include a service release configuration, where the service release configuration may include, but is not limited to, at least one of the following: service name, cluster area, function description, auto-scaling rule configuration, resource configuration (e.g., memory, CPU usage, etc.), environment variable configuration. In the case where the user does not set the service release configuration, the system may assign a default automatically generated service name, a default cluster area, a default automatic scaling rule, a default resource configuration, etc. to the service, where the default automatic scaling rule may be set to: the concurrency of the request receivable by the configuration service per second is 80, and of course, the default configuration of the concurrency of the request receivable by the configuration service per second may be determined according to practical situations, which is not limited by the present application.
The mirror image construction module 13, upon receiving a service release request from a user, acquires an object code input by the user from the storage server 11, and pulls a corresponding training model from the model repository server 12 as a target training model according to a specified model name contained in the object code. The corresponding training model can be pulled from the model repository server 12 specifically through the wget and FTP protocols, wherein wget is a free tool for automatically downloading files from the network, and supports downloading through the HTTP, HTTPS, FTP three most common TCP/IP protocols.
The deep learning basic framework, the dependency package required by the deep learning and the basic mirror package of the operating system can be packaged into a deep learning basic framework mirror, and further the service mirror can be constructed based on the basic framework mirror, so that a user only needs to concentrate on the writing of codes and does not need to pay attention to factors such as the deep learning basic framework, the dependency package and the like. The deep learning base frame image can be cached in the cluster node server in advance to improve the service release efficiency. The deep learning infrastructure described above may include, but is not limited to, at least one of: tensorflow, caffe, theano, MXNet, torch, pyTorch.
To ensure that the application service can start up normally, the object code and object training model can be placed under a specified directory when the object service image is built. Further, a preset deep learning infrastructure image may be pulled from the image repository 14, wherein the target training model is trained based on the deep learning infrastructure. And based on the preset deep learning basic framework image, the target code and the target training model, constructing a target service image through the Dockerf file, and pushing the target service image to the image warehouse 14.
4) The mirror image warehouse 14 is in communication connection with the mirror image construction module 13 and is used for storing a preset deep learning basic frame mirror image and receiving and storing a target service mirror image.
The mirror image repository 14 is used for storing mirror image files, and a preset deep learning base frame mirror image can be stored in the mirror image repository 14 in advance, so that extra unnecessary time for constructing the deep learning base frame mirror image in the service release process is avoided. And the image repository 14 may also receive and store the target service image pushed by the image construction module 13 so that the plurality of computing nodes 15 may pull the target service image from the image repository 14 before generating the target service. In one embodiment, the image repository 14 may be provided on a remote server, and when the application program is deployed, the cluster node (server) may remotely pull the image, that is, the remote centralized storage and the image distribution of the image may be implemented through the image repository 14.
In one embodiment, the image may be built by a Dockerfile, which is a file used to write the Docker image generation process, with a specific syntax that can quickly build the desired (custom) image. Dock contains three basic concepts, mirror image, container, and warehouse, respectively. The mirror image is a precondition of a Docker running container, and the warehouse is a place for storing the mirror image. A Docker image may be considered a special file system that contains some configuration parameters (e.g., anonymous volumes, environment variables, users, etc.) prepared for the runtime, in addition to the files that are needed to provide the runtime for the container, libraries, resources, configurations, etc. The image does not contain any dynamic data, the content of the image is not changed after the image is constructed, and the Docker image warehouse can realize the distribution of the Docker image.
5) A plurality of computing nodes 15 in communication with the image repository 14 for pulling a target service image from the image repository 14 and generating a target service based on the target service image.
The target service image may be pulled from the image repository 14 and the target service may be generated by deploying containers on the currently computing node having computing capabilities based on the target service image and the service release configuration described above, if it is determined that at least one currently computing node having computing capabilities exists among the plurality of computing nodes 15. In the case where the user does not envision setting the service release configuration, the target service may be generated using the default service release configuration in the system.
In order to improve service release efficiency, the computing nodes can be automatically scheduled by using the Kubernetes and containers can be deployed to generate target services, wherein the Kubernetes is an open source and is used for managing containerized applications on a plurality of hosts in a cloud platform, the goal of the Kubernetes is to enable the containerized applications to be deployed simply and efficiently, and the Kubernetes provides a mechanism for application deployment, planning, updating and maintenance. The application deployment mode can be realized by deploying containers, each container is isolated from the other, each container is provided with a file system, processes among the containers cannot affect each other, and computing resources can be distinguished.
In one embodiment, the service publishing system may further include: and the optimization server is used for monitoring the concurrency of the request target service in real time, comparing the monitored concurrency with the preset concurrency, and adaptively increasing and decreasing the pod number of the target service according to the comparison result. Since Kubernetes create Pod to host application instances at the time of deployment, pod is a resource abstraction of Kubernetes, representing one or more application container groups, and some shared resources for these containers.
After the target service is generated, the concurrency of the request of the outside for each second of the target service can be monitored in real time, and compared with the preset concurrency in the service release configuration, and the pod number of the target service can be adaptively reduced under the condition that the concurrency is smaller than the preset concurrency in the service release configuration; in the case where the concurrency is greater than the preset concurrency in the service release configuration, the number of pod of the target service can be adaptively increased. The method can increase or decrease the pod number of the target service so as to achieve the effect of adjusting the receivable concurrency of the target service.
From the above description, it can be seen that the following technical effects are achieved in the embodiments of the present application: by pre-storing a plurality of training models pre-trained based on the deep learning basic framework in the model warehouse server, the corresponding training model can be pulled from the model warehouse server as a target training model according to the target code input by the user under the condition that the mirror image construction module receives the service release request of the user. Furthermore, the mirror image construction module can automatically construct a target service mirror image based on a preset basic learning frame, a target code and a target training model, and push the target service mirror image to a mirror image warehouse, and the mirror image warehouse can receive and store the target service mirror image. A plurality of computing nodes in communication with the image repository may pull a target service image from the image repository and generate a target service based on the target service image. Therefore, after the user sends the service release request, the user does not need to manually construct and deploy to release the service, and the service release system can automatically perform operations such as mirror image construction and service generation, so that the time required by service release is effectively reduced, and the service release efficiency is improved.
The embodiment of the application also provides a service release method based on the service release system, as shown in fig. 2, which can comprise the following steps:
s201: and receiving a service release request of the user.
The service release request of the user may include a service release configuration, where the service release configuration may include, but is not limited to, at least one of: service name, cluster area, function description, auto-scaling rule configuration, resource configuration (e.g., memory, CPU usage, etc.), environment variable configuration. In the case where the user does not set the service release configuration, the system may employ a default service release configuration to assign a default automatically generated service name, a default cluster region, a default auto-scaling rule, a default resource configuration to the service. Wherein, the default automatic telescoping rule may be set as: the concurrency of the request receivable by the configuration service per second is 80, and of course, the default configuration of the concurrency of the request receivable by the configuration service per second may be determined according to practical situations, which is not limited by the present application.
The service release may be requested by the user by clicking a service release button on the client interface to trigger the service release request, or by inputting a code instruction to trigger the service release request, or in other possible manners, which is not limited by the present application.
S202: according to the service release request, the object code input by the user is acquired from the storage server.
The user may input and store the target code in the service release system in advance before requesting the service release, where the target code may be a code segment, and no complete application code, i.e., a service code written by a developer, is needed. The business code may be a piece of code written according to a deep learning infrastructure specification format that requires a fixed function entry so that the code fragment described above needs to be run in combination with the deep learning infrastructure and training model. The code directly used for realizing the user requirement is a service code, for example: if the user needs to inquire a certain data, the user directly inquires the database, and the code for returning the result is the service code. Therefore, after receiving the service release request, the target code previously input by the user and stored in the storage server can be acquired according to the service release request.
S203: and pulling the corresponding training model from the model warehouse server as a target training model according to the target code.
The object code may include a specified model name, so that a corresponding training model may be pulled from the model repository server according to the model name specified in the object code as the target training model. Prior to performing the deep learning, it is often desirable to determine an appropriate deep learning infrastructure that may include, but is not limited to, at least one of the following: tensorflow, caffe, theano, MXNet, torch, pyTorch.
The training models obtained based on the deep learning basic framework through pre-training can be stored in the model warehouse server in advance before service release is carried out, so that the corresponding training model can be pulled from the model warehouse server to serve as a target training model according to target codes input by a user when the mirror image is constructed.
S204: and constructing a target service image based on the preset deep learning basic frame image, the target code and the target training model, wherein the target service image is used for generating a target service by the computing node.
The deep learning basic framework, the dependency package required by the deep learning and the basic mirror package of the operating system can be packaged into a deep learning basic framework mirror, and further the service mirror can be constructed based on the basic framework mirror, so that a user only needs to concentrate on the writing of codes and does not need to pay attention to factors such as the deep learning basic framework, the dependency package and the like. Deep learning infrastructure images can be cached in an image warehouse in advance to improve service release efficiency. The deep learning infrastructure described above may include, but is not limited to, at least one of: tensorflow, caffe, theano, MXNet, torch, pyTorch.
A preset deep learning base frame image may be obtained from the image repository before the target service image is built, where the target training model is pre-trained based on the deep learning base frame. The target service mirror image can be constructed through the Dockerf file based on a preset deep learning basic framework mirror image, a target code and a target training model, and the target service mirror image is pushed to a mirror image warehouse.
The image warehouse is used for storing image files, and can receive and store pushed target service images so that the target service images can be pulled from the image warehouse before the target service is generated in a multi-execution mode. In one embodiment, the mirror image warehouse can be arranged on a remote server, and when an application program is deployed, the cluster node server can remotely pull the mirror image, namely, the remote centralized storage and the mirror image distribution of the mirror image can be realized through the mirror image warehouse.
Further, whether a node with the current computing capability exists in the plurality of computing nodes can be determined, and when the node with the current computing capability exists in the plurality of computing nodes, the target service image can be pulled from the image warehouse through the computing node with the current computing capability. Based on the target service image and the service release configuration, the target service can be generated on the current computing node with computing capability through the deployment container, and the target service can be released. In one embodiment, kubernetes may be employed to automatically invoke a currently computing-capable computing node and deploy containers on each computing-capable computing node to generate a target service.
The above Kubernetes is an open source for managing containerized applications on multiple hosts in a cloud platform, and the goal of Kubernetes is to make deploying containerized applications simple and efficient, and Kubernetes provides a mechanism for application deployment, planning, updating, and maintenance. The application deployment mode can be realized by deploying containers, each container is isolated from the other, each container is provided with a file system, processes among the containers cannot affect each other, and computing resources can be distinguished.
After the target service is released, in order to improve the resource utilization efficiency, the concurrency of the request of the outside for each second of the target service can be monitored in real time, the concurrency is compared with the preset concurrency in the service release configuration, and the pod number of the target service can be adaptively reduced under the condition that the concurrency is smaller than the preset concurrency in the service release configuration; in the case where the concurrency is greater than the preset concurrency in the service release configuration, the number of pod of the target service can be adaptively increased. The method can be used for increasing or decreasing the pod number of the target service so as to achieve the technical effect of adjusting the receivable concurrency of the target service. The Pod is a resource abstraction of Kubernetes, representing one or more application container groups, as well as some shared resources for those containers.
The above method is described below in connection with a specific embodiment, however, it should be noted that this specific embodiment is only for better illustrating the present application and is not meant to be a undue limitation on the present application.
The embodiment of the application provides a service release method, as shown in fig. 3, which can include:
step 1: and the developer stores the business codes trained by the written model into a code server and fills in the service release configuration. The service codes only need code fragments and do not need complete application program codes. Alternatively, the user may not fill in the service release configuration, and the system deploys in the default configuration to directly generate the service.
Wherein the service release configuration may include, but is not limited to, at least one of: service name, cluster area, function description, auto-scaling rule configuration, resource configuration (e.g., memory, CPU usage, etc.), environment variable configuration. In the case where the user does not set the service release configuration, the system may employ a default service release configuration to assign a default automatically generated service name, a default cluster region, a default auto-scaling rule, a default resource configuration to the service. Wherein, the default automatic telescoping rule is set as: the concurrency of servicing the receivable requests per second is 80.
Step 2: and pulling a training model corresponding to the service code from the model warehouse server through a wget and ftp protocol according to the service code and service release configuration written by a developer.
Step 3: based on deep learning basic framework mirroring, combining service codes and training models, automatically constructing a service mirror image through Dockerfile provided by a system, and pushing the service mirror image to a mirror image warehouse.
To ensure proper launching of the application, business codes and training models are placed under a specified directory. The deep learning infrastructure image is fixed as a TensorFlow deep learning framework. The deep learning base frame image is adopted to pack a TensorFlow deep learning frame, a dependence package required by deep learning and an operating system base image package into a unified deep learning base frame image, and a developer builds a service image based on the deep learning base frame image, so that the developer only needs to pay attention to own service codes and does not need to pay attention to factors such as frame dependence.
The deep learning infrastructure images described above are pre-stored in an image repository, and the service distribution system may be applied to a distributed cluster, which may include a plurality of computing nodes. The mirror image warehouse is used for storing mirror image files and is generally arranged on a remote server, and when an application program is deployed, the cluster node server can remotely pull the mirror images, namely, the mirror images can be remotely stored in a centralized mode and distributed through the mirror image warehouse.
Step 4: based on the service image and the service release configuration, the Kubernetes direct deployment container is called to generate cloud service.
Before the cloud service is generated, the service image can be pulled from the image warehouse, a plurality of computing nodes are automatically called through the Kubernetes, and containers are deployed on the computing nodes to generate the cloud service based on the service image and the service release configuration.
Step 5: the issued service automatically stretches based on the concurrency of external requests (counting the number of requests per second for the service). That is, when no request is made, the running service instance can be scaled to 0, and when the requested concurrency is greater than the preset concurrency, the number of concurrency is supported according to the configured single instance to perform horizontal expansion.
The external request may be an HTTP request, and the service instance refers to a pod of an application program, where pod is a resource abstraction of Kubernetes, is a minimum organization unit in Kubernetes, and represents one or more application container groups, and some shared resources for those containers, and a pod generally corresponds to an application program. The service instance can be understood as an application program running, and a service can have a plurality of application programs running at the same time, that is, the service can be horizontally expanded by increasing the number of pod of the service.
Based on the same inventive concept, the embodiment of the application also provides a service release device, such as the following embodiment. Since the principle of the service release device for solving the problem is similar to that of the service release method, the implementation of the service release device can refer to the implementation of the service release method, and the repetition is not repeated. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated. Fig. 4 is a block diagram of a service issuing apparatus according to an embodiment of the present application, and as shown in fig. 4, may include: the structure is described below, and the receiving module 401, the acquiring module 402, the pulling module 403, and the constructing module 404 are described below.
The receiving module 401 may be configured to receive a service release request of a user.
The obtaining module 402 may be configured to obtain, from a storage server, an object code input by a user according to a service release request.
The pulling module 403 may be configured to pull, from the model repository server, a corresponding training model as the target training model according to the target code.
The construction module 404 may be configured to construct a target service image based on a preset deep learning infrastructure image, a target code, and a target training model, where the target service image is used by the computing node to generate a target service.
In one embodiment, the building block 404 may include: the acquisition unit is used for acquiring a preset deep learning foundation framework image from the image warehouse, wherein the target training model is obtained based on deep learning foundation framework pre-training; the building unit is used for building a target service image through the Dockerf file based on a preset deep learning basic framework image, a target code and a target training model; and the pushing unit is used for pushing the target service image to the image warehouse.
In one embodiment, the service issuing apparatus may further include: the pulling unit is used for pulling the target service mirror image from the mirror image warehouse by the current computing node with computing capability in the plurality of computing nodes; and the target service generation unit is used for generating the target service by adopting a Kubernetes deployment container on the current computing node with computing capability based on the target service image.
In one embodiment, the service issuing apparatus may further include: the monitoring unit is used for monitoring the concurrency of the request target service in real time; the comparison unit is used for comparing the monitored concurrency with a preset concurrency; and the processing unit is used for adaptively increasing and decreasing the pod number of the target service according to the comparison result.
The embodiment of the application also provides an electronic device, which specifically can refer to a schematic diagram of an electronic device composition structure based on the service distribution method provided by the embodiment of the application shown in fig. 5, and the electronic device specifically can include an input device 51, a processor 52 and a memory 53. Wherein the input device 51 may be used in particular for inputting object codes. Processor 52 may be specifically configured to receive a service release request from a user; acquiring an object code input by a user from a storage server according to a service release request; according to the target codes, pulling the corresponding training model from the model warehouse server to serve as a target training model; and constructing a target service image based on the preset deep learning basic frame image, the target code and the target training model, wherein the target service image is used for generating a target service by the computing node. The memory 53 may be used to store parameters such as object code, object training model, deep learning infrastructure image, object service image, etc.
In this embodiment, the input device may specifically be one of the main means for exchanging information between the user and the computer system. The input device may include a keyboard, mouse, camera, scanner, light pen, handwriting input board, voice input apparatus, etc.; the input device is used to input raw data and a program for processing these numbers into the computer. The input device may also obtain data transmitted from other modules, units, and devices. The processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor, and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a programmable logic controller, and an embedded microcontroller, among others. The memory may in particular be a memory device for storing information in modern information technology. The memory may comprise a plurality of levels, and in a digital system, may be memory as long as binary data can be stored; in an integrated circuit, a circuit with a memory function without a physical form is also called a memory, such as a RAM, a FIFO, etc.; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card, and the like.
In this embodiment, the specific functions and effects of the electronic device may be explained in comparison with other embodiments, which are not described herein.
The embodiment of the application also provides a computer storage medium based on the service release method, which stores computer program instructions, and the computer program instructions can be realized when executed: receiving a service release request of a user; acquiring an object code input by a user from a storage server according to a service release request; according to the target codes, pulling the corresponding training model from the model warehouse server to serve as a target training model; and constructing a target service image based on the preset deep learning basic frame image, the target code and the target training model, wherein the target service image is used for generating a target service by the computing node.
In the present embodiment, the storage medium includes, but is not limited to, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects of the program instructions stored in the computer storage medium may be explained in comparison with other embodiments, and are not described herein.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps in them may be fabricated into a single integrated circuit module. Thus, embodiments of the application are not limited to any specific combination of hardware and software.
Although the present application provides method operational steps as described in the above embodiments or flowcharts, more or fewer operational steps may be included in the method, either on a routine basis or without inventive labor. In the steps where there is logically no necessary causal relationship, the execution order of the steps is not limited to the execution order provided by the embodiment of the present application. The described methods, when performed in an actual apparatus or an end product, may be performed sequentially or in parallel (e.g., in a parallel processor or multithreaded environment) as shown in the embodiments or figures.
It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided will be apparent to those of skill in the art upon reading the above description. The scope of the application should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, and various modifications and variations can be made to the embodiments of the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (11)

1. A service distribution system, comprising:
a storage server for receiving and storing an object code input by a user;
the model warehouse server is used for storing a plurality of training models obtained based on deep learning basic framework pre-training;
the mirror image construction module is used for acquiring the target code input by the user from the storage server under the condition of receiving the service release request of the user, pulling a corresponding training model from a model warehouse server according to the target code to serve as a target training model, constructing a target service mirror image based on a preset deep learning basic frame mirror image, the target code and the target training model, and pushing the target service mirror image to a mirror image warehouse;
The mirror image warehouse is in communication connection with the mirror image construction module and is used for storing the preset deep learning basic frame mirror image and receiving and storing the target service mirror image;
the computing nodes are in communication connection with the mirror image warehouse, and are used for pulling the target service mirror image from the mirror image warehouse and generating target service based on the target service mirror image;
the service release request carries service release configuration, wherein the service release configuration comprises at least one of the following steps: service name, cluster area, function description, auto-scaling rule configuration, resource configuration, and environment variable configuration.
2. The system of claim 1, wherein the pre-set deep learning infrastructure image comprises: deep learning infrastructure and/or dependency packages required for deep learning training.
3. The system of claim 1, wherein the computing node is specifically configured to generate a target service by deploying a container based on the target service image, wherein the container is deployed based on Kubernetes.
4. A system according to claim 3, further comprising:
And the optimization server is used for monitoring the concurrency quantity of the target service in real time, comparing the monitored concurrency quantity with a preset concurrency quantity, and adaptively increasing and decreasing the number of the pod of the target service according to the comparison result.
5. A service distribution method based on the service distribution system according to any one of claims 1 to 4, characterized by comprising:
receiving a service release request of a user;
acquiring the target code input by the user from a storage server according to the service release request;
according to the target codes, pulling a corresponding training model from a model warehouse server to serve as a target training model;
constructing a target service image based on a preset deep learning basic frame image, the target code and the target training model, wherein the target service image is used for generating target service by a computing node;
the service release request carries service release configuration, wherein the service release configuration comprises at least one of the following steps: service name, cluster area, function description, auto-scaling rule configuration, resource configuration, and environment variable configuration.
6. The method of claim 5, wherein constructing a target service image based on a pre-set deep learning infrastructure image, the target code, and the target training model comprises:
Obtaining a preset deep learning basic frame image from an image warehouse, wherein the target training model is obtained based on deep learning basic frame pre-training;
constructing a target service image through a Dockerfile based on the preset deep learning basic frame image, the target codes and the target training model;
pushing the target service image to an image warehouse.
7. The method of claim 5, further comprising, after constructing the target service image:
the current computing node with computing capability in the plurality of computing nodes pulls the target service image from the image warehouse;
based on the target service image, the target service is generated on a current computing node with computing capability by using a Kubernetes deployment container.
8. The method of claim 7, further comprising, after generating the target service:
monitoring the concurrency of the target service request in real time;
comparing the monitored concurrency with a preset concurrency;
and adaptively increasing and decreasing the number of pod of the target service according to the comparison result.
9. A service distribution apparatus, comprising:
The receiving module is used for receiving a service release request of a user;
the acquisition module is used for acquiring the target code input by the user from a storage server according to the service release request;
the pulling module is used for pulling the corresponding training model from the model warehouse server to serve as a target training model according to the target code;
the building module is used for building a target service image based on a preset deep learning basic frame image, the target codes and the target training model, wherein the target service image is used for generating target service by a computing node;
the service release request carries service release configuration, wherein the service release configuration comprises at least one of the following steps: service name, cluster area, function description, auto-scaling rule configuration, resource configuration, and environment variable configuration.
10. A service delivery device comprising a processor and a memory for storing processor-executable instructions, which when executed by the processor implement the steps of the method of any one of claims 5 to 8.
11. A computer readable storage medium having stored thereon computer instructions which when executed implement the steps of the method of any of claims 5 to 8.
CN201910720644.0A 2019-08-06 2019-08-06 Service release system, method, device and equipment Active CN110413294B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910720644.0A CN110413294B (en) 2019-08-06 2019-08-06 Service release system, method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910720644.0A CN110413294B (en) 2019-08-06 2019-08-06 Service release system, method, device and equipment

Publications (2)

Publication Number Publication Date
CN110413294A CN110413294A (en) 2019-11-05
CN110413294B true CN110413294B (en) 2023-09-12

Family

ID=68366036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910720644.0A Active CN110413294B (en) 2019-08-06 2019-08-06 Service release system, method, device and equipment

Country Status (1)

Country Link
CN (1) CN110413294B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127006B (en) * 2019-12-30 2024-03-29 北京懿医云科技有限公司 Service deployment method, device, equipment and storage medium
CN111240698A (en) * 2020-01-14 2020-06-05 北京三快在线科技有限公司 Model deployment method and device, storage medium and electronic equipment
CN111327692A (en) * 2020-02-05 2020-06-23 北京百度网讯科技有限公司 Model training method and device and cluster system
CN111290778B (en) * 2020-02-06 2024-03-01 网易(杭州)网络有限公司 AI model packaging method, platform and electronic equipment
CN111414233A (en) * 2020-03-20 2020-07-14 京东数字科技控股有限公司 Online model reasoning system
CN111459509A (en) * 2020-03-27 2020-07-28 北京金山云网络技术有限公司 Container mirror image construction method and device and server
CN111461349A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Modeling method and system
CN113626179B (en) * 2020-05-09 2023-08-22 烽火通信科技股份有限公司 Universal artificial intelligent model training method and system
CN111651191B (en) * 2020-05-12 2023-03-17 北京仁科互动网络技术有限公司 Single application packaging method and system applied to microservice framework
CN113806624B (en) * 2020-06-15 2024-03-08 阿里巴巴集团控股有限公司 Data processing method and device
CN111610989B (en) * 2020-06-17 2023-09-29 中国人民解放军国防科技大学 Application publishing/updating method and system for offline container cloud environment
CN111611087B (en) * 2020-06-30 2023-03-03 中国人民解放军国防科技大学 Resource scheduling method, device and system
CN112181599B (en) * 2020-10-16 2023-05-16 中国联合网络通信集团有限公司 Model training method, device and storage medium
CN112214285A (en) * 2020-10-22 2021-01-12 厦门渊亭信息科技有限公司 Docker-based model service deployment system
CN112288096A (en) * 2020-10-22 2021-01-29 济南浪潮高新科技投资发展有限公司 Rapid building and releasing method for machine learning model mirror image based on rapid machine learning model
CN112311605B (en) * 2020-11-06 2023-12-22 北京格灵深瞳信息技术股份有限公司 Cloud platform and method for providing machine learning service
CN112596741B (en) * 2020-11-16 2022-08-30 新华三大数据技术有限公司 Video monitoring service deployment method and device
CN112379919B (en) * 2020-11-23 2024-04-09 刘亚虹 Service customization method and device, electronic equipment and storage medium
CN112799782B (en) * 2021-01-20 2024-04-12 北京迈格威科技有限公司 Model generation system, method, electronic device and storage medium
CN113064696A (en) * 2021-03-25 2021-07-02 网易(杭州)网络有限公司 Cluster system capacity expansion method, device and medium
CN115563063A (en) * 2021-07-01 2023-01-03 马上消费金融股份有限公司 Model construction method and device and electronic equipment
CN114115857B (en) * 2021-10-29 2024-04-05 北京邮电大学 Machine learning model automatic production line construction method and system
CN114268661B (en) * 2021-11-19 2024-04-30 科大讯飞股份有限公司 Service scheme deployment method, device, system and equipment
CN114153525B (en) * 2021-11-30 2024-01-05 国电南瑞科技股份有限公司 AI model servitization sharing method and system for power grid regulation and control service
CN114281706B (en) * 2021-12-30 2023-09-12 北京瑞莱智慧科技有限公司 Model evaluation method, system and storage medium
CN114968271A (en) * 2022-05-26 2022-08-30 北京金堤科技有限公司 Model deployment method and device, electronic equipment and storage medium
CN115826995B (en) * 2022-10-31 2023-07-14 北京凯思昊鹏软件工程技术有限公司 Distributed mirror image construction system
CN115562690B (en) * 2022-12-05 2023-04-18 杭州未名信科科技有限公司 Algorithm service processing method, device and medium based on Docker container
CN115756733B (en) * 2023-01-10 2023-04-14 北京数原数字化城市研究中心 Container mirror image calling system and container mirror image calling method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582315A (en) * 2018-10-26 2019-04-05 北京百度网讯科技有限公司 Service privatization method, apparatus, computer equipment and storage medium
WO2019095936A1 (en) * 2017-11-15 2019-05-23 腾讯科技(深圳)有限公司 Method and system for building container mirror image, and server, apparatus and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019095936A1 (en) * 2017-11-15 2019-05-23 腾讯科技(深圳)有限公司 Method and system for building container mirror image, and server, apparatus and storage medium
CN109582315A (en) * 2018-10-26 2019-04-05 北京百度网讯科技有限公司 Service privatization method, apparatus, computer equipment and storage medium

Also Published As

Publication number Publication date
CN110413294A (en) 2019-11-05

Similar Documents

Publication Publication Date Title
CN110413294B (en) Service release system, method, device and equipment
CN108205442B (en) Edge computing platform
KR102414096B1 (en) Create and deploy packages for machine learning on end devices
US10554577B2 (en) Adaptive resource scheduling for data stream processing
JP7138150B2 (en) DISTRIBUTED TRAINING METHOD, SYSTEM, DEVICE, STORAGE MEDIUM, AND PROGRAM
US11514304B2 (en) Continuously provisioning large-scale machine learning models
JP7421511B2 (en) Methods and apparatus, electronic devices, readable storage media and computer programs for deploying applications
CN112165691B (en) Content delivery network scheduling method, device, server and medium
CN108848092A (en) The processing method and processing device of micro services gray scale publication based on call chain
CN109634686A (en) A kind of method and system by BMC remote configuration server state
KR102073678B1 (en) Method and apparatus for firmware virtualization
US8570905B2 (en) Adaptive enterprise service bus (ESB) runtime system and method
CN102362261A (en) Input content to application via web browser
CN113220420A (en) Service monitoring method, device, equipment, storage medium and computer program product
CN106257418B (en) Techniques for evaluating an application by using an auxiliary application
CN103391312A (en) Resource offline downloading method and device
JP7114772B2 (en) Certificate sending method, certificate receiving method, cloud and terminal equipment
EP4060496A2 (en) Method, apparatus, device and storage medium for running inference service platform
KR20200029574A (en) Simulator, simulation device, and simulation method
CN115248692A (en) Device and method for supporting cloud deployment of multiple deep learning framework models
CN107547591A (en) Upgrade server, set top box, set top box upgrading file delivery method and system
CN112328301A (en) Method and device for maintaining consistency of operating environments, storage medium and electronic equipment
CN110138774A (en) A kind of hold-up interception method of the general CC attack of dynamic configuration
CN113204425B (en) Method, device, electronic equipment and storage medium for process management internal thread
JP2015156104A (en) Self-controlling system, self-controlling device, self-controlling method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant