CN110780914B - Service publishing method and device - Google Patents

Service publishing method and device Download PDF

Info

Publication number
CN110780914B
CN110780914B CN201810856676.9A CN201810856676A CN110780914B CN 110780914 B CN110780914 B CN 110780914B CN 201810856676 A CN201810856676 A CN 201810856676A CN 110780914 B CN110780914 B CN 110780914B
Authority
CN
China
Prior art keywords
file
service
target service
program
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810856676.9A
Other languages
Chinese (zh)
Other versions
CN110780914A (en
Inventor
汤人杰
杨巧节
金天骄
方炜
于祥兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201810856676.9A priority Critical patent/CN110780914B/en
Publication of CN110780914A publication Critical patent/CN110780914A/en
Application granted granted Critical
Publication of CN110780914B publication Critical patent/CN110780914B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order

Abstract

The embodiment of the invention provides a service publishing method and device. The method comprises the following steps: acquiring a program file and a configuration file of a target service; generating a mirror image file of the target service according to the program file and the configuration file; and creating and starting an instance container of the target service according to the image file, and providing the target service to the outside through the instance container. In the embodiment of the invention, the arranging and releasing processes of the services are simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of diversified models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high.

Description

Service publishing method and device
Technical Field
The embodiment of the invention relates to the technical field of mobile communication, in particular to a service publishing method and device.
Background
Artificial Intelligence (AI) is a new technical science to study and develop theories, methods, techniques and application systems for simulating, extending and expanding human Intelligence. As a branch of computer science, AI aims to understand the essence of intelligence and produce an intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others. Since birth, the theory and technology of artificial intelligence have become mature, the application field has been expanded, and the scientific and technological products brought by artificial intelligence in the future will be the 'container' of human intelligence.
In the 50 s, AI concepts appeared for the first time, mainly based on expert systems; in the 70 s, machine learning developed vigorously with the invention of neural network algorithms, but only a shallow neural network could be trained due to the limitation of computing power. With the advent of cloud computing and big data era, computing power and a training data set do not become bottlenecks any more, and deep learning can achieve better accuracy than a traditional machine learning algorithm with the increase of the training data set.
The artificial intelligence technology represented by deep learning is deeply integrated with national economy, and the application prospect of the artificial intelligence technology is very wide. In order to accelerate the application of the deep learning model, the deep learning model needs to be converted into standard micro-services, and service arrangement and release are carried out; however, in the prior art, service arrangement and release are mainly realized in a customized development mode, encapsulated into micro-services and then applied to specific service scenes, and the scheme has the following problems:
1. inefficiency of model publishing
And code development is required to be carried out on each model release, and a deep learning framework exists in various frameworks such as TensorFlow and Caffe, so that the learning cost of developers is high.
2. Service standards are not uniform
Services developed in the prior art are personalized services, lack of unified standards and increase the difficulty of application calling;
3. lack of resilience of service
The individualized service deployed independently lacks an elastic coping mechanism in the face of irregular calling of applications.
4. Poor service maintainability
Such as the calling situation of the service, and lack a detailed statistical information system and a complete monitoring system.
Disclosure of Invention
The embodiment of the invention provides a service publishing method and device, which are used for solving the problems of low model publishing efficiency, non-uniform service standards, poor maintainability and the like in service arrangement and publishing in the prior art.
In one aspect, an embodiment of the present invention provides a service publishing method, where the method includes:
acquiring a program file and a configuration file of a target service;
generating a mirror image file of the target service according to the program file and the configuration file;
and creating and starting an instance container of the target service according to the image file, and providing the target service to the outside through the instance container.
In another aspect, an embodiment of the present invention provides a service publishing device, where the device includes:
the acquisition module is used for acquiring a program file and a configuration file of the target service;
the file generation module is used for generating a mirror image file of the target service according to the program file and the configuration file;
and the service publishing module is used for creating and starting an instance container of the target service according to the image file and providing the target service to the outside through the instance container.
On the other hand, the embodiment of the present invention further provides an electronic device, which includes a memory, a processor, a bus, and a computer program stored on the memory and executable on the processor, where the processor implements the steps in the service distribution method when executing the program.
In still another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the service publishing method.
According to the service publishing method and device provided by the embodiment of the invention, the program file and the configuration file of the target service are acquired; and generating an image file of the target service according to the program file and the configuration file, creating and starting an instance container of the target service according to the image file, creating an environment image of the target service operation through the instance container, and providing the target service to the outside. In the embodiment of the invention, the arranging and releasing processes of the services are simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of diversified models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a service publishing method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a first exemplary service publishing architecture in accordance with an embodiment of the present invention;
fig. 3 is a block diagram of a service distribution apparatus of a second example of the embodiment of the present invention;
FIG. 4 is a method flow diagram of a second example of an embodiment of the present invention;
fig. 5 is a block diagram of a service distribution apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 shows a flowchart of a service publishing method according to an embodiment of the present invention.
As shown in fig. 1, the service publishing method provided in the embodiment of the present invention specifically includes the following steps:
step 101, acquiring a program file and a configuration file of a target service.
The target service is a service to be scheduled and/or released in the current operating environment, and can be an AI service after deep learning training and maturation, or other atomic services; in the embodiment of the present invention, the atomic service refers to the smallest service unit in the current operating environment. The target service has different corresponding program files according to different attributes, for example, for the model service, the program file is a service model file and includes models such as a calculation graph and a data flow; for an application class service, its program files are the program code packages of the application.
The configuration file includes parameters of the target service, such as a usage framework of the target service, a list of input/output parameters that meet a preset interface specification, and the like.
As a first example, referring to fig. 2, fig. 2 provides a service distribution architecture (hereinafter referred to as architecture) that provides a service orchestration and distribution function, and acquires a program file and a configuration file of a target service during orchestration of the service.
And 102, generating a mirror image file of the target service according to the program file and the configuration file.
The mirror image file is a file generated by a program file and a configuration file and is used for providing the target service. And packaging the program file and the configuration file into a mirror image file of the target service through preset processing so as to create an environment mirror image for the operation of the target service.
Step 103, creating and starting an instance container of the target service according to the image file, and providing the target service to the outside through the instance container.
The instance container is a virtual container for running the instance, and as a resource sharing mode, the instance container can provide great flexibility. According to the mirror image file, an instance container is created and started, in the process of creating the instance container, a program file is processed into a service which can be called externally and internally, the processed service is merged into a previous running environment mirror image and is finally packaged into a basic mirror image of the current running environment containing the target service, and then the target service can be provided externally through the instance container under the current running environment.
And based on containerized instance deployment, flexible service can be realized, an elastic capacity expansion and reduction function is provided, the response condition of the target service and the resource overhead condition of each instance container can be monitored, and finally the number of containers can be dynamically adjusted. Alternatively, the example container may be a Docker container or a Mesos container.
With continued reference to FIG. 2, three example containers are provided in FIG. 2, a container for model class encapsulation, application class encapsulation, and service class encapsulation, respectively.
In the embodiment of the invention, the program file and the configuration file of the target service are acquired; and generating an image file of the target service according to the program file and the configuration file, creating and starting an instance container of the target service according to the image file, creating an environment image of the target service operation through the instance container, and providing the target service to the outside. In the embodiment of the invention, the arranging and releasing processes of the services are simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of diversified models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high. The embodiment of the invention solves the problems of low model release efficiency, non-uniform service standards, poor maintainability and the like in service arrangement and release in the prior art.
Optionally, in this embodiment of the present invention, the step of obtaining the program file and the configuration file of the target service includes:
and when the attribute of the target service is the model class package or the application class package, receiving a program file and a configuration file of the target service.
When the attribute (type) of the target service is a model class package or an application class package, receiving a program file and a configuration file of the target service, wherein the received program file and the received configuration file may be uploaded by a user or by a third-party device.
Specifically, the target service of the model class package is a package based on an algorithm model; the application class package is used for combining logical algorithms among a plurality of atomic service models, preprocessing logical algorithms before the model algorithms and other scenes.
Referring to fig. 2, each instance container in fig. 2 corresponds to a target service with different attributes, including a model class package, an application class package, and a service class package. Optionally, the architecture shown in fig. 2 may provide a User with a User Interface (UI) for the User to select attributes of the target service when uploading the program files and/or configuration files.
Optionally, in this embodiment of the present invention, after the step of receiving the program file and the configuration file of the target service, the method further includes:
and converting the program file into a first format file conforming to a Remote Procedure Call (RPC) interface protocol and a second format file conforming to a hypertext transfer protocol (HTTP) interface protocol according to the configuration file.
In order to provide a standardized service call interface for the outside, after receiving a program file and a configuration file of a target service, processing the program file into a first format file conforming to a Remote Procedure Call (RPC) interface protocol according to input/output parameters in the configuration file for Remote call; and a second format file conforming to a hypertext Transfer Protocol (HTTP) interface Protocol, providing standard HTTP interface services, and providing internal application calls.
It can be understood that, when the attribute of the target service is application class encapsulation, if the algorithm in the program file realizes HTTP interface output, the algorithm does not need to be converted into a second format file at this time, and only needs to perform standardized processing of the input/output format on the interface according to the HTTP interface specification.
Unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and standardized encapsulation and agile release of diversified models are realized.
Referring to fig. 2, after the program file and the configuration file are uploaded through a standard Representational State Transfer (REST) interface, the framework internally converts the program file into a file in an RPC interface or HTTP interface format.
Optionally, in this embodiment of the present invention, when the attribute of the target service is the model class package, the program file is a model file of the target service, where the model file includes: computational graphs, data flows, variable parameters, and/or signatures; a model usage framework, an interface specification, and/or configuration parameters of the target service for the model file;
when the attribute of the target service is the application class package, the program file is a program code file of the target service; the configuration file includes: a code language format and/or a dependency package of the program code file.
In one aspect, for model class encapsulation, the program file is a model file for the target service, the model file comprising: the computation graph, the data flow, the variable parameter and/or the signature of the algorithm model can also comprise some auxiliary files, the model of the model file uses a framework, an interface specification and/or a configuration parameter of the target service, wherein the interface specification comprises an input/output parameter list data flow which accords with the interface specification; optionally, the configuration file may further include a flag indicating whether the target service requires special processing, and a configuration of the relevant operating parameters of the microservice when the model is issued.
On the other hand, for application class encapsulation, the program file is the program code file of the target service, and the configuration file includes parameters such as the code language format and/or the dependency package of the program code file, and is used for creating an environment image adapting to the logic algorithm operation of the target service.
Optionally, in this embodiment of the present invention, the step of obtaining the program file and the configuration file of the target service includes:
when receiving a scheduling request for service scheduling, acquiring atomic services to be scheduled and configuration information of the atomic services;
generating a calculation graph configuration file according to the atomic service and the configuration information;
and acquiring a preset program file of the atomic service, combining the program files of the atomic service to obtain a program file of the target service, and combining to obtain a configuration file of the target service according to the configuration information.
As another embodiment, when an orchestration request is received to perform service orchestration, the orchestration request may be a deployment of an atomic service icon by a user.
Acquiring an atomic service to be arranged in an arrangement request and configuration information of the atomic service, and converting the atomic service to a calculation graph configuration file of the atomic service; the computation graph configuration file comprises an atomic service list required by a total service (target service), configuration information of each atomic service, and the configuration information comprises an input/output parameter list of the atomic service and a corresponding relation of input/output parameters among the atomic services.
After generating a configuration file of a calculation graph, acquiring preset program files of each atomic service, and combining to obtain the program files of the target service; and combining the configuration information to obtain a configuration file of the target service, and combining the atomic services to obtain the target service. The service arrangement process is simple, the efficiency is high, and the method is suitable for various types of services.
Optionally, in this embodiment of the present invention, the step of creating and starting the instance container of the target service includes:
creating and starting the instance container of the target service according to the configuration number of the instance container included in the configuration file;
or
And creating and starting a preset number of example containers of the target service.
The flexible service can be realized based on containerized instance deployment, the elastic expansion and contraction capacity function can be provided, the configuration number of the instance containers can be determined according to the limitation in the configuration file, and can also be preset to be a preset number of instance containers; and the response condition of the target service and the resource overhead condition of each instance container can be monitored, and the number of the containers can be dynamically adjusted.
Optionally, in this embodiment of the present invention, after the step of creating and starting the instance container of the target service, the method further includes:
and distributing Graphics Processing Unit (GPU) video memory resources for the instance container according to the configuration file, and monitoring GPU state information of the instance container.
The example container may be allocated with Graphics Processing Unit (GPU) video memory resources according to the configuration file, and the allocation of the GPU resources supports fine-grained segmentation in units of video memory, and may monitor GPU state information of the example container.
Specifically, the monitoring of the GPU state information may adopt a Master-slave architecture, for example, slave services are deployed on each instance container equipped with a GPU to monitor the GPU video memory resources on the instance container, and each slave service sends real-time monitoring information to the Master for aggregation. The Master controls the GPU resources of the whole device and responds to a GPU application request of an application end to distribute and dispatch the GPU, and after the application is finished running, the Master updates and recycles the GPU resources in time through monitoring information reported by the Slave, so that the management, control and dispatch of the GPU resources of the whole framework are completed.
Optionally, appropriate CPU resources may also be allocated for the instance container based on the characteristics of the service to achieve the desired performance.
In the above embodiment of the present invention, the program file and the configuration file of the target service are obtained; and generating an image file of the target service according to the program file and the configuration file, creating and starting an instance container of the target service according to the image file, creating an environment image of the target service operation through the instance container, and providing the target service to the outside. In the embodiment of the invention, the arranging and releasing processes of the services are simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of diversified models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high.
As a second example, the service publishing method provided by the present invention is described below with reference to fig. 3 and 4.
Fig. 3 provides a service distribution apparatus 300, including: a packaging engine 301, an arranging engine 302 and a load balancing module 303; the encapsulation engine 301 includes an encapsulation processing module 3011 and a micro-service conversion module 3012; the orchestration engine 302 includes a visualization processing module 3021.
Referring to fig. 4, after the service publishing process is started, the working process of the service publishing device 300 mainly includes the following steps:
step 401, determining whether the attribute of the target service is service orchestration: if not, go to step 402;
if yes, go to step 403, obtain the atomic service to be arranged and the configuration information of the atomic service, generate a computation graph configuration file, and go to step 406;
the visualization processing module 3021 receives the deployment of the atomic service icon, converts the deployment of the atomic service icon into a computation graph configuration file of the atomic service, which includes configuration information defined for the total service, a list of required atomic services, a list of input/output parameters of each atomic service, and a corresponding relationship between the input/output parameters of each atomic service, and transmits the computation graph configuration file to the encapsulation processing module 3011. The packaging processing module 3011 creates an environment image adapted to the operation of the high-level algorithm according to each atomic service operation dependent parameter in the input computation graph configuration file, and transmits the computation graph configuration file to the micro-service conversion module 3012.
Step 402, judging whether the attribute of the target service is the service class encapsulation: if not, go to step 405;
if yes, go to step 404, receive configuration file, and go to step 406;
step 405, determining the attribute of the target service as a model class package or an application class package, receiving a program file and a configuration file, and executing step 406;
the package processing module 3011 receives a program file and a configuration file.
Step 406, converting the program file into a first format file conforming to an RPC interface protocol and a second format file conforming to an HTTP interface protocol, generating a mirror image file of the target service, and executing step 407;
the micro-service conversion module 3012 converts the program file into a first format file and a second format file, and transmits the first format file and the second format file back to the encapsulation processing module 3011, where the encapsulation processing module 3011 generates a mirror image file of the target service and creates a basic mirror image of the operating environment.
Step 407, create and start an instance container of the target service.
Step 408, register the instance container to the load balancing module 303, and provide the target service to the outside.
Further, the following describes the operation of the service distribution apparatus 300 in fig. 3 with respect to target services with different attributes, taking AI service as an example.
(1) Model type package
The model class encapsulation sequentially executes step 401, step 402, step 405, step 406, step 407 and step 408 in fig. 4;
specifically, the model class encapsulation is mainly directed to encapsulation of core model algorithms in the AI generic service capability. The model developer uploads the model file and the configuration file to the packaging processing module 3011; the model file comprises a calculation graph, a data stream, input and output of related variables, a signature, an auxiliary file and the like, and the configuration file comprises a model using frame, an input/output parameter list conforming to the device interface specification, a sign indicating whether the data stream needs special processing or not, configuration of related operation parameters of the microservice issued by the model and the like; the encapsulation processing module 3011 creates an environment mirror image adapted to the operation of the model algorithm according to parameters such as a model usage frame.
Next, the configuration file and the model file are transferred to the microservice conversion module 3012, which processes the model file into RPC interface and standard HTTP interface services for remote invocation according to the input/output parameters in the configuration, and returns to the encapsulation processing module 3011. The encapsulation processing module 3011 incorporates the processed service into the previous runtime image, and finally encapsulates the service and the runtime image into a base image.
Thereafter, the basic image and configuration file are input into the load balancing module 303, and according to the service instance operation parameter N (not set as a device default value) input in the model configuration, the load balancing module 303 creates a container of N operation service instances based on the image and automatically registers the container on the load balancer.
Finally, the load balancing module 303 outputs a unified interface capable of providing the AI microservice, so that an AI developer in different scenes can conveniently and quickly call and develop the AI microservice. Meanwhile, the load balancing module 303 dynamically expands the number of containers holding the service instance to be run according to the load of the call request, thereby achieving efficient and dynamic allocation of resources.
(2) Application class encapsulation
The application class encapsulation sequentially executes step 401, step 402, step 405, step 406, step 407 and step 408 in fig. 4;
the application class encapsulation mainly aims at the encapsulation of key logic algorithm parts except a core model algorithm in AI service capability, and is usually used for combining logic algorithms among a plurality of atomic models, preprocessing logic algorithms before model algorithms and other scenes.
The model developer uploads the corresponding program code file and configuration file to the encapsulation processing module 3011, which creates an environment mirror image adapted to the operation of the logic algorithm according to the parameters such as code language format and dependency package.
Then, the program code file and the configuration file are transferred to the microservice conversion module 3012, and the subsequent processes (step 406, step 407, and step 408) are the same as the above-mentioned model class package, and are not described herein again in the embodiments of the present invention.
(3) Service class encapsulation
The service class encapsulation sequentially executes step 401, step 402, step 404, step 406, step 407 and step 408 in fig. 4;
the service class mainly aims at introducing mature AI service from the outside and unifies the integrated scene through the device interface specification. The packaging processing module 3011 packages the AI service into a base image including the service operating environment only by uploading a configuration file describing external service related parameters (such as an access address, an input/output parameter list, whether the parameters need to be converted into a flag, and issuing micro-service related operating parameter configuration) through the service class packaging engine 301 of the device.
Then, the basic image and the configuration file are input into the load balancing module 303, and the subsequent processes (step 406, step 407, and step 408) are the same as the above model class encapsulation, which is not described herein again in this embodiment of the present invention.
(4) Service orchestration
The model class encapsulation sequentially performs the steps 401, 403, 406, 407, and 408 in fig. 4;
the module algorithm of the target service of each attribute is packaged into a series of AI atomic services through the processing of the packaging processing module 3011. For high-level services that require the combination of several atomic services, the orchestration engine 302 of the present apparatus will provide a visual service orchestration function.
The orchestration engine 302 receives configuration information for the AI atomic service, and converts the configuration information into a computation graph configuration file for the atomic service, where the computation graph configuration file includes a list of atomic services required by a total service (target service), configuration information for each atomic service, and the configuration information includes a list of input/output parameters of the atomic service, and a correspondence relationship between the input/output parameters of each atomic service.
The packaging processing module 3011 creates an environment image adapted to the operation of the high-level algorithm according to the operation dependent parameters of each atomic service in the computation graph configuration file, and transmits the computation graph configuration file to the micro-service conversion module 3012.
The microservice conversion module 3012 organically associates the related atomic services according to the correspondence between the atomic services in the computation graph configuration file, and transmits the atomic services to the encapsulation processing module 3011, which encapsulates the atomic services into a high-level service image including a general AI service interface with unified input/output and its operating environment.
Then, the advanced service image and the computation graph configuration file are input into the load balancing module 303, and the subsequent processes (step 406, step 407, and step 408) are the same as the above model class encapsulation, which is not described herein again in this embodiment of the present invention.
In addition, the apparatus shown in fig. 3 also provides the following functions:
(1) Multi-version switching: AI services are a prospective study that requires constant trial and error iterations to achieve the desired effect. In order to realize seamless switching among a plurality of service versions, the device provides a gray scale release function, and can realize smooth online upgrading of the service without completely terminating the service.
(2) And (3) service level monitoring: the device provides fine-grained monitoring functions at a platform level, a service level and a container level. The failure of a service can be traced to a specific host, container.
(3) And (3) multi-user management: the device supports multi-user management function. The user can apply for hardware resources such as CPU, GPU and the like, and create and manage the application and service of the user.
In the above example, the online publishing platform, the bottom layer implementation and the resource configuration are independent, online agile publishing of different scenes of a plurality of deep learning framework models is realized, the publishing efficiency is high, modeling personnel can concentrate on the training of the models without paying attention to the service problem of the models, and the application efficiency of the AI model to the production is accelerated; the method supports model class, application class and service class encapsulation, simultaneously realizes seamless switching of multi-model versions without service perception, realizes different versions of AI service, and realizes service upgrading without service perception through gray level release capability provided by an online service platform; the service can be monitored in real time and the fault can be traced, operation and maintenance personnel can sense the abnormal state of the service at the first time, and can quickly locate a specific service instance through a complete service call log provided by the online service platform, and the maintainability is high; the on-line service with high availability, high concurrency and low time delay can be provided for production through the mixed scheduling and dynamic capacity expansion and the like of the CPU and the GPU, and the scalability is strong.
The service publishing method provided by the embodiment of the present invention is described above, and a service publishing apparatus provided by the embodiment of the present invention is described below with reference to the accompanying drawings.
Referring to fig. 5, an embodiment of the present invention provides a service publishing apparatus, where the apparatus includes:
the obtaining module 501 is configured to obtain a program file and a configuration file of a target service.
The target service is a service to be scheduled and/or released in the current operating environment, and can be an AI service after deep learning training and maturation, or other atomic services; in the embodiment of the present invention, the atomic service refers to the smallest service unit in the current operating environment. The target service has different corresponding program files according to different attributes, for example, for the model service, the program file is a service model file and includes models such as a calculation graph and a data flow; for an application class service, its program files are the program code packages of the application.
The configuration file includes parameters of the target service, such as a usage framework of the target service, a list of input/output parameters that meet a preset interface specification, and the like.
The file generating module 502 is configured to generate an image file of the target service according to the program file and the configuration file.
The mirror image file is a file generated by a program file and a configuration file and is used for providing the target service. And packaging the program file and the configuration file into a mirror image file of the target service through preset processing so as to create an environment mirror image for running the target service.
The service publishing module 503 is configured to create and start an instance container of the target service according to the image file, and provide the target service to the outside through the instance container.
The instance container is a virtual container for running the instance, and as a resource sharing mode, the instance container can provide great flexibility. And creating an instance container and starting the instance container according to the image file, processing the program file into a service which can be called externally and internally in the process of creating the instance container, merging the processed service into a previous running environment image, and finally packaging the processed service into a basic image of the current running environment containing the target service, so that the target service can be provided externally through the instance container in the current running environment.
And based on containerized instance deployment, flexible service can be realized, an elastic capacity expansion and reduction function is provided, the response condition of the target service and the resource overhead condition of each instance container can be monitored, and finally the number of containers can be dynamically adjusted. Alternatively, the example container may be a Docker container or a Mesos container.
Optionally, in this embodiment of the present invention, the obtaining module 501 includes:
and the receiving submodule is used for receiving the program file and the configuration file of the target service when the attribute of the target service is the model class package or the application class package.
Optionally, in an embodiment of the present invention, the apparatus further includes:
and the format conversion module is used for converting the program file into a first format file conforming to a Remote Procedure Call (RPC) interface protocol and a second format file conforming to a hypertext transfer protocol (HTTP) interface protocol according to the configuration file.
Optionally, in this embodiment of the present invention, when the attribute of the target service is the model class package, the program file is a model file of the target service, and the model file includes: computational graphs, data flows, variable parameters, and/or signatures; a model usage framework, an interface specification, and/or configuration parameters of the target service for the model file;
when the attribute of the target service is the application class package, the program file is a program code file of the target service; the configuration file includes: a code language format and/or a dependency package of the program code file.
Optionally, in this embodiment of the present invention, the obtaining module 501 includes:
the acquisition submodule is used for acquiring atomic services to be arranged and configuration information of the atomic services when an arrangement request for arranging the services is received;
generating a calculation graph configuration file according to the atomic service and the configuration information;
and acquiring a preset program file of the atomic service, combining the program files of the atomic service to obtain a program file of the target service, and combining to obtain a configuration file of the target service according to the configuration information.
Optionally, in this embodiment of the present invention, the service publishing module 503 includes:
the first creating submodule is used for creating and starting the instance container of the target service according to the configuration number of the instance container included in the configuration file;
or
And the second creating submodule is used for creating and starting a preset number of example containers of the target service.
Optionally, in an embodiment of the present invention, the apparatus further includes:
and the resource allocation module is used for allocating Graphics Processing Unit (GPU) video memory resources to the instance container according to the configuration file and monitoring GPU state information of the instance container.
In the above embodiment of the present invention, the obtaining module 501 obtains the program file and the configuration file of the target service; the file generating module 502 generates an image file of the target service according to the program file and the configuration file, and the service publishing module 503 creates and starts an instance container of the target service according to the image file, creates an environment image of the target service through the instance container, and provides the target service to the outside. In the embodiment of the invention, the arranging and releasing processes of the services are simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of diversified models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high.
Fig. 6 is a schematic structural diagram of an electronic device according to yet another embodiment of the present invention.
Referring to fig. 6, an embodiment of the present invention provides an electronic device, which includes a memory (memory) 61, a processor (processor) 62, a bus 63, and a computer program stored in the memory 61 and running on the processor. The memory 61 and the processor 62 complete communication with each other through the bus 63.
The processor 62 is configured to call the program instructions in the memory 61 to implement the method as provided in the above-mentioned embodiment of the present invention when executing the program.
In another embodiment, the processor, when executing the program, implements the method of:
acquiring a program file and a configuration file of a target service;
generating a mirror image file of the target service according to the program file and the configuration file;
and creating and starting an instance container of the target service according to the image file, and providing the target service to the outside through the instance container.
The electronic device provided in the embodiment of the present invention may be configured to execute the program corresponding to the method in the embodiment of the method, and details of this implementation are not described again.
According to the electronic equipment provided by the embodiment of the invention, the program file and the configuration file of the target service are acquired; and generating an image file of the target service according to the program file and the configuration file, creating and starting an instance container of the target service according to the image file, creating an environment image of the target service operation through the instance container, and providing the target service to the outside. In the embodiment of the invention, the arranging and releasing processes of the services are simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of diversified models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high.
A further embodiment of the invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the method as provided in the above-described embodiments of the invention.
In another embodiment, the program when executed by a processor implements a method comprising:
acquiring a program file and a configuration file of a target service;
generating a mirror image file of the target service according to the program file and the configuration file;
and creating and starting an instance container of the target service according to the image file, and providing the target service to the outside through the instance container.
In the non-transitory computer-readable storage medium provided in the embodiment of the present invention, when the program is executed by the processor, the method in the embodiment of the method is implemented, and details of the implementation are not repeated.
The non-transitory computer readable storage medium provided by the embodiment of the invention obtains a program file and a configuration file of a target service; and generating an image file of the target service according to the program file and the configuration file, creating and starting an instance container of the target service according to the image file, creating an environment image of the target service operation through the instance container, and providing the target service to the outside. In the embodiment of the invention, the service arrangement and release process is simple, the efficiency is higher, the method is suitable for various types of services, unified service encapsulation is provided, a standardized service calling interface is provided for the outside, and the standardized encapsulation and agile release of various models are realized; and based on containerized flexible service, the resources and the capacity of the instance container can be dynamically adjusted, the flexible scheduling of the use of target service resources is realized, and the maintainability is high.
Yet another embodiment of the present invention discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-described method embodiments, for example, comprising:
when a failure switching-back request sent by a terminal is received, determining a target cell corresponding to the failure switching-back request; the failure back-cut request is a back-built request after the terminal executes the enhanced single wireless voice call continuity (ESRVCC) switching execution failure;
determining a number of requests corresponding to failed handoff requests of the target cell;
and when the request times reach a preset threshold value of the target cell, stopping forwarding the ESRVCC switching request aiming at the target cell.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A method for service delivery, comprising:
acquiring a program file and a configuration file of a target service;
generating a mirror image file of the target service according to the program file and the configuration file;
according to the mirror image file, creating and starting an instance container of the target service, and providing the target service for the outside through the instance container;
the step of obtaining the program file and the configuration file of the target service comprises the following steps:
when an arrangement request for arranging service is received, acquiring atomic service to be arranged in the arrangement request and configuration information of the atomic service;
generating a calculation graph configuration file according to the atomic service and the configuration information, wherein the calculation graph configuration file comprises an atomic service list required by a target service and configuration information of each atomic service;
acquiring a preset program file of the atomic service based on the calculation graph configuration file, combining the program files of the atomic service to obtain a program file of the target service, and combining to obtain a configuration file of the target service according to the configuration information;
the step of obtaining the program file and the configuration file of the target service comprises the following steps:
when the attribute of the target service is model class packaging or application class packaging, receiving a program file and a configuration file of the target service;
when the attribute of the target service is the model class package, the program file is a model file of the target service, and the model file includes: computational graphs, data flows, variable parameters, and/or signatures; a model usage framework, an interface specification, and/or configuration parameters of the target service for the model file;
when the attribute of the target service is the application class package, the program file is a program code file of the target service; the configuration file includes: a code language format and/or a dependency package of the program code file.
2. The method of claim 1, wherein after the step of receiving the program file and the configuration file of the target service, the method further comprises:
and converting the program file into a first format file conforming to a Remote Procedure Call (RPC) interface protocol and a second format file conforming to a hypertext transfer protocol (HTTP) interface protocol according to the configuration file.
3. The method of claim 1, wherein the step of creating and launching the instance container of the target service comprises:
creating and starting the instance container of the target service according to the configuration number of the instance container included in the configuration file;
or
And creating and starting a preset number of example containers of the target service.
4. The method of claim 1, wherein after the step of creating and initiating the instance container for the target service, the method further comprises:
and distributing Graphics Processing Unit (GPU) video memory resources for the instance container according to the configuration file, and monitoring GPU state information of the instance container.
5. A service announcement apparatus, comprising:
the acquisition module is used for acquiring a program file and a configuration file of the target service;
the file generation module is used for generating a mirror image file of the target service according to the program file and the configuration file;
the service publishing module is used for creating and starting an instance container of the target service according to the mirror image file and providing the target service to the outside through the instance container;
the acquisition module comprises an acquisition submodule for:
when an arrangement request for service arrangement is received, acquiring atomic services to be arranged in the arrangement request and configuration information of the atomic services;
generating a calculation graph configuration file according to the atomic service and the configuration information, wherein the calculation graph configuration file comprises an atomic service list required by a target service and configuration information of each atomic service;
acquiring a preset program file of the atomic service based on the calculation graph configuration file, combining the program files of the atomic service to obtain a program file of the target service, and combining to obtain a configuration file of the target service according to the configuration information;
the acquisition module includes:
the receiving submodule is used for receiving a program file and a configuration file of the target service when the attribute of the target service is model class packaging or application class packaging;
when the attribute of the target service is the model class package, the program file is a model file of the target service, and the model file includes: computational graphs, data flows, variable parameters, and/or signatures; a model usage framework, an interface specification, and/or configuration parameters of the target service for the model file;
when the attribute of the target service is the application class package, the program file is a program code file of the target service; the configuration file includes: a code language format and/or a dependency package of the program code file.
6. An electronic device, comprising a memory, a processor, a bus, and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the service distribution method according to any one of claims 1 to 4 when executing the program.
7. A non-transitory computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implements the steps in the service publishing method as claimed in any one of claims 1 to 4.
CN201810856676.9A 2018-07-31 2018-07-31 Service publishing method and device Active CN110780914B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810856676.9A CN110780914B (en) 2018-07-31 2018-07-31 Service publishing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810856676.9A CN110780914B (en) 2018-07-31 2018-07-31 Service publishing method and device

Publications (2)

Publication Number Publication Date
CN110780914A CN110780914A (en) 2020-02-11
CN110780914B true CN110780914B (en) 2022-12-27

Family

ID=69382827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810856676.9A Active CN110780914B (en) 2018-07-31 2018-07-31 Service publishing method and device

Country Status (1)

Country Link
CN (1) CN110780914B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111367534B (en) * 2020-03-19 2023-05-09 远光软件股份有限公司 Service arrangement method and system based on application environment
CN111857653A (en) * 2020-05-26 2020-10-30 伏羲科技(菏泽)有限公司 Micro service management method and device
CN112015372B (en) * 2020-07-24 2022-12-23 北京百分点科技集团股份有限公司 Heterogeneous service arranging method, processing method and device and electronic equipment
CN111913715A (en) * 2020-07-30 2020-11-10 上海数策软件股份有限公司 Micro-service based machine learning automation process management and optimization system and method
CN111917586A (en) * 2020-08-07 2020-11-10 北京凌云雀科技有限公司 Container bandwidth adjusting method, server and storage medium
CN112288096A (en) * 2020-10-22 2021-01-29 济南浪潮高新科技投资发展有限公司 Rapid building and releasing method for machine learning model mirror image based on rapid machine learning model
CN112527357A (en) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 Service hot loading updating method and device, computer equipment and storage medium
CN114691231A (en) * 2020-12-29 2022-07-01 深圳云天励飞技术股份有限公司 Data flow arrangement method and device, readable storage medium and terminal equipment
CN112817581A (en) * 2021-02-20 2021-05-18 中国电子科技集团公司第二十八研究所 Lightweight intelligent service construction and operation support method
CN113407347B (en) * 2021-06-30 2023-02-24 北京百度网讯科技有限公司 Resource scheduling method, device, equipment and computer storage medium
CN113703784A (en) * 2021-08-25 2021-11-26 上海哔哩哔哩科技有限公司 Data processing method and device based on container arrangement
CN113918232A (en) * 2021-09-07 2022-01-11 深圳云天励飞技术股份有限公司 Method, device, server and storage medium for calling algorithm service
CN114003248B (en) * 2021-10-29 2022-11-25 深圳萨摩耶数字科技有限公司 Model management method and device, electronic equipment and storage medium
CN115618239B (en) * 2022-12-16 2023-04-11 四川金信石信息技术有限公司 Management method, system, terminal and medium for deep learning framework training

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677356A (en) * 2016-01-11 2016-06-15 上海雷腾软件股份有限公司 Operation and maintenance method and device
CN105959138A (en) * 2016-04-29 2016-09-21 深圳前海大数点科技有限公司 Micro-service dynamic disposition system and method based on cloud calculation
CN108052333A (en) * 2017-12-11 2018-05-18 北京紫优能源科技有限公司 A kind of power scheduling centralized control system standardization Automation arranging method and framework
CN108279892A (en) * 2018-02-27 2018-07-13 郑州云海信息技术有限公司 It is a kind of to split the method, apparatus and equipment that large-scale application service is micro services

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008017001A2 (en) * 2006-08-02 2008-02-07 Moka5, Inc. Sharing live appliances
CN101557297B (en) * 2009-05-14 2011-06-22 中兴通讯股份有限公司 Internet-based open telecommunication service generation system and method thereof
US8627309B2 (en) * 2010-02-25 2014-01-07 Microsoft Corporation Automated deployment and servicing of distributed applications
CN102999338A (en) * 2012-11-20 2013-03-27 北京思特奇信息技术股份有限公司 Business development method and device
CN104954232A (en) * 2014-03-28 2015-09-30 杭州华为企业通信技术有限公司 Method and device for service combination in network
CN105306542B (en) * 2015-09-25 2018-12-14 北京奇艺世纪科技有限公司 A kind of system for integrating web service
US10402181B2 (en) * 2016-07-18 2019-09-03 Airwatch Llc Generating and optimizing deployment configurations for enrolled devices
US10095539B2 (en) * 2016-07-25 2018-10-09 International Business Machines Corporation Automated data structure-driven orchestration of complex server provisioning tasks
CN106775912A (en) * 2016-12-15 2017-05-31 广州视源电子科技股份有限公司 Software distribution method and system
CN107092489B (en) * 2017-04-13 2020-06-30 中国联合网络通信集团有限公司 Processing method and system based on application version release
CN108008961A (en) * 2017-11-30 2018-05-08 郑州云海信息技术有限公司 The mirror image management method and system of a kind of PaaS platform
CN108306866A (en) * 2018-01-16 2018-07-20 厦门明延科技有限公司 A kind of Enterprise Service Bus platform and data analysing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677356A (en) * 2016-01-11 2016-06-15 上海雷腾软件股份有限公司 Operation and maintenance method and device
CN105959138A (en) * 2016-04-29 2016-09-21 深圳前海大数点科技有限公司 Micro-service dynamic disposition system and method based on cloud calculation
CN108052333A (en) * 2017-12-11 2018-05-18 北京紫优能源科技有限公司 A kind of power scheduling centralized control system standardization Automation arranging method and framework
CN108279892A (en) * 2018-02-27 2018-07-13 郑州云海信息技术有限公司 It is a kind of to split the method, apparatus and equipment that large-scale application service is micro services

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
【架构师】微服务架构--REST与RPC;老莫8;《https://blog.csdn.net/laomo_bible/article/details/79677677》;20180324;第1-3页 *

Also Published As

Publication number Publication date
CN110780914A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
CN110780914B (en) Service publishing method and device
CN108510082B (en) Method and device for processing machine learning model
CN106789339B (en) Distributed cloud simulation method and system based on lightweight virtualization framework
CN103279390B (en) A kind of parallel processing system (PPS) towards little optimization of job
CN109117252B (en) Method and system for task processing based on container and container cluster management system
CN104536937A (en) Big data appliance realizing method based on CPU-GPU heterogeneous cluster
CN105808341B (en) A kind of methods, devices and systems of scheduling of resource
CN112581578A (en) Cloud rendering system based on software definition
CN112860403A (en) Cluster load resource scheduling method, device, equipment, medium and product
CN111984385A (en) Task scheduling method and task scheduling device based on decorative BIM model
CN114327399A (en) Distributed training method, apparatus, computer device, storage medium and product
CN115879323B (en) Automatic driving simulation test method, electronic equipment and computer readable storage medium
CN110580527B (en) Method and device for generating universal machine learning model and storage medium
CN103780640B (en) A kind of multimedia cloud computing emulation mode
CN111459621A (en) Cloud simulation integration and scheduling method and device, computer equipment and storage medium
CN114924851A (en) Training task scheduling method and device, electronic equipment and storage medium
Justino et al. Outsourcing resource-intensive tasks from mobile apps to clouds: Android and aneka integration
CN110532060A (en) A kind of hybrid network environmental data collecting method and system
CN115686805A (en) GPU resource sharing method and device, and GPU resource sharing scheduling method and device
CN116402318B (en) Multi-stage computing power resource distribution method and device for power distribution network and network architecture
CN108829516B (en) Resource virtualization scheduling method for graphic processor
CN112817581A (en) Lightweight intelligent service construction and operation support method
CN114327856A (en) Data processing method and device, electronic equipment and storage medium
CN117076057B (en) AI service request scheduling method, device, equipment and medium
CN116991558B (en) Computing power resource scheduling method, multi-architecture cluster, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant