CN113723610B

CN113723610B - Dynamic updating method, device and equipment for inference framework and readable storage medium

Info

Publication number: CN113723610B
Application number: CN202111006550.0A
Authority: CN
Inventors: 陈清山
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2021-08-30
Filing date: 2021-08-30
Publication date: 2023-07-28
Anticipated expiration: 2041-08-30
Also published as: CN113723610A

Abstract

The application discloses a dynamic updating method of an inference framework, which comprises the following steps: the method comprises the steps of obtaining an inference framework, wherein the inference framework comprises a predictor, a front data processing module, a rear data processing module, an interpreter and an Ingress module, and the front data processing module, the rear data processing module and the interpreter are optional components; receiving an update request, wherein the update request is a request for adding or deleting optional components; and modifying the routing rule of the Ingress module according to the update request so that the request is sequentially scheduled to each component of the inference framework. Therefore, the method realizes complete description of the reasoning process through the reasoning framework, modularizes the reasoning framework and sets the routing rule of the Ingress module, so that the components can be dynamically added and deleted without redeploying the whole reasoning service, and the flexibility of the reasoning process is obviously improved. In addition, the application also provides a device, equipment and a computer readable storage medium for dynamically updating the reasoning framework, and the technical effects of the device and the equipment correspond to those of the method.

Description

Dynamic updating method, device and equipment for inference framework and readable storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a computer readable storage medium for dynamically updating an inference framework.

Background

The reasoning process is an application process of the deep learning network in the actual production environment. The inference process is faced with unknown data, rather than data of the same specification, as opposed to training processes. The reasoning process requires preprocessing of the input data and similarly post-processing of the output data, which is called a transform; in the actual reasoning process, the model needs to be interpreted to understand the working principle of the model, and this interpretation process is called explainer, and a specific data flow is shown in fig. 1.

The current reasoning framework does not completely contain the whole reasoning process, only comprises a predictor, and does not comprise a transducer and an explainer. After the service is online, if the pre-post processing module or the interpreter is needed to be temporarily added or deleted, the whole reasoning service needs to be redeployed, and the gateway and the route are reset, so that the whole deployment process becomes complex and easy to make mistakes.

Disclosure of Invention

The purpose of the application is to provide a method, a device, equipment and a computer readable storage medium for dynamically updating an inference framework, which are used for solving the problem that the current inference service is difficult to flexibly add and delete components. The specific scheme is as follows:

in a first aspect, the present application provides a method for dynamically updating an inference framework, including:

the method comprises the steps of obtaining an reasoning framework, wherein the reasoning framework comprises a predictor, a front data processing module, a back data processing module, an interpreter and an Ingress module, and the front data processing module, the back data processing module and the interpreter are optional components;

receiving an update request, wherein the update request is a request for adding or deleting optional components;

and modifying the routing rule of the Ingress module according to the update request, so that the request is sequentially scheduled to each component of the reasoning framework.

Optionally, before acquiring the inference framework, the method further includes:

defining a data structure of the component, including a component type and a component configuration parameter;

a data structure of the inference framework is defined.

Optionally, when the inference framework includes only a predictor, the routing rule of the Ingress module is: a default URL is associated to the predictor to control access to the predictor via the default URL for a request.

Optionally, the modifying, according to the update request, a routing rule of the Ingress module includes:

when the reasoning framework only comprises a predictor, if the received update request is a request for adding a front data processing module and a rear data processing module, modifying the routing rule of the access module into: after deploying the front and rear data processing modules, associating a default URL to the front and rear data processing modules to control access to the front and rear data processing modules by the default URL, and then scheduling the request to a predictor URL to access the predictor.

when the reasoning framework only comprises a predictor and a front data processing module and a back data processing module, if the received update request is a request for adding an interpreter, modifying the routing rule of the Ingress module into: associating a default URL to the front and rear data processing modules to control requests to access the front and rear data processing modules through the default URL, after deployment of the interpreter, scheduling requests to the interpreter URL to access the interpreter, and then scheduling requests to the predictor URL to access the predictor.

Optionally, the request is sequentially dispatched to respective components of the inference framework, including:

when a request is received, judging whether the reasoning framework comprises a front data processing module and a rear data processing module or not;

if yes, the top mark position is 1, and a default URL is set as the URL of the front and rear data processing modules so as to control the request to pass through the front and rear data processing modules;

if not, judging whether the reasoning framework has an interpreter;

if yes, judging whether the top-layer flag bit is 1; if yes, setting a default URL as the URL of the interpreter to control the request to pass through the interpreter, otherwise, recording the port number of the interpreter to control the request to be scheduled to the interpreter after passing through the front and rear data processing modules;

judging whether the top-layer flag bit is 1; if yes, setting a default URL as the URL of a predictor to control the request to pass through the predictor, otherwise, recording the port number of the predictor to control the request to be dispatched to the predictor after passing through the front and rear data processing modules.

In a second aspect, the present application provides an inference framework dynamic update apparatus, including:

the system comprises an acquisition module, a processing module, an interpreter and an Ingress module, wherein the acquisition module is used for acquiring an reasoning framework, and the reasoning framework comprises a predictor, a front data processing module, a back data processing module, an interpreter and an Ingress module, and the front data processing module, the back data processing module and the interpreter are optional components;

the receiving module is used for receiving an update request, wherein the update request is a request for adding or deleting optional components;

and the updating module is used for modifying the routing rule of the Ingress module according to the updating request so that the request is sequentially scheduled to each component of the reasoning framework.

Optionally, the method further comprises:

a definition module for defining a data structure of the component, including a component type and a component configuration parameter; a data structure of the inference framework is defined.

In a third aspect, the present application provides an inference framework dynamic update apparatus, including:

a memory: for storing a computer program;

a processor: for executing the computer program to implement the inference framework dynamic update method as described above.

In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program for implementing the inference framework dynamic update method as described above when executed by a processor.

The method for dynamically updating the inference framework provided by the application comprises the following steps: the method comprises the steps of obtaining an inference framework, wherein the inference framework comprises a predictor, a front data processing module, a rear data processing module, an interpreter and an Ingress module, and the front data processing module, the rear data processing module and the interpreter are optional components; receiving an update request, wherein the update request is a request for adding or deleting optional components; and modifying the routing rule of the Ingress module according to the update request so that the request is sequentially scheduled to each component of the inference framework. Therefore, the method realizes complete description of the reasoning process through the reasoning framework, modularizes the reasoning framework and sets the routing rule of the Ingress module, so that the components can be dynamically added and deleted without redeploying the whole reasoning service, and the flexibility of the reasoning process is obviously improved.

In addition, the application further provides a device, equipment and a computer readable storage medium for dynamically updating the reasoning framework, and the technical effects of the device and the equipment correspond to those of the method, and are not repeated here.

Drawings

For a clearer description of embodiments of the present application or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description that follow are only some embodiments of the present application, and that other drawings may be obtained from these drawings by a person of ordinary skill in the art without inventive effort.

FIG. 1 is a data flow diagram of an inference service;

FIG. 2 is a flowchart of a first embodiment of a method for dynamically updating an inference framework provided in the present application;

fig. 3 is a schematic diagram of an inference framework in a second embodiment of a method for dynamically updating an inference framework provided in the present application;

fig. 4 is a schematic diagram of a request scheduling flow of a second embodiment of a dynamic update method of an inference framework provided in the present application;

fig. 5 is a schematic route diagram one of a second embodiment of a dynamic update method for an inference framework provided in the present application;

fig. 6 is a second schematic routing diagram of a second embodiment of a dynamic update method for an inference framework provided in the present application;

fig. 7 is a schematic routing diagram III of a second embodiment of a dynamic update method for an inference framework provided in the present application;

fig. 8 is a schematic diagram of an embodiment of a dynamic update apparatus for an inference framework provided in the present application.

Detailed Description

In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The core of the application is to provide a method, a device, equipment and a computer readable storage medium for dynamically updating an inference framework, which are modularized and request scheduling is carried out based on the routing rule of an Ingress module, so that components can be dynamically added and deleted.

An embodiment of a dynamic update method for an inference framework provided in the present application is described below, with reference to fig. 2, where the embodiment includes:

s21, acquiring an inference framework, wherein the inference framework comprises a predictor, a front data processing module, a rear data processing module, an interpreter and an Ingress module, and the front data processing module, the rear data processing module and the interpreter are optional components;

s22, receiving an update request, wherein the update request is a request for adding or deleting optional components;

s23, according to the update request, the routing rule of the Ingress module is modified, so that the request is sequentially scheduled to each component of the reasoning framework.

In this embodiment, the inference service refers to a process of applying a trained neural network to an actual task.

Specifically, a data structure of the component is predefined, wherein the data structure comprises a component type and a component configuration parameter; and defining a data structure of an inference framework, wherein the data structure comprises a predictor, a front and back data processing module, an interpreter and an Ingress module, the predictor is an essential component, and the front and back data processing module and the interpreter are optional components.

Since the front and back data processing modules and interpreters are optional components, dynamic additions and deletions may be made. The update request in this embodiment mainly refers to a request to delete or add an optional component.

The Ingress module is used for guiding the external traffic to the inside and realizing request scheduling among the three components. Externally, the whole reasoning framework has only one default URL, and in the embodiment, the Ingress module realizes request scheduling among three components in the reasoning framework according to the routing rule. Therefore, when the inference framework changes the components, the routing rules of the Ingress module need to be synchronously updated naturally.

Specifically, when the inference framework includes only predictors, the routing rule of the Ingress module is: the default URL is associated with the predictor to control access of the request to the predictor via the default URL.

When the reasoning framework only comprises a predictor, if the received update request is a request for adding a front data processing module and a rear data processing module, modifying the routing rule of the Ingress module into: after deploying the front and rear data processing modules, associating a default URL to the front and rear data processing modules to control access of the front and rear data processing modules to the request through the default URL, and then scheduling the request to the predictor URL to access the predictor.

Similarly, when the reasoning framework only includes the predictor, if the received update request is a request for adding the interpreter, the routing rule of the Ingress module is modified as follows: after deploying the interpreter, a default URL is associated to the interpreter to control the request to access the interpreter through the default URL, after which the request is scheduled to the predictor URL to access the predictor.

When the reasoning framework only comprises a predictor and a front data processing module and a back data processing module, if the received update request is a request for adding an interpreter, modifying the routing rule of the input module as follows: the default URL is associated with the front and rear data processing modules to control access of requests to the front and rear data processing modules through the default URL, after deployment of the interpreter, the request is dispatched to the interpreter URL to access the interpreter, and thereafter the request is dispatched to the predictor URL to access the predictor.

Similarly, when the reasoning framework only comprises the predictor and the interpreter, if the received update request is a request for adding the front and rear data processing modules, modifying the routing rule of the Ingress module into: after deploying the front and rear data processing modules, associating a default URL to the front and rear data processing modules to control access of requests to the front and rear data processing modules through the default URL, scheduling the requests to the interpreter URL to access the interpreter, and then scheduling the requests to the predictor URL to access the predictor.

The method for dynamically updating the reasoning framework provided by the embodiment comprises the following steps: the method comprises the steps of obtaining an inference framework, wherein the inference framework comprises a predictor, a front data processing module, a rear data processing module, an interpreter and an Ingress module, and the front data processing module, the rear data processing module and the interpreter are optional components; receiving an update request, wherein the update request is a request for adding or deleting optional components; and modifying the routing rule of the Ingress module according to the update request so that the request is sequentially scheduled to each component of the inference framework. Therefore, the method realizes complete description of the reasoning process through the reasoning framework, modularizes the reasoning framework and sets the routing rule of the Ingress module, so that the components can be dynamically added and deleted without redeploying the whole reasoning service, and the flexibility of the reasoning process is obviously improved.

The second embodiment of the dynamic update method for the inference framework provided in the present application will be described in detail below.

Specifically, the whole reasoning process is abstracted into a reasoning service, and the reasoning service comprises 3 components, namely a predictor, a transducer and an explainer. Wherein the predictor is an optional component and the transducer and the displainer are optional components.

In this embodiment, the relevant data structures are defined as follows:

the data type of the component is string; the data structure of the component includes a component type and a component configuration parameter; the data structure of the reasoning service comprises a predictor (optional component), a front-back data processing module (optional component), an interpreter (optional component) and an information module (data transitor among the three components). PodSpec, ingress PodSpec and Ingress in k8s (Kubernetes, a container organization framework), flow is shown in FIG. 4:

when a request is received, judging whether the reasoning framework comprises a front data processing module and a rear data processing module or not; if yes, the top mark position is 1, and a default URL is set as the URL of the front and rear data processing modules so as to control the request to pass through the front and rear data processing modules; otherwise, judging whether the reasoning framework has an interpreter; if yes, judging whether the top marker bit is 1; otherwise, setting the default URL as the URL of the interpreter to control the request to pass through the interpreter, otherwise, recording the port number of the interpreter to control the request to be dispatched to the interpreter after passing through the front and rear data processing modules. Judging whether the top-layer flag bit is 1; if yes, setting the default URL as the URL of the predictor to control the request to pass through the predictor, otherwise, recording the port number of the predictor to control the request to be dispatched to the predictor after passing through the front and rear data processing modules.

In the design of ingress, the inference service is abstracted into a virtual service, which serves as a top layer, and each component serves as a bottom layer. The top-level service decides the setting of the ingress rule according to the actual component situation of the reasoning service. For example, there is an inference service called Test that needs to be deployed, if the inference service has only one predictor, the ingress rule is as shown in fig. 5. The top-level service is directly associated to the predictor, which all requests can directly access through http:// test.

At this time, if it is desired to add a transducer, the ingres rule is modified after deployment of the transducer as shown in fig. 6. The top-level service http:// test.sample.com is associated to the transducer, and the request is first passed through the transducer and then dispatched to http:// test.sample.com. predict (predictor). As shown in fig. 6, when the transducer is deployed, the deployment policy of the predictor is not required to be modified, and only the service ingress rule is required to be modified after the deployment of the transducer is completed, so that the deployment difficulty and the occurrence of errors are reduced.

Similarly, when the displainer is added, the ingress rule is as shown in FIG. 7.

When the inference service includes transformer, explainer, predictor, if one component is to be removed, the same can be updated directly without affecting the other components.

The following describes an inference framework dynamic update apparatus provided in the embodiments of the present application, where the inference framework dynamic update apparatus described below and the inference framework dynamic update method described above may be referred to correspondingly with each other.

As shown in fig. 8, the inference framework dynamic updating apparatus of the present embodiment includes:

an obtaining module 81, configured to obtain an inference framework, where the inference framework includes a predictor, a front-back data processing module, an interpreter, and an input module, where the front-back data processing module and the interpreter are optional components;

a receiving module 82, configured to receive an update request, where the update request is a request for adding or deleting an optional component;

and the updating module 83 is configured to modify the routing rule of the Ingress module according to the update request, so that the request is sequentially scheduled to each component of the inference framework.

In some specific embodiments, further comprising:

The inference framework dynamic updating device in this embodiment is used to implement the foregoing inference framework dynamic updating method, so that a specific implementation of the device can be found in the foregoing embodiment part of the inference framework dynamic updating method, which is not described herein again.

In addition, the application also provides a device for dynamically updating the reasoning framework, which comprises the following steps:

a memory: for storing a computer program;

Finally, the present application provides a computer readable storage medium having stored thereon a computer program for implementing the inference framework dynamic update method as described above when executed by a processor.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The foregoing has outlined rather broadly the more detailed description of the present application and the principles and embodiments of the present application have been presented in terms of specific examples, which are provided herein to assist in the understanding of the method and core concepts of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method for dynamically updating an inference framework, comprising:

modifying the routing rule of the Ingress module according to the update request, so that the request is sequentially scheduled to each component of the reasoning framework;

when the reasoning framework only comprises a predictor and an Ingress module, the routing rule of the Ingress module is as follows: associating a default URL to the predictor to control access to the predictor via the default URL;

the modifying the routing rule of the Ingress module according to the update request includes:

when the reasoning framework only comprises a predictor and an Ingress module, if the received update request is a request of a data processing module before and after addition, modifying a routing rule of the Ingress module into: after deploying the front and rear data processing modules, associating a default URL to the front and rear data processing modules to control access to the front and rear data processing modules by the default URL, and then scheduling the request to a predictor URL to access the predictor;

when the reasoning framework only comprises a predictor, an Ingress module and a front-back data processing module, if the received update request is a request for adding an interpreter, modifying a routing rule of the Ingress module into: associating a default URL to the front and rear data processing modules to control requests to access the front and rear data processing modules through the default URL, after deployment of the interpreter, scheduling requests to the interpreter URL to access the interpreter, and then scheduling requests to the predictor URL to access the predictor.

2. The method of claim 1, further comprising, prior to obtaining the inference framework:

a data structure of the inference framework is defined.

3. The method of claim 1 or 2, wherein the request is dispatched to the various components of the inference framework in turn, comprising:

judging whether an interpreter exists in the reasoning framework;

if yes, judging whether the top-layer flag bit is 1; if yes, setting a default URL as the URL of the interpreter to control the request to pass through the interpreter, otherwise, recording the port number of the interpreter to control the request to be dispatched to the interpreter after passing through the front and rear data processing modules;

4. An inference framework dynamic update apparatus, comprising:

the updating module is used for modifying the routing rule of the input module according to the updating request so that the request is sequentially scheduled to each component of the reasoning framework;

the reasoning framework dynamic updating device is specifically used for:

the updating module is specifically configured to:

5. The apparatus as recited in claim 4, further comprising:

6. An inference framework dynamic update device, comprising:

a memory: for storing a computer program;

a processor: for executing said computer program to implement the inference framework dynamic update method as claimed in any one of claims 1 to 3.

7. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, is adapted to implement the inference framework dynamic update method of any of claims 1 to 3.