CN112036558A

CN112036558A - Model management method, electronic device, and medium

Info

Publication number: CN112036558A
Application number: CN201910483980.8A
Authority: CN
Inventors: 李雨洺
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2019-06-04
Filing date: 2019-06-04
Publication date: 2020-12-04

Abstract

The model management method is applied to a server side and comprises the steps of receiving a calling request from a user side, determining a target model adaptive to hardware information from a plurality of models corresponding to identification information, and sending the target model to the user side, wherein the calling request comprises the hardware information of the user side and the identification information of the called model. The present disclosure also provides an electronic device and a computer-readable storage medium.

Description

Model management method, electronic device, and medium

Technical Field

The present disclosure relates to a model management method, an electronic device, and a medium.

Background

Deep learning, which is an artificial intelligence technique developed in recent years, uses a plurality of neural network layers (convolution, cyclic neural network, etc.), and is large in calculation amount, but has a much higher accuracy than the image algorithm of the previous artificial feature extraction. Deep learning can be divided into 2 parts of training and reasoning, the reasoning and training framework of the traditional deep learning is kept consistent, for example, the caffe training is used for reasoning, and the tenserflow training is used for reasoning. But this may be inefficient for hardware vendors who wish to implement some accelerated inference framework based on their hardware characteristics.

It is known that a large number of hardware vendors have launched reasoning acceleration frameworks. Their implementation consists in transforming the popular framework model into its own proprietary model (since the hardware cannot directly support the optimization of all generic frameworks) and then reasoning through its own accelerated reasoning SDK. For example, the openvino of intel can support the reasoning acceleration of intel cpu, gpu and embedded device movidius. Tensorrt, NVIDIA, supports a model for converting caffe, tensoflow, onnx into tensorrt.

The above various reasoning acceleration platforms bring great engineering difficulty for the reasoning of the equipment end. When an intelligent application (including an application for deep learning inference) is developed, an upper-layer application directly calls an inference API (application programming interface), so that the expandability and compatibility of the application are greatly limited.

Disclosure of Invention

One aspect of the present disclosure provides a model management method applied to a server, the method including receiving a call request from a client, where the call request includes hardware information of the client and identification information of a called model, determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information, and sending the target model to the client.

Optionally, before receiving the call request, the method further includes, in response to obtaining the registration request, respectively converting the models to be registered into a plurality of models adapted to different hardware, and allocating identification information to the models to be registered.

Optionally, the determining, from the plurality of models corresponding to the identification information, the target model adapted to the hardware information includes determining, when the hardware information indicates that the user side has a plurality of pieces of hardware, one piece of hardware from the plurality of pieces of hardware as the target hardware based on processing costs and loads of the plurality of pieces of hardware, and determining, from the plurality of models corresponding to the identification information, the target model adapted to the target hardware.

Optionally, the method further includes obtaining a delegation request, receiving data to be processed based on the delegation request, processing the data to be processed through the target model to obtain a processing result, and sending the processing result.

Optionally, the invocation request further includes verification information, and the method further includes refusing to send the target model if the verification information fails to verify.

Another aspect of the present disclosure provides a model management method applied to a client, the method including sending a call request to a server, the call request including hardware information of the client and identification information of a called model, and receiving a target model from the server, the target model including a model adapted to the hardware information among a plurality of models corresponding to the identification information.

Optionally, the method further includes sending a registration request to the server, for registering the model of the user side to the server.

Optionally, the method further includes sending a delegation request for delegating the server to process the data to be processed, and receiving a processing result.

Optionally, the invocation request further includes check information, the data of the target model is divided into a first part and a second part, and the method further includes caching at most the data of the first part and not caching the data of the second part.

Another aspect of the present disclosure provides a model management apparatus including a first receiving module, a determining module, and a first transmitting module. The first receiving module is used for receiving a calling request from a user side, wherein the calling request comprises hardware information of the user side and identification information of a called model. And the determining module is used for determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information. And the first sending module is used for sending the target model to the user side.

Optionally, the apparatus further includes a first registration module, configured to, in response to obtaining the registration request, respectively convert the models to be registered into a plurality of models adapted to different hardware, and allocate identification information to the models to be registered.

Optionally, the determining module comprises a first determining submodule and a second determining submodule. A first determining sub-module, configured to determine, when the hardware information indicates that the user side has multiple pieces of hardware, one piece of hardware from the multiple pieces of hardware as a target piece of hardware based on processing costs and loads of the multiple pieces of hardware. And the second determining submodule is used for determining a target model adapted to the target hardware from a plurality of models corresponding to the identification information.

Optionally, the apparatus further includes an obtaining module, a second receiving module, a processing module, and a second sending module. An obtaining module to obtain the delegation request. And the second receiving module is used for receiving the data to be processed based on the entrusting request. And the processing module is used for processing the data to be processed through the target model to obtain a processing result. And the second sending module is used for sending the processing result.

Optionally, the invocation request further includes verification information, and the apparatus further includes a rejection module, configured to reject sending of the target model when verification of the verification information fails.

Another aspect of the present disclosure provides a model management apparatus including a third transmitting module and a third receiving module. And the third sending module is used for sending a calling request to the server, wherein the calling request comprises the hardware information of the user side and the identification information of the called model. And a third receiving module, configured to receive an object model from a server, where the object model includes a model that is adapted to the hardware information among multiple models corresponding to the identification information.

Optionally, the apparatus further includes a second registration module, configured to send a registration request to a server, where the registration request is used to register the model of the user side with the server.

Optionally, the apparatus further includes a fourth sending module and a fourth receiving module. And the fourth sending module is used for sending the delegation request and delegating the server to process the data to be processed. And the fourth receiving module is used for receiving the processing result.

Optionally, the call request further includes check information, the data of the target model is divided into a first part and a second part, and the apparatus further includes a cache module configured to cache at most the data of the first part and not cache the data of the second part.

Another aspect of the disclosure provides an electronic device comprising a processor and a memory. The memory has stored thereon a computer program which, when executed by the processor, causes the processor to perform the method as described above.

Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.

Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.

According to the method, the identification information of the model is called at the server side instead of directly calling a specific model, hardware independence can be achieved by upper-layer application, and the server side can select a corresponding optimization model for the model according to the hardware information of a calling party.

Drawings

For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

FIG. 1 schematically shows a schematic diagram of an application scenario of a model management method according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow diagram of a model management method according to an embodiment of the disclosure;

fig. 3A schematically illustrates a flowchart of determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information according to an embodiment of the present disclosure;

FIG. 3B schematically shows a flow diagram for accepting delegated processing data according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow diagram of a model management method according to another embodiment of the present disclosure;

FIG. 5 schematically illustrates a block diagram of a model management apparatus according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a block diagram of a model management apparatus according to another embodiment of the present disclosure; and

fig. 7 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon for use by or in connection with an instruction execution system.

The embodiment of the disclosure provides a model management method applied to a server, the method includes receiving a calling request from a user side, the calling request including hardware information of the user side and identification information of a called model, determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information, and sending the target model to the user side.

Fig. 1 schematically shows a schematic diagram of an application scenario of a model management method according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a scenario in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1, the system architecture 100 according to the embodiment may include

clients

101, 102, 103, a network 104 and a server 105. The network 104 is used to provide a medium for communication links between the

clients

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

user terminals

101, 102, 103 to interact with the service terminal 105 through the network 104 to receive or send messages or the like. The

clients

101, 102, 103 may have various messaging client applications installed thereon, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

user terminals

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

clients

101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the user side.

According to the embodiment of the present disclosure, the

user terminals

101, 102, 103 may register the model to the server terminal 105, or may obtain the model from the server terminal 105 when the model is called. The method of the embodiment of the present disclosure realizes that the

user side

101, 102, 103 calls the model from the server side 105 conveniently.

It should be noted that the methods described in fig. 2, fig. 3A and fig. 3B below may be executed by the server 105, and accordingly, the model management apparatus described in fig. 5 below may be generally disposed in the server 105. The methods described in fig. 2, 3A and 3B below can also be performed by a server different from the server 105 and capable of communicating with the

clients

101, 102, 103 and/or the server 105. Accordingly, the model management apparatus described in fig. 5 below can also be disposed in a server different from the server 105 and capable of communicating with the

clients

101, 102, 103 and/or the server 105.

The method described in the following fig. 4 may be performed by the

user terminals

101, 102, 103, for example, and accordingly, the model management apparatus described in the following fig. 6 may be generally disposed in the

user terminals

101, 102, 103. The method described in fig. 4 below can also be performed by a user terminal different from the

user terminals

101, 102, 103 and capable of communicating with the

user terminals

101, 102, 103 and/or the server terminal 105. Accordingly, the model management apparatus described in fig. 6 below can also be disposed in a user terminal different from the

user terminals

101, 102, 103 and capable of communicating with the

user terminals

101, 102, 103 and/or the service terminal 105.

It should be understood that the number of clients, networks, and servers in fig. 1 is merely illustrative. There may be any number of clients, networks, and servers, as desired for an implementation.

FIG. 2 schematically shows a flow diagram of a model management method according to an embodiment of the disclosure.

As shown in fig. 2, the method is applied to the server, and includes operations S210 to S230.

In operation S210, a call request from a user side is received, where the call request includes hardware information of the user side and identification information of a called model.

In operation S220, a target model adapted to the hardware information is determined from a plurality of models corresponding to the identification information.

In operation S230, the target model is transmitted to the user terminal.

According to the embodiment of the disclosure, the method can be implemented as a uniform inference abstraction layer, which is used as a multi-model platform for registering models and providing model calls.

For example, if an upper layer application on the user side needs to call a certain face detection model, for example, identification information of the face detection model, such as a Universally Unique Identifier (UUID), is sent, and the model is called through the identification information. In contrast, the related art must call a specific model, and must provide explicit address information for storing the model when calling to complete the call.

When the call request is sent, the hardware information of the user side can be sent, wherein the hardware information comprises a hardware type, a hardware manufacturer and the like. Alternatively, the hardware information may be information described separately, or may be any feature information capable of reflecting hardware information of different hardware types or hardware vendors, for example, form feature information that can be used to distinguish different hardware information in the form of a call request.

According to the embodiment of the disclosure, after receiving the call request, the server determines a model, such as a certain face detection model, that the user terminal needs to call according to the identification information.

According to the embodiment of the disclosure, before receiving the call request, the method further includes, in response to obtaining the registration request, respectively converting the models to be registered into a plurality of models adapted to different hardware, and allocating identification information to the models to be registered.

For example, a client A may submit model X to the server₀Requesting to be X₀And registering to the server. The model X₀Can be a model that has been optimized on the hardware of the user side a, but if X is to be used₀And when the method runs on other hardware, optimal optimization cannot be realized, and even incompatibility exists. After receiving the registration request, the server in the embodiment of the disclosure may use the conversion components provided by different hardware manufacturers to convert the model X into the model X₀Conversion to multiple models X for multiple hardware platforms₁，X₂，X₃，……，X_nAnd distributing identification information for the group of models to complete the registration of the models.

Therefore, after the server determines the model to be called by the user side according to the identification information, the server can further determine a suitable model from the plurality of models according to the hardware information of the user side to serve as the target model to return to the user side.

The model of the disclosed embodiment is no longer bound to specific hardware, and the identification information describes only one capability to the application, such as image recognition capability, natural language processing capability, and the like. When the application program of the user terminal calls the capability, the server terminal selects the model file which is most suitable for the current hardware, thereby improving the compatibility of the system.

Fig. 3A schematically illustrates a flowchart of determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information according to an embodiment of the present disclosure.

As shown in fig. 3A, the method includes operations S311 to S312.

In operation S311, when the hardware information indicates that the user side has a plurality of pieces of hardware, one piece of hardware is determined as a target piece of hardware from the plurality of pieces of hardware based on processing costs and loads of the plurality of pieces of hardware.

In operation S312, a target model adapted to the target hardware is determined from a plurality of models corresponding to the identification information.

According to the embodiment of the present disclosure, the client may have multiple pieces of hardware, for example, multiple GPUs and multiple CPUs, and the server may store information about processing costs for different pieces of hardware to run corresponding models, for example, for calling different models of the same identification information, time consumed for completing processing using first hardware is a second duration, time consumed for completing processing using second hardware is the second duration, and if the first duration is smaller than the second duration, the first hardware is used in preference to the second hardware, and a model corresponding to hardware information of the first hardware may be selected for the client.

According to the embodiment of the present disclosure, the corresponding hardware can also be selected based on the loads generated by the processing tasks of different hardware operation models. For example, for calls to different models of the same identifying information, using a first hardware process will result in a first percentage of load on the first hardware and using a second hardware process will result in a second percentage of load on the second hardware. If the first percentage is less than the second percentage, the influence on the user terminal caused by the operation on the first hardware is small, the first hardware is used in preference to the second hardware, and a model corresponding to the hardware information of the first hardware can be selected for the user terminal.

The traditional design is that the application call is fixed on certain hardware, if the load of the hardware of a certain user side is very high, the requirement cannot be met, but other hardware is in an idle state.

According to the embodiment of the disclosure, the client can also submit actual information indicating the current actual load conditions of the plurality of hardware in the request, and the server can select the model file suitable for running the hardware and corresponding to the hardware for the client according to the actual load conditions of the plurality of hardware of the client. For example, the first hardware is in a nearly full state and the second hardware is in a relatively idle state, and the server may select a model corresponding to the hardware information of the second hardware to send to the client, so that the client performs processing through the second hardware.

FIG. 3B schematically shows a flow diagram for accepting delegated processing data according to an embodiment of the present disclosure.

As shown in fig. 3B, the method includes operations S321 to S324.

In operation S321, a delegation request is obtained.

In operation S322, data to be processed is received based on the delegation request.

In operation S323, the data to be processed is processed by the target model, and a processing result is obtained.

In operation S324, the processing result is transmitted.

According to the embodiment of the disclosure, when hardware of the user side is not enough to complete the processing task well, a request can be sent to the server side, the request is executed on the server side, and only the processing result is returned to the user side.

Conventional model encryption may be implemented through hardware encryption. Since most hardware is highly customized equipment and the model encryption of the hardware is highly related to the hardware, the traditional hardware encryption has poor universality. According to the embodiment of the present disclosure, the calling request further includes verification information, and the method further includes refusing to send the target model when the verification information fails to be verified. The server side of the embodiment of the disclosure can also encrypt the model, solve the problem of encryption of the model of the server side, and prevent data leakage.

FIG. 4 schematically shows a flow diagram of a model management method according to another embodiment of the present disclosure.

As shown in fig. 4, the method is applied to the user side, and includes operations S410 and S420.

In operation S410, a call request is sent to a server, where the call request includes hardware information of the client and identification information of a called model.

In operation S420, an object model including a model adapted to the hardware information among a plurality of models corresponding to the identification information is received from a server.

According to the embodiment of the disclosure, the method further includes sending a registration request to the server, for registering the model of the user side to the server.

According to the embodiment of the present disclosure, the call request may further include actual load information of a plurality of pieces of hardware.

According to the embodiment of the disclosure, the method further comprises sending a delegation request for delegating the server to process the data to be processed, and receiving a processing result. For example, when the hardware of the client is not enough to complete the processing task well, a request may be sent to the server to request the server to execute the call, and only the processing result is returned to the client.

According to an embodiment of the present disclosure, the call request further includes check information, the data of the target model is divided into a first part and a second part, and the method further includes caching at most the data of the first part and not caching the data of the second part. According to the embodiment of the disclosure, the user side can partially cache the received model so as to solve the problem of low efficiency caused by the fact that the model needs to be completely transmitted every time when the model is called for many times. However, the method of the embodiment of the present disclosure limits that only part of data, not all data of the model, can be cached at a time, since the model at the user side may leak without being protected, and caching only the first part of the model does not cause model leakage, the second part must be obtained from the server side through the verification information every time the model is used, so as to protect the model.

Fig. 5 schematically shows a block diagram of a model management apparatus 500 according to an embodiment of the present disclosure.

As shown in fig. 5, the model management apparatus 500 includes a first receiving module 510, a determining module 520, and a first transmitting module 530. The model management device 500 may perform the various methods described above with reference to fig. 2, 3A, and 3B.

The first receiving module 510, for example, performs operation S210 described with reference to fig. 2 above, for receiving a call request from a user side, where the call request includes hardware information of the user side and identification information of a called model.

The determining module 520, for example, performs operation S220 described with reference to fig. 2 above, for determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information.

The first sending module 530, for example, performs the operation S230 described with reference to fig. 2 above, for sending the target model to the user end.

According to the embodiment of the present disclosure, the apparatus 500 may further include a first registration module, configured to, in response to obtaining a registration request, respectively convert models to be registered into a plurality of models adapted to different hardware, and allocate identification information to the models to be registered.

According to an embodiment of the present disclosure, the determination module 520 may include a first determination submodule and a second determination submodule.

The first determining sub-module, for example, performs operation S311 described with reference to fig. 3A above, and is configured to determine one hardware from the plurality of hardware as the target hardware based on processing costs and loads of the plurality of hardware when the hardware information indicates that the user side has the plurality of hardware.

A second determining sub-module, for example, performs operation S312 described with reference to fig. 3A above, for determining a target model adapted to the target hardware from the plurality of models corresponding to the identification information.

According to the embodiment of the present disclosure, the apparatus 500 may further include an obtaining module, a second receiving module, a processing module, and a second sending module.

The obtaining module, for example, performs operation S321 described with reference to fig. 3B above, for obtaining the delegation request.

The second receiving module, for example, performs operation S322 described with reference to fig. 3B above, and is configured to receive the data to be processed based on the delegation request.

The processing module, for example, executes the operation S323 described with reference to fig. 3B above, for processing the data to be processed by the target model, so as to obtain a processing result.

The second sending module, for example, executes the operation S324 described with reference to fig. 3B above, for sending the processing result.

According to the embodiment of the present disclosure, the invocation request further includes verification information, and the apparatus 500 may further include a rejecting module, configured to reject sending of the target model when the verification information fails to be verified.

FIG. 6 schematically shows a block diagram of a model management apparatus 600 according to an embodiment of the present disclosure.

As shown in fig. 6, the model management apparatus 600 includes a third transmitting module 610 and a third receiving module 620. The model management apparatus 600 may perform the various methods described above with reference to fig. 4.

A third sending module, for example, executing operation S410 described with reference to fig. 4 above, is configured to send a call request to the server, where the call request includes the hardware information of the client and the identification information of the called model.

A third receiving module, for example, executing operation S420 described with reference to fig. 4 above, is configured to receive an object model from the server, where the object model includes a model adapted to the hardware information in a plurality of models corresponding to the identification information.

According to the embodiment of the present disclosure, the apparatus 600 may further include a second registration module, configured to send a registration request to the server, for registering the model of the user side to the server.

According to the embodiment of the present disclosure, the apparatus 600 may further include a fourth sending module and a fourth receiving module. And the fourth sending module is used for sending the delegation request and delegating the server to process the data to be processed. And the fourth receiving module is used for receiving the processing result.

According to the embodiment of the present disclosure, the call request further includes check information, the data of the target model is divided into a first part and a second part, and the apparatus 600 may further include a caching module configured to cache at most the data of the first part and not cache the data of the second part.

Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to the embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or the same in any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in any suitable combination of any several of them. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.

For example, a plurality of modules among the first receiving module 510, the determining module 520, the first transmitting module 530, the first registering module, the first determining sub-module, the second determining sub-module, the obtaining module, the second receiving module, the processing module, the second transmitting module, and the rejecting module may be combined into one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first receiving module 510, the determining module 520, the first sending module 530, the first registering module, the first determining submodule, the second determining submodule, the obtaining module, the second receiving module, the processing module, the second sending module, and the rejecting module may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or by a suitable combination of any of them. Alternatively, at least one of the first receiving module 510, the determining module 520, the first sending module 530, the first registering module, the first determining sub-module, the second determining sub-module, the obtaining module, the second receiving module, the processing module, the second sending module and the rejecting module may be at least partially implemented as a computer program module which, when executed, may perform a corresponding function.

For another example, a plurality of the third sending module 610, the third receiving module 620, the second registering module, the fourth sending module, the fourth receiving module, and the caching module may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to the embodiment of the present disclosure, at least one of the third sending module 610, the third receiving module 620, the second registering module, the fourth sending module, the fourth receiving module, and the caching module may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or by a suitable combination of any several of them. Alternatively, at least one of the third transmitting module 610, the third receiving module 620, the second registering module, the fourth transmitting module, the fourth receiving module, and the caching module may be at least partially implemented as a computer program module, which may perform a corresponding function when executed.

FIG. 7 schematically illustrates a block diagram of a computer system suitable for implementing the above-described method according to an embodiment of the present disclosure. The computer system illustrated in FIG. 7 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.

As shown in fig. 7, a computer system 700 according to an embodiment of the present disclosure includes a processor 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In the RAM 703, various programs and data necessary for the operation of the system 700 are stored. The processor 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. The processor 701 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. It is noted that the programs may also be stored in one or more memories other than the ROM 702 and RAM 703. The processor 701 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

According to an embodiment of the present disclosure, the system 700 may also include an input/output (I/O) interface 705, the input/output (I/O) interface 705 also being connected to the bus 704. The system 700 may also include one or more of the following components connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by the processor 701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, a computer-readable storage medium may include the ROM 702 and/or the RAM 703 described above and/or one or more memories other than the ROM 702 and the RAM 703 in accordance with embodiments of the present disclosure.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A model management method is applied to a server side and comprises the following steps:

receiving a calling request from a user side, wherein the calling request comprises hardware information of the user side and identification information of a called model;

determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information; and

and sending the target model to the user terminal.

2. The method of claim 1, wherein prior to receiving a call request, the method further comprises:

and responding to the registration request, respectively converting the models to be registered into a plurality of models adapted to different hardware, and distributing identification information for the models to be registered.

3. The method of claim 1, wherein the determining a target model adapted to the hardware information from a plurality of models corresponding to the identification information comprises:

when the hardware information indicates that the user side has a plurality of pieces of hardware, determining one piece of hardware from the plurality of pieces of hardware as target hardware based on processing costs and loads of the plurality of pieces of hardware;

and determining a target model adapted to the target hardware from a plurality of models corresponding to the identification information.

4. The method of claim 1, further comprising:

obtaining a delegation request;

receiving data to be processed based on the entrusting request;

processing the data to be processed through the target model to obtain a processing result;

and sending the processing result.

5. The method of claim 1, wherein the invocation request further includes verification information, the method further comprising:

and refusing to send the target model under the condition that the verification of the verification information fails.

6. A model management method is applied to a user side, and comprises the following steps:

sending a calling request to a server, wherein the calling request comprises hardware information of the client and identification information of a called model; and

receiving an object model from a server, wherein the object model comprises a model which is matched with the hardware information in a plurality of models corresponding to the identification information.

7. The method of claim 6, further comprising:

and sending a registration request to the server, wherein the registration request is used for registering the model of the user side to the server.

8. The method of claim 6, further comprising:

sending a request for entrusting the server to process the data to be processed;

and receiving a processing result.

9. The method of claim 6, wherein the invocation request further includes verification information, the data of the target model is divided into a first portion and a second portion, the method further comprising:

at most buffering the first portion of data and not buffering the second portion of data.

10. An electronic device, comprising:

one or more processors;

a memory for storing one or more computer programs,

wherein the one or more computer programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1 to 9.

11. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 9.