CN110764838A

CN110764838A - Service model loading method and system, electronic equipment and storage medium

Info

Publication number: CN110764838A
Application number: CN201910889057.4A
Authority: CN
Inventors: 邹亚劼; 林乐彬
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Network Technology Co.,Ltd.
Priority date: 2019-09-19
Filing date: 2019-09-19
Publication date: 2020-02-07
Anticipated expiration: 2039-09-19
Also published as: CN110764838B

Abstract

The application discloses a method and a system for loading a service model, wherein the method comprises the following steps: responding to a model pushing request at a model management layer, informing a target model server of loading model block data in the request, after the model block data is loaded successfully, storing a first mapping relation among a model name, region information and the model block data in the request at the target model server, recording a second mapping relation among the target model server, the model name and the region information at the model management layer, and issuing the second mapping relation to a routing layer; the routing layer determines a target model server corresponding to the model calculation request according to the second mapping relation and sends the address of the server to the client; and the target model server responds to the model calculation request and provides target model service to the client by utilizing the first mapping relation. The invention can load the model data according to the region and provide the model service of the corresponding region, thereby shortening the loading time of the model.

Description

Service model loading method and system, electronic equipment and storage medium

Technical Field

Embodiments of the present disclosure relate to the field of communications technologies, and in particular, to a method and a system for loading a service model, an electronic device, and a computer-readable storage medium.

Background

In the early model estimation service, the full-scale loading of the model is performed in all machines to provide service, so that the service is limited by the machine memory, and when the memory occupied by the model reaches the upper limit, the next old model is required to be used for the previous new model, thereby seriously influencing the model iteration speed. Based on such background, a model distributed service is proposed.

The current model distributed services are mainly divided into vertical splitting and horizontal splitting.

For the vertical splitting, the model is mainly split according to the business to which the model belongs, for example, the model service is split according to the interfaces, and how many sets of model services are provided by how many interfaces one model has. The single machine only needs to bear the number of models of the current service, thereby reducing the memory occupation and providing services with more models.

For horizontal splitting, the model is not split, but the model needs to provide a routing service according to machine loading, and when an upstream request comes, the model can be correctly routed to the machine where the model is located.

In the two model distributed services, in order to ensure that model calculations provided when new and old models are updated and iterated are consistent, model data on machine nodes need to be loaded completely, and then the model services can be provided. This results in too long model loading times and also in a large time interval between model push and model service provisioning.

Disclosure of Invention

The present disclosure provides a method and a system for loading a service model, so as to solve the problem of long model loading time caused by the fact that model service can be provided only after all model data are loaded in the related art.

In order to solve the above problem, in a first aspect, an embodiment of the present disclosure provides a method for loading a service model, including:

receiving a model push request at a model management layer, wherein the model push request comprises region information, model block data and a model name, and the model block data is one of a plurality of model block data generated by blocking the model data with the model name according to regions;

notifying a target model server to load the model block data in response to the model push request at the model management layer;

loading the model block data on the target model server, and after the model block data is loaded successfully, storing a first mapping relation among the model name, the region information and the model block data on the target model server;

receiving, at the model management layer, a notification from the target model server indicating that the loading of the model block data was successful;

recording a second mapping relation among the target model server, the model name and the region information in the model management layer, and sending the second mapping relation to a routing layer in the model management layer;

receiving, at the routing layer, a model computation request from a client, the model computation request comprising: the name of the target model and the positioning information of the client;

determining the target model server corresponding to the target model name and the positioning information according to the second mapping relation in the routing layer;

sending, at the routing layer, address information of the target model server to the client in response to the model computation request;

receiving the model calculation request of the client at the target model server, and responding to the model calculation request at the target model server to provide target model service for the client according to the first mapping relation.

In a second aspect, an embodiment of the present disclosure provides a loading system for a service model, including:

the system comprises a model management layer, a model server and a routing layer;

the model management layer is used for receiving a model pushing request, wherein the model pushing request comprises region information, model block data and a model name, and the model block data is one of a plurality of model block data generated by blocking the model data with the model name according to regions;

the model management layer is used for responding to the model pushing request and informing a target model server of loading the model block data;

the target model server is used for loading the model block data and storing a first mapping relation among the model name, the region information and the model block data after the model block data is loaded successfully;

the model management layer is used for receiving a notice from the target model server, wherein the notice represents that the model block data is successfully loaded;

the model management layer is used for recording a second mapping relation among the target model server, the model name and the region information and sending the second mapping relation to the routing layer;

the routing layer is configured to receive a model computation request from a client, where the model computation request includes: the name of the target model and the positioning information of the client;

the routing layer is configured to determine the target model server corresponding to the target model name and the positioning information according to the second mapping relationship;

the routing layer is used for responding to the model calculation request and sending the address information of the target model server to the client;

and the target model server is used for receiving the model calculation request of the client and responding to the model calculation request to provide target model service for the client according to the first mapping relation.

In a third aspect, an embodiment of the present disclosure further discloses an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the method for loading the service model according to the embodiment of the present disclosure when executing the computer program.

In a fourth aspect, the disclosed embodiments provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, provides the steps of the loading method of the service model disclosed in the disclosed embodiments.

According to the loading method of the service model, the model data is loaded in blocks according to the regions, so that after the model block data of a certain region is loaded successfully, a first mapping relation between the model name and the region information and between the model block data is stored in a target model server which loads the model block data successfully, and a second mapping relation between the model name and the region information and between the target model server which receives the model block data loaded successfully and is recorded by a model management layer is received by a routing layer. Then, after receiving a model calculation request of a client, the routing layer may search for a target model server loaded with the requested model block service in the second mapping relationship according to the location and the model name in the model calculation request, and when the target model server responds to the model calculation request, the target model server may directly search for a corresponding model block through the first mapping relationship to perform model calculation without waiting for the completion of the complete loading of the entire model data of the model name, and may provide the model service to the client in the region, and when providing the model loading service, the model loading duration is greatly reduced, so that the time from pushing of a new model to providing the model service is greatly reduced; and for the model server loaded with the model block data, recording a second mapping relation between the model server and the model name of the model block data and the region of the model block data, and sending the second mapping relation to the routing layer, so that the consistency of providing services to the outside after the model block is successfully loaded is ensured.

Drawings

To more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a block diagram of the architecture of a loading system for a service model according to one embodiment of the present disclosure;

FIG. 2 is a flow diagram of a method of loading a service model according to one embodiment of the present disclosure;

FIG. 3 is an interaction diagram of a loading system for a service model according to another embodiment of the present disclosure;

FIG. 4 is a schematic diagram of the interaction between a model server and a model management layer, client, according to one embodiment of the present disclosure;

FIG. 5 is a schematic diagram of the interaction between the routing layer and the model management platform of one embodiment of the present disclosure;

FIG. 6 is a block diagram of a loading system for a service model according to another embodiment of the present disclosure;

FIG. 7 schematically shows a block diagram of a computing processing device for performing a method according to the present disclosure; and

fig. 8 schematically shows a storage unit for holding or carrying program code implementing a method according to the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

As shown in fig. 1, a loading system of a service model disclosed in an embodiment of the present disclosure includes: a model management layer (i.e., the model management platform in fig. 1), a routing layer (router), and a model service layer (including multiple model servers).

The loading system of the service model in the embodiment of the present disclosure may be implemented based on an RPC (remote procedure Call Protocol) framework and service management inside an enterprise.

At present, an RPC framework in an enterprise can be integrated with a service management function, namely when an RPC call is initiated, an obtained machine list is an available remote machine list, and a service management layer can automatically take a remote machine which is down and cannot be sensed by a calling party.

In the system of the embodiment of the present disclosure, the loading system of the service model of the embodiment of the present disclosure is developed based on the existing RPC framework, and therefore, a set of service management layer is not required to be built any more, and only a set of model management layer is required to be built, and therefore, the model management layer is a software module newly developed based on the existing RPC framework.

An upper layer interactive object of the model management layer is a client, and the client can send a model pushing request to the model management layer;

the lower interactive object of the model management layer comprises a routing layer and a model service layer, and the model management layer can push the service model requested to be pushed by the client to one model server in the model service layer according to the model pushing request, generate a loaded model-machine list for the successfully pushed service model and send the loaded model-machine list to the routing layer.

The model management layer receives a model pushing request of a client, selects an IP machine (wherein the model service layer can comprise a plurality of servers, any one of which is used for loading a service model, and therefore the server is called a model server, the selected IP machine is one of the plurality of servers), records the mapping between the model and the IP of the machine when the model pushing is executed, sends the mapping to the IP machine to inform the IP machine to carry out model loading (load), and sends (fetch) a loaded model-machine list to a routing layer (router) after the model pushing is successful.

A routing layer, which is based on a customized routing strategy of an RPC frame (for example, Dubbo is a high-performance and light-weight open source JavaRPC frame, MTThrift is a cross-language service deployment frame)) client, does not care about machine information at a service layer level any more, only needs to pay attention to model machine information, and regularly pulls a loaded model-machine list from a model management layer;

the model server is an IP machine and is used for providing the service loaded with the model, and the service is a service at a software layer, such as a meal ordering service.

The model server receives the model loading notification of the model management layer and realizes the loading of the model; and after the loading is successful, informing the model management layer of the information of the loaded model in a model registration mode.

In addition, the routing layer also receives a model calculation request of the client, selects a model server according to the loaded model-machine list and the routing strategy which are pulled from the model management layer, and returns the IP of the model server to the client. The routing layer may also report (report) information of the requested model (e.g., model name, time of request for the model, number of requests for the model) to the model management layer based on the model calculation request.

The client initiates a model calculation request to a model server with the IP by using the IP sent by the routing layer; the model server responds to the model calculation request by using the loaded model file data and returns the calculation result to the client.

The loading system based on the service model shown in fig. 1, referring to fig. 2, shows a flowchart of steps of a loading method of the service model according to an embodiment of the present disclosure. The loading method of the service model can comprise the following steps:

step 101, receiving a model push request at a model management layer;

in some embodiments, the model data is particularly large and is large enough that a single-machine memory is difficult to accommodate, so that the model data to be loaded is processed in a block-by-block manner according to regions in advance to generate a plurality of model block data, each model block data can carry information about the regions, the regions to which different model blocks originated from the same model data belong are different, and each model block data can execute independent calculation logic. Therefore, when the model is pushed, the model data can be blocked according to the region information, and then the model data can be loaded in a blocking mode according to the region.

Wherein, the client can send a model push request to the model management layer. The model pushing request comprises model block data, region information to which the model block data belongs, and a model name of an original model to which the model block data belongs;

the model block data is one of a plurality of model block data generated by blocking model data having the model name according to a region.

The following description will be given taking, as an example, that the region information of the model block data to be loaded is beijing and the model name is model 2.

Step 102, responding to the model pushing request at the model management layer, and informing a target model server to load the model block data;

the number of machines (i.e. model servers) that can load the service model may be multiple, and the model management layer may select a target model server from the available model servers to notify it to load the model block data of the region. The following description will take an example in which the target model server is server 1 having IP address IP 1.

Step 103, receiving a notice from the target model server at the model management layer, wherein the notice represents that the model block data loading is successful;

wherein after the target model server completes loading the model block data, the target model server may send a notification to the model management layer indicating that the loading was successful.

The notification includes the region information and the model name to which the successfully loaded model block belongs.

And 104, recording a second mapping relation among the target model server, the model name and the region information in the model management layer, and sending the second mapping relation to a routing layer in the model management layer.

When the model block is loaded successfully, the model management layer can record the mapping relation of IP1-model 2-Beijing, for example. And the second mapping occurs to the routing layer.

When the second mapping relation is sent, the second mapping relation can be actively pulled from the model management layer in a mode of route layer monitoring.

According to the loading method of the service model, the model data are loaded in blocks according to the regions, so that after the model block data of a certain region are loaded successfully, model service can be provided for the client side of the region without waiting for the completion of the whole model data, the model loading time is greatly reduced, and the time from pushing a new model to providing the model service is greatly reduced; and for the model server loaded with the model block data, recording a second mapping relation between the model server and the model name of the model block data and the region of the model block data, and sending the second mapping relation to the routing layer, so that the consistency of providing services to the outside after the model block is successfully loaded is ensured.

In one embodiment of the present disclosure, the model push request further includes: the model block data requires a number of requests per second (QPS) to be carried.

Then, in the embodiment of the present disclosure, the model management layer may implement the scheme of step 102 through the following steps S21 to S23:

s21, acquiring the operation parameters of the available model server at the model management layer;

the operating parameters may include operating parameters such as a CPU and/or a memory.

Because the RPC framework is integrated with the service administration function, the model management layer can directly obtain the list of available remote machines and obtain the operating parameters of each model server in the list of remote machines.

S22, selecting a target model server from the available model servers at the model management layer according to the operation parameters and the requests per second;

the model management layer can intelligently select an available model server capable of meeting the QPS from the multiple available model servers according to the cpu and memory dimension measurement of each available model server.

In one embodiment of the present disclosure, when the number of available model servers satisfying the QPS is multiple, information of the multiple available model servers may be output to the client, and a user selects which model server to adopt as a target model server for loading the model block data; alternatively, the model management layer may automatically select one available model server from a plurality of available model servers that satisfy the QPS as the target model server.

S23, responding to the model pushing request at the model management layer, and informing a target model server to load the model block data.

In one example, the model name of the model block to be pushed, the storage address of the model block, and the information of the region to which the model block belongs may be added to the preloading node of the IP node of the recommended target model server in the ZK of the model management layer.

In the embodiment of the disclosure, a model server for loading a model does not need to be manually selected, but a target model server of the model to be pushed is flexibly selected in a model management layer according to the operating parameters of an available model server and the QPS of the model block to be pushed, so that intelligent block loading of the model can be realized according to the operating condition of the model server.

The above steps performed at the model management layer are further described below in conjunction with the schematic diagram of the loading system of the service model shown in FIG. 3.

After receiving the model push request, the model management layer may store the model block data (e.g., a model block belonging to beijing of model 2) to a cloud server (e.g., the mei cloud shown in fig. 3), so as to obtain a cloud address (here,/hdfs/model 2/) of the model block;

the model management server obtains the machine information (e.g., IP, operating parameters, etc.) of the available model servers from the machine service tree service, and then, according to the method of intelligently selecting an available model server of the above-described embodiment, intelligently selects a target model server, e.g., server 1 having IP 1.

In addition, as can be seen from FIG. 3, the model management layer maintains records of model server loads on model modules via ZK (ZK is a set of open source code, XUL/HTML compliant, Java written development tools).

In particular, the model service layer may include a plurality of model servers, although not necessarily every model server is available.

As shown in fig. 3, each model server in the model service layer has an IP node (which may be the IP address of the model server) registered on the ZK. Each IP node is provided with a preloading node and a loading node. And a model list which needs to be loaded by the IP machine and a path address of the Mei Tuo cloud where the model is located are arranged below the preloading node. Below the loading node is a list of models that the model server for that IP has loaded successfully.

After S22, when the model management layer selects a target model server, a piece of record information may be added to the preloading node of the model management layer corresponding to the target model server, where the record information includes the model name, the region information, and the storage address of the model block data in the cloud server.

As shown in FIG. 3, the model management layer, after selecting the target model server, may update the preload node under the ZK machine, where the model management layer may add "model 2 under the preload node of IP1 of ZK: record of/hdfs/model 2/Beijing ″.

Then, when the model management layer performs step 102 (or performs S23), the record information in the preloaded node corresponding to the target model server may be sent to the target model server at the model management layer in response to the model push request.

That is, the model load notification may be sent at the model management layer through the preload node under the corresponding IP node of ZK, informing the server 1 of IP1 to load the model block of model2 in Beijing.

In one embodiment of the disclosure, monitoring whether data in a preloading node corresponding to the target model server in the model management layer changes at the target model server; and if the target model server monitors that the data in the preloading node changes, responding to the model pushing request at the model management layer, and sending the record information in the preloading node corresponding to the target model server.

That is, any one of the model servers in the model management layer may maintain a heartbeat with the ZK to monitor whether data regarding the preloaded node of the own IP node has changed, in the above example, the server 1 may monitor that data in the preloaded node of the IP1 has changed (because the above piece of record information is added) through a heartbeat event with the ZK, and the server 1 may pull record information in the preloaded node of the IP1 node (here, two sets of record information of the model1 and the model2 are included) from the ZK of the model management layer.

In an embodiment of the present disclosure, as shown in fig. 2, after step 102 and before step 103, the method according to an embodiment of the present disclosure may further include:

step 1021, loading the model block data in the target model server;

step 1022, after the model block data is successfully loaded, saving a first mapping relationship between the model name and the region information and the model block data in the target model server.

In one example, FIG. 4 shows a schematic diagram of the interaction between a model server and a model management layer, client.

Taking the model server as the server 1 for example, the server 1 may pull the record data in the preload node of the IP1 node by monitoring (listener) ZK for a preload node of the IP1 node to trigger a change event, and check whether the record locally contains corresponding model block data, and if not, obtain the corresponding model block data from the cloud according to the address information in the preload node.

After pulling the record data in the preload node of the P1 node, the server 1 may modify the local model list manifest file, i.e. add the record information of the model blocks that do not exist in the manifest file to the manifest file.

The server 1 may scan the local model list file regularly, and if the list result obtained by the current polling scan and the list result obtained by the last polling scan are changed, it indicates that the model is changed. For example, in this example, although the pulled record data includes two pieces of record data of the model1 and the model2, the record data of the model 1-beijing already exists in the model list manifest file, so when the model list manifest file is modified, the record of the model 2-beijing-model block data can be newly added, so that only the model block data belonging to the beijing area of the model2 requested to be loaded by the client can be loaded when the step of loading the model data is executed.

Then, the server 1 maintains the loaded model block data belonging to the beijing area of the model2 in the memory to store the first mapping relationship between the model name and the area information and the model block data, here, model 2-beijing-model block data, in the memory.

It should be noted that the model in the model list file on the model server side is not necessarily the model block that has been successfully loaded, and may also include the model block that has not been successfully loaded, but the model block data in the first mapping relationship maintained by the memory is the model block data that has been completely loaded by the model server.

In the embodiment of the present disclosure, the target model server stores the first mapping relationship of the loaded model block, so that when the target model server responds to the model calculation request, the corresponding model block is searched for performing the model calculation.

In an embodiment of the present disclosure, when the model management layer executes step 103, that is, when receiving the notification indicating that the model block data is successfully loaded from the target model server, as shown in fig. 4, after the server 1 successfully loads the model block of beijing of the model2, the server 1 may send the model name and the region to which the model name and the region belong of the model block are successfully loaded to the model management layer through the heartbeat event of ZK, so that the model management layer may receive the heartbeat event (that is, the notification indicating that the model block is successfully loaded), as shown in fig. 3, the model management layer may receive the heartbeat event from the server 1 with IP 1.

Then, when the model management layer executes step 104, that is, when recording the second mapping relationship between the target model server and the model name and the region information, as shown in fig. 3, the model management layer may add a record of "model 2-beijing" to the load node of the IP1 node of ZK, so as to generate the second mapping relationship of IP1-model 2-beijing.

And the second mapping relation corresponds to the recorded data of the load node of each IP node in the ZK in the model management layer.

Wherein the loaded model block-machine list, i.e., the second mapping relationship, may be cached at the model management layer and provided to the routing layer for invocation at the model management layer.

In one embodiment of the present disclosure, the method according to an embodiment of the present disclosure may further include: periodically pulling the second mapping of records from the model management layer at the routing layer; storing, at the routing layer, the second mapping to the routing layer.

As shown in fig. 3, the model block-machine list, i.e., the data (corresponding to the second mapping relationship) of the load node of each IP node of ZK, may be cached in the model management layer, and when the model management layer sends the second mapping relationship to the routing layer, the model management layer may periodically pull the data. Specifically, the model management layer may provide a Thrift (swift is a cross-language service deployment framework) interface to the routing layer, through which data (model name, region) of the load node of each IP node of the model management layer may be pulled at the routing layer.

In order to prevent the situation that once the model management layer is down, the routing layer is restarted to lose data of all mapping relationships and cause service failure to start, in the embodiment of the present disclosure, a highly available disaster recovery scheme is adopted, and the routing layer backs up the second mapping relationship, that is, the loading routing table of the model block, so that even if the model management layer is down due to version rollback and other reasons, the loading system of the service model of the embodiment of the present disclosure may remove strong dependence on the ZK and the cloud server of the model management layer.

As shown in fig. 5, a schematic diagram of the interaction between the routing layer and the model management platform (i.e., the model management layer) of one embodiment of the present disclosure is shown.

Not only can the second mapping relationship (corresponding to model-region-IP in fig. 5) be regularly obtained from the model management layer at the routing layer, but also whether the load node of the ZK of the model management layer is changed or not can be monitored at the routing layer to monitor the change of the second mapping relationship, so that the latest second mapping relationship (model name-region-IP of model block) can be timely pulled.

In one example, ModelListener can be utilized at the routing layer to interact with the model management layer, listen for and pull the latest model-domain-IP list periodically, zk/mq can be used for listening, and pulling is called by RPC.

As shown in fig. 5, a model-region-IP list may be maintained in the routing layer, and when it is monitored that the list is changed, the latest model-region-IP list maintained by the memory and pulled is imported into the local disk file, so as to prevent the situation that all data is lost and the service is unavailable once the loading system of the service model is down and the routing layer is restarted.

The routing layer of the embodiment adds the routing rule by means of the function of the existing RPC frame, and because the RPC frame has already made service management functions such as machine (model server) downtime, port forbidding, automatic removal of a problem machine and the like when initiating the RPC call, the routing function of the model calculation request can be realized only by embedding a load balancing strategy (Balance) in the routing layer.

In one embodiment of the present disclosure, as shown in fig. 2, after step 104, the method according to an embodiment of the present disclosure may further include:

step 105, receiving a model calculation request from a client at the routing layer, wherein the model calculation request comprises: the name of the target model and the positioning information of the client;

the positioning information may be latitude and longitude information. For example, a client located at longitude m1 dimension m2 (where "longitude m1 dimension m 2" is located in a certain cell of the sunny region in beijing), requests service computation by using a model2, for example, the service computation request is a meal ordering service, and then the model computation request sent by the client to the routing layer includes model name information of the model2 and location information of the client (longitude m1 dimension m 2).

Step 106, determining the target model server corresponding to the target model name and the positioning information according to the second mapping relation in the routing layer;

in one embodiment of the present disclosure, when step 106 is executed, it may be implemented by: at least one model server corresponding to the target model name and the positioning information is searched in the routing layer according to the second mapping relation; and selecting the target model server from the at least one model server at the routing layer.

As shown in fig. 5, the non-model machine may be filtered from the model-region-IP list in the memory through parameters (i.e., the name of the target model and the positioning information) in the model calculation request at the routing layer, and specifically, the IP corresponding to the target region (beijing) to which the positioning information belongs and the model name of the model2 may be searched from the model-region-IP list. For example, the model servers loaded with model2 model block data of Beijing include three model servers with IP1, IP3 and IP 5.

In one embodiment of the present disclosure, a target model server may be selected from the at least one model server in conjunction with a remote list of available IPs and a load balancing policy.

For example, as shown in fig. 5, before the routing layer receives the model computation request, a remote available IP list (e.g., IP1, IP3, IP4, IP6) has been obtained through an RPC framework, and then when a target model server is selected, an available model server that is loaded with the needed model block data and is not down, here IP1 and IP3, can be selected from IP1, IP3, IP5 in combination with the remote available IP list.

Then, a target model server is selected from the two model servers of IP1 and IP3 according to a load balancing policy, for example, a model server with an IP address of IP1 is selected.

Step 107, responding to the model calculation request at the routing layer, and sending the address information of the target model server to the client.

Here, the routing layer may output IP1 to the client.

In the embodiment of the present disclosure, the routing layer may utilize the second mapping relationship issued by the model management layer to accurately route the model calculation request to an available target model server capable of providing the model service.

Step 108, receiving the model calculation request from the client at the target model server, and providing a target model service to the client according to the first mapping relation at the target model server in response to the model calculation request.

The model block loaded on the target model server is a model service, so that the target model server can determine which model block service is provided to the client according to the first mapping relation, namely the target model service.

In one embodiment of the present disclosure, when step 108 is executed, it may be implemented by S301 to S303:

s301, in response to the model calculation request, the target model server searches target model block data corresponding to the target model name and the positioning information according to the first mapping relation;

among other things, since the routing layer routes the model computation request of the client to the machine of IP1, the client can send the model computation request to server 1 of IP 1.

As shown in fig. 4, the server 1 may receive a model calculation request.

The server 1 can parse the model calculation request to obtain the name of the target model and the positioning information, here, model2 and "longitude m1 dimension m 2", respectively.

The memory of the server 1 stores the first mapping relationship (the first mapping relationship between the model name and the region information to which the model block belongs and the model block data) of the loaded model block, so that the server 1 can search the model name-region-model block data in the memory for the target model block data corresponding to beijing and model 2.

S302, calculating the model calculation request at the target model server according to the target model block data to generate a calculation result;

the model calculation request can also carry data to be calculated, so that the data to be calculated can be calculated by using the target model block data to generate a calculation result.

S303, responding to the model calculation request at the target model server, and sending the calculation result to the client.

The process is an implementation manner for providing a service model and loading the service model at the client.

In contrast, if the target model block data corresponding to beijing, model2 is not found in the model name-region-model block data in the memory of the server 1, the server 1 returns an error flag indicating that there is no target model block data to the client.

In the embodiment of the present disclosure, the model server may calculate the model calculation request of a certain region by using the locally stored model block data of the certain region, and provide the model service.

In a take-out scene, because the relationship between service calculation and a region is relatively tight, the method of the embodiment of the disclosure can provide a model service after model block data of a region is successfully loaded, thereby greatly reducing the loading time of the model.

In one embodiment of the present disclosure, the model block to which the model calculation request of the client relates may be one or more, and when the model calculation request relates to a plurality of model blocks, the model calculation request further includes an association relationship between the plurality of model blocks, that is, a logical relationship of the plurality of model blocks.

For example, the model calculation request indicates that the result 1 is obtained by performing calculation using the model block 1, the result 2 is obtained by performing calculation on the result 1 using the model block 2, the result 3 is obtained by performing calculation on the result 2 using the model block 3, and the result 3 is finally required by the client. In addition, the three model blocks may belong to the same model or different models.

Of course, the association relationship between the plurality of model blocks is not limited to the above example, and when the model calculation result is returned, the calculation results calculated by the plurality of model blocks are combined according to the association relationship to generate a final calculation result, and the final calculation result is returned to the client.

As shown in fig. 6, the loading system of a service model disclosed in this embodiment includes:

the model management layer 61, the model services layer includes at least one model server (here shown as target model server 62) and a routing layer 63;

the model management layer 61 is configured to receive a model push request, where the model push request includes region information, model block data, and a model name, where the model block data is one of a plurality of model block data generated by blocking model data with the model name according to a region;

the model management layer 61 is configured to respond to the model push request, and notify the target model server 62 to load the model block data;

the target model server 62 is configured to load the model block data, and after the model block data is successfully loaded, store a first mapping relationship between the model name and the region information and between the model block data;

the model management layer 61 is configured to receive a notification from the target model server 62 indicating that the loading of the model block data is successful;

the model management layer 61 is configured to record a second mapping relationship between the target model server 62 and the model name and the region information, and send the second mapping relationship to the routing layer 63;

the routing layer 63 is configured to receive a model computation request from a client, where the model computation request includes: the name of the target model and the positioning information of the client;

the routing layer 63 is configured to determine, according to the second mapping relationship, the target model server 62 corresponding to the target model name and the positioning information;

the routing layer 63 is configured to send address information of the target model server 62 to the client in response to the model calculation request;

the target model server 62 is configured to receive the model computation request of the client, and in response to the model computation request, provide a target model service to the client according to the first mapping relationship.

In one embodiment of the present disclosure, the model push request further includes: the number of requests per second that the model block data needs to carry;

the model management layer 61 is used for acquiring the operating parameters of the available model servers;

the model management layer 61 is configured to select a target model server 62 from the available model servers according to the operation parameters and the number of requests per second;

and the model management layer 61 is configured to notify the target model server 62 to load the model block data in response to the model push request.

In one embodiment of the present disclosure,

the routing layer 63 is configured to periodically pull the second mapping relationship of the record from the model management layer 61;

the routing layer 63 is configured to store the second mapping relationship in the routing layer 63.

In one embodiment of the present disclosure,

the routing layer 63 is configured to search for at least one model server corresponding to the target model name and the positioning information according to the second mapping relationship;

the routing layer 63 is configured to select the target model server 62 from the at least one model server.

In one embodiment of the present disclosure,

the target model server 62 is configured to, in response to the model calculation request, search for target model block data corresponding to the target model name and the positioning information according to the first mapping relationship;

the target model server 62 is configured to calculate the model calculation request according to the target model block data, and generate a calculation result;

and the target model server 62 is configured to respond to the model calculation request and send the calculation result to the client.

The system for loading a service model disclosed in the embodiment of the present disclosure is configured to implement each step of the method for loading a service model described in the above embodiment of the present disclosure, and for specific implementation of each module of the system, reference is made to the corresponding step, which is not described herein again.

Correspondingly, the present disclosure also discloses an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the steps of the method for loading the service model according to any one of the above embodiments of the present disclosure are implemented. The electronic device can be a PC, a mobile terminal, a personal digital assistant, a tablet computer and the like.

The present disclosure also discloses a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method for loading a service model according to any of the above embodiments of the present disclosure.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The method and the system for loading the service model provided by the present disclosure are described in detail above, and a specific example is applied in the text to explain the principle and the implementation of the present disclosure, and the description of the above embodiment is only used to help understanding the method and the core idea of the present disclosure; meanwhile, for a person skilled in the art, based on the idea of the present disclosure, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present disclosure should not be construed as a limitation to the present disclosure.

The above-described system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Various component embodiments of the disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in a computing processing device according to embodiments of the present disclosure. The present disclosure may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present disclosure may be stored on a computer-readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

For example, FIG. 7 illustrates a computing processing device that may implement methods in accordance with the present disclosure. The computing processing device conventionally includes a processor 1010 and a computer program product or computer-readable medium in the form of a memory 1020. The memory 1020 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. The memory 1020 has a storage space 1030 for program code 1031 for performing any of the method steps of the above-described method. For example, the storage space 1030 for program code may include respective program code 1031 for implementing various steps in the above method, respectively. The program code can be read from or written to one or more computer program products. These computer program products comprise a program code carrier such as a hard disk, a Compact Disc (CD), a memory card or a floppy disk. Such a computer program product is typically a portable or fixed storage unit as described with reference to fig. 8. The memory unit may have memory segments, memory spaces, etc. arranged similarly to memory 1020 in the computing processing device of fig. 7. The program code may be compressed, for example, in a suitable form. Typically, the memory unit comprises computer readable code 1031', i.e. code that can be read by a processor, such as 1010, for example, which when executed by a computing processing device causes the computing processing device to perform the steps of the method described above.

Reference herein to "one embodiment," "an embodiment," or "one or more embodiments" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Moreover, it is noted that instances of the word "in one embodiment" are not necessarily all referring to the same embodiment.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the disclosure may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The disclosure may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Claims

1. A method for loading a service model, the method comprising:

2. The method of claim 1, wherein the model push request further comprises: the number of requests per second that the model block data needs to carry;

responding to the model pushing request at the model management layer, and informing a target model server to load the model block data, wherein the method comprises the following steps:

obtaining the operating parameters of the available model servers at the model management layer;

selecting a target model server from the available model servers at the model management layer according to the operating parameters and the requests per second;

and responding to the model pushing request at the model management layer, and informing a target model server to load the model block data.

3. The method of claim 1, further comprising:

periodically pulling the second mapping of records from the model management layer at the routing layer;

storing, at the routing layer, the second mapping to the routing layer.

4. The method according to claim 1, wherein said determining, at the routing layer, the target model server corresponding to the target model name and the positioning information according to the second mapping relationship comprises:

at least one model server corresponding to the target model name and the positioning information is searched in the routing layer according to the second mapping relation;

and selecting the target model server from the at least one model server at the routing layer.

5. The method of claim 1, wherein providing, at the target model server, a target model service to the client according to the first mapping in response to the model computation request comprises:

in response to the model calculation request, the target model server searches target model block data corresponding to the target model name and the positioning information according to the first mapping relation;

calculating the model calculation request at the target model server according to the target model block data to generate a calculation result;

and responding to the model calculation request at the target model server, and sending the calculation result to the client.

6. A system for loading a service model, comprising: the system comprises a model management layer, a model server and a routing layer;

7. The system of claim 6, wherein the model push request further comprises: the number of requests per second that the model block data needs to carry;

the model management layer is used for acquiring the operating parameters of the available model servers;

the model management layer is used for selecting a target model server from the available model servers according to the operation parameters and the requests per second;

and the model management layer is used for responding to the model pushing request and informing a target model server of loading the model block data.

8. The system of claim 6,

the routing layer is used for pulling the second mapping relation of the record from the model management layer periodically;

the routing layer is configured to store the second mapping relationship to the routing layer.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of loading the service model of any one of claims 1 to 5 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the steps of the method of loading a service model according to any one of claims 1 to 5.