CN112559065A

CN112559065A - Method and device for loading model in clustering mode

Info

Publication number: CN112559065A
Application number: CN201910920240.6A
Authority: CN
Inventors: 赵玉峰
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2019-09-26
Filing date: 2019-09-26
Publication date: 2021-03-26

Abstract

The invention discloses a method and a device for loading a model in a clustering manner, and relates to the technical field of computers. One embodiment of the method comprises: for each cluster, acquiring index data of each device in the cluster; determining the performance index of the cluster according to the index data of each device in the cluster; and determining a model list loaded by each cluster according to the performance indexes of the clusters so as to adjust the model loaded by each cluster. According to the method, manual intervention is not needed, the loaded model of each cluster can be dynamically adjusted according to the index data of the clusters, the user-defined index is supported, and the availability, the standard and the flexibility of the system are greatly improved.

Description

Method and device for loading model in clustering mode

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for loading a model in a clustering manner.

Background

In the field of artificial intelligence, model files are usually loaded into a local memory of a computer to realize rapid prediction of model results. In the Saas (Software-as-a-Service) platform solution, a single system needs to provide services for a large number of users, and in most scenarios, each user needs an independent business model to provide model prediction services for the user. The memory of a single computer is limited, and the consumption of the memory space of the single computer caused by the increasing number of models cannot be supported. A common solution is to manually select a cluster to load a new model with reference to some fixed indexes when creating a user, and route traffic to a correct cluster through a mapping relationship between the model and the cluster.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

1. the manual cluster selection mode can greatly increase the operation and maintenance cost and reduce the availability of the system;

2. indexes such as flow and memory of the cluster are dynamically changed, and the prior art cannot support dynamic adjustment of the cluster;

3. the cluster selection algorithms and algorithm input parameters of different service systems are different, and the input parameters in the prior art are fixed and lack flexibility and standard.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for loading a model in a cluster, which do not require manual intervention, can dynamically adjust the model loaded by each cluster according to index data of the cluster, and support a user-defined index, thereby greatly improving system availability, standability, and flexibility.

To achieve the above object, according to an aspect of the embodiments of the present invention, there is provided a method for loading a model in a clustering system, including:

for each cluster, acquiring index data of each device in the cluster;

determining the performance index of the cluster according to the index data of each device in the cluster;

and determining a model list loaded by each cluster according to the performance indexes of the clusters so as to adjust the model loaded by each cluster.

Optionally, after the index data of each device in the cluster is respectively obtained, the method further includes:

determining whether a message queue corresponding to the equipment exists in a preset buffer queue; if the index data exists, the index data of the equipment is stored in the message queue; otherwise, creating a message queue corresponding to the equipment in the cache queue, and then storing the index data of the equipment into the message queue;

before the step of determining the performance index of the cluster according to the index data of each device in the cluster, the method further includes: and acquiring index data of the equipment from the message queue.

Optionally, after the step of obtaining the metric data of the device from the message queue, the method further includes: if the acquisition is successful, updating the last pulling time of the message queue; otherwise, judging whether the time difference between the current time and the last pulling time of the message queue is larger than a preset time threshold value or not; and if so, deleting the message queue from the cache queue, and deleting the mapping corresponding to the equipment in the mapping relation between the model and the cluster.

Optionally, after the step of determining the loaded model list of each cluster according to the performance index of each cluster, the method further includes:

and updating the mapping relation between the model and the cluster according to the model list loaded by each cluster, and pushing the updated mapping relation to each cluster.

Optionally, the method of the embodiment of the present invention further includes: when a new model needs to be loaded, determining the cluster for loading the new model according to the performance index of each cluster, or determining the cluster for loading the new model according to a preset cluster selection algorithm.

According to a second aspect of the embodiments of the present invention, there is provided an apparatus for clustering a loading model, including:

the acquisition module is used for respectively acquiring index data of each device in each cluster;

the determining module is used for determining the performance index of the cluster according to the index data of each device in the cluster;

and the adjusting module is used for determining the model list loaded by each cluster according to the performance index of each cluster so as to adjust the model loaded by each cluster.

Optionally, the obtaining module is further configured to: after the index data of each device in the cluster is acquired respectively,

the determination module is further to: and before the step of determining the performance index of the cluster according to the index data of each device in the cluster, acquiring the index data of the device from the message queue.

Optionally, the determining module is further configured to: after the step of obtaining the index data of the equipment from the message queue, if the obtaining is successful, updating the last pulling time of the message queue; otherwise, judging whether the time difference between the current time and the last pulling time of the message queue is larger than a preset time threshold value or not; and if so, deleting the message queue from the cache queue, and deleting the mapping corresponding to the equipment in the mapping relation between the model and the cluster.

Optionally, the adjusting module is further configured to: after the step of determining the list of models loaded by each of said clusters based on the performance indicators of the respective clusters,

updating the mapping relation between the model and the cluster according to the model list loaded by each cluster; and pushing the updated mapping relation to each cluster.

Optionally, the apparatus of the embodiment of the present invention further includes a selecting module, configured to: when a new model needs to be loaded, determining the cluster for loading the new model according to the performance index of each cluster, or determining the cluster for loading the new model according to a preset cluster selection algorithm.

According to a third aspect of the embodiments of the present invention, a system for a cluster loading model is provided, including:

a collector for: for each cluster, acquiring index data of each device in the cluster; determining the performance index of the cluster according to the index data of each device in the cluster; determining a model list loaded by each cluster according to the performance indexes of the clusters so as to adjust the model loaded by each cluster;

a selector for: when a new model needs to be loaded, determining a cluster for loading the new model according to the performance index of each cluster, or determining a cluster for loading the new model according to a preset cluster selection algorithm;

a memory to: and storing a data processing algorithm for determining the performance index of the cluster according to the index data of each device in the cluster, the performance index of each cluster, the mapping relation between the model and the cluster, and a preset cluster selection algorithm.

According to a fourth aspect of the embodiments of the present invention, an electronic device for a clustering loading model is provided, including:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method provided by the first aspect of the embodiments of the present invention.

According to a fifth aspect of embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method provided by the first aspect of embodiments of the present invention.

One embodiment of the above invention has the following advantages or benefits: by determining the performance index of the cluster according to the index data of each device in the cluster and further determining the loaded model list of each cluster, the loaded model of each cluster can be dynamically adjusted without manual intervention, the self-defined index can be supported, and the availability, the standard property and the flexibility of the system are greatly improved. By customizing a cluster selection algorithm and a dynamic balance strategy, the availability, the standard and the flexibility of the system can be further improved.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main flow of a method of diversity group loading model according to an embodiment of the invention;

FIG. 2 is a schematic diagram of the main blocks of an apparatus for diversity group loading model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the main components of a system for diversity group loading model according to an embodiment of the present invention;

FIG. 4 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 5 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

According to one aspect of the embodiment of the invention, a method for loading a model in a clustering manner is provided.

Fig. 1 is a schematic diagram of a main flow of a method for loading a diversity group loading model according to an embodiment of the present invention, and as shown in fig. 1, the method for loading a diversity group loading model includes: step S101, step S102, and step S103.

In step S101, for each cluster, index data of each device in the cluster is acquired.

A cluster is a group of independent computer devices interconnected by a high-speed network, and a cluster often includes multiple computer devices.

The index data refers to index values of the respective indexes. The selection of the index may be selectively determined according to actual conditions, such as memory usage, CPU usage, number of node requests, and the like. The embodiment supports the user-defined index, and compared with the fixed index parameter in the prior art, the standard performance and flexibility of the system can be greatly improved by the user-defined index.

And S102, determining the performance index of the cluster according to the index data of each device in the cluster.

The algorithm for determining the performance index of the cluster according to the index data of each device in the cluster can be selectively set according to actual application scenarios. For example, for the cluster a, according to the index a values of all the devices in the cluster a, the average index a value of all the devices in the cluster a is calculated through an averaging algorithm, or the sum of the index a values of all the devices in the cluster a is calculated through a summation algorithm. The embodiment supports a custom data processing algorithm, thereby greatly improving the standard property and the flexibility of the system.

Step S103, determining a model list loaded by each cluster according to the performance indexes of each cluster so as to adjust the model loaded by each cluster.

Illustratively, for cluster a, the number of models loaded by cluster a is determined according to the average index a value of cluster a. Or distributing all models according to the performance indexes of the clusters, wherein the cluster with higher performance loads a larger number of models or the models with larger operation load, and the cluster with lower performance loads a smaller number of models or the models with smaller operation load. In the actual application process, a dynamic balance strategy can be preset, so that a model list loaded by each cluster is determined according to the preset dynamic balance strategy and the performance indexes of each cluster. The dynamic balancing policy can be set by self.

The invention determines the model list loaded by each cluster according to the index data of each device in the cluster, can dynamically adjust the loaded model of each cluster, does not need manual intervention, can support the user-defined index, and greatly improves the availability, the standard property and the flexibility of the system.

In some embodiments, after obtaining the index data of each device in the cluster, the method further includes: determining whether a message queue corresponding to the equipment exists in a preset buffer queue; if the index data exists, the index data of the equipment is stored in the message queue; otherwise, creating a message queue corresponding to the equipment in the buffer queue, and then storing the index data of the equipment into the message queue. Before the step of determining the performance index of the cluster according to the index data of each device in the cluster, the method further includes: and acquiring index data of the equipment from the message queue.

Illustratively, the message queues of the individual devices are named using "cluster name + device unique identification (e.g., IP)". For the device b, after the index data of the device b is acquired, whether a message queue named as "cluster name + unique identifier of the device b" exists in the cache queue is judged. If yes, storing the currently acquired index data of the equipment b to the message queue; otherwise, a message queue named as "cluster name + unique identifier of device b" is created, and then the currently acquired index data of device b is saved to the message queue.

The cache queue is adopted to store the index data of each device in the cluster, so that message caching and asynchronous processing can be performed, and the system performance is improved.

Further, after the step of obtaining the metric data of the device from the message queue, the method may further include: and judging whether the index data of the equipment is successfully acquired. If the acquisition is successful, updating the last pulling time of the message queue; otherwise, judging whether the time difference between the current time and the last pulling time of the message queue is larger than a preset time threshold value or not; and if so, deleting the message queue from the cache queue, and deleting the mapping corresponding to the equipment in the mapping relation between the model and the cluster.

Illustratively, the loop pulls the index data of each device from the buffer queue, and the last pull time of the message queue of each device is updated after the pull. For the index data of the equipment c, the index data of the equipment c is obtained from a message queue c named as 'cluster name + unique identifier of the equipment c' in a caching queue, and then the last pull time of the message queue c is updated. If the index data cannot be acquired from the message queue c, judging that the equipment c is providing service, deleting the message queue c named as 'cluster name + unique identifier of the equipment c' from the cache queue, and deleting the mapping corresponding to the equipment c in the mapping relation between the model and the cluster.

Through the overtime judgment, the device loading module which does not provide service can be avoided, and the effect of the diversity group loading module is improved.

Optionally, after the step of determining the loaded model list of each cluster according to the performance index of each cluster, the method further includes: and updating the mapping relation between the model and the cluster according to the model list loaded by each cluster, and pushing the updated mapping relation to each cluster.

In this example, the mapping relationship between each cluster and the loaded model is preset. And after determining the model list loaded by each cluster according to the performance indexes of each cluster, updating the mapping relation between the model and the clusters according to the determined list. And pushing the updated mapping relation to each cluster, so that the equipment in the cluster can update the loaded model in time according to the updated mapping relation.

Optionally, the method of the embodiment of the present invention further includes: when a new model needs to be loaded, determining the cluster for loading the new model according to the performance index of each cluster, or determining the cluster for loading the new model according to a preset cluster selection algorithm. In this example, the new model refers to the model that is not currently loaded in each cluster.

Illustratively, a cluster black and white list is preset, wherein the black list is used for indicating that the cluster can not load a new model any more, and the white list is used for indicating that the cluster can load the new model. Of course, the cluster for loading the new model may also be determined according to the performance index of each cluster. For example, a higher performing cluster loads a new model.

By customizing the cluster selection algorithm, the cluster for loading the new model can be determined, so that the availability, the standard and the flexibility of the system are further improved.

According to a second aspect of the embodiments of the present invention, an apparatus for clustering a loading model is provided.

Fig. 2 is a schematic diagram of main modules of an apparatus for diversity group loading model according to an embodiment of the present invention, and as shown in fig. 2, the apparatus 200 for diversity group loading model includes:

an obtaining module 201, configured to obtain, for each cluster, index data of each device in the cluster;

a determining module 202, configured to determine a performance index of the cluster according to index data of each device in the cluster;

and the adjusting module 203 determines a model list loaded by each cluster according to the performance index of each cluster so as to adjust the model loaded by each cluster.

Optionally, the determining module is further configured to: before the step of obtaining the index data of the equipment from the message queue, confirming that the time difference between the current time and the last pulling time of the message queue is less than or equal to a preset time threshold value; updating the last pull time of the message queue after the step of obtaining the index data of the equipment from the message queue; and the number of the first and second groups,

and if the time difference between the current time and the last pulling time of the message queue is greater than a preset time threshold, deleting the message queue from the cache queue, and deleting the mapping corresponding to the equipment in the mapping relation between the model and the cluster.

According to a third aspect of the embodiments of the present invention, there is provided a system for implementing the method provided by the first aspect of the embodiments of the present invention.

Fig. 3 is a schematic diagram of the main components of a system for diversity group loading model according to an embodiment of the present invention, and as shown in fig. 3, the system 300 for diversity group loading model includes:

a collector 301 for: for each cluster, acquiring index data of each device in the cluster; determining the performance index of the cluster according to the index data of each device in the cluster; determining a model list loaded by each cluster according to the performance indexes of the clusters so as to adjust the model loaded by each cluster;

a selector 303 for: when a new model needs to be loaded, determining a cluster for loading the new model according to the performance index of each cluster, or determining a cluster for loading the new model according to a preset cluster selection algorithm;

a memory 302 for: and storing a data processing algorithm for determining the performance index of the cluster according to the index data of each device in the cluster, the performance index of each cluster, the mapping relation between the model and the cluster, and a preset cluster selection algorithm.

In fig. 3, the model predictive service provider corresponds to a device in the cluster. Basic data collection nodes (collection memory utilization rate and CPU utilization rate) and query rate nodes (QpsNodes) (the number of requests received by the nodes) in a model prediction service provider are connected in series to form a calling chain tree, and a user can develop index data of a user-defined index data collection node by inheriting a DefaultNode abstract class through an entry method. After all the index data are collected, the message push node PushMsgNode sends the index data collected by the Call chain to the collector 301 in a manner of RPC (Remote Procedure Call) Call.

The model loading node loadmodel node of the model prediction service provider obtains the latest mapping relationship between the model and the cluster from the data storage 302, and loads or updates the model. The whole business logic is completed by timing tasks.

The collector 301 mainly contains the Disproptor cache queue and the Graph balancing tree. The dispatcher cache queue is used for decoupling a producer and a consumer, a Graph (a data structure) balance tree is used as the consumer for acquiring indexes, the preprocessing, processing and storing work of the indexes is completed, and a Dynamic node (Dynamic node) can be customized to realize the function of dynamically adjusting the cluster load in the operation period.

When receiving an index data set sent by a model prediction service provider, the collector 301 checks whether a message queue named by a client "cluster name + IP address" exists in a dispatcher cache queue, if so, the index data is directly stored in the corresponding message queue, if not, a message queue named by a client "cluster name + IP address" is created, the index data is stored in the newly created queue, and the mapping relationship between the cluster and the IP address is stored in a Root node of a Graph balanced tree.

And the Root node in the Graph balancing tree adopts a timing task to execute service logic. Firstly, a message queue in a circular dispatcher cache queue consumes messages from a corresponding message queue, if the message queue is pulled, the last pulling time of the message queue is updated, and if the message queue is not pulled, the judgment is made as follows: current time-last pull time of the message queue > if the set time threshold is correct. If the mapping relation between the cluster and the IP address is correct, deleting the mapping relation between the cluster and the IP address, and deleting the message queue named by the cluster name + the IP address. And finally, sequentially calling the left, middle and right son nodes Group, Storage and Dynamic.

The Group node carries out message aggregation according to the cluster, and calls left and right child nodes Average and Add according to a basic parameter sequence corresponding to index data obtained from a data storage, wherein the Average is responsible for calculating a data index result of which the current Group lower merging strategy is an averaging value, and the Add is responsible for calculating a data index result of which the current Group lower merging strategy is a summation value.

The Storage node updates the index data processed according to the clusters into a data Storage, and the data updating strategy comprises the following steps: the overwrite/append no empty/append timing clear. And aiming at each index, a corresponding storage strategy exists, and the storage strategy can be set by self-definition.

The Dynamic node is used for updating the mapping relation between the model and the cluster according to the basic parameters corresponding to the index data acquired from the data storage in the operation period, and realizing Dynamic balance of key index data such as model calling amount, memory utilization rate and the like. The user can realize the self-defined dynamic balance strategy by inheriting the dynamic node class and rewriting the execute method.

The selector 303 provides basic functions such as acquisition of acquisition indexes and persistence of selection results through a template mode, and a user can complete cluster selection service logic by implementing an abstract method. The selector 303 may also function as a model training platform, and is configured to train a new model, and after training of the new model is completed, obtain data such as a processed cluster acquisition index and a cluster black and white list from the data storage 302, execute a cluster selection algorithm defined by a user to select a cluster used for loading the new model, and update a result to a mapping relationship between a model and a cluster in the data storage.

The data storage 302 may adopt a nonsql (non-relational database) database such as Redis (a key-value storage system), Zookeeper (a distributed, open source distributed application coordination service), and the like, and mainly stores basic parameters (a merging strategy, a data updating strategy, a data clearing strategy, and the like) of an acquisition index, processed index data, a mapping relationship between a model and a cluster, a black and white list of the cluster, and the like. After the mapping relationship between the model and the cluster in the data storage 302 changes, the local cache of the model prediction service consumer is updated through an active push mechanism (Pubsub mechanism of Redis/message subscription of Zookeeper).

And the model prediction service consumer calls model prediction services of different clusters according to the mapping relation between the models and the clusters in the local cache, and if one model has a plurality of service providers, the load balancing is carried out in a polling mode. The model prediction service consumer calls the computer nodes of the model prediction service predictor, and when the model prediction service consumer calls the cluster provider, because the cluster is provided with a plurality of computers, the model prediction service consumer can call different computers in sequence in a polling mode, so that the requirement can be ensured to distribute all the computer nodes in the cluster evenly, and the phenomenon that one computer node is crushed is avoided.

The existing technology for manually selecting the model loading cluster has a plurality of defects in the aspects of system availability, standardization, flexibility and the like. The invention fully considers the usability, expansibility and high performance of the sub-cluster management model. Firstly, a mapping table and an overtime detection mechanism are adopted in a collector for cluster management, and the method is simple and easy to use; secondly, a large amount of service logics coupled with services are abstracted, so that a user can conveniently perform custom expansion; and finally, message caching and asynchronous processing are carried out by adopting a Disproptor cache queue and a Graph balance tree, so that the high performance of the system is ensured. The invention realizes a diversity group model loading method which can support the linear expansion of the number of models without manual intervention, and makes the management of the diversity group of models easy.

one or more processors;

a storage device for storing one or more programs,

Fig. 4 shows an exemplary system architecture 400 to which the method of diversity group loading model or the apparatus of diversity group loading model of an embodiment of the invention may be applied.

As shown in fig. 4, the system architecture 400 may include

terminal devices

401, 402, 403, a network 404, and a server 405. The network 404 serves as a medium for providing communication links between the

terminal devices

401, 402, 403 and the server 405. Network 404 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may use

terminal devices

401, 402, 403 to interact with a server 405 over a network 404 to receive or send messages or the like. The

terminal devices

401, 402, 403 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

401, 402, 403 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 405 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

401, 402, 403. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the method for loading a diversity group provided by the embodiment of the present invention is generally executed by the server 405, and accordingly, the apparatus for loading a diversity group is generally disposed in the server 405.

It should be understood that the number of terminal devices, networks, and servers in fig. 4 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 501.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprising: the acquisition module is used for respectively acquiring index data of each device in each cluster; the determining module is used for determining the performance index of the cluster according to the index data of each device in the cluster; and the adjusting module is used for determining the model list loaded by each cluster according to the performance index of each cluster so as to adjust the model loaded by each cluster. The names of these modules do not form a limitation to the module itself under certain circumstances, for example, the obtaining module may also be described as a "module for determining a model list loaded by each cluster according to the performance index of each cluster".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: for each cluster, acquiring index data of each device in the cluster; determining the performance index of the cluster according to the index data of each device in the cluster; and determining a model list loaded by each cluster according to the performance indexes of the clusters so as to adjust the model loaded by each cluster.

According to the technical scheme of the embodiment of the invention, the performance index of the cluster is determined according to the index data of each device in the cluster, and then the loaded model list of each cluster is determined, so that the loaded model of each cluster can be dynamically adjusted, manual intervention is not needed, the self-defined index can be supported, and the availability, the standard property and the flexibility of the system are greatly improved. By customizing a cluster selection algorithm and a dynamic balance strategy, the availability, the standard and the flexibility of the system can be further improved.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for clustering a loading model, comprising:

for each cluster, acquiring index data of each device in the cluster;

2. The method of claim 1, wherein after obtaining the metric data for each device in the cluster, respectively, further comprising:

3. The method of claim 2, wherein the step of obtaining metric data for the device from the message queue is followed by: if the acquisition is successful, updating the last pulling time of the message queue; otherwise, judging whether the time difference between the current time and the last pulling time of the message queue is larger than a preset time threshold value or not; and if so, deleting the message queue from the cache queue, and deleting the mapping corresponding to the equipment in the mapping relation between the model and the cluster.

4. The method of claim 1, wherein the step of determining a list of models loaded by each of the clusters based on the performance metrics of the respective clusters is followed by the step of:

5. The method of claim 1, further comprising: when a new model needs to be loaded, determining the cluster for loading the new model according to the performance index of each cluster, or determining the cluster for loading the new model according to a preset cluster selection algorithm.

6. An apparatus for clustering loading models, comprising:

7. The apparatus of claim 6, wherein the acquisition module is further to: after the index data of each device in the cluster is acquired respectively,

8. A system for clustering loading models, comprising:

9. An electronic device for clustering a loading model, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.

10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-5.