CN110728372A - Cluster design method and cluster architecture for dynamic loading of artificial intelligence model - Google Patents

Cluster design method and cluster architecture for dynamic loading of artificial intelligence model Download PDF

Info

Publication number
CN110728372A
CN110728372A CN201910921147.7A CN201910921147A CN110728372A CN 110728372 A CN110728372 A CN 110728372A CN 201910921147 A CN201910921147 A CN 201910921147A CN 110728372 A CN110728372 A CN 110728372A
Authority
CN
China
Prior art keywords
model
server
service
cluster
deployment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910921147.7A
Other languages
Chinese (zh)
Other versions
CN110728372B (en
Inventor
顾嘉晟
李瀚清
王江
曾彦能
陈运文
纪达麒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daerguan Information Technology (shanghai) Co Ltd
Original Assignee
Daerguan Information Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daerguan Information Technology (shanghai) Co Ltd filed Critical Daerguan Information Technology (shanghai) Co Ltd
Priority to CN201910921147.7A priority Critical patent/CN110728372B/en
Publication of CN110728372A publication Critical patent/CN110728372A/en
Application granted granted Critical
Publication of CN110728372B publication Critical patent/CN110728372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to a cluster design method for dynamic loading of an artificial intelligence model, and belongs to the field of artificial intelligence. The method designs a model and a service discovery mechanism, and realizes automatic deployment of the neural network model in a service cluster through server deployment architecture design of a distributed automatic loading model. The model deployment architecture can realize full automation, high concurrency, high availability and automatic resource allocation of the deployment of the neural network model in the service cluster, thereby utilizing the server resources to the maximum extent and greatly reducing the waste of calculation and memory resources.

Description

Cluster design method and cluster architecture for dynamic loading of artificial intelligence model
Technical Field
The invention relates to the field of artificial intelligence, in particular to an architecture design mode of a service cluster based on a large-scale neural network.
Background
The neural network algorithm for deep learning is the mainstream algorithm in the field of artificial intelligence at present. Due to the multi-layer and complex model structure and the algorithmic property of the back propagation derivation, the most advanced models in the current academic world, such as Bert or GPT-2.0 in the field of natural language processing, are composed of more than 10 hundred million parameter variables, resulting in a model size of up to several GB.
In the actual engineering field, if the model is finely tuned each time on line or is reloaded from the hard disk to the memory by using model prediction, a large amount of time is consumed, and the on-line use experience of a user is seriously influenced. If all models are persistently loaded in the memory of the server, the problem of memory overflow can be caused when the number of the models is too large. For example, a plurality of servers with 16GB memories in a cluster require 20 deep learning models with different sizes ranging from 1GB to 4GB to be deployed, and if a conventional method is adopted and machines with the same number of models are deployed, a large amount of computing resources are wasted, and server cost is greatly increased. If the models are randomly distributed on a single machine, the internal memory is tried to be fully utilized, and the problems brought by the random distribution of the models are as follows: the models have popularity, most customers may only access a small portion of the models, and when the several high-frequency-use models are deployed on the same machine, the computational resources of the GPU or CPU may be congested. In addition, after the service is on line, if the administrator finds that the service resources cannot meet the user requirements, and intends to increase the service resources by adding machines, the common practice can only be to restart the service as a whole and load the model at random, which greatly affects the high availability of the service.
Disclosure of Invention
The invention aims to solve the defects of the platform architecture design, and provides a novel server deployment architecture of a distributed automatic loading model by designing a brand-new model and a service discovery mechanism. The model deployment architecture in the invention can realize full automation, high concurrency, high availability and automatic resource allocation of the deployment of the neural network model in the service cluster. Therefore, server resources are utilized to the maximum extent, and waste of computing and memory resources is greatly reduced.
In order to achieve the purpose of the invention, the technical scheme provided by the invention patent is as follows:
a cluster design method for dynamic loading of an artificial intelligence model is characterized in that the method designs a model and a service discovery mechanism, and realizes automatic deployment of a neural network model in a service cluster through server deployment architecture design of a distributed automatic loading model, and the method comprises the following processing steps:
the model discovery service is provided with a first dictionary, the first dictionary comprises the corresponding relation between a current server and a deployed model, when a user clicks and deploys a certain new model by one key on a foreground page, the model discovery service receives an instruction, and automatically deploys the new model on a low-pressure server by calculating the server pressure condition and the memory occupation condition of the server in a historical period;
the machine discovery service is used for storing a second dictionary, wherein the second dictionary comprises the current online server state, the state of each server in the service cluster is checked in a traversing mode every 10 seconds, the second dictionary is updated, and when a new server is found to be added into the current service cluster, the highest frequency usage model is obtained and loaded into a low-pressure server under the condition that the memory is not exceeded;
designing a service health check mechanism, wherein the service monitoring check mechanism checks the state of the machine discovery service every 10 seconds, and if the machine discovery service is down or a task is stuck, sending alarm information or automatically restarting the service is selected according to the actual condition;
during service health check, the pressure condition of each field in each server is analyzed simultaneously, if the pressure of individual servers in a service cluster is too high due to a large number of backlog tasks of a certain field, an idle server is automatically searched for loading the model, the model is unloaded on a high-pressure server, and the task is returned to a message queue for redistribution;
the message queues are used as a prediction task propagation medium, the message queues are used as a transmission medium for distributing tasks, prediction tasks of different models are separately stored in different queues, and when the prediction task queues of a certain model are too congested, the model discovery service automatically selects an idle server to load the model.
A cluster structure for dynamically loading artificial intelligence models comprises a user input port, a preprocessing server, a message queue server, a hard disk, a service cluster and a Redis database, wherein a plurality of neural network models are stored in the hard disk, the service cluster comprises a plurality of model deployment servers, the neural network models are loaded in the model deployment servers in the service cluster to carry out artificial intelligence operation, a user uploads or inputs signals through the user input port, the preprocessing server forwards codes which can be identified and processed by a machine, the codes are distributed to different model deployment servers through the message queue server, the corresponding models in the model deployment servers are used for processing, and processing results are output and stored in the Redis database, the cluster structure is characterized by further comprising a model/machine discovery mechanism module, the model/machine discovery mechanism module is stored in a single server in a service cluster and exists in a master-standby service mode in the server, and comprises a model discovery service submodule, a machine discovery service submodule, a service health check mechanism submodule and a model unloading submodule:
the model discovery service sub-module is independently deployed in a service cluster, the interior of the model discovery service sub-module exists in a main service and standby service mode, a first dictionary is arranged in the model discovery service sub-module, the first dictionary comprises the corresponding relation between a current server and a deployed model, a user clicks one key on a foreground page to deploy a certain new model, the model discovery service sub-module receives an instruction, and the new model is automatically deployed on a low-pressure server by calculating the server pressure condition and the memory occupation condition of the server in a historical period;
the machine discovery service submodule and the model discovery service submodule are deployed in the same server, a second dictionary is stored in the machine discovery service submodule, the second dictionary comprises the current online server state, the state of each model deployment server in a service cluster is checked in a traversing mode every 10 seconds, the second dictionary is updated, and when a new model deployment server is added into the current service cluster, the model used at the highest frequency is obtained and loaded into a low-pressure model deployment server under the condition that the memory does not exceed;
the service monitoring and checking mechanism submodule checks the state of the machine discovery service submodule every 10 seconds, if the machine discovery service submodule is down or a task is stuck, alarm information is sent or the service is restarted automatically according to the actual situation, during the service health check, the pressure situation of each field in each model deployment server is analyzed, if a large number of overstocked tasks of a certain field are found to cause the pressure of an individual model deployment server in a service cluster to be overlarge, the unloading model submodule automatically searches for an idle model deployment server to load the model, unloads the model on the high-pressure model deployment server and returns the task to a message queue for redistribution;
the method comprises the steps that a message queue is used as a prediction task propagation medium, the message queue is used as a transmission medium for distributing tasks, prediction tasks of different models are separately stored in different queues, and when the prediction task queue of a certain model is too congested, the model discovery service sub-module automatically selects an idle model deployment server to load the model.
Based on the technical scheme, the cluster design method and the cluster architecture for dynamic loading of the artificial intelligence model disclosed by the invention have the following technical effects through practical application:
1. the cluster design method for dynamic loading of the artificial intelligence model and the architecture design based on the method can realize one-key deployment of the online neural network model, reduce human intervention and reduce the occurrence of artificial errors.
2. The invention relates to a cluster design method for dynamic loading of an artificial intelligence model and an architecture design based on the method, which can automatically detect the pressure of a single server when the service is on line and automatically allocate an idle server to load a high-frequency use model through a model discovery mechanism.
3. The cluster design method for dynamic loading of the artificial intelligence model and the architecture design based on the method can automatically discover when a new server is added in the cluster, and realize dynamic loading according to the model used at high frequency in the current or historical record.
4. The cluster design method for dynamic loading of the artificial intelligence model and the service architecture designed based on the architecture of the method can realize high concurrency and high availability of the neural network service.
Drawings
FIG. 1 is a schematic diagram of a cluster architecture for dynamic loading of an artificial intelligence model according to the present invention.
FIG. 2 is a flow chart of a cluster design method for dynamic loading of an artificial intelligence model according to the present invention.
Detailed Description
In the following, we will make further detailed descriptions on the cluster design method and the cluster architecture for dynamic loading of an artificial intelligence model according to the present invention with reference to the drawings and specific embodiments, so as to understand the structural composition and the workflow clearly and clearly, but not to limit the scope of the present invention.
The invention firstly describes a cluster design method for dynamically loading an artificial intelligence model, and the method designs a model and a service discovery mechanism, and realizes automatic deployment of a neural network model in a service cluster through server deployment architecture design of a distributed automatic loading model.
To implement the above-mentioned ideas in detail, the cluster design method of the present invention includes the following processing steps:
the model discovery service is provided with a first dictionary, the first dictionary comprises the corresponding relation between a current server and a deployed model, when a user clicks and deploys a certain new model by one key on a foreground page, the model discovery service receives an instruction, and the new model is automatically deployed on a low-pressure server by calculating the server pressure condition and the memory occupation condition of the server in a historical period.
And the machine discovery service is used for storing a second dictionary, wherein the second dictionary comprises the current online server state, the state of each server in the service cluster is checked in a traversing mode every 10 seconds, the second dictionary is updated, and when a new server is found to be added into the current service cluster, the highest frequency usage model is obtained and loaded into a low-pressure server under the condition that the memory is not exceeded.
And designing a service health check mechanism, wherein the service monitoring check mechanism checks the state of the machine discovery service every 10 seconds, and if the machine discovery service is down or the task is stuck, the service monitoring check mechanism selects to send alarm information or automatically restart the service according to the actual condition.
During the service health check, the pressure condition of each field in each server is analyzed simultaneously, if the pressure of individual servers in the service cluster is too high due to a large number of backlogged tasks of a certain field, an idle server is automatically searched for to load the model, the model is unloaded on a high-pressure server, and the task is returned to the message queue for redistribution.
The message queues are used as a prediction task propagation medium, the message queues are used as a transmission medium for distributing tasks, prediction tasks of different models are separately stored in different queues, and when the prediction task queues of a certain model are too congested, the model discovery service automatically selects an idle server to load the model.
As shown in FIG. 1, the present invention also relates to a cluster structure for dynamic loading of artificial intelligence models. In general, the cluster architecture of the invention comprises a user input port, a preprocessing server, a message queue server, a hard disk, a service cluster and a Redis database, wherein a plurality of neural network models are stored in the hard disk, the service cluster comprises a plurality of model deployment servers, the neural network models are loaded in the model deployment servers in the service cluster to perform artificial intelligence operation, a user uploads or inputs signals through the user input port, the preprocessing server forwards codes which can be identified and processed by a machine, the codes are distributed to different model deployment servers through the message queue server, the codes are processed by corresponding models in the model deployment servers, and processing results are output and stored in the Redis database. The innovation of the present patent application is that the cluster architecture further includes a model/machine discovery mechanism module, the model/machine discovery mechanism module is stored in a separate server in the service cluster and exists in the form of a master/standby service inside the server, and the model/machine discovery mechanism module includes a model discovery service sub-module, a machine discovery service sub-module, a service health check mechanism sub-module and a model uninstallation sub-module.
The model discovery service sub-module is independently deployed in a service cluster, the interior of the model discovery service sub-module exists in a main service and standby service mode, a first dictionary is arranged in the model discovery service sub-module and comprises the corresponding relation between a current server and a deployed model, a user clicks one key on a foreground page to deploy a certain new model, the model discovery service sub-module receives an instruction, and the new model is automatically deployed on a low-pressure server by calculating the server pressure condition and the memory occupation condition of the server in a historical period.
The machine discovery service submodule and the model discovery service submodule are deployed in the same server, a second dictionary is stored in the machine discovery service submodule, the second dictionary comprises the current online server state, the state of each model deployment server in a service cluster is checked in a traversing mode every 10 seconds, the second dictionary is updated, and when a new model deployment server is added into the current service cluster, the model used at the highest frequency is obtained and loaded into the model deployment server with low pressure under the condition that the memory does not exceed the model deployment server.
The service monitoring and checking mechanism submodule checks the state of the machine discovery service submodule every 10 seconds, if the machine discovery service submodule is down or a task is stuck, alarm information is sent or the service is restarted automatically according to the actual situation, during the service health check, the pressure situation of each field in each model deployment server is analyzed, if a large number of overstocked tasks in a certain field are found to cause the pressure of a certain model deployment server in a service cluster to be overlarge, the unloading model submodule automatically searches an idle model deployment server to load the model, unloads the model on the high-pressure model deployment server, and returns the task to a message queue for redistribution.
The invention also uses the message queue as a prediction task transmission medium and uses the message queue as a transmission medium for distributing tasks, so that the prediction tasks of different models are separately stored in different queues, and when the prediction task queue of a certain model is too congested, the model discovery service submodule automatically selects an idle model deployment server to load the model.
The cluster architecture composition of the present invention and the dynamic loading method for implementing the artificial intelligence model based on the cluster architecture are described in detail below with a specific embodiment.
Example 1
The embodiment is a text mining task based on the method and the system. The requirements for the system in this embodiment are: a batch of contract files are uploaded by a user, and the field contents of information of a party A, information of a party B, contract effective time, contract specific terms and the like are expected to be excavated. Each field in the text corresponds to a deep learning neural network model. Assume that the task has 40 fields, i.e., 40 neural network models. The hardware aspect is mainly the server: the number of servers required is 4. The processing flow is shown in fig. 2:
step 1, a user clicks 'model online' on a page, at the moment, a model/machine discovery mechanism automatically detects the total number of the models and the number of servers, and initialization loading is realized by calculating the memory occupancy rate. For example, models 1-10 are loaded into server A; the models 11-20 are loaded into the server B; the models 21-30 are loaded into server C; the models 31-40 are loaded into server D.
And 2, after the user uploads a batch of files, the preprocessing server performs a series of text processing work to clean a text structure suitable for model reading. The text is divided into small tasks of about 5000 characters and is sequentially transmitted into a message queue, and one part is arranged in each field. At this time, there should be 20 queues in the message queue service, which correspond to 20 fields respectively, and each queue excavates all documents uploaded by the user.
And 3, the model deployment server pulls the corresponding task from the corresponding message queue at the moment and starts to execute the mining task.
And 4, at the moment, the administrator finds that the load of the 4 servers is too high, and decides to add the 4 servers. The machine discovery service discovers 4 machines newly added into the cluster after 10 seconds, and determines to load the field model with the largest number of remaining tasks into the new machine by analyzing the total number of tasks of each current field.
Step 5. assume that the model 10 predicts a longer speed than the rest of the fields and that machine a is overloaded. After the service health check mechanism finds the phenomenon, the model 10 is loaded on a machine which is not loaded with the model 10, and related tasks of the model 10 are pulled from a message queue. At the same time, model 10 in machine A is unloaded, and if a task is in progress, the task is automatically returned to the message queue for consumption by other servers.
The invention is a specific application case of the cluster architecture dynamically loaded by the patent artificial intelligence model, which is applied to a new patent system architecture and an implementation mode to embody the practical value of the patent system architecture and the implementation mode. In summary, the scope of the present invention also includes other modifications and alternatives apparent to those skilled in the art.

Claims (2)

1. A cluster design method for dynamic loading of an artificial intelligence model is characterized in that the method designs a model and a service discovery mechanism, and realizes automatic deployment of a neural network model in a service cluster through server deployment architecture design of a distributed automatic loading model, and the method comprises the following processing steps:
the model discovery service is provided with a first dictionary, the first dictionary comprises the corresponding relation between a current server and a deployed model, when a user clicks and deploys a certain new model by one key on a foreground page, the model discovery service receives an instruction, and automatically deploys the new model on a low-pressure server by calculating the server pressure condition and the memory occupation condition of the server in a historical period;
the machine discovery service is used for storing a second dictionary, wherein the second dictionary comprises the current online server state, the state of each server in the service cluster is checked in a traversing mode every 10 seconds, the second dictionary is updated, and when a new server is found to be added into the current service cluster, the highest frequency usage model is obtained and loaded into a low-pressure server under the condition that the memory is not exceeded;
designing a service health check mechanism, wherein the service monitoring check mechanism checks the state of the machine discovery service every 10 seconds, and if the machine discovery service is down or a task is stuck, sending alarm information or automatically restarting the service is selected according to the actual condition;
during service health check, the pressure condition of each field in each server is analyzed simultaneously, if the pressure of individual servers in a service cluster is too high due to a large number of backlog tasks of a certain field, an idle server is automatically searched for loading the model, the model is unloaded on a high-pressure server, and the task is returned to a message queue for redistribution;
the message queues are used as a prediction task propagation medium, the message queues are used as a transmission medium for distributing tasks, prediction tasks of different models are separately stored in different queues, and when the prediction task queues of a certain model are too congested, the model discovery service automatically selects an idle server to load the model.
2. A cluster structure for dynamically loading artificial intelligence models comprises a user input port, a preprocessing server, a message queue server, a hard disk, a service cluster and a Redis database, wherein a plurality of neural network models are stored in the hard disk, the service cluster comprises a plurality of model deployment servers, the neural network models are loaded in the model deployment servers in the service cluster to carry out artificial intelligence operation, a user uploads or inputs signals through the user input port, the preprocessing server forwards codes which can be identified and processed by a machine, the codes are distributed to different model deployment servers through the message queue server, the corresponding models in the model deployment servers are used for processing, and processing results are output and stored in the Redis database, the cluster structure is characterized by further comprising a model/machine discovery mechanism module, the model/machine discovery mechanism module is stored in a single server in a service cluster and exists in a master-standby service mode in the server, and comprises a model discovery service submodule, a machine discovery service submodule, a service health check mechanism submodule and a model unloading submodule:
the model discovery service sub-module is independently deployed in a service cluster, the interior of the model discovery service sub-module exists in a main service and standby service mode, a first dictionary is arranged in the model discovery service sub-module, the first dictionary comprises the corresponding relation between a current server and a deployed model, a user clicks one key on a foreground page to deploy a certain new model, the model discovery service sub-module receives an instruction, and the new model is automatically deployed on a low-pressure server by calculating the server pressure condition and the memory occupation condition of the server in a historical period;
the machine discovery service submodule and the model discovery service submodule are deployed in the same server, a second dictionary is stored in the machine discovery service submodule, the second dictionary comprises the current online server state, the state of each model deployment server in a service cluster is checked in a traversing mode every 10 seconds, the second dictionary is updated, and when a new model deployment server is added into the current service cluster, the model used at the highest frequency is obtained and loaded into a low-pressure model deployment server under the condition that the memory does not exceed;
the service monitoring and checking mechanism submodule checks the state of the machine discovery service submodule every 10 seconds, if the machine discovery service submodule is down or a task is stuck, alarm information is sent or the service is restarted automatically according to the actual situation, during the service health check, the pressure situation of each field in each model deployment server is analyzed, if a large number of overstocked tasks of a certain field are found to cause the pressure of an individual model deployment server in a service cluster to be overlarge, the unloading model submodule automatically searches for an idle model deployment server to load the model, unloads the model on the high-pressure model deployment server and returns the task to a message queue for redistribution;
the method comprises the steps that a message queue is used as a prediction task propagation medium, the message queue is used as a transmission medium for distributing tasks, prediction tasks of different models are separately stored in different queues, and when the prediction task queue of a certain model is too congested, the model discovery service sub-module automatically selects an idle model deployment server to load the model.
CN201910921147.7A 2019-09-27 2019-09-27 Cluster design method and cluster system for dynamic loading of artificial intelligent model Active CN110728372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910921147.7A CN110728372B (en) 2019-09-27 2019-09-27 Cluster design method and cluster system for dynamic loading of artificial intelligent model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910921147.7A CN110728372B (en) 2019-09-27 2019-09-27 Cluster design method and cluster system for dynamic loading of artificial intelligent model

Publications (2)

Publication Number Publication Date
CN110728372A true CN110728372A (en) 2020-01-24
CN110728372B CN110728372B (en) 2023-04-25

Family

ID=69218463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910921147.7A Active CN110728372B (en) 2019-09-27 2019-09-27 Cluster design method and cluster system for dynamic loading of artificial intelligent model

Country Status (1)

Country Link
CN (1) CN110728372B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190344A (en) * 2021-03-26 2021-07-30 中国科学院软件研究所 Method and device for dynamic reconfiguration and deployment of neural network for software-defined satellite
CN113791798A (en) * 2020-06-28 2021-12-14 北京沃东天骏信息技术有限公司 Model updating method and device, computer storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080312982A1 (en) * 2007-06-15 2008-12-18 International Business Machine Corporation Dynamic Creation of a Service Model
WO2016127756A1 (en) * 2015-02-15 2016-08-18 北京京东尚科信息技术有限公司 Flexible deployment method for cluster and management system
CN110149396A (en) * 2019-05-20 2019-08-20 华南理工大学 A kind of platform of internet of things construction method based on micro services framework

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080312982A1 (en) * 2007-06-15 2008-12-18 International Business Machine Corporation Dynamic Creation of a Service Model
WO2016127756A1 (en) * 2015-02-15 2016-08-18 北京京东尚科信息技术有限公司 Flexible deployment method for cluster and management system
CN110149396A (en) * 2019-05-20 2019-08-20 华南理工大学 A kind of platform of internet of things construction method based on micro services framework

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘文洁等: "一种面向服务器集群的自律计算模型", 《计算机应用》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113791798A (en) * 2020-06-28 2021-12-14 北京沃东天骏信息技术有限公司 Model updating method and device, computer storage medium and electronic equipment
CN113190344A (en) * 2021-03-26 2021-07-30 中国科学院软件研究所 Method and device for dynamic reconfiguration and deployment of neural network for software-defined satellite
CN113190344B (en) * 2021-03-26 2023-12-15 中国科学院软件研究所 Method and device for dynamic reconfiguration deployment of neural network for software defined satellite

Also Published As

Publication number Publication date
CN110728372B (en) 2023-04-25

Similar Documents

Publication Publication Date Title
CN105808334B (en) A kind of short optimization of job system and method for MapReduce based on resource reuse
CN110427284B (en) Data processing method, distributed system, computer system, and medium
US8694638B2 (en) Selecting a host from a host cluster to run a virtual machine
US20160162309A1 (en) Virtual machine packing method using scarcity
CN111966453B (en) Load balancing method, system, equipment and storage medium
CN103810048A (en) Automatic adjusting method and device for thread number aiming to realizing optimization of resource utilization
US20150295970A1 (en) Method and device for augmenting and releasing capacity of computing resources in real-time stream computing system
CN112559182B (en) Resource allocation method, device, equipment and storage medium
CN114861911B (en) Deep learning model training method, device, system, equipment and medium
CN103310460A (en) Image characteristic extraction method and system
CN105446653A (en) Data merging method and device
US20220137876A1 (en) Method and device for distributed data storage
CN110728372A (en) Cluster design method and cluster architecture for dynamic loading of artificial intelligence model
US10114438B2 (en) Dynamic power budgeting in a chassis
CN113961353A (en) Task processing method and distributed system for AI task
CN112860532B (en) Performance test method, device, equipment, medium and program product
US20200142803A1 (en) Hyper-converged infrastructure (hci) log system
CN107179998A (en) A kind of method and device for configuring peripheral hardware core buffer
CN114461407B (en) Data processing method, data processing device, distribution server, data processing system, and storage medium
CN116248689A (en) Capacity expansion method, device, equipment and medium for cloud native application
CN114416357A (en) Method and device for creating container group, electronic equipment and medium
US8607245B2 (en) Dynamic processor-set management
CN111090627B (en) Log storage method and device based on pooling, computer equipment and storage medium
CN110377398B (en) Resource management method and device, host equipment and storage medium
CN108984271A (en) A kind of method and relevant device of equally loaded

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant