CN110278279A - A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline - Google Patents

A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline Download PDF

Info

Publication number
CN110278279A
CN110278279A CN201910564917.7A CN201910564917A CN110278279A CN 110278279 A CN110278279 A CN 110278279A CN 201910564917 A CN201910564917 A CN 201910564917A CN 110278279 A CN110278279 A CN 110278279A
Authority
CN
China
Prior art keywords
task
resource
module
host node
resource allocation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910564917.7A
Other languages
Chinese (zh)
Inventor
张梦龙
裴宝山
李翔
祁洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Consumption Finance Co Ltd
Original Assignee
Suning Consumption Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Consumption Finance Co Ltd filed Critical Suning Consumption Finance Co Ltd
Priority to CN201910564917.7A priority Critical patent/CN110278279A/en
Publication of CN110278279A publication Critical patent/CN110278279A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/133Protocols for remote procedure calls [RPC]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/484Precedence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Abstract

The present invention relates to a kind of big datas of dynamic resource scheduling mechanism to dispatch development platform offline, including client, resource allocation host node module and several task execution managers, the client is connect with resource allocation host node module and several task execution managers respectively, each task execution manager is connected with resource allocation host node module, it is communicated for timing with resource allocation host node module, current server resource service condition is reported, and periodically updates the resource using information of oneself into zookeeper node.Using present invention significantly reduces the Single Point of Faliure risks of host node.The basic load for solving the problems, such as working node in old frame the operation conditions of subprogram but can not be handled under monitor task.If the machine of task run goes wrong, other machines can be distributed to and restart task, greatly improve the fault-tolerance of platform.

Description

A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline
Technical field
The present invention relates to scheduling of resource technical fields, and in particular to a kind of big data of dynamic resource scheduling mechanism is adjusted offline Spend development platform and method.
Background technique
In the case where Internet company's high speed development, the calculating demand based on data-intensive applications is continuously increased, public The considerations of department is for factors such as the resource utilizations, O&M cost, data sharing of server hardware, it is desirable to will be various types of Calculating demand be all deployed in a public cluster, allow the resources of their shared clusters, and unified use is carried out to resource, Each task is isolated using certain resource isolation mechanism simultaneously, be just born scheduling development platform in this way.And it is traditional Scheduling development platform perform poor in terms of these three in scalability, reliability, resource utilization.
There can be a host node (Master) in traditional scheduler development platform and be provided simultaneously with resource management and operation control Function processed, this part become a maximum bottleneck of system.This node completes too many task, causes excessive resource and disappears Consumption will cause very big memory overhead when user program is very more, it is potential for, also increase user program distribution Risk, this is also that industry generally sums up this framework and can only support the upper limit of 4000 node hosts.
From node (Slave), the simple quantity by task is too simple as the expression of resource, does not account for To the occupancy situation of CPU, memory, if together with the task schedule of two big memory consumption, just it is easy to appear memories Overflow problem.
To solve the above-mentioned problems, while meeting the seamless migration of business scenario, we are to having carried out function on host node For two components, the two components are resource management and task schedule for separation.Resource manager (ResourceMaster) is complete Office manages the computational resource allocation of all application programs, and the Executor of each application is responsible for corresponding scheduling and coordinates.This The design of sample substantially reduces the resource consumption of Master, and allow monitor each subroutine state program distribution. For resource expression as unit of memory and cpu, than before number of tasks distribution it is more reasonable.
Summary of the invention
It dispatches and opens offline technical problem to be solved by the invention is to provide a kind of big data of dynamic resource scheduling mechanism Send out platform and method.
In order to solve the above technical problems, the technical solution of the present invention is as follows: providing a kind of big number of dynamic resource scheduling mechanism According to offline scheduling development platform, innovative point is: including client, resource allocation host node module and several task execution pipes Device is managed, the client is connect with resource allocation host node module and several task execution managers respectively, each task It executes manager to connect with resource allocation host node module, be communicated for timing with resource allocation host node module, report is worked as Preceding server resource service condition, and periodically update the resource using information of oneself into zookeeper node.
Further, each task execution manager includes several resource management containers, each resource pipe Reason container includes a task performer module, the task performer module by be connected with resource allocation host node module come Task ID is found from resource allocation host node module, and makes reality of the task execution manager by task ID to task performer When monitoring and user in the operating status of web client real time inspection to task, the task performer module is also and client End is connected to client exposed interface, and the interface is used to carry out submission task, kill task and check task status.
Further, the resource allocation host node module includes scheduler and task manager, and Scheduler module is used for The starting of clocked flip task, resource manager is for being divided the CPU, memory and bandwidth resources of the server in cluster Match.
Further, each task execution manager includes that the quantity of resource management container passes through the strategy of scheduler To determine.
In order to solve the above technical problems, technical solution of the present invention additionally provides a kind of big number of dynamic resource scheduling mechanism According to offline scheduling development approach, innovative point is: specifically includes the following steps:
(1) client creates corresponding multiple tasks according to customer service demand, and task flow is added in multiple tasks, together When configuration task warning strategies and case mechanism, the scheduling time of configuration task stream and priority, and to resource allocation master Node module sends the configuration information of each task;
(2) resource allocation host node module is to distribute the appearance for having resource information respectively from the received each task of client Device, i.e. resource management container, and store tasks ID, resource allocation host node module and each task execution are distinguished for each task Manager communication, allows task execution manager to start corresponding task performer module;
(3) after the starting of task performer module, to resource allocation host node Module registers;
It (4) is to appoint to resource allocation host node module after task performer module succeeds to resource allocation host node Module registers Resource is applied for and is got in business;
(5) after resource allocation host node module receives the resource bid of task performer module, pass through internal resource management Device is to task performer module assignment task, after the application of task performer module is to resource, with corresponding task execution management Device communication makes corresponding task execution manager starting user application, that is, starts execution task;
(6) each task performer module by RPC agreement to task execution manager report task execution state and into Degree, so that task execution manager monitors the operating status of each task at any time, to restart task in mission failure;
By resource allocation host node after task performer module is finished current task or current task execution failure Module is nullified, and shows task action result in the task daily record of client, the appointing by step (1) configuration if mission failure Business warning strategies notify business personnel.
Further, task performer module to resource allocation host node module application and gets money in the step (4) The mode in source is by the way of poll, and task performer module and resource allocation host node module pass through RPC protocol communication, institute RPC communication is stated to realize using Apache Thrift.
Further, the process of task execution manager starting user application includes: that task is held in the step (5) Line supervisor is that user program has configured running environment, and the running environment includes environmental variance and binary program, by program Start command is write in an executable file, starts user application by running the executable file.
Further, in the step (6) during user program operation, task performer module passes through at any time RPC agreement shows the operating status of application program to user.
The present invention compared to the prior art, the beneficial effects are as follows:
(1) ResourceMaster separates existing two components of host node, the two functions are resource management and task schedule / monitoring.The distribution of all application program computing resources of new resource manager global administration, the Executor of each application It is responsible for corresponding scheduling and coordinates.Separation function significantly reduces the Single Point of Faliure risk of host node.
(2) WorkerManager function is more single-minded, is just responsible for the maintenance of program containers state, and to ResourceMaster keeps heartbeat, and Executor is responsible for all working in a task life cycle, similar old frame Middle Slave node.Although note that each task (not being each) has an Executor, it may operate in On machine other than ResourceMaster.The basic load for solving working node in old frame is subprogram under monitor task Operation conditions but the problem of can not handle.If the machine of task run goes wrong, other machines can be distributed to and restarted Task greatly improves the fault-tolerance of platform.
Detailed description of the invention
It, below will be to needed in the embodiment in order to more clearly illustrate the technical solution in the embodiment of the present invention Attached drawing is simply introduced, it should be apparent that, the accompanying drawings in the following description is only some embodiments recorded in the present invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.Fig. 1 is that a kind of big data of dynamic resource scheduling mechanism of the invention dispatches the system construction drawing of development platform offline.
Fig. 2 is the system construction drawing of the task execution manager 1 in Fig. 1.
Specific embodiment
Technical solution of the present invention will be clearly and completely described by specific embodiment below.
The present invention provides a kind of big datas of dynamic resource scheduling mechanism to dispatch development platform offline, and specific structure is such as Shown in Fig. 1, including client, resource allocation host node module and several task execution managers, several task execution managers Be divided into task execution manager 1, task execution manager 2 ..., client respectively with resource allocation host node module and several The connection of task execution manager, each task execution manager are connected with resource allocation host node module, for timing It is communicated with resource allocation host node module, reports current server resource service condition, and the resource of oneself is periodically used into letter Breath is updated into zookeeper node.
Several task execution managers of the invention are several servers under resource allocation host node module management, each , the present invention chooses the task execution manager 1 in Fig. 1 and is illustrated, as shown in Fig. 2, task for the effect of a server Executing manager 1 includes several resource management containers, and each resource management container includes a task performer module, is appointed Business executor module finds task ID from resource allocation host node module by being connected with resource allocation host node module, and makes Task execution manager is obtained to arrive the real time monitoring of task performer and user in web client real time inspection by task ID The operating status of task, the task performer module are also connected to client exposed interface with client, interface be used into Row submission task, kill task and check task status.
Resource allocation host node module of the invention includes scheduler and task manager, and Scheduler module is for periodically touching The starting of hair task, resource manager is for the CPU, memory and bandwidth resources of the server in cluster to be allocated.It is each A task execution manager includes that the quantity of resource management container is determined by the strategy of scheduler.
The big data that technical solution of the present invention additionally provides a kind of dynamic resource scheduling mechanism dispatches development approach offline, Specifically includes the following steps:
(1) client creates corresponding multiple tasks according to customer service demand, and task flow is added in multiple tasks, together When configuration task warning strategies and case mechanism, the scheduling time of configuration task stream and priority, and to resource allocation master Node module sends the configuration information of each task;
(2) resource allocation host node module is to distribute the appearance for having resource information respectively from the received each task of client Device, i.e. resource management container, and store tasks ID, resource allocation host node module and each task execution are distinguished for each task Manager communication, allows task execution manager to start corresponding task performer module;
(3) after the starting of task performer module, to resource allocation host node Module registers;
It (4) is to appoint to resource allocation host node module after task performer module succeeds to resource allocation host node Module registers Business application and get resource, wherein task performer module is to resource allocation host node module application and gets the mode of resource By the way of poll, and task performer module and resource allocation host node module, by RPC protocol communication, the RPC is logical Courier is realized with Apache Thrift;
(5) after resource allocation host node module receives the resource bid of task performer module, pass through internal resource management Device is to task performer module assignment task, after the application of task performer module is to resource, with corresponding task execution management Device communication makes corresponding task execution manager starting user application, that is, starts execution task, wherein task execution pipe It is that user program has configured running environment, the fortune that the process of reason device starting user application, which includes: task execution manager, Row environment includes environmental variance and binary program, and program start command is write in an executable file, by running institute State executable file starting user application;
(6) each task performer module by RPC agreement to task execution manager report task execution state and into Degree so that task execution manager monitors the operating status of each task at any time, to restart task in mission failure, with During family program is run, task performer module passes through the operating status that RPC agreement shows application program to user at any time;
(7) by the main section of resource allocation after task performer module is finished current task or current task execution failure Point module is nullified, and shows task action result in the task daily record of client, by step (1) configuration if mission failure Task warning strategies notify business personnel.
Embodiment described above is only that the preferred embodiment of the present invention is described, not to design of the invention It is defined with range, without departing from the design concept of the invention, ordinary engineering and technical personnel is to this hair in this field The all variations and modifications that bright technical solution is made should all fall into protection scope of the present invention, claimed skill of the invention Art content is all documented in technical requirements book.

Claims (8)

1. a kind of big data of dynamic resource scheduling mechanism dispatches development platform offline, it is characterised in that: including client, resource Distribute host node module and several task execution managers, the client respectively with resource allocation host node module and several Business executes manager connection, and each task execution manager connects with resource allocation host node module, be used for periodically with The communication of resource allocation host node module, reports current server resource service condition, and periodically by the resource using information of oneself It updates in zookeeper node.
2. a kind of big data of dynamic resource scheduling mechanism according to claim 1 dispatches development platform, feature offline Be: each task execution manager includes several resource management containers, and each resource management container includes One task performer module, the task performer module with resource allocation host node module by being connected come from resource allocation master Node module finds task ID, and makes real time monitoring and use of the task execution manager by task ID to task performer In the operating status of web client real time inspection to task, the task performer module is also connected with client to visitor at family Family end exposed interface, the interface are used to carry out submission task, kill task and check task status.
3. a kind of big data of dynamic resource scheduling mechanism according to claim 1 dispatches development platform, feature offline Be: the resource allocation host node module includes scheduler and task manager, and Scheduler module is used for clocked flip task Starting, resource manager is for the CPU, memory and bandwidth resources of the server in cluster to be allocated.
4. a kind of big data of dynamic resource scheduling mechanism according to claim 2 dispatches development platform, feature offline Be: each task execution manager includes that the quantity of resource management container is determined by the strategy of scheduler.
5. a kind of big data of dynamic resource scheduling mechanism dispatches development approach offline, it is characterised in that: specifically include following step It is rapid:
(1) client creates corresponding multiple tasks according to customer service demand, and task flow is added in multiple tasks, together When configuration task warning strategies and case mechanism, the scheduling time of configuration task stream and priority, and to resource allocation master Node module sends the configuration information of each task;
(2) resource allocation host node module is to distribute the appearance for having resource information respectively from the received each task of client Device, i.e. resource management container, and store tasks ID, resource allocation host node module and each task execution are distinguished for each task Manager communication, allows task execution manager to start corresponding task performer module;
(3) after the starting of task performer module, to resource allocation host node Module registers;
It (4) is to appoint to resource allocation host node module after task performer module succeeds to resource allocation host node Module registers Resource is applied for and is got in business;
(5) after resource allocation host node module receives the resource bid of task performer module, pass through internal resource management Device is to task performer module assignment task, after the application of task performer module is to resource, with corresponding task execution management Device communication makes corresponding task execution manager starting user application, that is, starts execution task;
(6) each task performer module by RPC agreement to task execution manager report task execution state and into Degree, so that task execution manager monitors the operating status of each task at any time, to restart task in mission failure;
(7) by the main section of resource allocation after task performer module is finished current task or current task execution failure Point module is nullified, and shows task action result in the task daily record of client, by step (1) configuration if mission failure Task warning strategies notify business personnel.
6. a kind of big data of dynamic resource scheduling mechanism according to claim 5 dispatches development approach, feature offline Be: task performer module to resource allocation host node module application and is got the mode of resource and is used in the step (4) The mode of poll, and task performer module and resource allocation host node module, by RPC protocol communication, the RPC communication makes It is realized with Apache Thrift.
7. a kind of big data of dynamic resource scheduling mechanism according to claim 5 dispatches development approach, feature offline Be: it is to use that the process of task execution manager starting user application, which includes: task execution manager, in the step (5) Family program has configured running environment, and the running environment includes environmental variance and binary program, and program start command is write In one executable file, start user application by running the executable file.
8. a kind of big data of dynamic resource scheduling mechanism according to claim 5 dispatches development approach, feature offline Be: in the step (6) during user program operation, task performer module passes through RPC agreement to user at any time Show the operating status of application program.
CN201910564917.7A 2019-06-27 2019-06-27 A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline Pending CN110278279A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910564917.7A CN110278279A (en) 2019-06-27 2019-06-27 A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910564917.7A CN110278279A (en) 2019-06-27 2019-06-27 A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline

Publications (1)

Publication Number Publication Date
CN110278279A true CN110278279A (en) 2019-09-24

Family

ID=67963508

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910564917.7A Pending CN110278279A (en) 2019-06-27 2019-06-27 A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline

Country Status (1)

Country Link
CN (1) CN110278279A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125645A (en) * 2019-11-15 2020-05-08 至本医疗科技(上海)有限公司 Executive program processing method, system, device, computer equipment and medium
CN112910703A (en) * 2021-02-01 2021-06-04 中金云金融(北京)大数据科技股份有限公司 Offline task management platform
CN113220480A (en) * 2021-04-29 2021-08-06 西安易联趣网络科技有限责任公司 Distributed data task cross-cloud scheduling system and method
CN117271102A (en) * 2023-11-23 2023-12-22 山东省工业技术研究院 Task scheduling system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040201604A1 (en) * 2000-06-19 2004-10-14 International Business Machines Corporation System and method for developing and administering web applications and services from a workflow, enterprise, and mail-enabled web application server and platform
US20110270855A1 (en) * 2010-02-04 2011-11-03 Network State, LLC State operating system
CN103944769A (en) * 2014-05-05 2014-07-23 江苏物联网研究发展中心 RPC (Remote Procedure Call) protocol based cluster resource unified management system
CN105703940A (en) * 2015-12-10 2016-06-22 中国电力科学研究院 Multistage dispatching distributed parallel computing-oriented monitoring system and monitoring method
CN106022664A (en) * 2016-07-08 2016-10-12 大连大学 Big data analysis based network intelligent power saving monitoring method
CN108388472A (en) * 2018-03-01 2018-08-10 吉林大学 A kind of elastic task scheduling system and method based on Docker clusters
CN108762932A (en) * 2018-05-31 2018-11-06 安徽四创电子股份有限公司 A kind of cluster task scheduling system and processing method
CN109408210A (en) * 2018-09-27 2019-03-01 北京车和家信息技术有限公司 Distributed timing task management method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040201604A1 (en) * 2000-06-19 2004-10-14 International Business Machines Corporation System and method for developing and administering web applications and services from a workflow, enterprise, and mail-enabled web application server and platform
US20110270855A1 (en) * 2010-02-04 2011-11-03 Network State, LLC State operating system
CN103944769A (en) * 2014-05-05 2014-07-23 江苏物联网研究发展中心 RPC (Remote Procedure Call) protocol based cluster resource unified management system
CN105703940A (en) * 2015-12-10 2016-06-22 中国电力科学研究院 Multistage dispatching distributed parallel computing-oriented monitoring system and monitoring method
CN106022664A (en) * 2016-07-08 2016-10-12 大连大学 Big data analysis based network intelligent power saving monitoring method
CN108388472A (en) * 2018-03-01 2018-08-10 吉林大学 A kind of elastic task scheduling system and method based on Docker clusters
CN108762932A (en) * 2018-05-31 2018-11-06 安徽四创电子股份有限公司 A kind of cluster task scheduling system and processing method
CN109408210A (en) * 2018-09-27 2019-03-01 北京车和家信息技术有限公司 Distributed timing task management method and system

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125645A (en) * 2019-11-15 2020-05-08 至本医疗科技(上海)有限公司 Executive program processing method, system, device, computer equipment and medium
CN111125645B (en) * 2019-11-15 2023-05-16 至本医疗科技(上海)有限公司 Method, system, device, computer equipment and medium for processing execution program
CN112910703A (en) * 2021-02-01 2021-06-04 中金云金融(北京)大数据科技股份有限公司 Offline task management platform
CN113220480A (en) * 2021-04-29 2021-08-06 西安易联趣网络科技有限责任公司 Distributed data task cross-cloud scheduling system and method
CN113220480B (en) * 2021-04-29 2023-03-10 西安易联趣网络科技有限责任公司 Distributed data task cross-cloud scheduling system and method
CN117271102A (en) * 2023-11-23 2023-12-22 山东省工业技术研究院 Task scheduling system
CN117271102B (en) * 2023-11-23 2024-03-19 山东省工业技术研究院 Task scheduling system

Similar Documents

Publication Publication Date Title
CN110278279A (en) A kind of big data of dynamic resource scheduling mechanism dispatches development platform and method offline
US10700948B2 (en) Service-oriented modular system architecture
CN109828831B (en) Artificial intelligence cloud platform
US20190377604A1 (en) Scalable function as a service platform
US10761829B2 (en) Rolling version update deployment utilizing dynamic node allocation
US10412158B2 (en) Dynamic allocation of stateful nodes for healing and load balancing
US20180331897A1 (en) Method and device for training model in distributed system
CN105659562B (en) It is a kind of for hold barrier method and data processing system and include for holds hinder computer usable code storage equipment
EP0706685B1 (en) An integrated plant environment system having a PROGRAM-TO-PROGRAM COMMUNICATION SERVER and method
US20050149908A1 (en) Graphical development of fully executable transactional workflow applications with adaptive high-performance capacity
Sengupta et al. Scheduling multi-tenant cloud workloads on accelerator-based systems
CN103780655A (en) Message transmission interface task and resource scheduling system and method
CN106354563A (en) Distributed computing system for 3D (three-dimensional reconstruction) and 3D reconstruction method
CN101751288A (en) Method, device and system applying process scheduler
CN110958311A (en) YARN-based shared cluster elastic expansion system and method
CN103324479B (en) The middleware System Framework that under loose environment, distributed big data calculate
CN108268314A (en) A kind of method of multithreading task concurrent processing
US10521272B1 (en) Testing in grid computing systems
Khan et al. Scheduling in Desktop Grid Systems: Theoretical Evaluation of Policies & Frameworks
da Rosa Righi et al. Towards cloud-based asynchronous elasticity for iterative HPC applications
CN111274018A (en) Distributed training method based on DL framework
CN109450913A (en) A kind of multinode registration dispatching method based on strategy
CN113515356B (en) Lightweight distributed resource management and task scheduler and method
Xu et al. Dirigo: Self-scaling Stateful Actors For Serverless Real-time Data Processing
US8402465B2 (en) System tool placement in a multiprocessor computer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190924

RJ01 Rejection of invention patent application after publication