CN112948113A - Cluster resource management scheduling method, device, equipment and readable storage medium - Google Patents

Cluster resource management scheduling method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN112948113A
CN112948113A CN202110224312.0A CN202110224312A CN112948113A CN 112948113 A CN112948113 A CN 112948113A CN 202110224312 A CN202110224312 A CN 202110224312A CN 112948113 A CN112948113 A CN 112948113A
Authority
CN
China
Prior art keywords
resource
task
cluster
offline
maximum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110224312.0A
Other languages
Chinese (zh)
Inventor
龙骏锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Weimeng Enterprise Development Co ltd
Original Assignee
Shanghai Weimeng Enterprise Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Weimeng Enterprise Development Co ltd filed Critical Shanghai Weimeng Enterprise Development Co ltd
Priority to CN202110224312.0A priority Critical patent/CN112948113A/en
Publication of CN112948113A publication Critical patent/CN112948113A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a cluster resource management scheduling method, which can divide the whole cluster resource into a plurality of resource pools, so that tasks with different resource pool attributes run on different resource pools, realize task isolation and ensure that the tasks with different resource pool attributes can be smoothly carried out. In addition, the method sets the task queue and the task priority, can control the concurrency of the total tasks, can ensure that the tasks with high priority are executed preferentially, and can ensure that the resource utilization rate is improved. Finally, the method sets the maximum resource usage amount of the task, and submits the task to the task pool only when the available resource amount is larger than or equal to the maximum resource usage amount of the task, so that the problem of overlarge cluster load is avoided, and the reliability of the cluster is improved. In addition, the application also provides a cluster resource management scheduling device, equipment and a readable storage medium, and the technical effect of the cluster resource management scheduling device corresponds to that of the method.

Description

Cluster resource management scheduling method, device, equipment and readable storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for managing and scheduling cluster resources.
Background
In task execution systems that do not introduce resource management, the use of cluster resources is generally controlled by task concurrency. This approach has a number of disadvantages, such as: the resource monitoring of the whole cluster is not carried out, the total resource consumption is unknown, and the number and the configuration of machines which meet the current task quantity cannot be estimated; when the total resources are used tightly, the method cannot be visually expressed, so that the machine load is too high; the use of single task resources is not monitored, and the resource index cannot be obtained, so that the task optimization is difficult.
Therefore, at present, a management scheme for cluster resources is lacked, so that the use condition of the cluster resources cannot be known, and the resources cannot be fully utilized or the machine is overloaded.
Disclosure of Invention
The present application aims to provide a method, an apparatus, a device and a readable storage medium for managing and scheduling cluster resources, so as to solve the problem that the resource cannot be fully utilized or the machine is overloaded due to the fact that the use condition of the cluster resources cannot be known at present. The specific scheme is as follows:
in a first aspect, the present application provides a cluster resource management scheduling method, including:
dividing cluster resources into a plurality of resource pools by using a resource scheduler;
setting the maximum resource usage amount, priority and resource pool attribute of the offline task;
when the task queue is not full, adding the offline task to the task queue according to the priority;
for each offline task in the task queue, inquiring whether the available resource amount is greater than or equal to the maximum resource usage amount through a resource manager;
and if so, submitting the offline task to a corresponding resource pool according to the resource pool attribute.
Preferably, the dividing the cluster resource into a plurality of resource pools by the resource scheduler includes:
and dividing the cluster resources into a plurality of resource pools according to the task types or the working environments by using a resource scheduler.
Preferably, after the utilizing resource scheduler divides the cluster resource into a plurality of resource pools, the method further includes:
setting parameters of each resource pool, wherein the parameters comprise any one or more of the following items: minimum available resource, maximum available resource, resource allocation weight, maximum number of applications that can be run, authorized user.
Preferably, the submitting the offline task to the corresponding resource pool according to the resource pool attribute includes:
and encapsulating the task script, the running command and the resource request determined according to the maximum resource usage amount of the offline task into a jar packet, and submitting the jar packet to a resource pool corresponding to the resource pool attribute.
Preferably, after submitting the offline task to the corresponding resource pool according to the resource pool attribute, the method further includes:
generating a running log while running the offline task, wherein the running log comprises any one or more of: the execution state of the offline task, the working state of the computing node for running the offline task, and the resource use condition of the offline task.
Preferably, after the generating a running log while the offline task is running, the method further includes:
and adjusting the maximum resource usage amount of the offline task according to the running log.
Preferably, after submitting the offline task to the corresponding resource pool according to the resource pool attribute, the method further includes:
while running the off-line task, inquiring the utilization rate of cluster resources through a resource manager;
and dynamically increasing or decreasing the computing nodes according to the cluster resource utilization rate.
In a second aspect, the present application provides a cluster resource management scheduling apparatus, including:
a resource division module: the resource scheduler is used for dividing the cluster resources into a plurality of resource pools;
a task setting module: the method comprises the steps of setting the maximum resource usage amount, the priority and the resource pool attribute of an offline task;
a task adding module: the offline task is added to the task queue according to the priority when the task queue is not full;
a condition judgment module: the resource manager is used for inquiring whether the available resource quantity is larger than or equal to the maximum resource utilization quantity or not for each offline task in the task queue;
a task submitting module: and the offline task is submitted to the corresponding resource pool according to the resource pool attribute when the available resource amount is greater than or equal to the maximum resource usage amount.
In a third aspect, the present application provides a cluster resource management scheduling device, including:
a memory: for storing a computer program;
a processor: for executing said computer program to implement the cluster resource management scheduling method as described above.
In a fourth aspect, the present application provides a readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to implement the cluster resource management scheduling method as described above.
The cluster resource management scheduling method provided by the application can divide cluster resources into a plurality of resource pools by using a resource scheduler; setting the maximum resource usage amount, priority and resource pool attribute of the offline task; when the task queue is not full, adding the offline task to the task queue according to the priority; for each offline task in the task queue, whether the available resource amount is larger than or equal to the maximum resource usage amount is inquired through a resource manager; and if so, submitting the offline task to the corresponding resource pool according to the resource pool attribute.
Therefore, the method can divide the whole cluster resource into a plurality of resource pools, ensure that offline tasks with different resource pool attributes run on different resource pools, realize task isolation and ensure that tasks with different resource pool attributes can be smoothly carried out. In addition, the method sets the task queue and the task priority, can control the concurrency of the total tasks, can ensure that the tasks with high priority are executed preferentially, and can ensure that the resource utilization rate is improved. Finally, the method sets the maximum resource usage amount of the task, and submits the task to the task pool only when the available resource amount is larger than or equal to the maximum resource usage amount of the task, so that the problem of overlarge cluster load is avoided, and the reliability of the cluster is improved.
In addition, the present application also provides a cluster resource management scheduling device, a device and a readable storage medium, whose technical effects correspond to those of the above method, and are not described herein again.
Drawings
For a clearer explanation of the embodiments or technical solutions of the prior art of the present application, the drawings needed for the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a first implementation of a cluster resource management scheduling method according to an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating an implementation of a second cluster resource management scheduling method according to the present application;
fig. 3 is a schematic process diagram of a second embodiment of a cluster resource management scheduling method provided in the present application;
fig. 4 is a functional block diagram of an embodiment of a cluster resource management scheduling device provided in the present application.
Detailed Description
The core of the application is to provide a cluster resource management scheduling method, a device, equipment and a readable storage medium, wherein the whole cluster resource is divided into a plurality of resource pools, so that offline tasks with different resource pool attributes are ensured to run on different resource pools, and task isolation is realized; the concurrency of the total tasks is controlled by setting the task queue and the task priority, and the task with high priority is guaranteed to be executed preferentially; by setting the maximum resource usage amount of the tasks, the tasks are submitted to the task pool only when the available resource amount is larger than or equal to the maximum resource usage amount of the tasks, and the problem of overlarge cluster load is avoided.
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a first embodiment of a cluster resource management scheduling method provided in the present application is described below, where the first embodiment includes:
s101, dividing cluster resources into a plurality of resource pools by using a resource scheduler.
Specifically, the cluster resources (such as the memory and the CPU) may be divided according to the working environment or the task type, for example, the cluster resources may be divided according to the working environment, and may be divided into a test resource pool, a run-over task resource pool, a formal task resource pool, and the like; the cluster resources are divided according to the task types, and can be divided into a cleaning task resource pool, an import and export task resource pool and the like.
The mutual interference can be prevented according to the division of the working environment, and if a testing resource pool is used in online testing, the formal task resource pool is not occupied. When the tasks are divided according to the types, the method is beneficial to limiting the proportion of the total resources occupied by the tasks of different types, and preventing the tasks of a certain type from occupying too much to block the operation of the tasks of other types.
The resource scheduler may specifically be a Fair scheduler, where the Fair scheduler may divide the entire cluster resource into a plurality of resource pools, and may set parameters of each resource pool, such as a minimum available resource, a maximum available resource, a resource allocation weight, a maximum executable application number, and an authorized user. The authority user refers to a user who can submit and manage the application; the resource allocation weight refers to a ratio of the amount of resources that can be obtained by one resource pool to all available resources when a task is waiting, and if the available resources are 100% and the resource allocation weight of the resource pool a is 30%, then at most 30% of the available resources can be requested when the resource pool a requests the resources.
That is to say, each resource pool can obtain available resources in different proportions according to the resource allocation weight, it should be noted that the resource allocation weight here is only an empirical value, and the resources occupied by the resource pools in practical applications are flexible and variable. For example, a cluster has 2 resource pools, namely a resource pool A and a resource pool B. The resource allocation weight of the resource pool a is 30%, the resource allocation weight of the resource pool B is 70%, it is known that one task can only run in one resource pool, for example, if the task 1 runs in the resource pool a, under normal conditions, the task 1 can only use all resources in the resource pool a, but if the resource pool B is idle, the resource pool a can borrow a part of resources from the resource pool B for the task 1 to use, and the task is returned after the task is completed.
S102, setting the maximum resource usage amount, the priority and the resource pool attribute of the offline task.
For offline tasks, the present embodiment is provided with a task queue and a priority. The task queue is used for controlling the concurrency of the total task quantity, and the task priority is used for ensuring that important tasks are operated firstly under the condition of task backlog.
In addition, the embodiment also sets the maximum resource usage amount and the resource pool attribute of the offline task, wherein the resource pool attribute is used for limiting which resource pool or which kind of resource pool the offline task falls into; the maximum resource usage is used for limiting the resource amount occupied during the running of the off-line task, and the parameter is adjustable and can be adjusted to be larger or smaller according to the actual requirement.
S103, when the task queue is not full, adding the offline tasks to the task queue according to the priority.
And under the condition that the total concurrency is constant, arranging all the offline tasks according to the sequence of the priorities from high to low, and adding N offline tasks with the highest priorities to the task queue, wherein N is the total concurrency of the cluster.
S104, inquiring whether the available resource amount is more than or equal to the maximum resource usage amount through a resource manager for each offline task in the task queue; if yes, the process proceeds to S105.
The resource manager may be specifically yarn, and is configured to implement cluster resource application and task distribution. Of course, the mess may also be selected as the resource scheduler of this embodiment, and the mess may be accessed to the mapreduce, spark and other components, and is compatible with various offline development tasks at present.
And S105, submitting the offline task to a corresponding resource pool according to the resource pool attribute.
Specifically, a task script, an operation command and a resource request determined according to the maximum resource usage of the offline task are encapsulated into jar packets, and the jar packets are submitted to a resource pool corresponding to the resource pool attributes.
In the process of running the off-line task, the resource occupation condition of the off-line task can be monitored, and the maximum resource usage amount of the off-line task can be adjusted subsequently according to the resource occupation condition; the overall resource occupancy rate of the cluster can be monitored, and accordingly, the number of computing nodes in the cluster can be increased or decreased.
The cluster resource management and scheduling method provided by this embodiment can divide the whole cluster resource into multiple resource pools, ensure that offline tasks with different resource pool attributes run on different resource pools, implement task isolation, and ensure that tasks with different resource pool attributes can be smoothly performed. In addition, the method sets the task queue and the task priority, can control the concurrency of the total tasks, can ensure that the tasks with high priority are executed preferentially, and can ensure that the resource utilization rate is improved. Finally, the method sets the maximum resource usage amount of the task, and submits the task to the task pool only when the available resource amount is larger than or equal to the maximum resource usage amount of the task, so that the problem of overlarge cluster load is avoided, and the reliability of the cluster is improved.
An embodiment of a cluster resource management scheduling method provided by the present application is described in detail below, and the embodiment two is implemented based on the foregoing embodiment one and is expanded to a certain extent on the basis of the embodiment one.
Referring to fig. 2 and 3, the second embodiment specifically includes:
s201, dividing cluster resources into a plurality of resource pools according to task types by using a resource scheduler Fair scheduler, and setting minimum available resources, maximum available resources, resource allocation weights, maximum operable application number and authority users of each resource pool.
A plurality of resource pools are arranged, and the purpose is to enable tasks of different types to run on different resource pools without mutual influence. For example, basic tasks such as import and export tasks and log cleaning tasks are placed in 2 resource pools, so that the tasks all acquire corresponding resources to execute as early as possible.
S202, developing an offline task, compiling a task script, and setting the maximum resource usage amount, priority and task type of the offline task.
And S203, when the task queue is not full, adding the offline task to the task queue according to the priority.
Specifically, under a certain trigger condition, the developed offline task is added to the task queue, where the trigger condition may be a manual trigger or a timing trigger. And judging whether the triggering condition is met or not according to the state machine, if so, selecting the task with the highest priority from the task pool, and adding the task to the task queue.
S204, for each offline task in the task queue, inquiring whether the available resource amount is larger than or equal to the maximum resource usage amount through the resource manager yarn; if yes, the process proceeds to S205.
That is, before the task runs, the task enters the task queue according to the priority, and then whether there is a resource environment required by the running of the offline task is checked, for example: judging whether the available resources of the cluster are enough or not, inquiring whether the task condition is normal or not, judging whether the target resource pool can meet the task requirement or not, and the like. Tasks will only go to the commit stage after all necessary conditions are met.
S205, when the available resource amount is larger than or equal to the maximum resource usage amount, encapsulating the task script, the running command of the offline task and the resource request determined according to the maximum resource usage amount into jar packets, and submitting the jar packets to the resource pool corresponding to the task type.
The jar package is used for generating an app master of yarn, and the app master is submitted to a specified resource pool of yarn to be executed by a client of yarn.
S206, when the offline task is operated, the utilization rate of the cluster resources is inquired through the resource manager yarn, and an operation log is generated.
Wherein the running log comprises any one or more of: error information (hive task memory overflow and the like), an execution state of an offline task, a working state of a computing node for running the offline task, and a resource use condition of the offline task (hive and the like tasks have the number of maps and reduce).
Specifically, the app master of the task executes the task script on the designated computing node, meanwhile, the task execution state and the node working state are monitored in real time, the task execution state and the server node are synchronized to achieve task state circulation, task logs and resource use summary information are generated, and the task logs and the resource use summary information are put on an nfs system and used for follow-up error query and task analysis.
Specifically, the resource usage amount (information such as the number of memories, cpus, map reduce, the total amount of read files and the like) of the task is inquired and obtained from yann through the app id of the task at yann and is recorded in the system, so that a basis is provided for task resource inquiry and optimization.
And S207, adjusting the maximum resource usage amount of the offline task according to the running log.
And S208, dynamically increasing or decreasing the computing nodes according to the utilization rate of the cluster resources.
In this embodiment, two aspects of optimization are implemented according to the monitoring data in the task running process, namely, task optimization and cluster resource optimization, as follows:
task optimization: the log information can be generated when the task runs, for example, the resource amount used by each app, so that the resource use condition of the task can be checked in the system, the task with excessively large resource use can be found out, and the task optimization is performed, for example, hive can reduce the resource use by adjusting an sql writing method.
Cluster resource optimization: through the resource monitoring interface, the recent overall cluster resource utilization rate can be obtained, such as the load condition of the cluster in each time period, so that whether the ratio of the cluster memory to the cpu is reasonable or not is judged; if the resources are insufficient or excessive, the computing nodes can be offline if the resources are not sufficiently used, and the computing nodes can be expanded if the resources are not sufficient. Resources are fully utilized by properly adjusting the ratio of the CPU to the memory in the task operation, and the number of the computing nodes is increased and reduced to meet the requirement of the off-line task. Moreover, the online and offline of the resources are not sensed by the scheduling platform, the running of tasks is not influenced, and any service of the scheduling system is not required to be restarted.
It can be seen that, in the cluster resource management and scheduling method provided in this embodiment, yarn is used as a resource manager, and resource isolation and scheduling are performed based on fair schedule, so that functions such as cluster resource monitoring and resource monitoring used by a single task are realized. Specifically, the total amount of resources to be used by the tasks is set during task development, the available resources are determined by inquiring the resource manager during task submission, if the available resources are sufficient, the tasks are submitted, the amount of the resources to be used by the tasks is monitored, the tasks with excessive resource consumption are found in time, index basis is provided for subsequent task optimization, and therefore cost is reduced.
In the following, a cluster resource management scheduling device provided in an embodiment of the present application is introduced, and a cluster resource management scheduling device described below and a cluster resource management scheduling method described above may be referred to in a corresponding manner.
As shown in fig. 4, the cluster resource management scheduling device of this embodiment includes:
the resource partitioning module 401: the resource scheduler is used for dividing the cluster resources into a plurality of resource pools;
the task setting module 402: the method comprises the steps of setting the maximum resource usage amount, the priority and the resource pool attribute of an offline task;
the task adding module 403: the offline task is added to the task queue according to the priority when the task queue is not full;
the condition determining module 404: the resource manager is used for inquiring whether the available resource quantity is larger than or equal to the maximum resource utilization quantity or not for each offline task in the task queue;
the task submission module 405: and the offline task is submitted to the corresponding resource pool according to the resource pool attribute when the available resource amount is greater than or equal to the maximum resource usage amount.
The cluster resource management scheduling apparatus of this embodiment is used to implement the foregoing cluster resource management scheduling method, and therefore a specific implementation manner of the apparatus may be found in the foregoing embodiment parts of the cluster resource management scheduling method, for example, the resource dividing module 401, the task setting module 402, the task adding module 403, the condition determining module 404, and the task submitting module 405 are respectively used to implement steps S101, S102, S103, S104, and S105 in the foregoing cluster resource management scheduling method. Therefore, specific embodiments thereof may be referred to in the description of the corresponding respective partial embodiments, and will not be described herein.
In addition, since the cluster resource management scheduling apparatus of this embodiment is used to implement the foregoing cluster resource management scheduling method, its role corresponds to that of the foregoing method, and details are not described here.
In addition, the present application further provides a cluster resource management scheduling device, including:
a memory: for storing a computer program;
a processor: for executing said computer program for implementing the cluster resource management scheduling method as described above.
Finally, the present application provides a readable storage medium having stored thereon a computer program for implementing the cluster resource management scheduling method as described above when executed by a processor.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed descriptions of the solutions provided in the present application, and the specific examples applied herein are set forth to explain the principles and implementations of the present application, and the above descriptions of the examples are only used to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A cluster resource management scheduling method is characterized by comprising the following steps:
dividing cluster resources into a plurality of resource pools by using a resource scheduler;
setting the maximum resource usage amount, priority and resource pool attribute of the offline task;
when the task queue is not full, adding the offline task to the task queue according to the priority;
for each offline task in the task queue, inquiring whether the available resource amount is greater than or equal to the maximum resource usage amount through a resource manager;
and if so, submitting the offline task to a corresponding resource pool according to the resource pool attribute.
2. The method of claim 1, wherein the utilizing a resource scheduler to divide cluster resources into a plurality of resource pools comprises:
and dividing the cluster resources into a plurality of resource pools according to the task types or the working environments by using a resource scheduler.
3. The method of claim 2, wherein after the partitioning of cluster resources into a plurality of resource pools by the utilization resource scheduler, further comprising:
setting parameters of each resource pool, wherein the parameters comprise any one or more of the following items: minimum available resource, maximum available resource, resource allocation weight, maximum number of applications that can be run, authorized user.
4. The method of claim 1, wherein the submitting the offline task to the corresponding resource pool according to the resource pool attributes comprises:
and encapsulating the task script, the running command and the resource request determined according to the maximum resource usage amount of the offline task into a jar packet, and submitting the jar packet to a resource pool corresponding to the resource pool attribute.
5. The method of claim 4, after the submitting the offline task to the corresponding resource pool according to the resource pool attributes, further comprising:
generating a running log while running the offline task, wherein the running log comprises any one or more of: the execution state of the offline task, the working state of the computing node for running the offline task, and the resource use condition of the offline task.
6. The method of claim 5, wherein after the generating a run log while running the offline task, further comprising:
and adjusting the maximum resource usage amount of the offline task according to the running log.
7. The method of claim 4, after the submitting the offline task to the corresponding resource pool according to the resource pool attributes, further comprising:
while running the off-line task, inquiring the utilization rate of cluster resources through a resource manager;
and dynamically increasing or decreasing the computing nodes according to the cluster resource utilization rate.
8. A cluster resource management scheduler, comprising:
a resource division module: the resource scheduler is used for dividing the cluster resources into a plurality of resource pools;
a task setting module: the method comprises the steps of setting the maximum resource usage amount, the priority and the resource pool attribute of an offline task;
a task adding module: the offline task is added to the task queue according to the priority when the task queue is not full;
a condition judgment module: the resource manager is used for inquiring whether the available resource quantity is larger than or equal to the maximum resource utilization quantity or not for each offline task in the task queue;
a task submitting module: and the offline task is submitted to the corresponding resource pool according to the resource pool attribute when the available resource amount is greater than or equal to the maximum resource usage amount.
9. A cluster resource management scheduling device, comprising:
a memory: for storing a computer program;
a processor: for executing said computer program for implementing the cluster resource management scheduling method according to any of claims 1-7.
10. A readable storage medium, having stored thereon a computer program for implementing a cluster resource management scheduling method according to any one of claims 1 to 7 when being executed by a processor.
CN202110224312.0A 2021-03-01 2021-03-01 Cluster resource management scheduling method, device, equipment and readable storage medium Pending CN112948113A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110224312.0A CN112948113A (en) 2021-03-01 2021-03-01 Cluster resource management scheduling method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110224312.0A CN112948113A (en) 2021-03-01 2021-03-01 Cluster resource management scheduling method, device, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN112948113A true CN112948113A (en) 2021-06-11

Family

ID=76246837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110224312.0A Pending CN112948113A (en) 2021-03-01 2021-03-01 Cluster resource management scheduling method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN112948113A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114090267A (en) * 2021-12-09 2022-02-25 云知声智能科技股份有限公司 Resource allocation method, device, equipment and medium based on dynamic resource view
CN114500401A (en) * 2022-01-21 2022-05-13 上海金融期货信息技术有限公司 Resource scheduling method and system for dealing with burst traffic
CN114880118A (en) * 2022-05-05 2022-08-09 北京达佳互联信息技术有限公司 Resource calling method and device, electronic equipment and storage medium
WO2023143057A1 (en) * 2022-01-27 2023-08-03 华为技术有限公司 Resource flow method, apparatus and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106027596A (en) * 2016-04-27 2016-10-12 乐视控股(北京)有限公司 Task distributing method and device
US20180300174A1 (en) * 2017-04-17 2018-10-18 Microsoft Technology Licensing, Llc Efficient queue management for cluster scheduling
CN110109752A (en) * 2019-04-12 2019-08-09 平安普惠企业管理有限公司 A kind of method for allocating tasks, device, electronic equipment and storage medium
CN110888732A (en) * 2018-09-10 2020-03-17 中国移动通信集团黑龙江有限公司 Resource allocation method, equipment, device and computer readable storage medium
CN111338791A (en) * 2020-02-12 2020-06-26 平安科技(深圳)有限公司 Method, device and equipment for scheduling cluster queue resources and storage medium
CN111580951A (en) * 2019-02-15 2020-08-25 杭州海康威视数字技术股份有限公司 Task allocation method and resource management platform

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106027596A (en) * 2016-04-27 2016-10-12 乐视控股(北京)有限公司 Task distributing method and device
US20180300174A1 (en) * 2017-04-17 2018-10-18 Microsoft Technology Licensing, Llc Efficient queue management for cluster scheduling
CN110888732A (en) * 2018-09-10 2020-03-17 中国移动通信集团黑龙江有限公司 Resource allocation method, equipment, device and computer readable storage medium
CN111580951A (en) * 2019-02-15 2020-08-25 杭州海康威视数字技术股份有限公司 Task allocation method and resource management platform
CN110109752A (en) * 2019-04-12 2019-08-09 平安普惠企业管理有限公司 A kind of method for allocating tasks, device, electronic equipment and storage medium
CN111338791A (en) * 2020-02-12 2020-06-26 平安科技(深圳)有限公司 Method, device and equipment for scheduling cluster queue resources and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
万川梅,谢正兰: "《深入云计算:Hadoop应用开发实战详解》", 30 June 2013, 中国铁道出版社 *
张鑫: "《深入云计算:Hadoop源码分析(修订版)》", 31 August 2014, 中国铁道出版社 *
谢能付: "《智能农业》", 31 May 2020, 中国铁道出版社 *
韦鹏程,黄思行: "《基于大数据深入解析MapReduce架构设计与实现原理研究》", 30 November 2019, 中国原子能出版社 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114090267A (en) * 2021-12-09 2022-02-25 云知声智能科技股份有限公司 Resource allocation method, device, equipment and medium based on dynamic resource view
CN114500401A (en) * 2022-01-21 2022-05-13 上海金融期货信息技术有限公司 Resource scheduling method and system for dealing with burst traffic
CN114500401B (en) * 2022-01-21 2023-11-14 上海金融期货信息技术有限公司 Resource scheduling method and system for coping with burst traffic
WO2023143057A1 (en) * 2022-01-27 2023-08-03 华为技术有限公司 Resource flow method, apparatus and device
CN114880118A (en) * 2022-05-05 2022-08-09 北京达佳互联信息技术有限公司 Resource calling method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112948113A (en) Cluster resource management scheduling method, device, equipment and readable storage medium
KR101953906B1 (en) Apparatus for scheduling task
US7945913B2 (en) Method, system and computer program product for optimizing allocation of resources on partitions of a data processing system
Abad et al. Package-aware scheduling of faas functions
CN111104208B (en) Process scheduling management method, device, computer equipment and storage medium
CN108897627B (en) Docker dynamic scheduling method for typical container
CN103593242A (en) Resource sharing control system based on Yarn frame
CN108205469B (en) MapReduce-based resource allocation method and server
CN108900626B (en) Data storage method, device and system in cloud environment
TWI786564B (en) Task scheduling method and apparatus, storage media and computer equipment
US10313265B1 (en) System and methods for sharing memory subsystem resources among datacenter applications
US8782659B2 (en) Allocation of processing tasks between processing resources
CN103425536A (en) Test resource management method oriented towards distributed system performance tests
CN111666131A (en) Load balancing distribution method and device, computer equipment and storage medium
CN116541134B (en) Method and device for deploying containers in multi-architecture cluster
WO2024120205A1 (en) Method and apparatus for optimizing application performance, electronic device, and storage medium
TW201818244A (en) Method, apparatus and system for allocating resources of application clusters under cloud environment
CN111124687A (en) CPU resource reservation method, device and related equipment
US20230305880A1 (en) Cluster distributed resource scheduling method, apparatus and device, and storage medium
US20210374319A1 (en) Dynamic allocation of computing resources for electronic design automation operations
CN116820729A (en) Offline task scheduling method and device and electronic equipment
CN117519929A (en) Example scheduling method and device, storage medium and electronic equipment
CN111506400A (en) Computing resource allocation system, method, device and computer equipment
CN111143063A (en) Task resource reservation method and device
CN111580937B (en) Automatic virtual machine scheduling method for Feiteng multi-core/many-core hybrid cluster

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210611

RJ01 Rejection of invention patent application after publication