CN110012062B

CN110012062B - Multi-computer-room task scheduling method and device and storage medium

Info

Publication number: CN110012062B
Application number: CN201910134018.3A
Authority: CN
Inventors: 卢明樊; 宗志远
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2019-02-22
Filing date: 2019-02-22
Publication date: 2022-02-08
Anticipated expiration: 2039-02-22
Also published as: CN110012062A

Abstract

The invention provides a multi-computer-room task scheduling method, a device and a storage medium, which are applied to a system comprising a plurality of clusters, a scheduler and a memory, wherein the method is executed in the scheduler, and comprises the following steps: acquiring the data state of a preposition task in a memory; judging whether to start a task to be executed or not according to the data production state of the preposed task; under the condition that the task to be executed is determined to be started, determining an execution cluster for executing the task to be executed and an execution time length required by executing the task to be executed, wherein the execution cluster is an idle cluster in a plurality of clusters; according to the execution cluster and the execution duration, a preset execution strategy is utilized to complete the tasks to be executed, and the execution strategy is used for determining different idle clusters as execution clusters when the number of the clusters for executing the tasks to be executed is multiple, so that the scheduling and distribution of multiple cluster resources in a cross machine room are correspondingly carried out, the robustness of a big data system is improved, the safe operation capability of the system is improved, and the resource configuration is optimized.

Description

Multi-computer-room task scheduling method and device and storage medium

Technical Field

The invention belongs to the field of data processing, and particularly relates to a multi-computer-room task scheduling method, a multi-computer-room task scheduling device and a storage medium.

Background

In the prior art, scheduling of cluster resources in a single machine room is performed by allocating suitable cluster resources to execute a task to be executed, and when there is no idle cluster resource in the machine room, placing the task to be executed in a queue waiting state until there is an idle and suitable cluster resource. Once a machine room fails, cluster resources which can not execute tasks cannot be executed continuously, so that the reliability of a single machine room is low and the single machine room does not have disaster tolerance capability.

At present, cluster resources distributed in multiple computer rooms can be used for executing tasks to be executed in a big data system, but due to the limitation of communication functions/permissions among all the computer rooms, resources and task states in other computer rooms cannot be known, flexible allocation of the cluster resources among the multiple computer rooms is difficult to achieve, the problems that a single computer room is low in reliability and does not have disaster tolerance capability cannot be solved by using the cluster resources in the multiple computer rooms, and therefore robustness and disaster tolerance of the big data system are difficult to guarantee.

Disclosure of Invention

In view of the above, the present invention provides a method, an apparatus, and a storage medium for scheduling multi-computer-room tasks, so as to solve the problem in the prior art that a reasonable allocation of resources cannot be achieved under the condition of multiple computer rooms.

According to a first aspect of the present invention, there is provided a method for scheduling a task in multiple computer rooms, applied to a system including multiple clusters, a scheduler, and a storage, where the multiple clusters are deployed in different computer rooms, each cluster is respectively in communication with the storage, the storage is in communication with the scheduler, and the method is performed in the scheduler, where the method includes:

acquiring the data state of a preposed task in the memory;

judging whether to start a task to be executed or not according to the data production state of the preposed task;

under the condition that the task to be executed is determined to be started, determining an execution cluster for executing the task to be executed and an execution time length required by executing the task to be executed, wherein the execution cluster is an idle cluster in the plurality of clusters;

and according to the execution cluster and the execution duration, completing the task to be executed by using a preset execution strategy, wherein the execution strategy is used for determining different idle clusters as the execution cluster when a plurality of clusters for executing the task to be executed are available.

According to a second aspect of the present invention, there is provided a multi-room task scheduling device, applied to a system including a plurality of clusters, a scheduler, and a storage, wherein the plurality of clusters are deployed in different rooms, each cluster is respectively in communication with the storage, the storage is in communication with the scheduler, the device is disposed in the scheduler, and the device includes:

the acquisition module acquires the data state of the preposed task in the memory;

the detection module is used for judging whether to start the task to be executed according to the data production state of the preposed task;

the cluster determining module is used for determining an execution cluster for executing the task to be executed and an execution time length required for executing the task to be executed under the condition that the task to be executed is determined to be started, wherein the execution cluster is an idle cluster in the plurality of clusters;

and the task execution module is used for completing the tasks to be executed by utilizing a preset execution strategy according to the execution clusters and the execution duration, wherein the execution strategy is used for determining different idle clusters as the execution clusters when the clusters for executing the tasks to be executed are multiple.

According to a third aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the multi-room task scheduling method according to the first aspect.

Aiming at the prior art, the invention has the following advantages:

determining whether to start the task to be executed or not by using a scheduler according to the data state of the preposed task, allocating an idle cluster in a system for the task to be executed according to the resource idle condition after determining to start the task to be executed, and correspondingly setting the execution duration, inquiring a plurality of cluster states of a plurality of computer rooms through data interaction between a dispatcher and the plurality of computer rooms, thereby being capable of carrying out scheduling and distribution on cluster resources among a plurality of machine rooms, avoiding the problem that the execution of tasks cannot be finished due to single machine room failure, determining the plurality of clusters to be idle clusters if the cluster executing the task to be executed is a plurality of clusters, and then under the conditions that a single cluster has faults, task failure, cluster congestion and the like, other suitable resources can be scheduled to complete the task to be executed, so that the safe operation capacity of the system is improved, and the resource allocation is optimized.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a flowchart illustrating steps of a multi-computer-room task scheduling method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating specific steps of a multi-computer-room task scheduling method according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating specific steps of a multi-computer-room task scheduling method according to an embodiment of the present invention;

fig. 4 is a flowchart illustrating specific steps of a multi-computer-room task scheduling method according to an embodiment of the present invention;

fig. 5 is a block diagram of a multi-computer-room task scheduling device according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

Fig. 1 is a flowchart of steps of a multi-machine-room task scheduling method provided in an embodiment of the present invention, as shown in fig. 1, the method is applied to a system including a plurality of clusters, a scheduler, and a memory, the plurality of clusters are deployed in different machine rooms, each cluster is respectively in communication with the memory, and the memory is in communication with the scheduler, and the method is executed by the scheduler and may include:

step 101, acquiring the data state of the preposition task in the memory.

In a specific application, when a large amount of data is processed based on a certain target item (e.g. cheating data cleaning or report index statistics) of a big data system, a plurality of execution steps (tasks) are generally included, and in these steps (tasks), each operation (development, submission and maintenance operation) may belong to different teams, since different teams correspond to different cluster resources and have different authority limits, team a cannot/cannot directly check the execution condition of team B's task, the cluster resource for executing team a corresponding operation cannot perform data interaction with the cluster resource for executing team B corresponding operation, so that in the big data system, a scheduler is required to perform scheduling and coordination therein, and the completion state of a partner, i.e. a pre-step (task) is known through the scheduler, so as to start the execution operation of my task, i.e. the task to be executed.

It should be noted that the scheduler may obtain the cluster resource state in each machine room and the task state executed in each machine room by using the data interface of each machine room. Illustratively, HDFS (Chinese: Hadoop Distributed File System; English: Hadoop Distributed File System) is used as a memory for storing done files for identifying the completion status of tasks. After each task is executed, a done file is correspondingly generated and stored in a memory to indicate that the data production state of the task is a completion state, for example, a scheduler may be set to periodically query whether the done file of the pre-task is generated (acquire the data state of the pre-task) in the HDFS, and when it is detected that the pre-task generates the corresponding done file, the scheduler determines to start the task to be executed; and if not, waiting for the next period to perform query again so as to determine whether the front task generates a done file or not, and further determine whether to start the task to be executed or not.

It should be noted that the machine room related in the present invention refers to a location where hardware devices for data storage and calculation are located by an enterprise, such as a certain machine room, and machine rooms of other enterprises built in the country, and for the same enterprise, data and calculation resources may be distributed in different machine rooms throughout the country; each enterprise has a corresponding business team, and generally the same big data analysis task is performed according to the personnel divided according to the functions inside the enterprise, a plurality of teams may be required to participate, one team provides a part of results for other teams to use, the results can be blacklist data or abstract statistical characteristics, and the like, so that a scheduler is required to schedule the resources of the system according to the different machine room resource utilization rates and the upstream and downstream relations between tasks.

And 102, judging whether to start the task to be executed or not according to the data production state of the preposed task.

The task to be executed is a task to be executed based on data produced by the preposed task, and the task to be executed is a data production task which needs to be executed currently. In the system including the plurality of clusters, the scheduler and the memory, the scheduler acquires the state information of the clusters and the data production state of the tasks from the memory, thereby realizing the global resource allocation in the system, and it should be noted that the scheduler can be a functional module capable of realizing the technical scheme provided by the invention and a virtual device integrated on a hardware device; or a hardware device with an entity may implement the method steps of the technical solution proposed by the present invention, which is not limited by the present invention.

Step 103, determining an execution cluster for executing the task to be executed and an execution time length required for executing the task to be executed under the condition that the task to be executed is determined to be started.

Wherein the executing cluster is a free cluster of the plurality of clusters.

In a specific application scenario, when resources are reasonably scheduled by the scheduler, for example, for the actual execution situation of each current data production task, a suitable execution cluster and a corresponding execution duration are determined, and further, when the data production task of the cluster to be executed fails, queues up or runs slowly, the scheduler schedules other clusters that can be used for executing the data production task, so that the data production of the task to be executed can be continuously executed. The cluster refers to distributed computing resources and is distributed in different machine rooms, so that in the process of scheduling the tasks to be executed, reasonable allocation of the execution cluster is correspondingly performed to optimize resource allocation.

And 104, completing the task to be executed by utilizing a preset execution strategy according to the execution cluster and the execution duration.

The execution strategy is used for determining different idle clusters as execution clusters when the clusters for executing the tasks to be executed are multiple.

In a specific application, according to the execution cluster and the execution duration determined in step 103, the execution cluster is utilized to perform the execution operation of the task to be executed within the execution duration, so as to complete the task to be executed. If the task to be executed is completed within the execution duration, for example, corresponding data state data, such as a Done file, may be generated in the HDFS to indicate that the subsequent task may be continuously executed; otherwise, other idle clusters different from the previously determined execution cluster are used to continue the task to be executed. For example, when the cluster a is used as an execution cluster, when a task to be executed is not completed within a preset execution time, the task to be executed may be caused by a fault of the cluster a, and then when the scheduler allocates a new execution cluster to the task to be executed, even if the cluster a is in an idle state, the cluster a is not used any more, so as to avoid a situation that the task to be executed cannot be completed within the execution time due to reuse of the cluster a (reuse of the cluster resource with the fault), so as to ensure a completion rate of the task to be executed.

It should be noted that the scheduler according to the technical solution provided by the present invention is a scheduler arranged on a scheduling system of multiple machine rooms (a single machine room), and can integrate and uniformly schedule cluster resources in different machine rooms, so that global resource allocation is realized in the whole big data system, and the problem of poor stability and disaster tolerance of the single machine room is solved.

In summary, the multi-computer-room task scheduling method provided by the invention acquires the data state of the pre-task in the memory; judging whether to start a task to be executed or not according to the data production state of the preposed task; under the condition that the task to be executed is determined to be started, determining an execution cluster for executing the task to be executed and an execution time length required by executing the task to be executed, wherein the execution cluster is an idle cluster in a plurality of clusters; according to the execution cluster and the execution duration, a preset execution strategy is utilized to complete the tasks to be executed, and the execution strategy is used for determining different idle clusters as execution clusters when the number of the clusters for executing the tasks to be executed is multiple, so that the scheduling and distribution of multiple cluster resources in a cross machine room are correspondingly carried out, the robustness of a big data system is improved, the safe operation capability of the system is improved, and the resource configuration is optimized.

Optionally, fig. 2 is a flowchart of specific steps of a multi-computer-room task scheduling method provided in an embodiment of the present invention, and as shown in fig. 2, the determining whether to start a task to be executed according to a data production state of a pre-task in step 102 may include:

step 1021, acquiring the data production state of the pre-task according to the execution state data of the pre-task in the system.

The execution state data is used to indicate a current execution state of the pre-task, and may be, for example, data obtained by executing a state file or executing a state parameter.

In a specific application, the Done file manner described in step 101 may be used as an execution state file, which is not described herein again. In addition, the current execution state of the preceding task can be determined by using the execution state parameters of the respective tasks stored in the memory. For example, after the step of the team a is completed, a piece of status data is inserted into the MySQL database (memory), for example, the status is set (value is 0 or 1), and when the status is 1, the execution status of the task is the completion status; when status is 0, the execution status of the task is not completed. The scheduler can periodically inquire in the MySQL database to determine whether status is 1, and further determine whether to start executing the task to be executed.

In a specific application, since the pre-posed task serves as a partner of the to-be-executed task and the to-be-executed task has a data call relationship between the pre-posed task and the to-be-executed task, the execution operation of the to-be-executed task can be started only when the data production state of the pre-posed task of the to-be-executed task is in the completion state, that is, the operation of the step 1022 is performed; otherwise, under the condition that the data production state of the prepositive task is in an unfinished state, a SLEEP (SLEEP) mechanism can be introduced, the query of the data production state of the prepositive task is carried out after a period of time, and the execution operation of the task to be executed is started until the completion condition is met.

And step 1022, starting the task to be executed when the data production state of the pre-task is the completion state.

That is to say, after the pre-task has completed data production, the execution of the to-be-executed service may be started, and the to-be-executed task is placed in the data production task pool, for example, the to-be-executed service may be placed in a message queue manner, so that after the task to be executed is allocated to the idle cluster in the following steps, the idle cluster directly acquires the task content corresponding to the task from the message queue (for example, by using an address pointer included in the message queue to acquire the task content), and then executes the corresponding task.

And step 1023, suspending starting the task to be executed under the condition that the data production state of the preposed task is an unfinished state.

For example, when the data production state of the pre-task is an incomplete state, it indicates that the data of the partner (pre-task) of the task to be executed is not ready, and it needs to wait for the pre-task to continue executing until the data production state is the complete state, and then starts the task to be executed, and the scheduler puts the task to be executed in a sleep state during waiting and periodically queries the data production state of the pre-task.

Optionally, fig. 3 is a flowchart of specific steps of a multi-computer-room task scheduling method provided in an embodiment of the present invention, and as shown in fig. 3, in the case that it is determined that a task to be executed is started, determining an execution cluster for executing the task to be executed and an execution duration required for executing the task to be executed in step 103 may include:

step 1031, acquiring resource idle condition in the system according to the state information of the cluster in the system.

For example, when a multi-computer-room task is scheduled, a scheduler needs to monitor the state of each cluster in real time, each cluster reports the state (whether idle or not) of each cluster to a memory of the system, and the scheduler can query the state information of a pre-task within a preset time interval to determine the real-time state of cluster resources in the multi-computer-room and the data production state and other information of each task, so as to determine an execution cluster for a task to be executed, and set the longest execution duration of the task to be executed as the execution duration to start executing the task to be executed.

And 1032, screening out an idle cluster in the alternative clusters to serve as an execution cluster according to the resource idle condition and the alternative clusters preset for the task to be executed.

In a specific application, since a cluster is a machine providing distributed computing resources, there may be some dedicated clusters in the system, such as dedicated clusters for performing online tasks, and some dedicated clusters for performing corresponding testing tasks. In order to reduce the scheduling failure rate, when the system is initially established, the resource configuration of the task is initialized, a plurality of clusters capable of supporting task execution are selected as alternative clusters, and then the scheduler performs cluster resource scheduling according to the alternative clusters and information of whether the alternative clusters are in an idle state or not. That is, one idle cluster in the alternative clusters is used as an execution cluster to execute the task to be executed.

And 1033, setting an execution time length according to the task to be executed.

For example, the setting of the execution duration may be determined based on the size of the data amount included in the task to be executed, and may also be correspondingly adjusted in combination with the size of the computing resource of the candidate cluster set for the task to be executed in the above step. The execution time length can be used as the maximum execution time length required by the execution cluster, and after the maximum execution time length is finished, the execution cluster is not used for executing the task to be executed, so that the problem that the cluster with a fault is repeatedly used for executing the same task to be executed is avoided.

Optionally, fig. 4 is a flowchart of specific steps of a multi-computer-room task scheduling method provided in an embodiment of the present invention, and as shown in fig. 4, the step 104 of completing the task to be executed by using a preset execution policy according to the execution cluster and the execution duration may include:

step 1041, within the execution duration, starting executing the task to be executed by using the execution cluster.

And 1042, if the task to be executed is executed within the execution duration, determining that the task to be executed is completed.

And 1043, if the task to be executed is not executed within the execution time, reacquiring the resource idle condition in the system, so as to determine a new execution cluster and a new execution time according to the resource idle condition and the task to be executed.

And step 1044, continuing to execute the task to be executed within the new execution duration by using the new execution cluster until the task to be executed is completed.

In a specific application, according to a principle that the idle cluster is utilized to execute the task to be executed within the execution time and the same resource is not repeatedly used, after the execution time (the maximum execution time) determined in step 1033 is finished, whether the service to be executed is finished is judged, and if the service to be executed is finished, the scheduling of the task to be executed is finished; otherwise, when the execution time length is over and the task to be executed is not completed, the scheduler re-queries the resource idle condition in the system to determine a new idle cluster, that is, step 1043, submits the task to be executed again in the new idle cluster (the same task is not submitted in the same cluster for multiple times), and sets a new execution time length correspondingly, so that the new execution cluster is used to continue executing the task to be executed in the new execution time length until the task to be executed is completed.

It should be noted that the execution clusters determined for the tasks to be executed are different, and the execution time lengths may be the same or different, and the execution clusters and the execution time lengths are determined according to the actual resource idle condition in the system and the tasks to be executed. For example, when the scheduler starts a task to be executed, the corresponding determination cluster 1 (one of the candidate clusters) is used as an execution cluster, and the execution time length set according to the actual task content (data size in the task) of the task to be executed is 300 s. After 300s, if the task to be executed is not completed yet, the resource idle condition in the system is obtained again, and it is found that the cluster 1 is still in the idle state, but because the same task is not submitted in the same cluster for multiple times, and the cluster 2 in the alternative cluster of the task to be executed is in the idle state at this time, the cluster 2 is selected as the execution cluster, and the content of the task to be executed is 70% of the content of the original task, the execution time length may be set to 500s, and the cluster 2 is used to continue to execute the content of the task of 70% of the task to be executed. Therefore, by the resource scheduling method, under the conditions that the task to be executed has a fault in a single cluster, a task fails, cluster congestion and the like, other appropriate idle resources can be allocated to the task to be executed by the scheduler, the execution time length is adjusted again, and the task to be executed is continuously executed so as to complete the task to be executed.

Fig. 5 is a block diagram of a multi-machine-room task scheduling apparatus according to an embodiment of the present invention, as shown in fig. 5, the apparatus is applied to a system including a plurality of clusters, a scheduler, and a memory, the plurality of clusters are deployed in different machine rooms, each cluster is respectively in communication with the memory, the memory is in communication with the scheduler, and the apparatus 500 is disposed in the scheduler, where the apparatus 500 includes:

the fetch module 510 fetches the data state of the pre-task in memory.

The detecting module 520 is configured to determine whether to start the task to be executed according to the data production state of the pre-task.

The cluster determining module 530 is configured to determine, when it is determined that the to-be-executed task is started, an execution cluster for executing the to-be-executed task and an execution time length required for executing the to-be-executed task, where the execution cluster is an idle cluster among the multiple clusters.

And the task execution module 540 is configured to complete the task to be executed according to the execution cluster and the execution duration by using a preset execution policy, where the execution policy is used to determine, as the execution cluster, different idle clusters when there are multiple clusters for executing the task to be executed.

Optionally, the detecting module 520 includes:

and the state acquisition submodule is used for acquiring the data production state of the preposed task according to the execution state data of the preposed task in the system, and the execution state data is used for representing the current execution state of the preposed task.

The task state determining submodule is used for starting the task to be executed under the condition that the data production state of the preposed task is a finished state; or, when the data production state of the pre-task is an uncompleted state, suspending the task to be executed.

Optionally, the cluster determining module 530 includes:

the resource acquisition submodule is used for acquiring the resource idle condition in the system according to the state information of the cluster in the system;

the cluster screening submodule is used for screening out an idle cluster in the alternative clusters to serve as an execution cluster according to the resource idle condition and the alternative clusters preset for the task to be executed;

and the time length setting submodule is used for setting the execution time length according to the data volume of the task to be executed.

Optionally, the task execution module 540 includes:

the task execution submodule is used for starting to execute the task to be executed by utilizing the execution cluster within the execution duration;

and the completion determining submodule is used for determining to complete the task to be executed if the task to be executed is completed within the execution duration.

Optionally, the task execution module further includes:

the resource acquisition submodule is used for re-acquiring the resource idle condition in the system if the task to be executed is not executed within the execution time length, so as to determine a new execution cluster and a new execution time length according to the resource idle condition and the task to be executed;

and the data execution submodule is used for continuously executing the task to be executed within the new execution duration by utilizing the new execution cluster until the task to be executed is completed.

In addition, an embodiment of the present invention further provides a terminal, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the foregoing multi-machine-room task scheduling method embodiment, and can achieve the same technical effect, and details are not repeated here to avoid repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned multi-machine-room task scheduling method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

For the above device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As is readily imaginable to the person skilled in the art: any combination of the above embodiments is possible, and thus any combination between the above embodiments is an embodiment of the present invention, but the present disclosure is not necessarily detailed herein for reasons of space.

The multi-room task scheduling methods provided herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The structure required to construct a system incorporating aspects of the present invention will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components of the multi-room task scheduling method according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims

1. A multi-machine room task scheduling method is applied to a system comprising a plurality of clusters, a scheduler and a memory, wherein the clusters are deployed in different machine rooms, each cluster is respectively communicated with the memory, the memory is communicated with the scheduler, the method is executed on the scheduler, and the method comprises the following steps:

acquiring the data state of a preposed task in the memory;

according to the execution cluster and the execution duration, completing the task to be executed by using a preset execution strategy, wherein the execution strategy is used for determining different idle clusters as the execution cluster when the number of clusters for executing the task to be executed is multiple;

if the task to be executed is not executed within the execution duration, re-determining a new execution cluster, wherein the new execution cluster is an idle cluster different from the previously determined execution cluster, and the new execution cluster and the previously determined execution cluster are deployed in different machine rooms;

and continuing to execute the task to be executed by using the new execution cluster until the task to be executed is completed.

2. The method according to claim 1, wherein the determining whether to start the task to be executed according to the data production state of the pre-task comprises:

acquiring the data production state of the preposed task according to the execution state data of the preposed task in the system;

starting the task to be executed under the condition that the data production state of the preposed task is a finished state; or, when the data production state of the pre-task is an incomplete state, suspending the task to be executed.

3. The method according to claim 1, wherein in a case that it is determined that the task to be executed is started, determining an execution cluster for executing the task to be executed and an execution time length required for executing the task to be executed comprises:

acquiring the resource idle condition in the system according to the state information of the cluster in the system;

screening out an idle cluster in the alternative clusters to serve as the execution cluster according to the resource idle condition and the alternative clusters preset for the task to be executed;

and setting the execution time length according to the data volume of the task to be executed.

4. The method according to claim 3, wherein the completing the task to be executed according to the execution cluster and the execution duration by using a preset execution policy comprises:

within the execution duration, starting to execute the task to be executed by utilizing the execution cluster;

and if the task to be executed is executed within the execution duration, determining to finish the task to be executed.

5. The method of claim 4, further comprising:

if the task to be executed is not executed within the execution time length, the resource idle condition in the system is obtained again, and a new execution cluster and a new execution time length are determined according to the resource idle condition and the task to be executed;

and utilizing the new execution cluster to continue executing the task to be executed within the new execution duration until the task to be executed is completed.

6. A multi-machine room task scheduling device is applied to a system comprising a plurality of clusters, a scheduler and a memory, wherein the clusters are deployed in different machine rooms, each cluster is respectively communicated with the memory, the memory is communicated with the scheduler, the device is arranged in the scheduler, and the device comprises:

the task execution module is used for completing the tasks to be executed according to the execution clusters and the execution duration by using a preset execution strategy, wherein the execution strategy is used for determining different idle clusters as the execution clusters when the number of the clusters for executing the tasks to be executed is multiple;

7. The apparatus of claim 6, wherein the detection module comprises:

the state acquisition submodule is used for acquiring the data production state of the preposed task according to the execution state data of the preposed task in the system;

the task state determining submodule is used for starting the task to be executed under the condition that the data production state of the preposed task is a finished state; or, when the data production state of the pre-task is an incomplete state, suspending the task to be executed.

8. The apparatus of claim 6, wherein the cluster determining module comprises:

the cluster screening submodule is used for screening out an idle cluster in the alternative clusters to serve as the execution cluster according to the resource idle condition and the alternative clusters preset for the task to be executed;

9. The apparatus of claim 8, wherein the task execution module comprises:

the task execution sub-module is used for starting to execute the task to be executed by utilizing the execution cluster within the execution duration;

10. The apparatus of claim 9, wherein the task execution module further comprises:

the resource acquisition submodule is used for reacquiring the resource idle condition in the system if the task to be executed is not executed within the execution time length, so as to determine a new execution cluster and a new execution time length according to the resource idle condition and the task to be executed;

11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the multi-room task scheduling method according to any one of claims 1 to 5.