CN115309592A

CN115309592A - Resource scheduling method and device, computer equipment and storage medium

Info

Publication number: CN115309592A
Application number: CN202211065521.6A
Authority: CN
Inventors: 张鑫春; 师锐; 辛朝晖; 李亚坤; 宋浩祥
Original assignee: Beijing Volcano Engine Technology Co Ltd
Current assignee: Beijing Volcano Engine Technology Co Ltd
Priority date: 2022-09-01
Filing date: 2022-09-01
Publication date: 2022-11-08

Abstract

The present disclosure provides a resource scheduling method, apparatus, computer device and storage medium, wherein the method comprises: receiving a job request; responding to a job request, and distributing a first logic machine room queue for the target job by utilizing the hierarchical relation among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model based on the job type of the target job; determining a second logic machine room queue under the queue name space queue based on the operation type of the target operation in response to the abnormality of at least one associated cluster queue under the first logic machine room queue; and executing the target operation by utilizing the computing resources in the physical cluster queue associated with the second logic machine room queue. The embodiment of the disclosure realizes that the target operation is automatically migrated from the abnormal first logic machine room queue to the normal second logic machine room queue within the range of the queue naming space queue, so that the method has the capability of cross-machine room automatic disaster recovery.

Description

Resource scheduling method and device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a resource scheduling method, an apparatus, a computer device, and a storage medium.

Background

With the development of internet technology, more and more fields begin to use big data for calculation and analysis, and decision support is provided for data operation and the like.

Under a big data computing scene, certain computing tasks with larger data quantity need to distribute computing queues with abundant resources to meet computing requirements. In practice, network abnormality or device abnormality often occurs, and the abnormal calculation queue cannot be used, which easily causes that the calculation task cannot be performed normally.

Disclosure of Invention

The embodiment of the disclosure at least provides a resource scheduling method, a resource scheduling device, computer equipment and a storage medium.

In a first aspect, an embodiment of the present disclosure provides a resource scheduling method, including:

receiving a job request, wherein the job request is used for requesting to execute a target job;

responding to the operation request, and based on the operation type of the target operation, distributing a corresponding first logic machine room queue for the target operation by utilizing a hierarchical relationship among a queue name space queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model; each queue namespace queue is associated with a plurality of logic room queues, and at least one physical cluster queue is associated under each logic room queue;

in response to the occurrence of an abnormality in at least one physical cluster queue associated under the first logical machine room queue, determining a second logical machine room queue under the queue namespace queue for executing the target job based on the job type of the target job;

and executing the target operation by utilizing the computing resources in the physical cluster queue associated with the second logic machine room queue.

In an optional implementation, after receiving the job request, the method further includes:

in response to not associating a physical cluster queue with a target computing resource under the first logical room queue, creating a first physical cluster queue under the first logical room queue; the target computing resource is used for executing the target job;

establishing an association relationship between the first physical cluster queue and a second physical cluster queue having the target computing resource; the second physical cluster queue is a physical cluster queue in other logical machine room queues;

and scheduling the target computing resource of the second physical cluster queue to execute the target job based on the incidence relation in the first physical cluster queue.

In an optional implementation manner, the executing, by using a computing resource in a physical cluster queue associated under the second logical machine room queue, the target job includes:

determining a fourth physical cluster queue based on the number of computing resources respectively corresponding to each physical cluster queue associated under the second logical machine room queue in response to the number of computing resources corresponding to a third physical cluster queue associated under the second logical machine room queue currently executing the target job being less than the number of computing resources required by the target job;

executing the target job using the computing resources in the fourth physical cluster queue; or merging the fourth physical cluster queue and the third physical cluster queue to obtain a fifth physical cluster queue, and executing the target operation by using the computing resource in the fifth physical cluster queue.

In an optional implementation manner, the allocating, based on the job type of the target job, a corresponding first logical room queue for the target job by using a hierarchical relationship among a queue namespace queue, a logical room queue, and a physical cluster queue in a three-tier virtual queue model includes:

based on the operation type of the target operation, distributing candidate logic machine room queues with computing resources and computing resource configuration information corresponding to each candidate logic machine room queue for the target operation by utilizing the hierarchical relation among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model;

and screening a first logic machine room queue with the computing resource configuration information meeting preset conditions from the candidate logic machine room queues based on the computing resource configuration information corresponding to the candidate logic machine room queues.

In an optional embodiment, the job request includes the amount of computing resources required to execute the target job;

the allocating candidate logical room queues with computing resources for the target job includes:

based on the number of computing resources, allocating a queue namespace queue matched with the number of computing resources for executing the target job to the target job;

based on the job type of the target job, selecting a candidate logical room queue from the queue namespace queues having computing resources for executing the target job.

based on the operation type of the target operation, distributing candidate logic machine room queues with computing resources for executing the target operation and the number of physical cluster queues associated under each candidate logic machine room queue for the target operation by utilizing the hierarchical relation among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model;

and determining the candidate logical machine room queues with the number of the associated physical cluster queues meeting a set threshold as a first logical machine room queue for executing the target operation.

In an optional embodiment, the job request includes identification information of a user;

the allocating a corresponding first logic machine room queue for the target operation by utilizing a hierarchical relationship among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model based on the operation type of the target operation comprises the following steps:

verifying the user authority of the user based on the identity identification information of the user;

responding to the user authority verification of the user, and distributing a corresponding first logic machine room queue for the target operation by utilizing the hierarchical relation among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model based on the operation type of the target operation.

In an optional implementation manner, after the response to an exception occurring in at least one of the cluster queues associated under the first logical machine room queue, the method further includes:

receiving a job scheduling request; the job scheduling request comprises queue identification information to be scheduled;

determining a target queue for executing the target job based on the queue identification information;

and executing the target job by utilizing the computing resource corresponding to the target queue based on the computing resource configured in the target queue.

In an optional embodiment, the determining a target queue for executing the target job based on the queue identification information includes:

determining a target queue namespace queue corresponding to the queue namespace queue identification information under the condition that the queue identification information comprises queue namespace queue identification information; determining any physical cluster queue in any logic machine room queue associated under the target queue namespace queue as a target queue for executing the target operation;

under the condition that the queue identification information comprises queue namespace queue identification information and logic machine room queue identification information, determining a target logic machine room queue corresponding to the logic machine room queue identification information under a target queue namespace queue corresponding to the queue namespace queue identification information; determining a target physical cluster queue in the target logical machine room queue as a target queue for executing the target operation; the target physical cluster queue is a physical cluster queue with user authority;

under the condition that the queue identification information comprises queue namespace queue identification information, logic machine room queue identification information and physical cluster queue identification information, determining a target physical cluster queue corresponding to the physical cluster queue identification information under a target queue namespace queue corresponding to the queue namespace queue identification information, and under a target logic machine room queue corresponding to the logic machine room queue identification information in the target queue namespace queue; and determining the target physical cluster queue as a target queue for executing the target operation.

In a second aspect, an embodiment of the present disclosure further provides a resource scheduling apparatus, including:

a first receiving module, configured to receive a job request, where the job request is used to request execution of a target job;

the allocation module is used for responding to the operation request, and allocating a corresponding first logic machine room queue for the target operation by utilizing a hierarchical relationship among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model based on the operation type of the target operation; each queue namespace queue is associated with a plurality of logic machine room queues, and at least one physical cluster queue is associated under each logic machine room queue;

a first determining module, configured to determine, in response to an occurrence of an exception in at least one physical cluster queue associated with the first logical machine room queue, a second logical machine room queue for executing the target job under the queue namespace queue based on a job type of the target job;

and the first execution module is used for executing the target operation by utilizing the computing resources in the physical cluster queue associated under the second logic machine room queue.

In a third aspect, an embodiment of the present disclosure further provides a computer device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the computer device is running, the machine-readable instructions when executed by the processor performing the steps of the first aspect described above, or any possible implementation of the first aspect.

In a fourth aspect, the disclosed embodiments further provide a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.

The resource scheduling method provided by the embodiment of the disclosure uses three-dimensional virtual queues, namely, a queue namespace queue, a logic machine room queue and a physical cluster queue, and can automatically determine a second logic machine room queue for executing a target job when at least one physical cluster queue under a first logic machine room queue currently executing the target job is abnormal, so that the target job can be automatically migrated from the abnormal first logic machine room queue to the normal second logic machine room queue within the range of the queue namespace queue, and the resource scheduling method has the capability of automatic disaster tolerance across machine rooms; and the target operation can use the computing resources in each virtual machine room queue in the range of the queue namespace queue, and compared with the scheme that the target operation only can use the computing resources in the designated computing queue, the resource utilization rate is improved.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 is a flowchart illustrating a resource scheduling method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram illustrating a hierarchical structure of a three-level virtual queue model provided by an embodiment of the present disclosure;

fig. 3 is a schematic diagram illustrating user permission levels corresponding to a three-level virtual queue model according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating a hierarchical structure of another three-level virtual queue model provided by an embodiment of the present disclosure;

fig. 5 shows a flowchart of a resource scheduling apparatus provided in an embodiment of the present disclosure;

fig. 6 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making any creative effort, shall fall within the protection scope of the disclosure.

In big data scenarios, the computing tasks are typically scheduled to a designated computing cluster, or to the computing cluster with the most abundant computing resources. When network abnormality or equipment abnormality occurs, the abnormal calculation queue cannot be used, and a user needs to manually schedule a calculation task to a normal calculation cluster. In the mode, cross-machine-room automatic disaster recovery cannot be realized, and the computing service is seriously influenced.

Based on the above, the present disclosure provides a resource scheduling method, which uses three-dimensional virtual queues, namely a queue namespace queue, a logic room queue, and a physical cluster queue, and can automatically determine a second logic room queue for executing a target job when at least one physical cluster queue under a first logic room queue currently executing the target job is abnormal, thereby automatically migrating the target job from the abnormal first logic room queue to a normal second logic room queue within the range of the queue namespace queue, and enabling the target job to have the capability of automatic disaster tolerance across rooms; and the target operation can use the computing resources in each virtual machine room queue in the range of the queue namespace queue, and compared with the scheme that the target operation only can use the computing resources in the designated computing queue, the resource utilization rate is improved.

The defects existing in the above solutions and the proposed solutions are the results obtained after the inventor has made practice and careful study, therefore, the discovery process of the above problems and the solutions proposed by the present disclosure in the following problems should be the contribution of the inventor to the present disclosure in the process of the present disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

To facilitate understanding of this embodiment, first, a resource scheduling method disclosed in the embodiments of the present disclosure is described in detail, and an execution subject of the resource scheduling method provided in the embodiments of the present disclosure is generally a computer device with certain computing capability.

The resource scheduling method provided by the embodiment of the disclosure is mainly applicable to big data scenes, such as machine learning, data statistical analysis and other scenes.

Referring to fig. 1, a flowchart of a resource scheduling method provided in the embodiment of the present disclosure is shown, where the method includes S101 to S104, where:

s101: a job request is received, the job request requesting execution of a target job.

In the disclosed embodiments, the received job request may be a request submitted by a user for execution of the target job. The target job included in the job request may include at least one. The plurality of target jobs included in the job request may be a plurality of target jobs of a certain user; or may be multiple target jobs for multiple users, e.g., multiple target jobs for individual members in a team of multiple members.

S102: responding to the operation request, and based on the operation type of the target operation, distributing a corresponding first logic machine room queue for the target operation by utilizing a hierarchical relationship among a queue name space queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model; and each queue namespace queue is associated with a plurality of logic machine room queues, and at least one physical cluster queue is associated under each logic machine room queue.

In the embodiment of the present disclosure, the computing resource for executing the job may be set as a three-layer virtual queue model including three layers of virtual queues, where the three layers of virtual queues may include a queue namespace queue, a logical machine room queue, and a physical cluster queue. In the schematic diagram of the hierarchical structure of the three-layer virtual queue model shown in fig. 2, the upper-layer virtual queue and the lower-layer virtual queue have an inclusion and an inclusion relationship, specifically, a queue namespace queue (or referred to as a Region virtual queue) may correspond to a plurality of logical room queues (or referred to as data centers), and each logical room queue may correspond to at least one physical Cluster queue (or referred to as a Cluster). That is, each queue namespace queue may be a set of multiple logical machine room queues, and each logical machine room queue may be a set of at least one physical cluster queue. Each layer of virtual queue may set corresponding queue identification information, such as a queue name. For example, queue namespace queue identification information corresponding to the queue namespace queue may be represented by queue _ namespace; the logic machine room queue identification information corresponding to the logic machine room queue can be represented by data _ center; the cluster queue identification information corresponding to the physical cluster queue may be represented by queue _ name. The queue namespace queue identification information can be a set unique identification to distinguish different regional virtual queues.

In particular implementations, each logical room queue may have configured therein computing resources, storage resources, job data, and the like for executing jobs. The computing resource may include, for example, a Central Processing Unit (CPU); the storage resource may include, for example, a Memory (Memory). Job data may refer to data on which a target job is executed.

The queue namespace queue can manage a plurality of logic computer room queues through corresponding cross-computer room routes, for example, the cross-computer room routes can transfer the operation in the logic computer room queue 1 under the same queue namespace queue to the logic computer room queue 2; the job data stored in the logical room queue 2 may also be read to execute the job in the logical room queue 1, and the like.

Different target jobs may correspond to different job types, and job data required for executing the target jobs may be different in different job types.

The logical room queues may determine the correlation between the job type of the target job and the job data stored in the logical room queues through configured room routes, that is, different logical room queues may process target jobs of different job types.

The physical cluster queues can adjust the amount of computing resources and/or storage resources among the physical cluster queues in the same logical machine room queue through the configured cluster routes.

In order to ensure that the target operation can be smoothly executed when the logic room queues are abnormal, the operation data required by the same target operation can be stored in a plurality of logic room queues. In an embodiment, in the process of determining the first logical room queue for executing the target job, according to the job type of the target job, the candidate logical room queues having the computing resources and the computing resource configuration information corresponding to each candidate logical room queue may be allocated to the target job by using a hierarchical relationship among the queue namespace queue, the logical room queues, and the physical cluster queues in the three-layer virtual queue model.

Here, the storage path corresponding to the role data required by the target job may be analyzed at the calculation engine layer according to the job type of the target job, and then an Application Programming Interface (API) provided by the storage system may be called to obtain the distribution of the job data. According to the distribution condition of the job data and the hierarchical relation of the virtual queues in the three-layer virtual queue model, candidate logic room queues configured with computing resources for executing the target job and computing resource configuration information corresponding to each candidate logic room queue can be determined. The computing resource allocation information here may include information of the size, the number of allocations, and the like of the job data.

Here, the preset condition may mean that the number of arranged pieces of job data is greater than or equal to a set threshold value. In a specific implementation, candidate logical room queues with the configuration number of the job data being greater than or equal to the set threshold may be screened as the first logical room queue.

To ensure utilization of computing resources, in one embodiment, the job request may include an amount of computing resources required to execute the target job. The amount of computing resources herein may include an amount of computing resources and/or storage resources. In allocating a queue namespace queue for executing the target job for the target job, the queue namespace queue for executing the target job may be allocated for the target job based on the number of computing resources.

In specific implementation, a user may apply for queue namespace queue identification information from a resource management platform, and the resource management platform may allocate a queue namespace queue corresponding to the number of computing resources required for a target job to the user. The queue namespace queue assigned to the user may execute each target job submitted by the user. That is, each target job submitted by a user may share all of the computing resources in the queue namespace queue.

After determining the queue namespace queue for executing the target job, a candidate logical room queue having computing resources for executing the target job may be selected from the queue namespace queue based on the job type of the target job.

In a specific implementation, when the number of physical cluster queues set in a logical room queue is larger, the larger the number of computing resources in the logical room queue is, the faster the speed of executing a target job is, so in an embodiment, a hierarchical relationship among a queue namespace queue, a logical room queue, and a physical cluster queue in a three-tier virtual queue model may be used to allocate, to the target job, a candidate logical room queue having computing resources for executing the target job and the number of physical cluster queues set under each candidate logical room queue based on a job type of the target job. And then determining the candidate logical machine room queues with the number of the associated physical cluster queues meeting the set threshold value as first logical machine room queues for executing the target operation.

Here, the determined candidate logical room queues may include a logical room queue configured with job data for executing the target job. And then selecting the candidate logical room queues of which the number accords with the set threshold value as a first logical room queue according to the number of the physical cluster queues associated under each candidate logical room queue.

As previously described, the target job may include a plurality of target jobs for a plurality of users. In the user permission hierarchy diagram corresponding to the three-layer virtual queue model shown in fig. 3, the permissions of using the virtual queue owned by different users may be different. In order to manage the authority of each user, a corresponding Resource Record (RR) may be set for each queue namespace queue, where the Resource Record describes the user authority, virtual queue attribute information (e.g., virtual queue identification information), and the like of each logical machine room queue under the queue namespace queue. Each queue namespace queue, the logical room queues under the queue namespace queue, and the physical cluster queues under each logical room queue may be constructed in the form of a tree queue structure. Each parent node may have full authority of the child node and may be able to use all of the computing resources of the child node.

In one embodiment, the job request may include identification information of the user. In the process of allocating the corresponding first logic machine room queue to the target job, the user authority of the user may be verified based on the identity information of the user. And responding to the user authority verification of the user, and based on the job type of the target job and the user authority, allocating a first logic machine room queue matched with the user authority for the target job by using the hierarchical relationship among the queue namespace queue, the logic machine room queue and the physical cluster queue in the three-layer virtual queue model.

And aiming at the user with the user permission passing, the corresponding target operation can be executed by using the first logic machine room queue matched with the user permission. For example, if the user right is a right of the logical machine room queue, the user may use the computing resources corresponding to each physical cluster queue in the first logical machine room queue. For example, if the user right is a right of the physical cluster queue, the user may use the computing resource corresponding to the target physical cluster queue in the first logical machine room queue.

S103: and in response to the occurrence of an exception in at least one physical cluster queue associated under the first logical machine room queue, determining a second logical machine room queue under the queue namespace queue for executing the target job based on the job type of the target job.

Here, at least one physical cluster queue under the first logical machine room queue is abnormal, including but not limited to network abnormality, hardware failure, etc. In a case that the number of the physical cluster queues with the abnormality in the first logical machine room queue is less than the total number of the physical cluster queues in the first logical machine room queue, in an embodiment, the target job in the physical cluster queue with the abnormality in the first logical machine room queue may be migrated to another physical cluster queue without the abnormality in the first logical machine room queue. For example, as shown in fig. 4, when an exception occurs in the physical cluster queue a in the logical room queue, the target job in the physical cluster queue a may be migrated to the physical cluster queue B in the logical room queue. That is, at least one physical cluster queue under the first logical room queue is abnormal, and based on the job type of the target job, the determined second logical room queue may be the same logical room queue as the first logical room queue. In an embodiment, the target job in the physical cluster queue with the exception in the first logical room queue may also be migrated to the other second logical room queue without the exception. The second logical room queue may be a different logical room queue than the first logical room queue.

When the number of the physical cluster queues with the abnormality in the first logical machine room queue is the total number of the physical cluster queues in the first logical machine room queue, the target jobs in the physical cluster queues with the abnormality in the first logical machine room queue can be migrated to other second logical machine room queues without the abnormality. The second logical room queue may be a different logical room queue than the first logical room queue.

S104: and executing the target operation by utilizing the computing resources in the physical cluster queue associated with the second logic machine room queue.

In the embodiment of the present disclosure, in a process of executing a target job in a second logical machine room queue, a situation that computing resources in a virtual queue are not balanced may occur, and in one implementation, in response to that the number of computing resources corresponding to a third physical cluster queue in the second logical machine room queue currently executing the target job is smaller than the number of computing resources required by the target job, a fourth physical cluster queue is determined based on the number of computing resources corresponding to each physical cluster queue in the second logical machine room queue.

Here, in the case where the number of computing resources corresponding to the third physical cluster queue under the second logical machine room queue currently executing the target job is smaller than the number of computing resources required by the target job, the third physical cluster queue may not be able to execute the target job in time. In this case, a fourth physical cluster queue whose number of computing resources meets the preset condition may be determined according to the number of computing resources corresponding to each physical cluster queue in the second logical machine room queue. The preset condition may mean that the amount of computing resources is greater than or equal to a set threshold. The target job is then executed using the computing resources in the fourth physical cluster queue.

Or merging the fourth physical cluster queue and the third physical cluster queue to obtain a fifth physical cluster queue. Here, the number of computing resources of the fourth physical cluster queue is mainly merged with the number of computing resources of the third physical cluster queue, and the number of computing resources of the fifth physical cluster queue may be the sum of the number of computing resources of the fourth physical cluster queue and the number of computing resources of the third physical cluster queue. And executing the target job by using the computing resource quantity of the fifth physical cluster queue.

Through the embodiment, the computing resources among the physical cluster queues can be balanced, so that the computing resources of the physical cluster queues can be fully utilized, and the fragmentation of the resources is prevented.

In an embodiment of the present disclosure, in order to reduce cross-room communication, in a case that a physical cluster queue with a target computing resource is not configured under a determined first logical room queue, in response to not associating a physical cluster queue with a target computing resource under the first logical room queue for executing a target job under the determined queue namespace queue after receiving a job request, a first physical cluster queue is created under the first logical room queue.

Here, the target computing resource may be used to execute the target job. The created first physical cluster queue is also a physical cluster queue that is not associated with a target computing resource, which may specifically refer to job data. To reduce cross-room communication, an association between a first physical cluster queue and a second physical cluster queue having a target computing resource may be established. And the second physical cluster queue is a physical cluster queue in other logical machine room queues. By establishing an association between the first physical cluster queue and a second physical cluster queue having target computing resources, the first physical cluster queue may schedule the target computing resources of the second physical cluster queue based on the association and then execute the target job using the target computing resources of the second physical cluster queue.

In the implementation process, the second logic machine room queue for executing the target operation under the queue name space queue can be automatically determined based on the operation type of the target operation by responding to the abnormality of at least one physical cluster queue under the first logic machine room queue, and the efficiency of cross-machine room scheduling is improved.

In the embodiment of the disclosure, after the first logical room queue is abnormal, the job scheduling request may be received in addition to automatically determining the second logical room queue for executing the target job. The job scheduling request may include queue identification information to be scheduled. The queue identification information to be scheduled may be user-specified queue identification information.

Based on the queue identification information, a target queue for executing the target job may be determined. Then, based on the computing resources configured in the target queue, the target job is executed using the computing resources corresponding to the target queue.

Here, as described above, each layer of virtual queue may set corresponding queue identification information. The queue namespace queue identification information may be uniquely determined queue identification information. The logical machine room queue identification information and the physical cluster queue identification information may support wildcard characters.

In the case where the queue identification information includes queue namespace queue identification information, a target queue namespace queue corresponding to the queue namespace queue identification information may be determined. And determining any physical cluster queue in any logic machine room queue under the target queue namespace queue as a target queue for executing the target operation.

For example, the queue namespace queue identification information is ns1, and it may be determined that the target queue namespace queue corresponding to the queue namespace queue identification information is a target queue for executing the target job in any physical cluster queue in any logical machine room queue in "ns 1".

Under the condition that the queue identification information comprises queue namespace queue identification information and logic machine room queue identification information, determining a target logic machine room queue corresponding to the logic machine room queue identification information under a target queue namespace queue corresponding to the queue namespace queue identification information; determining a target physical cluster queue in a target logic machine room queue as a target queue for executing target operation; the target physical cluster queue is a physical cluster queue with user permissions.

For example, queue namespace queue identification information is ns1.Dc1, and it may be determined that the target queue namespace queue corresponding to the queue namespace queue identification information is a logical machine room queue in "ns1" and the target physical cluster queue in dc1 is the target queue for executing the target job.

Under the condition that the queue identification information comprises queue name space queue identification information, logic machine room queue identification information and physical cluster queue identification information, determining a target physical cluster queue corresponding to the cluster queue identification information under a target queue name space queue corresponding to the queue name space queue identification information, and under a target logic machine room queue corresponding to the logic machine room queue identification information in the target queue name space queue; and determining the target physical cluster queue as a target queue for executing the target job.

For example, queue namespace queue identification information is ns1.Dc1.Queue a, it may be determined that the logical machine room queue under "ns1" of the target queue namespace queue corresponding to the queue namespace queue identification information is dc1, and the target physical cluster queue with queue is a target queue for executing the target job.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same inventive concept, the embodiment of the present disclosure further provides a job execution device corresponding to the resource scheduling method, and since the principle of the device in the embodiment of the present disclosure for solving the problem is similar to the resource scheduling method in the embodiment of the present disclosure, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 5, a schematic diagram of an architecture of a job execution apparatus according to an embodiment of the present disclosure is shown, where the apparatus includes:

a first receiving module 501, configured to receive a job request, where the job request is used to request execution of a target job;

an allocating module 502, configured to, in response to the job request, allocate, based on the job type of the target job, a corresponding first logical machine room queue for the target job by using a hierarchical relationship among a queue namespace queue, a logical machine room queue, and a physical cluster queue in a three-layer virtual queue model; each queue namespace queue is associated with a plurality of logic machine room queues, and at least one physical cluster queue is associated under each logic machine room queue;

a first determining module 503, configured to determine, in response to an occurrence of an exception in at least one physical cluster queue associated under the first logical machine room queue, a second logical machine room queue for executing the target job under the queue namespace queue based on the job type of the target job;

a first executing module 504, configured to execute the target job by using the computing resource in the physical cluster queue associated under the second logical machine room queue.

In an optional implementation manner, after receiving the job request, the apparatus further includes:

a creating module, configured to create a first physical cluster queue under the first logical room queue in response to a physical cluster queue with a target computing resource not being associated under the first logical room queue; the target computing resource is used for executing the target job;

an establishing module, configured to establish an association relationship between the first physical cluster queue and a second physical cluster queue having the target computing resource; the second physical cluster queue is a physical cluster queue in other logical machine room queues;

and the scheduling module is used for scheduling the target computing resource of the second physical cluster queue to execute the target job on the basis of the incidence relation in the first physical cluster queue.

In an optional implementation manner, the first executing module 504 is specifically configured to:

In an optional implementation, the allocating module 502 is specifically configured to:

the allocating module 502 is specifically configured to:

based on the number of computing resources, allocating a queue namespace queue matched with the number of computing resources for executing the target job for the target job;

based on the job type of the target job, allocating candidate logical machine room queues with computing resources for executing the target job and the number of physical cluster queues associated under each candidate logical machine room queue for the target job by utilizing a hierarchical relationship among a queue namespace queue, a logical machine room queue and a physical cluster queue in a three-layer virtual queue model;

the allocating module 502 is specifically configured to:

In an optional implementation manner, after the response to an exception occurring in at least one of the cluster queues associated under the first logical machine room queue, the apparatus further includes:

the second receiving module is used for receiving the job scheduling request; the job scheduling request comprises queue identification information to be scheduled;

a second determination module for determining a target queue for executing the target job based on the queue identification information;

and the second execution module is used for executing the target job by utilizing the computing resource corresponding to the target queue based on the computing resource configured in the target queue.

In an optional implementation manner, the second determining module is specifically configured to:

The description of the processing flow of each module in the apparatus and the interaction flow between the modules may refer to the relevant description in the above method embodiments, and will not be described in detail here.

Based on the same technical concept, the embodiment of the disclosure also provides computer equipment. Referring to fig. 6, a schematic structural diagram of a computer device 600 provided in the embodiment of the present disclosure includes a processor 601, a memory 602, and a bus 603. The memory 602 is used for storing execution instructions and includes a memory 6021 and an external memory 6022; the memory 6021 is also called an internal memory and is used for temporarily storing the operation data in the processor 601 and the data exchanged with the external memory 6022 such as a hard disk, the processor 601 exchanges data with the external memory 6022 through the memory 6021, and when the computer device 600 operates, the processor 601 and the memory 602 communicate with each other through the bus 603, so that the processor 601 executes the following instructions:

responding to the operation request, and based on the operation type of the target operation, distributing a corresponding first logic machine room queue for the target operation by utilizing a hierarchical relationship among a queue name space queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model; each queue namespace queue is associated with a plurality of logic machine room queues, and at least one physical cluster queue is associated under each logic machine room queue;

in response to the fact that at least one physical cluster queue associated under the first logical machine room queue is abnormal, determining a second logical machine room queue used for executing the target operation under the queue namespace queue based on the operation type of the target operation;

The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the resource scheduling method in the foregoing method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the resource scheduling method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.

The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the system and the apparatus described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present disclosure, which are essential or part of the technical solutions contributing to the prior art, may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used to illustrate the technical solutions of the present disclosure, but not to limit the technical solutions, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes and substitutions do not depart from the spirit and scope of the embodiments disclosed herein, and they should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method for scheduling resources, comprising:

2. The method of claim 1, wherein after receiving a job request, the method further comprises:

creating a first physical cluster queue under the first logical room queue in response to not associating a physical cluster queue with a target computing resource under the first logical room queue; the target computing resource is to execute the target job;

3. The method of claim 1, wherein executing the target job using computing resources in a physical cluster queue associated below the second logical machine room queue comprises:

in response to that the number of computing resources corresponding to a third physical cluster queue associated under the second logical machine room queue currently executing the target job is smaller than the number of computing resources required by the target job, determining a fourth physical cluster queue based on the number of computing resources respectively corresponding to each physical cluster queue associated under the second logical machine room queue;

4. The method of claim 1, wherein the allocating a corresponding first logical room queue for the target job based on the job type of the target job by using a hierarchical relationship among a queue namespace queue, a logical room queue, and a physical cluster queue in a three-tier virtual queue model comprises:

5. The method of claim 4, wherein the job request includes an amount of computing resources required to execute the target job;

6. The method of claim 1, wherein the allocating a corresponding first logical room queue for the target job based on the job type of the target job using a hierarchical relationship among a queue namespace queue, a logical room queue, and a physical cluster queue in a three-tier virtual queue model comprises:

7. The method according to claim 1, wherein the job request includes identification information of a user;

8. The method of claim 1, wherein in response to an exception occurring in at least one of the cluster queues associated under the first logical machine room queue, the method further comprises:

9. The method of claim 8, wherein determining a target queue for executing the target job based on the queue identification information comprises:

10. A resource scheduling apparatus, comprising:

the allocation module is used for responding to the operation request, and allocating a corresponding first logic machine room queue for the target operation by utilizing the hierarchical relationship among a queue namespace queue, a logic machine room queue and a physical cluster queue in a three-layer virtual queue model based on the operation type of the target operation; each queue namespace queue is associated with a plurality of logic room queues, and at least one physical cluster queue is associated under each logic room queue;

11. A computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when a computer device is run, the machine-readable instructions when executed by the processor performing the steps of the resource scheduling method according to any one of claims 1 to 9.

12. A computer-readable storage medium, having stored thereon a computer program for performing, when being executed by a processor, the steps of the resource scheduling method according to any one of claims 1 to 9.