WO2017113074A1 - Resource allocation method, device, and system - Google Patents

Resource allocation method, device, and system Download PDF

Info

Publication number
WO2017113074A1
WO2017113074A1 PCT/CN2015/099258 CN2015099258W WO2017113074A1 WO 2017113074 A1 WO2017113074 A1 WO 2017113074A1 CN 2015099258 W CN2015099258 W CN 2015099258W WO 2017113074 A1 WO2017113074 A1 WO 2017113074A1
Authority
WO
WIPO (PCT)
Prior art keywords
resource
container
application controller
manager
pool
Prior art date
Application number
PCT/CN2015/099258
Other languages
French (fr)
Chinese (zh)
Inventor
梁殿鹏
赵彦荣
刘佳
党李飞
彭磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/099258 priority Critical patent/WO2017113074A1/en
Priority to CN201580084802.8A priority patent/CN108293041B/en
Publication of WO2017113074A1 publication Critical patent/WO2017113074A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the embodiments of the present invention relate to the field of computers, and in particular, to a resource allocation method.
  • Yarn Yet Another Resource Negotiator (yarn) is a new Hadoop resource manager. It is a universal resource management system that provides unified resource management and scheduling for upper-layer applications. The benefits of utilization, unified resource management, and data sharing have brought tremendous benefits. Yarn was originally designed to fix the apparent lack of MRv1 and to improve scalability, reliability, and cluster utilization. Yarn implements these requirements by dividing Job Tracker's two main functions (resource management and job scheduling/monitoring) into two separate service programs, one for the Global Resource Manager (RM) and for each Job Master (AM) for each job.
  • RM Global Resource Manager
  • AM Job Master
  • Each job in Yarn enables a separate AM, which solves the single point of failure and expansion bottleneck in MRv1. But this approach will introduce a new problem: the job delay is large.
  • Each job first needs to start an AM with the RM application resource. After the AM requests the resource from the RM and starts the resource container, the job can be officially started. So the Yarn job will have a long run delay, which is not conducive to running small jobs, and because you need to apply for an AM for each job, you need more computing resources.
  • an embodiment of the present invention provides a resource allocation method, device, and system, Reduce the waiting time for jobs.
  • the present application provides a method for allocating resource containers in a distributed system.
  • the distributed system includes a resource manager and a node manager.
  • the resource manager manages node resources of the distributed system, and the node manager starts based on the node resources.
  • the resource container is used to execute the task of the application.
  • the method includes: the resource manager starts the application controller when the trigger timing is met, and configures an initial specification of the resource pool managed by the application controller, where the initial specification of the resource pool is used.
  • the resource manager After instructing the application controller to apply for the number and specification of the resource container for the first time to the resource manager, the resource manager receives the first resource request of the resource pool sent by the application controller according to the initial specification of the resource pool, according to the resource pool a resource request, allocate an initial resource container for the resource pool, and send a first resource allocation message of the resource pool to the application controller, where the first resource allocation message includes a resource pool allocation managed by the resource manager for the application controller.
  • the resource manager actively starts the application controller and configures the initial specification of the resource pool managed by the application controller, so that the application controller applies a certain number of resource containers to the resource manager in advance according to the initial specification of the resource pool, thereby starting the application in advance.
  • the container reduces the time that subsequent application jobs wait for the resource container to start.
  • the resource manager starting the application controller when the triggering timing is met includes: the resource manager receiving the request for pre-launching the application controller or pre-configuring the resource pool The request is initiated, or when the system is initialized, the application controller is started.
  • the startup time of the application controller can be started in a variety of ways. It can be started by the administrator according to the user's requirements, or can be started by the resource manager through system configuration.
  • the initial specification of the resource pool managed by the resource manager configuration application controller includes: the resource manager according to The expected resource requirement information of the preset application, the initial specification of the resource pool managed by the application controller is configured; or the resource manager configures the resource pool managed by the application controller according to the collected usage information of the node resource of the distributed system.
  • the initial specifications For example, when there are sufficient unused node resources, the resource manager can configure the initial specification of the larger resource pool; when there are fewer unused node resources, the resource manager can configure the initial of the smaller resource pool. specification.
  • the initial specification of the resource pool can be carried by the administrator in the startup command of the application controller, and the resource manager starts the response.
  • configure the application controller When using the controller, configure the application controller.
  • the present application provides a computer readable medium, comprising computer executed instructions, when the processor of the computer executes the computer to execute an instruction, the computer executes the first aspect or any of the possible implementations of the first aspect Methods.
  • the present application provides a computing device, including: a processor, a memory, a bus, and a communication interface; the memory is configured to store an execution instruction, the processor is connected to the memory through the bus, when the computing device is running The processor executes the execution instructions stored by the memory to cause the computing device to perform the method of any of the first aspect or the first aspect.
  • the present application provides a method for allocating resource containers in a distributed system, where the distributed system includes a resource manager and a node manager, where the resource manager is used to manage node resources of the distributed system, and the node manager uses Starting a resource container based on the node resource, where the resource container is used to execute an application task, and the resource manager starts an application controller when the trigger timing is met, and configures an initial specification of the resource pool managed by the application controller, and according to the resource The initial specification of the pool allocates an initial resource container for the resource pool managed by the application controller, and the initial resource container of the resource pool is started;
  • the method includes: receiving, by the application controller, a resource allocation request from a client, where the resource allocation request is used to request a resource container for an application running on the client, where the resource allocation request carries resource requirement information of the application, and according to the application Resource requirement information, from which an idle resource container is selected and allocated to the client.
  • the method before the application controller receives the resource allocation request from the client, the method further includes: the application controller according to an initial specification of the resource pool
  • the resource manager sends a first resource request, where the first resource request carries a quantity of resources determined according to an initial specification of the resource pool, and acquires a first resource allocation message of the resource pool sent by the resource manager, where the first resource is
  • the allocation message includes information about the node where the resource manager allocates the initial resource container for the resource pool, and sends a startup request to the node manager to request the node manager to start the initial resource container of the resource pool.
  • the resource container in the resource pool is applied to the resource manager in advance by the application controller, when the application controller receives the resource allocation request from the client, the resource container can be allocated to the client in time, thereby avoiding In the prior art, the waiting time of the AM and the resource container is started after receiving the application job.
  • the resource allocation request includes an identifier of the client, where the application controller is in the resource pool After the idle resource container is allocated to the client, the method further includes: the application controller sending an indication message to each resource container in the resource container allocated to the client, where the indication message carries the identifier of the client, the indication A message is used to indicate that each resource container is assigned to the client.
  • the method further includes: the application controller determining the remaining idle resource containers in the resource pool The number, if the number of remaining free resource containers is less than a preset first threshold, sending a second resource request to the resource manager, and receiving a second resource allocation message of the resource pool of the resource manager, according to the The second resource allocation message of the resource pool sends a startup request to the node manager, requesting the node manager to start a new resource container allocated for the resource pool.
  • the number of idle resource containers remaining in the resource pool may be determined, or the application controller periodically determines the resource pool.
  • the number of remaining idle resource containers When the number of remaining free resource containers is less than a preset first threshold, indicating that the resource container of the resource pool may not satisfy the subsequent allocation operation, the resource is re-applied to the resource manager. A container to supplement the resource container in the resource pool.
  • the method further includes: determining, by the application controller, the remaining idle resource containers in the resource pool And a quantity release message, if the number of remaining idle resource containers is greater than a preset second threshold, sending a resource release message to the at least one resource container in the idle resource container, where the resource release message is used to release the at least one resource container occupation resource of.
  • the number of idle resource containers remaining in the resource pool may be determined, or the application controller periodically determines remaining free resources in the resource pool.
  • the number of containers when the number of remaining free resource containers is greater than a preset second threshold, sending a resource release message to at least one resource container in the idle resource container. Thereby releasing the resources occupied by the resource pool and ensuring the rational use of the global resources.
  • the application controller maintains state information of each resource container in the resource pool, the state information Indicates whether the corresponding resource container is idle; the application controller selects an idle resource container from the resource pool and allocates the resource to the client. The application controller selects the resource pool according to status information of each resource container in the resource pool. An idle resource container, and the selected resource container in the selected resource pool is allocated to the client.
  • the application controller may maintain a state information table, where the state information table maintains state information of each resource container in the resource pool, so that the application controller can quickly determine the current idle resource container, and according to the idle resources.
  • Container and resource allocation requests allocate containers for clients.
  • the sixth possible implementation manner of the fourth aspect after the application controller selects an idle resource container from the resource pool and allocates the The method further includes the application controller setting a state of each of the resource containers allocated to the client to be not idle.
  • the application controller allocates a status of each resource container in the resource container of the client After being set to not idle, the method further includes: the application controller receiving a status update message from each of the resource containers allocated to the client, the status update message being used to indicate that the task assigned by the client is completed; The application controller sets the state of each resource container in the resource container assigned to the client to idle.
  • the application controller dynamically updates the state information maintained by itself according to the change of the state of the resource container, thereby ensuring that the idle resource container in the resource pool can be accurately determined when there is a resource allocation request.
  • the resource allocation request further includes user rights information; the method further includes: the application controller The user rights information is verified according to a preset user permission library, and the user rights library contains the user rights information.
  • the user permission library contains user rights information of different users. If the user permission information of the client is not included in the user permission library, the resource allocation request of the client is rejected or not responded; If the user rights library contains the user rights information of the client 102, the subsequent resource allocation step is performed.
  • the present application provides a computer readable medium, comprising computer executed instructions, in a possible implementation manner of the fourth aspect or the fourth aspect, when the processor of the computer executes the computer to execute the instruction Methods.
  • the present application provides a computing device, including: a processor, a memory, a bus, and a communication interface; the memory is configured to store an execution instruction, the processor is connected to the memory through the bus, when the computing device is running The processor executes the execution instructions stored by the memory to cause the computing device to perform the method of any of the possible implementations of the fourth aspect or the fourth aspect.
  • the present application provides a resource container allocation apparatus in a distributed system, where the distributed system includes the apparatus and a node manager, where the apparatus is configured to manage node resources of the distributed system, and the node manager is configured to be based on The node resource starts a resource container, where the resource container is used to perform an application task, and the device includes: a startup unit, configured to start an application controller when the trigger timing is met, and configure an initial specification of the resource pool managed by the application controller; a receiving unit, configured to receive a first resource request of the resource pool that is sent by the application controller according to an initial specification of the resource pool; and an allocating unit, configured to allocate an initial resource container to the resource pool according to the first resource request; And a sending unit, configured to send, to the application controller, a first resource allocation message of the resource pool, where the first resource allocation message includes information about a node where the initial resource container allocated by the allocation unit is allocated to the resource pool.
  • a startup unit configured to start an application controller when the trigger timing is met, and configure an initial
  • the device actively starts the application controller and configures an initial specification of the resource pool managed by the application controller, so that the application controller applies for a certain number of resource containers in advance according to the initial specification of the resource pool, thereby starting the application container in advance. , reducing the time that subsequent application jobs wait for the resource container to start.
  • the starting unit is configured to start the application controller when the trigger timing is met, including: the startup unit is configured to receive a request for pre-launching the application controller or The resource pool request is pre-configured, or the application controller is started when the system is initialized.
  • the startup time of the application controller can be started in a variety of ways. It can be started by the administrator according to the user's requirements, or can be started by the startup unit when the system is initialized.
  • the starting unit is configured to configure a resource pool managed by the application controller
  • the initial specification includes: the startup unit is configured to configure an initial specification of a resource pool managed by the application controller according to the expected resource requirement information of the preset application; or configure according to the collected usage information of the node resource of the distributed system.
  • the initial specification of the resource pool managed by the application controller For example, when there are sufficient unused node resources, the startup unit can configure an initial specification of a larger resource pool; when there are fewer unused node resources, the startup unit can configure an initial specification of a smaller resource pool.
  • the initial specification of the resource pool may be carried by the administrator in the startup command of the application controller, and the startup controller activates the application controller to the application controller. Configure it.
  • the application provides a resource container allocation apparatus in a distributed system, where the distributed system includes a resource manager and a node manager, and the resource manager is configured to manage node resources of the distributed system, and the node manager uses Starting a resource container based on the node resource, where the resource container is used to execute an application task, the resource manager starts the device when the trigger timing is met, configures an initial specification of the resource pool managed by the device, and according to the initial of the resource pool The specification allocates an initial resource container for the resource pool managed by the device, and the initial resource container is started;
  • the device includes: a receiving unit, configured to receive a resource allocation request from a client, where the resource allocation request is used to request a resource container for an application running on the client, where the resource allocation request carries resource requirement information of the application; And selecting, according to the resource requirement information of the application, an idle resource container from the resource pool to be allocated to the client.
  • the apparatus further includes: a sending unit, the receiving unit is configured to use, according to the initial of the resource pool, a resource allocation request from the client The first resource request is sent to the resource manager, where the first resource request carries the quantity of resources determined according to the initial specification of the resource pool; the receiving unit is further configured to receive the first resource allocation message sent by the resource manager, The first resource allocation message includes information about a node where the initial resource container allocated by the resource manager is allocated to the resource pool; the sending unit is further configured to send a startup request to the node manager, requesting the node manager to start the The initial resource container for the resource pool.
  • the allocating unit can allocate the activated resource container to the client in time, thereby avoiding In the prior art, the waiting time of the AM and the resource container is started after receiving the application job.
  • the device further includes a sending unit, where the resource allocation request includes an identifier of the client, and the sending unit selects an idle resource container from the resource pool and allocates the resource to the client, where the sending unit is used for An indication message is sent to each resource container in the resource container allocated to the client, where the indication message carries an identifier of the client, and the indication message is used to indicate that each resource container is allocated to the client.
  • the apparatus further includes a determining unit, and a sending unit, the determining unit is configured to determine the resource pool The number of the remaining free resource containers, if the number of remaining free resource containers is less than a preset first threshold, the sending unit is configured to send a second resource request to the resource manager, and the receiving unit is further configured to receive the resource. a second resource allocation message of the resource pool sent by the manager, the sending unit is further configured to send a startup request to the node manager according to the second resource allocation message of the resource pool, requesting the node manager to start a new allocation for the resource pool Increased resource container.
  • the determining unit determines the number of free resource containers remaining in the resource pool, or periodically determines the resource pool by the determining unit.
  • the number of remaining idle resource containers when the number of remaining free resource containers is less than a preset first threshold, indicating that the resource container of the resource pool may not satisfy the subsequent allocation operation, and then re-send to the resource manager through the sending unit. Apply for a resource container to supplement the resource container in the resource pool.
  • the device further includes a determining unit, and a sending unit, the determining unit is configured to determine the resource pool The number of remaining idle resource containers, if the number of remaining free resource containers is greater than a preset second threshold, the sending unit is configured to send a resource release message to at least one resource container in the idle resource container, where The resource release message is used to release resources occupied by the at least one resource container.
  • the determining unit determines the number of free resource containers remaining in the resource pool, or periodically determines, by the determining unit, the remaining idle in the resource pool.
  • the number of resource containers when the remaining free resource containers If the number is greater than the preset second threshold, the resource release message is sent by the sending unit to the at least one resource container in the idle resource container. Thereby releasing the resources occupied by the resource pool and ensuring the rational use of the global resources.
  • the allocating unit is further configured to maintain state information of each resource container in the resource pool, where The status information indicates whether the corresponding resource container is idle; the allocation unit is configured to select an idle resource container from the resource pool, and the allocation unit is configured to: according to the status information of each resource container in the resource pool, Select an idle resource container in the resource pool and assign the selected resource container in the resource pool to the client.
  • the allocating unit may maintain a state information table, where the state information table maintains state information of each resource container in the resource pool, so that the allocating unit can quickly determine the current idle resource container, and according to the idle resource container and A resource allocation request allocates a container for the client.
  • the allocation unit selects an idle resource container from the resource pool and allocates the virtual resource container to the client, It is also used to set the state of each resource container in the resource container assigned to the client to not idle.
  • the allocating unit sets a state of each resource container in the resource container allocated to the client After not being idle, the receiving unit is further configured to receive a status update message from each of the resource containers allocated to the client, the status update message is used to indicate that the task assigned by the client is completed; Used to set the state of each resource container in the resource container assigned to the client to idle based on the status update information.
  • the allocation unit dynamically updates the state information maintained by itself according to the change of the state of the resource container, thereby ensuring that the idle resource container in the resource pool can be accurately determined when there is a resource allocation request.
  • the resource allocation request further includes user rights information; the receiving unit is further configured to be configured according to the preset The user permission library verifies the user rights information, and the user permission library contains the user rights information.
  • the user permission library contains user rights information of different users. If the user permissions are not in the library If the user rights information of the client is included, the resource allocation request of the client is rejected or not; if the user rights information of the client 102 is included in the user rights library, the subsequent resource allocation step is performed.
  • the present application provides a resource container allocation system in a distributed system, where the distributed system includes a resource manager and a node manager, and the resource manager is configured to start an application controller when a trigger timing is met.
  • Configuring an initial specification of the resource pool managed by the application controller, and receiving a first resource request of the resource pool sent by the application controller according to the initial specification of the resource pool, and initializing the resource pool according to the first resource request The resource container sends a first resource allocation message of the resource pool to the application controller; the application controller is configured to acquire a first resource allocation message of the resource pool sent by the resource manager, according to the resource pool Information indicating a node where the initial resource container allocated for the resource pool is indicated in a resource allocation message, requesting the node manager to start an initial resource container of the resource pool; the node manager is configured to use according to the application The controller's request to start the initial resource container.
  • the resource manager is configured to start the application controller when the trigger timing is met, and the resource manager is configured to receive the request for starting the application controller in advance The application controller is started when the request for the resource pool is pre-configured.
  • the initial specification of the resource pool used by the resource manager to configure the application controller includes: The resource manager is configured to configure an initial specification of the resource pool managed by the application controller according to the preset resource requirement information of the application set in advance or according to the collected usage information of the node resource of the distributed system.
  • the application controller is further configured to receive a resource allocation request from a client, where the resource allocation request is And configured to request a resource container for an application running on the client, and select an idle resource container from the resource pool to allocate to the client according to the resource requirement information of the application in the resource allocation request.
  • the resource allocation request includes an identifier of the client, where the application controller is in the resource pool After the idle resource container is allocated to the client, the method further includes: sending an indication message to each resource container in the resource container allocated to the client, where the indication message is carried The identifier of the client, the indication message is used to indicate that each resource container is allocated to the client.
  • the application controller is further configured to: determine the remaining idle resource containers in the resource pool a number, if the number of remaining free resource containers is less than a preset first threshold, sending a second resource request to the resource manager, and receiving a second resource allocation message of the resource pool of the resource manager, according to the resource The second resource allocation message of the pool sends a start request to the node manager, requesting the node manager to start a new resource container allocated for the resource pool.
  • the application controller is further configured to: determine the remaining idle resource containers in the resource pool a number, if the number of remaining free resource containers is greater than a preset second threshold, sending a resource release message to at least one resource container in the idle resource container, the resource release message being used to release the at least one resource container Resources.
  • the application controller is further configured to maintain state information of each resource container in the resource pool, in a seventh possible implementation manner of the ninth aspect,
  • the status information is used to indicate whether the corresponding resource container is idle.
  • the application controller selects an idle resource container from the resource pool to be allocated to the client, and is further configured to: according to status information of each resource container in the resource pool. To determine the free resource container in the resource pool.
  • the application controller selects an idle resource container from the resource pool and allocates the Also used to: set the state of each resource container in the resource container assigned to the client to not idle.
  • the application controller allocates a status of each resource container in the resource container of the client After being set to not idle, the method is further configured to: receive a status update message from each of the resource containers allocated to the client, the status update message is used to indicate that the task assigned by the client is completed, and is assigned to the The status of each resource container in the client's resource container is set to idle.
  • the resource allocation request further includes user rights information;
  • the system further includes: the application controller
  • the user rights information is verified according to a preset user permission library, and the user rights library contains the user rights information.
  • the ninth aspect is the system implementation corresponding to the first aspect and the fourth aspect, the first aspect or any one of the possible implementation manners of the first aspect, and the feature description in any one of the possible implementation manners of the fourth aspect or the fourth aspect Applicable to any possible implementation of the ninth aspect or the ninth aspect, and details are not described herein again.
  • the resource container is applied to the resource manager in advance by the application controller, and the requested resource container is started in advance.
  • the resource container resource can be allocated in time to avoid The waiting time of the resource container is started, and the resource container in the resource pool is reused, thereby avoiding resource consumption of multiple opening and closing of the resource container, and directly managing the resource container in the resource pool through the application controller, thereby realizing more Flexible management.
  • FIG. 1 is a block diagram of an exemplary networked environment of a resource allocation system in accordance with an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a hardware of a computing device according to an embodiment of the invention.
  • FIG. 3 is a signaling diagram of a resource allocation method according to an embodiment of the present invention.
  • FIG. 4 is a signaling diagram of a resource allocation method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram showing the logical structure of a resource allocation apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram showing the logical structure of a resource allocation apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram showing the logical structure of a resource allocation apparatus according to an embodiment of the present invention.
  • Resource Manager is a global resource manager responsible for resource management and allocation of the entire yarn system.
  • a resource container is a resource abstraction in yarn that encapsulates multiple types of resources on a node. Such as memory, CPU, disk, network, etc.
  • the resource container is used to perform the tasks of the application.
  • the client is a device running an application to be applied for, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or Message Passing Interface (MPI).
  • MapReduce MapReduce, Giraph, Storm, Spark, Tez/Impala, or Message Passing Interface (MPI).
  • the Node Manager is a resource and task manager on each node. On the one hand, it periodically reports the resource usage on the node and the running status of each resource container to the RM; on the other hand, It receives and processes various requests such as start/stop of resource containers from the application controller.
  • the application controller is used to apply for and manage the resource pool to the RM.
  • the resource pool runs the started resource container, and allocates the resource container to the client according to the resource allocation request of the client.
  • FIG. 1 shows an exemplary networked environment block diagram of a resource allocation system 100, as shown in FIG. 1, a system 100 including a client 102, an application controller 104, a resource manager 112, and a plurality of nodes 106, each of which The node 106 includes a node manager 108 and at least one resource container 110, and the resource pool 114 contains at least one resource container 110.
  • the client 102 runs an application to be applied for, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI.
  • the client 102 can be any type of computing device, which is not limited by the embodiment of the present invention.
  • the resource container 110 is a resource abstraction in a node, which can encapsulate multiple types of resources on a node, such as memory, CPU, disk, network, and the like.
  • the resource container 110 may also encapsulate only a part of the resources on a certain node, for example, only the memory and the CPU are encapsulated, which is not limited by the embodiment of the present invention.
  • the resource container 110 can run any type of task.
  • a MapReduce application can request a resource container 110 to initiate a map or reduce task
  • a Giraph application can request a resource container 110 to run a Giraph task.
  • Users can also implement a custom application type that runs a specific task through the resource container 110 to implement a completely new application framework.
  • the application controller 104 is configured to manage the resource container 110 in the resource pool 114.
  • the resource container in the resource pool 114 is a resource container that has been started, and is applied and started by the application controller 104 to the resource manager 112 in advance. At the same time, the task of the job can be performed as soon as possible, thereby saving the startup time of the resource container 114.
  • the application controller 104 requests resources from the resource manager 112, the resources returned by the resource manager 112 for the application controller 104 are represented by the resource container 110.
  • the resource pool 114 shown in FIG. 1 includes multiple resource containers 110 among the plurality of nodes 106, but does not limit the plurality of nodes 106 or the resource containers 110 of the plurality of nodes 106 all belong to the resource pool. 114.
  • the resource pool 114 is composed of a resource container 110.
  • the application controller 104 is configured to receive a resource allocation request from an application on the client 102, and allocate a resource container required for the application of the application to the application of the application.
  • a resource container required for the application of the application to the application of the application.
  • the client 102 is in the application's job.
  • Each task is assigned a resource container, and the task can only use the resources described in the resource container.
  • the resource manager 112 is a global resource manager responsible for resource management and allocation of the entire system. When receiving a resource request from the application controller 104, the resource controller can be allocated to the application controller 104 according to the load of the entire system. .
  • the client 102, the application controller 104, the resource manager 112, and the node manager 108 of each node 106 can communicate through a network, where the network can be the Internet, an intranet, or a local area network (LAN). , Wireless Local Area Networks (WLANs), Storage Area Networks (SANs), etc., or a combination of the above.
  • LAN local area network
  • WLANs Wireless Local Area Networks
  • SANs Storage Area Networks
  • FIG. 1 is merely exemplary participants of the system 100 and their interrelationships. Therefore, the depicted system 100 is greatly simplified, and the embodiments of the present invention are merely described in general terms, and the implementation thereof is not limited in any way.
  • the client 102, the application controller 104, and the node 106 in FIG. 1 may be of any architecture, which is not limited by the embodiment of the present invention.
  • the application controller 104 and/or resource manager 112 shown in FIG. 1 can be implemented by the computing device 200 shown in FIG. 2.
  • computing device 200 includes a processor 202, a memory unit 204, an input/output interface 206, a communication interface 208, a bus 210, and a storage device 212.
  • processor 202, the memory unit 204, and the input/output connection implement a communication connection with each other via the bus 210.
  • the processor 202 is a control center of the computing device 200 for executing related programs to implement the technical solutions provided by the embodiments of the present invention.
  • the processor 202 includes one or more central processing units (CPUs), such as the central processing unit 1 and the central processing unit 2 shown in FIG.
  • the computing device 200 can also include multiple processors 202, each of which can be a single core processor (including one CPU) or a multi-core processor (including multiple CPUs).
  • a component for performing a specific function for example, the processor 202 or the memory unit 204, may be implemented by configuring a general-purpose component to perform a corresponding function, or may be specifically performed by a specific one.
  • the processor 202 can be a general-purpose central processing unit, a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs to implement the technology provided by the present application. Program.
  • Processor 202 can be coupled to one or more storage schemes via bus 210.
  • the storage scheme can include a memory unit 204 and a storage device 212.
  • the storage device 212 can be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • Memory unit 204 can be a random access memory.
  • the memory unit 204 can be integrated with or integrated with the processor 202, or it can be one or more memory units independent of the processor 202.
  • Program code for execution by the processor 202 or a CPU internal to the processor 202 may be stored in the storage device 212 or the memory unit 204.
  • program code eg, an operating system, an application, a resource allocation module, or a communication module, etc.
  • stored internal to storage device 212 is copied to memory unit 204 for execution by processor 202.
  • the storage device 212 can be a physical hard disk or a partition thereof (including a small computer system interface storage or a global network block device volume), a network storage protocol (including a network file system NFS or the like network or a cluster file system), a file-based virtual storage device (virtual Disk mirroring), logical volume-based storage devices. It may include high speed random access memory (RAM), and may also include non-volatile memory, such as one or more disk memories, flash memories, or other non-volatile memory.
  • RAM random access memory
  • the storage device may further include the one or more processors 202 separate remote storage, such as a network disk accessed through a communication interface 208 with a communication network, which may be the Internet, an intranet, a local area network (LANs), a wide area network (WLANs), a storage area network (SANs), etc., or A combination of the above networks.
  • a communication network which may be the Internet, an intranet, a local area network (LANs), a wide area network (WLANs), a storage area network (SANs), etc., or A combination of the above networks.
  • Operating systems include controls and management of general system tasks (such as memory management, storage device control, power management, etc.) And various software components and/or drivers that facilitate communication between various hardware and software components.
  • the input/output interface 206 is for receiving input data and information, and outputting data such as operation results.
  • Communication interface 208 enables communication between computing device 200 and other devices or communication networks using transceivers such as, but not limited to, transceivers.
  • Bus 210 may include a path for communicating information between various components of computing device 200, such as processor 202, memory unit 204, input/output interface 206, communication interface 208, and storage device 212.
  • the bus 210 can use a wired connection or a wireless communication mode, which is not limited in this application.
  • computing device 200 shown in FIG. 2 only shows the processor 202, the memory unit 204, the input/output interface 206, the communication interface 208, the bus 210, and the storage device 212, in a specific implementation process, the field Those skilled in the art will appreciate that computing device 200 also includes other devices necessary to achieve proper operation.
  • the computing device 200 can be a general purpose computer or a special purpose computing device, including but not limited to a portable computer, a personal desktop computer, a network server, a tablet computer, a mobile phone, a personal digital assistant (PDA), or the like. Or a combination of two or more of the above, the present application does not limit the specific implementation of the computing device 200.
  • a portable computer including but not limited to a portable computer, a personal desktop computer, a network server, a tablet computer, a mobile phone, a personal digital assistant (PDA), or the like.
  • PDA personal digital assistant
  • computing device 200 of FIG. 2 is merely an example of one computing device 200, which may include more or fewer components than those shown in FIG. 2, or have different component configurations.
  • computing device 200 may also include hardware devices that implement other additional functions, depending on the particular needs.
  • computing device 200 may also only include the components necessary to implement embodiments of the present invention, and does not necessarily include all of the devices shown in FIG.
  • the various components shown in Figure 2 can be implemented in hardware, software, or a combination of hardware and software.
  • FIG. 2 and the foregoing description are applicable to various computing devices provided by the embodiments of the present invention, and are applicable to performing various resource allocation methods provided by the embodiments of the present invention.
  • the memory unit 204 of the computing device 200 includes a resource allocation module, and the processor 202 executes the resource allocation module program code to implement resource management and allocation.
  • the resource allocation module can be comprised of one or more operational instructions to cause computing device 200 to perform one or more method steps in accordance with the above description. The specific method steps are described in detail in the following sections of this application.
  • the distributed system includes a resource manager 112, a node manager 108, and at least two nodes 106.
  • the resource manager 112 is configured to manage the distributed system.
  • the node resource is used by the node manager 108 to start the resource container 110 based on the node resource.
  • the resource container 110 encapsulates the resource of the node 106 for performing the task of the application.
  • the resource allocation process includes:
  • the resource manager 112 launches the application controller 104.
  • the resource manager 112 is configured to manage resources of the at least two nodes 106, and launch the application controller 104 when the trigger timing is met, and configure an initial specification of the resource pool 114 managed by the application controller 104.
  • the resource manager 112 may start the application controller 104 when the system is initialized; or may be dynamically started by the user according to the requirements according to the requirements, for example, the resource manager 112 may receive the pre-launched application controller.
  • the request of 112, or the request to pre-configure the resource pool 114 initiates the application controller 104; or is initiated by the resource manager 112 according to its own resource status, for example, when each application is frequently received
  • the application controller 104 of the corresponding application type can be started. It should be understood that embodiments of the present invention do not limit the startup form of the application controller 104.
  • the resource manager 112 configures an initial specification of the resource pool 114 managed by the application controller 104, so that when the application controller 104 subsequently applies for resources to the resource manager 112, This initial specification determines the number and specifications of resource containers that are applied to the resource manager 112.
  • the initial specification of the resource pool 114 may be the number of resource containers initially included in the resource pool 114, and the specifications of each resource container, where the specification of the resource container is the resource type included in the resource container, and the number of each type of resource; Resource container specifications are set in advance, then resource pool
  • the initial specification of 114 may be the number of resource containers initially included in the resource pool 114; if the specifications of the resource container are configured in advance, and the resource container has multiple specifications, the resource types in each specification and the number of resources of each type are If set in advance, the initial specification of the resource pool 114 may be the resource container specification identifier initially included in the resource pool 114, and the number of resource containers for each specification.
  • the initial specification of the resource pool 114 may also be the type of various resources initially included by the resource pool 114, as well as the number of each resource.
  • the embodiment of the present invention does not limit the form of the initial specification of the resource pool 114, and the representation of the number of resources may be different in different scenarios.
  • the resource manager 112 may configure an initial specification of the resource pool 114 managed by the application controller 104 according to the expected resource requirement information of the preset application; the resource manager 112 may also be based on the collected node resources of the distributed system.
  • the usage information configures the initial specifications of the resource pool 114 managed by the application controller 104.
  • resource manager 112 may configure an initial specification of a larger resource pool 114 when there are sufficient unused node resources; resource manager 112 may configure smaller resources when there are fewer unused node resources The initial specification of pool 114.
  • the initial specification of the resource pool 114 can be carried by the user in the startup command of the application controller 104, and the resource manager 112 configures the application controller 104 when the application controller 104 is launched.
  • the application controller 104 may have a one-to-one correspondence with the application, because different applications have different requirements for resource containers, and the types of resources and the number of each resource in the resource containers corresponding to different applications may be different, and resource pools corresponding to different applications.
  • the initial specifications of the 114 may also be different, which is not limited by the embodiment of the present invention.
  • the application controller 104 sends a first resource request to the resource manager 112.
  • the application controller 104 may also register with the resource manager 112 before sending the first resource request to the resource manager 112 to facilitate subsequent resource manager 112 to manage the application controller 104.
  • the application controller 104 can also register with a registration server, and the address information of the registration server is distributed to the client 102 for the subsequent client 102 to query the address information of the application controller 104 through the registration server.
  • the registration server can store the mapping between the type of the application and the application controller 104.
  • the type of the application can be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI.
  • the registration server stores the Spark application and the corresponding Spark.
  • Applied application controller 104 Corresponding relationship, when the application running on the client 102 is Spark, the client 102 can find the address information of the application controller 104 corresponding to the Spark application according to the correspondence. It is also possible to set up a separate registration server for a certain type of application. For example, a corresponding registration server can be set specifically for a Spark application.
  • the embodiment of the present invention does not limit the specific implementation manner of the registration server.
  • each type of application may correspond to multiple application controllers 104, and the multiple application controllers 104 may be used by different users, thereby facilitating the expansion of the distributed system scale.
  • the correspondence between the client 102 and the application controller 104 may be maintained on the registration server, or multiple registration servers may be set, and each registration server corresponds to a different user. It should be understood that the embodiment of the present invention does not limit the number of application servers corresponding to each type of application and the implementation manner of the registration server.
  • the application controller 104 carries the quantity of resources determined according to the initial specification of the resource pool in the first resource request sent to the resource manager 112.
  • the number of resources may be the number of resource containers requested by the application controller 104 to the resource manager 112 and the specifications of each resource container, wherein the specification of the resource container refers to the kind of resources included in the resource container and the number of each resource.
  • the resources included in each resource container may include one or more of a processor resource, a memory resource, a network, a disk, and the like. It should be understood that, according to the type of application, the resource type in the corresponding requested resource container and the number of each resource may be different, which is not limited by the embodiment of the present invention.
  • the request that is carried in the first resource request sent by the application controller 104 to the resource manager 112 is The number of resources can be the number of resource containers. If the specification of the resource container is configured in advance, and the resource container has multiple specifications, the resource type in each specification and the resource quantity of each type are set in advance, the resource quantity may be the application controller 104 to the resource manager.
  • the amount of resources may also be the kind of various resources that the application controller 104 needs, as well as the number of each resource.
  • the embodiment of the present invention does not limit the form of the quantity of resources requested in the first resource request, and the representation of the quantity of resources may be different in different scenarios.
  • the first resource request may further carry node information and/or rack information in which the resource container requested by the application controller 104 is located.
  • the application controller 104 may preferably host a shorter link with the application controller 104 as a node of the resource container running the request, thereby enabling more efficient control of the resource pool 114.
  • the resource manager 112 determines a node to allocate a resource container to the application controller 104 on the determined node.
  • the resource manager 112 is responsible for global resource management and allocation. After the resource manager 112 receives the first resource request of the resource pool 114 sent by the application controller 104 according to the initial specification of the resource pool 114, according to the first resource request, An initial resource container is allocated for the resource pool. Specifically, after receiving the first resource request, the resource manager 112 first determines an optional node, and according to the quantity of resources requested in the first resource request of the resource pool 114, the resource pool 114 is on the optional node. Assign the initial resource container.
  • the resource manager 112 preferentially obtains from the host node and/or the rack.
  • the application controller 112 allocates a resource container. If the host node and/or the rack load requested in the first resource request cannot be satisfied currently, for example, the requested host cannot satisfy the request due to a load problem, it may be in the rack where the node is located.
  • a resource container is allocated for the application controller 112. If the rack cannot satisfy the request due to load balancing, the resource controller can be allocated to the application controller 112 in the rack adjacent to the rack.
  • the resource manager 112 may allocate the resource container to the application controller 104 according to the load balancing of the node, or The application controller 104 is preferably allocated resources on a host or rack that has a short link to the application controller 104.
  • embodiments of the present invention do not define a policy for resource manager 112 to allocate resource containers to application controllers 104.
  • the resource manager 112 sends a first resource allocation message to the application controller 104.
  • the resource manager 112 sends a first resource allocation message of the resource pool 114 to the application controller 104, where the first resource allocation message includes an initial resource allocated by the resource manager 112 to the resource pool 114 managed by the application controller 104. Information about the node where the container is located.
  • the first resource allocation message may carry the resource manager 112 as an application controller.
  • the specification of a resource container is the kind of resource that the resource container contains, and the number of each resource.
  • the information of the node where each resource container is located may be carried in the first resource allocation message. If the resource manager 112 allocates multiple resource containers to the application controller 104 on the same node, the first resource allocation message may also carry the number of resource containers allocated in each node.
  • the first resource allocation message may carry the allocation.
  • the resource manager 112 may not immediately return the resource that meets the requirements for the application controller 104, but requires the application controller 104 to continuously and resource management.
  • the device 112 communicates, detects the allocated resources, and pulls them past.
  • the application controller 104 launches a resource container.
  • the application controller 104 sends a start request to the node manager requesting the node manager to initiate the initial resource container of the resource pool 114.
  • the application controller 104 After the application controller 104 obtains the resource container allocated to the resource manager 112, it sends a startup request to each node where the resource container is located, and more specifically, to the node of the node where each resource container is located.
  • the manager sends the start request to start the resource container for which the resource manager 112 is assigned.
  • the startup request also carries the specification of the resource container, and the specification of the resource container is the type of the resource container including the resource, and the quantity of each resource.
  • the resource type in each specification and the number of resources in each type are set in advance, the resource container that requests the startup is also carried in the startup request. Specification of the logo.
  • the resource manager 112 allocates multiple resource containers on the same node for the application controller 104, the multiple resource containers may be initiated by carrying a request in one startup request.
  • the node manager of the node where the resource container is located After receiving the startup request, the node manager of the node where the resource container is located first performs resource localization, that is, creates a working directory of the resource container, and downloads each required for running the resource container from the distributed file system (Hadoop Distributed File System, HDFS). Resources (jar packages, executable files, etc.), etc., and then start the resource container.
  • resource localization that is, creates a working directory of the resource container, and downloads each required for running the resource container from the distributed file system (Hadoop Distributed File System, HDFS). Resources (jar packages, executable files, etc.), etc., and then start the resource container.
  • HDFS Hadoop Distributed File System
  • the activated resource container constitutes the resource pool 114.
  • the resource containers in the resource pool 114 are already in the startup state, when the task is received, the execution can be performed quickly, and the startup time before the task is avoided, thereby speeding up the execution of the application task.
  • the application controller 104 receives a resource allocation request from the client 102.
  • the client 102 runs an application, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI.
  • the client 102 When the application running on the client 102 requires resources for task processing, the client 102 sends a resource allocation request to the application controller.
  • the client 102 can query the address information of the application controller 104 corresponding to the application through a registration server, and send the resource allocation request to the application controller 104 indicated by the corresponding address information.
  • the registration server stores the corresponding information of the application controller 104 and the application, and different applications may have different application controllers 104.
  • resource containers of different applications have various specifications, and an application controller 104 and a resource pool 114 are configured for each type of application, so that rational utilization of resource containers can be realized.
  • the Spark application is used as an example.
  • the specific implementation process can be as follows: The user submits a spark application through the spark-submit.
  • the user program automatically creates a SparkContext.
  • the SparkContext is used by the Spark to provide the external interface for the Spark.
  • the schedulerBackend queries the registration server, searches for the address information of the application controller 104 corresponding to the Spark application, and sends the resource allocation request to the corresponding application controller 104.
  • the resource allocation request carries resource requirement information of the application requested by the client 102 to the application controller 104.
  • the resource requirement information of the application in the resource allocation request may be the number of resource containers requested, and the specifications of each resource container.
  • the specification of a resource container is the kind of resource that the resource container contains, and the number of each resource.
  • the resource requirement information of the application may be the requested resource container number.
  • the resource requirement information of the application may be the requested resource.
  • the resource requirement information of the application in the resource allocation request may also be the kind of various resources requested by the client 102 to the application controller 104, and the number of each resource.
  • the resource allocation request further includes user rights information of the client 102
  • the application controller 104 verifies the user rights information of the client 102 according to the preset user permission library, where the user rights The library contains user permission information for different users. If the user rights information of the client 102 is not included in the user rights library, the resource allocation request of the client 102 is rejected or not; if the user rights library contains the user rights information of the client 102, the following steps are performed.
  • the application controller 104 allocates a resource container to the client 102.
  • the application controller 104 After receiving the resource allocation request from the client 102, the application controller 104 selects an idle resource container from the resource pool 114 and allocates it to the client 102 according to the resource requirement information of the application in the resource allocation request.
  • the application controller 104 first determines an idle resource container in the resource pool 114, and according to the resource requirement information of the application carried in the resource allocation request, from the idle resource container.
  • a resource container is allocated for the client 102.
  • the idle resource container refers to a resource container that is not currently assigned to perform an application task.
  • the resource containers in the resource pool 114 controlled by the application controller 104 can be simultaneously allocated for use by the plurality of clients 102, one resource container can only perform one client 102 allocation at the same time. The task, so the application controller 104 needs to first determine the currently idle resource container before allocating the resource container to the client 102, thereby allocating the requested resource from the idle resource container to the client 102.
  • the application controller 104 can determine whether the current resource container is currently in an idle state by sending a query message to each resource container in the resource pool 114.
  • the application controller 104 maintains state information of each resource container in the resource pool 114, the state information indicating whether the resource container is in an idle state, and if the resource container is currently allocated and other tasks are running, the resource container The status of the resource container is idle. If the resource container is not currently allocated and no other tasks are running, the status of the resource container is idle.
  • the application controller 104 may determine the resource container in the resource pool 114 according to the state information of each resource container in the resource pool 114, and then request the resource requirement of the application according to the resource allocation request. Information, the resource container is allocated to the client 102 from the currently idle resource container.
  • the application controller 104 selects an idle resource container from the resource pool 114 and allocates it to the client 102.
  • the application controller 104 selects an idle resource container in the resource pool 114 according to the state information of each resource container in the resource pool 114.
  • the resource containers that are free in the selected resource pool 114 are allocated to the client 102.
  • the resource container allocated by the application controller 104 to the client 102 is referred to as a first resource container group, and the first resource container group includes at least one resource container.
  • the application controller 104 If the application controller 104 maintains state information for each resource container in the resource pool 114, the application controller 104 allocates the resource containers in the first resource container group to the client 102, and also each of the first resource container groups. The status of the resource container is set to not idle.
  • the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and sends the second resource to the resource manager 112 if the number of remaining idle resource containers is less than a preset first threshold. And requesting, and receiving the second resource allocation message of the resource pool 114 sent by the resource manager 112, sending a startup request to the node manager according to the second resource allocation message of the resource pool 114, requesting the node manager to start as the resource pool 114 The newly added resource container.
  • the second resource request may carry the requested resource quantity, and the requested resource quantity may be a difference between the first threshold and the current idle resource container number, or may be positively correlated with the difference.
  • the form of the second resource request is similar to the form of the first resource request, and details are not described herein again.
  • the second resource allocation message is similar to the first resource allocation message, and details are not described herein again.
  • the application server 104 may determine the number of idle resource containers remaining in the resource pool 114 after allocating the resource containers in the first resource container group to the client 102 or according to a preset period. It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
  • the application controller 104 sends an indication message to each resource container in the first resource container group, where the indication message carries the identifier of the client 102.
  • the resource allocation request sent by the client 102 to the application controller 104 further carries the identifier of the client 102, and the indication message sent by the application controller to each resource container in the first resource container group is used to indicate that the first resource is to be used.
  • Each resource container in the container group is assigned to the client 102.
  • the identifier of the client 102 may be address information of the client 102 and a port number that communicates with the resource container in the first resource container group.
  • Each resource container in the first resource container group sends a registration message to the client 102.
  • the registration message carries the identification information of the resource container, and the identification information of the resource container is the identification information for the client to uniquely identify the resource container.
  • the identifier information is address information of a node where the resource container is located and information of the resource container at the node.
  • the registration message can also carry the specification of the resource container, that is, the kind of resources in the resource container and the number of each resource.
  • the registration message may carry the specification identifier of the resource container.
  • the client 102 assigns the application's job to the resource container that the application controller 114 allocates for the client 102.
  • the client 102 After receiving the registration message of the resource container in the first resource container group, the client 102 divides the application job into at least one task, and sends each task to one resource container in the first resource container group.
  • the resource container allocated by the application controller 114 to the client 102 sends the execution result of the task to the client 102.
  • the task After the resource container in the first resource container group is assigned to its task execution, the task will be The execution result is sent to the client 102. It should be understood that the execution result here may be an actual task execution result. For a task that does not need to return a result, the execution result here may be a task completion indication message.
  • the client 102 ends the connection with the resource container in the first resource container group.
  • the client 102 After receiving the task execution result returned by the resource container in the first resource container group, the client 102 sends an end connection message to the resource container of the first resource container group, ending the connection with the resource container in the first resource container group, and releasing Resource container resource.
  • the resource container in the first resource container group sends a status update message to the application controller 104.
  • the resource container in the first resource container group After the client 102 releases the resource container resource in the first resource container group, the resource container in the first resource container group sends a status update message to the application controller 104 indicating that a new task can be received.
  • the application controller 104 If the application controller 104 maintains state information of each resource container in the resource pool 114, the application controller 104 sets the state of the corresponding resource container to idle after receiving the status update message of the resource container in the first resource container group. When there is a new resource allocation request, it can be allocated to the client 102 as an alternative resource container.
  • the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and if the number of remaining free resource containers is greater than a preset second threshold, at least one of the idle resource containers The resource container sends a resource release message, and the resource release message is used to release resources occupied by the at least one resource container.
  • the number of resource containers released may be the difference between the number of currently idle resource containers and the second threshold, or may be positively correlated with the difference.
  • the application server 104 further maintains the idle time of each resource container in the resource pool 114.
  • the application server 104 provides the idle time to be greater than the preset time.
  • the third threshold of the resource container sends a resource release message.
  • the application server 104 when the number of remaining idle resource containers is greater than a preset second threshold, the application server 104 preferentially sends a resource release message to the resource container whose idle time is greater than a preset third threshold. Or when the number of remaining idle resource containers is greater than a preset second threshold, and the idle time of the resource container in the resource pool 114 is greater than a preset third threshold, the application server 104 is idle longer than the preset third.
  • the threshold resource container sends a resource release message.
  • the application server 104 may receive the resources in the first resource container group.
  • the status update message of the source container after re-setting the resource container in the first resource container group to the idle state, or determining the number of free resource containers remaining in the resource pool 114 according to a preset period. It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
  • the application controller 104 and the resource pool 114 are separately configured for each type of application, so that the rational use of resources can be realized.
  • the embodiment of the present invention does not limit the application, and the application controller 104 and the resource pool 114 can be shared by multiple types of applications.
  • the resource controller is applied to the resource manager in advance by the application controller, and the requested resource container is started in advance.
  • the resource container resource can be allocated in time.
  • FIG. 4 is a signaling diagram of a resource allocation process according to an embodiment of the present invention. As shown in FIG. 4, the resource allocation process includes:
  • Steps 402-410 refer to steps 302-310 of the embodiment of FIG. 3, and details are not described herein again.
  • the client 102 sends an application job to the application controller 104.
  • the client runs an application of the user, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI.
  • the application job is sent to the application controller through the client 102.
  • the client 102 can query the address information of the application controller 104 corresponding to the application through a registration server, and send the application job to the application controller 104 indicated by the corresponding address information.
  • the registration server stores the corresponding information of the application controller 104 and the application, and different applications may have different application controllers 104.
  • an application controller 104 and a resource pool are configured for each type of application, so that the resource container can be rationalized and utilized.
  • the application controller 104 allocates a resource container for the application job of the client 102.
  • the application controller 104 after receiving the application job sent by the client 102, the application controller 104 first determines an idle resource container in the resource pool 114, and divides the application job into at least one task according to the scale of the application job. And allocate a resource container for the application job of the client 102 from the idle resource container according to the number of tasks.
  • the application controller 104 also allocates different specifications of resources for different tasks according to the types of tasks. container. For example, if the application type is MapReduce, the task may be a Map type or a Reduce type. The Map task and the Reduce task may have different resource container resources. The application controller 104 may allocate different types of resource containers for different types of resource containers.
  • the application controller 104 is the client. Before allocating the resource container 102, it is necessary to first determine the currently idle resource container, thereby allocating resources for the application job of the client 102 from the idle resource container.
  • the application controller 104 can determine whether the current resource container is currently in an idle state by sending a query message to each resource container in the resource pool 114.
  • the application controller 104 maintains state information of each resource container in the resource pool 114, the state information is used to indicate whether the resource container is in an idle state, and if the resource container is currently allocated and other tasks are running, The state of the resource container is not idle. If the resource container is not currently allocated and no other tasks are running, the state of the resource container is idle.
  • the application controller 104 can determine the resource containers that are free in the resource pool 114 based on the state information of each resource container in the resource pool 114.
  • the resource container allocated by the application controller 104 to the application task of the client 102 is referred to as a second resource container group, and the second resource container group includes at least one resource container.
  • the application controller 104 If the application controller 104 maintains state information of each resource container in the resource pool 114, the application controller 104 allocates the resource container in the second resource container group to the task of the application job of the client 102, and also the second resource. The status of each resource container in the container group is set to not idle.
  • the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and sends the second resource to the resource manager 112 if the number of remaining idle resource containers is less than a preset first threshold. Requesting, and receiving a second resource allocation message of the resource pool 114 sent by the resource manager 112, according to the second resource allocation message of the resource pool 114, the application controller 104 sends a startup request to the node manager, requesting the node manager to start A new resource container that is allocated for resource pool 114.
  • the number of resources of the second resource request request may be a difference between the first threshold and the current number of idle resource containers, or may be positively correlated with the difference.
  • the application server 104 may determine the number of idle resource containers remaining in the resource pool 114 after the resource container in the second resource container group is allocated to the application job of the client 102, or according to a preset period. . It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
  • the application controller 104 delivers a task to each resource container in the second resource container group.
  • the application controller 104 divides the job of the client 102 into at least one task, and selects a resource container for each task, and then delivers the corresponding task to the resource container in the second resource container group.
  • the application controller 104 receives the task execution result returned by each resource container in the second resource container group.
  • the execution result of the task is returned to the application controller 104.
  • the execution result here may be an actual task execution result.
  • the execution result here may be a task completion indication message.
  • the application controller 104 If the application controller 104 maintains the state information of each resource container in the resource pool 114, the application controller 104 receives the task execution result returned by the resource container in the second resource container group, and sets the state of the corresponding resource container to Idle, when there is a new application job, it can be assigned to the new application job as an alternate resource container.
  • the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and if the number of remaining free resource containers is greater than a preset second threshold, at least one of the idle resource containers The resource container sends a resource release message, and the resource release message is used to release resources occupied by the at least one resource container.
  • the number of resource containers released can be current The difference between the number of idle resource containers and the second threshold, or a positive correlation with the difference.
  • the application server 104 further maintains the idle time of each resource container in the resource pool 114.
  • the application server 104 provides the idle time to be greater than the preset time.
  • the third threshold of the resource container sends a resource release message.
  • the application server 104 when the number of remaining idle resource containers is greater than a preset second threshold, the application server 104 preferentially sends a resource release message to the resource container whose idle time is greater than a preset third threshold. Or when the number of remaining idle resource containers is greater than a preset second threshold, and the idle time of the resource container in the resource pool 114 is greater than a preset third threshold, the application server 104 is idle longer than the preset third.
  • the threshold resource container sends a resource release message.
  • the application server 104 may receive the execution result returned by each resource container in the second resource container group, reset the resource container of the second resource container group to the idle state, or follow the preset.
  • the period determines the number of free resource containers remaining in resource pool 114. It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
  • the application controller 104 returns the execution result of the application job of the client 102 to the client 102.
  • the application controller 104 may merge the execution results returned by each resource container in the second resource container group, and then send the merged execution result to the client 102.
  • the application controller 104 and the resource pool 114 are separately configured for each type of application, so that the rational use of resources can be realized.
  • the embodiment of the present invention does not limit the application, and the application controller 104 and the resource pool 114 can be shared by multiple types of applications.
  • the resource controller is applied to the resource manager in advance by the application controller, and the requested resource container is started in advance.
  • the resource container resource can be allocated in time.
  • the waiting time of the resource container startup is avoided, and the resource consumption of the resource container in the resource pool is avoided, and the resource consumption of the resource container being opened and closed multiple times and the resource consumption of multiple application managers being turned on and off are avoided.
  • the controller directly manages the resource containers in the resource pool, enabling more flexible management.
  • FIG. 5 is a schematic diagram of a logical structure of a resource container allocation apparatus 500 in a distributed system according to an embodiment of the present invention.
  • the distributed system includes a device 500 and a node manager, where the apparatus 500 is used.
  • the node resource of the distributed system is managed, the node manager is configured to start the resource container based on the node resource, and the resource container is used to perform the task of the application.
  • the device 500 includes a starting unit 502, a receiving unit 504, an allocating unit 506, and processing.
  • Unit 508 wherein
  • the startup unit 502 is configured to start the application controller when the trigger timing is met, and configure an initial specification of the resource pool managed by the application controller.
  • the activation unit 502 can start the application controller upon receiving a request to start the application controller in advance or receiving a request for pre-configuring the resource pool.
  • the startup unit 502 can be started when the system is initialized, or can be dynamically started by the user according to the requirements according to the requirements, or can be started by the resource manager according to the resource status of the user, and the embodiment of the present invention does not Define the startup form of the application controller.
  • the initiating unit 502 may configure an initial specification of the resource pool managed by the application controller according to the expected resource requirement information of the preset application; the initiating unit 502 may further use the node resource of the distributed system collected by the device 500. Information, configure the initial specifications of the resource pool managed by the application controller. For example, when there are sufficient unused node resources, the startup unit 502 can configure an initial specification of a larger resource pool; when there are fewer unused node resources, the startup unit 502 can configure an initial of a smaller resource pool. specification.
  • the initial specification of the resource pool may be carried by the user in the startup command of the application controller, and the startup controller 502 configures the application controller when the application controller is started.
  • the receiving unit 502 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. More specifically, the communication module and resource allocation module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to launch the application server by instruction.
  • the receiving unit 504 is configured to receive a first resource request of the resource pool that is sent by the application controller according to an initial specification of the resource pool.
  • the receiving unit 504 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. More specifically, the communication module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to receive a resource request from a resource pool that the application controller sent according to the initial specification of the resource pool.
  • the allocating unit 506 is configured to allocate an initial resource container to the resource pool according to the first resource request.
  • the allocating unit 506 can be implemented by the processor 202 and the memory unit shown in FIG. 2. More specifically, resource allocation in memory unit 204 can be performed by processor 202. A module that allocates node resources for a resource pool to an application controller.
  • the sending unit 508 is configured to send, to the application controller, a first resource allocation message of the resource pool, where the first resource allocation message includes information about a node where the initial resource container allocated by the allocation unit 506 is allocated to the resource pool.
  • the sending unit 508 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. 2. More specifically, the communication module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to send a first resource allocation message for the resource pool to the application controller.
  • the embodiment of the present invention is an apparatus embodiment of the resource manager in the embodiment of FIG. 3 and FIG. 4, and the feature description of the embodiment of FIG. 3 and FIG. 4 is applicable to the embodiment of the present invention, and details are not described herein again.
  • FIG. 6 is a schematic diagram showing the logical structure of a resource container allocation apparatus 600 in a distributed system according to an embodiment of the present invention.
  • the distributed system includes a resource manager and a node manager, and the resource manager is used to manage node resources and nodes of the distributed system.
  • the manager is configured to start a resource container based on the node resource, where the resource container is used to execute the task of the application, and the resource manager starts the device 600 when the triggering time is met, and configures the initial specification of the resource pool managed by the device 600, and according to the initial specification of the resource pool,
  • the resource pool allocates an initial resource container, and the initial resource container in the resource pool is started;
  • Apparatus 600 includes:
  • the receiving unit 602 is configured to receive a resource allocation request from the client, where the resource allocation request is used to request a resource container for the application running on the client, where the resource allocation request carries resource requirement information of the application.
  • the receiving unit 602 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. More specifically, the communication module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to receive a resource allocation request from the client.
  • the allocating unit 604 is configured to select, from the resource pool, an idle resource container to be allocated to the client according to the resource requirement information of the application.
  • the allocating unit 604 can be implemented by the processor 202 and the memory unit shown in FIG. 2. More specifically, the resource allocation module in the memory unit 204 can be executed by the processor 202 to allocate the node resources of the resource pool to the application controller.
  • the apparatus 600 further includes a sending unit 606.
  • the sending unit 606 is configured to send the first resource request to the resource manager according to the initial specification of the resource pool, where The resource request carries the initial specification according to the resource pool.
  • the receiving unit 602 is further configured to receive a first resource allocation message of the resource pool sent by the resource manager, where the first resource allocation message includes an initial resource container allocated by the resource manager for the resource pool.
  • the information of the node where the node is located; the sending unit 606 is further configured to send a startup request to the node manager, requesting the node manager to start the initial resource container of the resource pool.
  • the resource allocation request includes the identifier of the client
  • the allocating unit 604 selects the idle resource container from the resource pool and allocates the data to the client
  • the sending unit 606 is configured to send an indication message to each resource container in the resource container allocated to the client.
  • the indication message carries the identifier of the client, and the indication message is used to indicate that each resource container is allocated to the client.
  • the apparatus 600 further includes a determining unit 608, configured to determine the number of free resource containers remaining in the resource pool, and if the number of remaining free resource containers is less than a preset first threshold, send The unit 606 is configured to send a second resource request to the resource manager, and the receiving unit 602 is further configured to receive the second resource allocation message of the resource pool sent by the resource manager, where the sending unit 606 is further configured to use the second resource according to the resource pool. Assign a message, send a start request to the node manager, and request the node manager to start the new resource container allocated for the resource pool.
  • the sending unit 606 is configured to send a resource release message to the at least one resource container in the idle resource container, where the resource release message is used to release at least one resource container occupation resource of.
  • the allocating unit 604 is further configured to maintain state information of each resource container in the resource pool, where the state information indicates whether the corresponding resource container is idle; and the allocating unit 604 is configured to select an idle resource container from the resource pool.
  • the allocation to the client includes: an allocating unit 604, configured to select an idle resource container in the resource pool according to status information of each resource container in the resource pool, and allocate the selected resource container in the selected resource pool to The client.
  • the allocation unit 604 selects an idle resource container from the resource pool and allocates it to the client, it also sets the state of each resource container in the resource container allocated to the client to be not idle.
  • the receiving unit 602 is further configured to receive a status update message from each of the resource containers allocated to the client, the status update.
  • the message is used to indicate that the task of the client allocation is completed; the allocating unit 604 is further configured to set the state of each resource container in the resource container allocated to the client to be idle according to the status update information.
  • the resource allocation request further includes user rights information; the receiving unit 602 is further configured to verify user rights information according to the preset user rights library, where the user rights library includes user rights information.
  • the embodiment of the present invention is an apparatus embodiment of the application controller in the embodiment of FIG. 3, and the feature description of the embodiment of FIG. 3 is applicable to the embodiment of the present invention, and details are not described herein again.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules is only a logical function division, and may be implemented in another manner, for example, multiple modules or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.
  • the modules described as separate components may or may not be physically separated.
  • the components displayed as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of hardware plus software function modules.
  • the above-described integrated modules implemented in the form of software function modules can be stored in a computer readable storage medium.
  • the software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform some of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a mobile hard disk, a read-only memory (English: Read-Only Memory, ROM for short), a random access memory (English: Random Access Memory, RAM for short), a magnetic disk or an optical disk, and the like. The medium of the code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Embodiments of the present invention provide a resource allocation method, device, and system, used to allocate a resource to an application. The method comprises: an application controller receives a resource allocation request from a client end, the resource allocation request carrying resource demand information of an application running on the client end, wherein the application controller is configured to manage a resource pool, the resource pool comprises at least one launched container, and the container in the resource pool is negotiated from a resource manager by the application controller and launched in advance; and the application controller selects, according to the resource demand information of the application, an idle resource container from the resource pool, and allocates the idle resource container to the client end. According to the method, an application controller can promptly allocate a resource of a resource container upon reception of a resource allocation request of an application, so that a waiting time for launching a resource container is eliminated. In addition, by reusing a resource container in a resource pool, a resource waste resulting from repeatedly creating and terminating a resource container is avoided.

Description

一种资源的分配方法、装置和系统Method, device and system for allocating resources 技术领域Technical field
本发明实施例涉及计算机领域,尤其涉及一种资源分配方法。The embodiments of the present invention relate to the field of computers, and in particular, to a resource allocation method.
背景技术Background technique
云计算发展以来,集群的规模不断扩大,集群种类也不断增多。如常见的面向并行编程技术任务的分布式计算Hadoop集群等。从业界使用分布式系统的变化趋势和Hadoop框架的长远发展来看,MapReduce的第一个版本MRv1中的作业追踪器(Job Tracker)/任务追踪器(Task Tracker)机制需要大规模的调整来修复它在可扩展性,内存消耗,线程模型,可靠性和性能上的缺陷。Since the development of cloud computing, the scale of clusters has continued to expand, and the types of clusters have also increased. Such as the common distributed computing Hadoop cluster for parallel programming technology tasks. From the trend of the use of distributed systems in the industry and the long-term development of the Hadoop framework, the Job Tracker/Task Tracker mechanism in the first version of MapReduce MRv1 requires large-scale adjustments to fix It is flawed in scalability, memory consumption, threading model, reliability and performance.
另一种资源协调者(Yet Another Resource Negotiator,yarn)是一种新的Hadoop资源管理器,它是一个通用资源管理系统,可为上层应用提供统一的资源管理和调度,它的引入为集群在利用率、资源统一管理和数据共享等方面带来了巨大好处。Yarn最初是为了修复MRv1的明显不足,并对可伸缩性、可靠性和集群利用率进行了提升。Yarn实现这些需求的方式是,把Job Tracker的两个主要功能(资源管理和作业调度/监控)分成了两个独立的服务程序,一个是全局的资源管理器(Resource Manager,RM)和针对每个作业的作业追踪器(Application Master,AM)。Yet Another Resource Negotiator (yarn) is a new Hadoop resource manager. It is a universal resource management system that provides unified resource management and scheduling for upper-layer applications. The benefits of utilization, unified resource management, and data sharing have brought tremendous benefits. Yarn was originally designed to fix the apparent lack of MRv1 and to improve scalability, reliability, and cluster utilization. Yarn implements these requirements by dividing Job Tracker's two main functions (resource management and job scheduling/monitoring) into two separate service programs, one for the Global Resource Manager (RM) and for each Job Master (AM) for each job.
Yarn中每个作业启用一个独立的AM,解决了MRv1中单点故障和扩展瓶颈问题。但这种方式将引入一个新的问题:作业延迟较大。每个作业首先要向RM申请资源启动一个AM,该AM再向RM申请资源并启动资源容器后,才可以正式启动作业。所以Yarn作业会产生较长的运行延迟,这不利于运行小作业,而且因为要为每一个作业申请一个AM,需要更多的计算资源。Each job in Yarn enables a separate AM, which solves the single point of failure and expansion bottleneck in MRv1. But this approach will introduce a new problem: the job delay is large. Each job first needs to start an AM with the RM application resource. After the AM requests the resource from the RM and starts the resource container, the job can be officially started. So the Yarn job will have a long run delay, which is not conducive to running small jobs, and because you need to apply for an AM for each job, you need more computing resources.
发明内容Summary of the invention
有鉴于此,本发明实施例提供了一种资源的分配方法、装置和系统,以 减少作业的等待时间。In view of this, an embodiment of the present invention provides a resource allocation method, device, and system, Reduce the waiting time for jobs.
第一方面,本申请提供了一种分布式系统中资源容器的分配方法,分布式系统包括资源管理器和节点管理器,资源管理器管理分布式系统的节点资源,节点管理器基于节点资源启动资源容器,资源容器用于执行应用的任务,该方法包括:资源管理器在满足触发时机时启动应用控制器,并配置该应用控制器管理的资源池的初始规格,该资源池的初始规格用于指示该应用控制器向资源管理器首次申请资源容器的数目和规格,资源管理器接收该应用控制器根据资源池的初始规格发送的资源池的第一资源请求后,根据该资源池的第一资源请求,为该资源池分配初始的资源容器,并向该应用控制器发送该资源池的第一资源分配消息,第一资源分配消息中包含资源管理器为应用控制器管理的资源池分配的初始的资源容器所在的节点的信息。In a first aspect, the present application provides a method for allocating resource containers in a distributed system. The distributed system includes a resource manager and a node manager. The resource manager manages node resources of the distributed system, and the node manager starts based on the node resources. The resource container is used to execute the task of the application. The method includes: the resource manager starts the application controller when the trigger timing is met, and configures an initial specification of the resource pool managed by the application controller, where the initial specification of the resource pool is used. After instructing the application controller to apply for the number and specification of the resource container for the first time to the resource manager, the resource manager receives the first resource request of the resource pool sent by the application controller according to the initial specification of the resource pool, according to the resource pool a resource request, allocate an initial resource container for the resource pool, and send a first resource allocation message of the resource pool to the application controller, where the first resource allocation message includes a resource pool allocation managed by the resource manager for the application controller. The information of the node where the initial resource container is located.
资源管理器主动启动应用控制器,并配置应用控制器管理的资源池的初始规格,从而使应用控制器根据资源池的初始规格向资源管理器提前申请一定数目的资源容器,从而提前启动申请到的容器,减少了后续应用作业等待资源容器启动的时间。The resource manager actively starts the application controller and configures the initial specification of the resource pool managed by the application controller, so that the application controller applies a certain number of resource containers to the resource manager in advance according to the initial specification of the resource pool, thereby starting the application in advance. The container reduces the time that subsequent application jobs wait for the resource container to start.
结合第一方面,在第一方面第一种可能的实现方式中,资源管理器在满足触发时机时启动应用控制器包括:资源管理器接收到预先启动应用控制器的请求或预先配置资源池的请求,或者在系统初始化时,启动该应用控制器。With reference to the first aspect, in a first possible implementation manner of the first aspect, the resource manager starting the application controller when the triggering timing is met includes: the resource manager receiving the request for pre-launching the application controller or pre-configuring the resource pool The request is initiated, or when the system is initialized, the application controller is started.
应用控制器的启动时间可以有多种方式,可以基于用户需求,由管理员通过指令进行启动,也可以通过系统配置,在系统初始化时,由资源管理器自主启动。The startup time of the application controller can be started in a variety of ways. It can be started by the administrator according to the user's requirements, or can be started by the resource manager through system configuration.
结合第一方面或第一方面以上任一种可能的实现方式,在第一方面第二种可能的实现方式中,资源管理器配置应用控制器管理的资源池的初始规格包括:资源管理器根据预先设置的应用的预期的资源需求信息,配置应用控制器管理的资源池的初始规格;或者,资源管理器根据收集到的分布式系统的节点资源的使用信息,配置应用控制器管理的资源池的初始规格。例如,当有充足未被使用的节点资源时,资源管理器可以配置较大的资源池的初始规格;当未被使用的节点资源较少时,资源管理器可以配置较小的资源池的初始规格。With reference to the first aspect or any one of the foregoing possible implementation manners, in the second possible implementation manner of the first aspect, the initial specification of the resource pool managed by the resource manager configuration application controller includes: the resource manager according to The expected resource requirement information of the preset application, the initial specification of the resource pool managed by the application controller is configured; or the resource manager configures the resource pool managed by the application controller according to the collected usage information of the node resource of the distributed system. The initial specifications. For example, when there are sufficient unused node resources, the resource manager can configure the initial specification of the larger resource pool; when there are fewer unused node resources, the resource manager can configure the initial of the smaller resource pool. specification.
另外,如果应用控制器是通过管理员的指令启动,则资源池的初始规格可以由管理员在应用控制器的启动指令中携带,并由资源管理器在启动该应 用控制器时,对应用控制器进行配置。In addition, if the application controller is started by an administrator's instruction, the initial specification of the resource pool can be carried by the administrator in the startup command of the application controller, and the resource manager starts the response. When using the controller, configure the application controller.
第二方面,本申请提供了一种计算机可读介质,包括计算机执行指令,当计算机的处理器执行该计算机执行指令时,该计算机执行第一方面或第一方面任一种可能的实现方式中的方法。In a second aspect, the present application provides a computer readable medium, comprising computer executed instructions, when the processor of the computer executes the computer to execute an instruction, the computer executes the first aspect or any of the possible implementations of the first aspect Methods.
第三方面,本申请提供了一种计算设备,包括:处理器、存储器、总线和通信接口;该存储器用于存储执行指令,该处理器与该存储器通过该总线连接,当该计算设备运行时,该处理器执行该存储器存储的该执行指令,以使该计算设备执行第一方面或第一方面任一种可能的实现方式中的方法。In a third aspect, the present application provides a computing device, including: a processor, a memory, a bus, and a communication interface; the memory is configured to store an execution instruction, the processor is connected to the memory through the bus, when the computing device is running The processor executes the execution instructions stored by the memory to cause the computing device to perform the method of any of the first aspect or the first aspect.
第四方面,本申请提供了一种分布式系统中资源容器的分配方法,分布式系统包括资源管理器和节点管理器,该资源管理器用于管理该分布式系统的节点资源,该节点管理器用于基于该节点资源启动资源容器,该资源容器用于执行应用的任务,该资源管理器在满足触发时机时启动应用控制器,配置该应用控制器管理的资源池的初始规格,并根据该资源池的初始规格为该应用控制器管理的资源池分配初始的资源容器,该资源池的初始的资源容器已启动;In a fourth aspect, the present application provides a method for allocating resource containers in a distributed system, where the distributed system includes a resource manager and a node manager, where the resource manager is used to manage node resources of the distributed system, and the node manager uses Starting a resource container based on the node resource, where the resource container is used to execute an application task, and the resource manager starts an application controller when the trigger timing is met, and configures an initial specification of the resource pool managed by the application controller, and according to the resource The initial specification of the pool allocates an initial resource container for the resource pool managed by the application controller, and the initial resource container of the resource pool is started;
该方法包括:应用控制器接收来自客户端的资源分配请求,该资源分配请求用于为该客户端上运行的应用请求资源容器,资源分配请求中携带该应用的资源需求信息,并根据该应用的资源需求信息,从该资源池中选择空闲的资源容器分配给该客户端。The method includes: receiving, by the application controller, a resource allocation request from a client, where the resource allocation request is used to request a resource container for an application running on the client, where the resource allocation request carries resource requirement information of the application, and according to the application Resource requirement information, from which an idle resource container is selected and allocated to the client.
结合第四方面,在第四方面第一种可能的实现方式中,该应用控制器接收来自该客户端的资源分配请求之前,该方法还包括:该应用控制器根据所述资源池的初始规格向该资源管理器发送第一资源请求,该第一资源请求中携带根据该资源池的初始规格确定的资源数量,获取该资源管理器发送的该资源池的第一资源分配消息,该第一资源分配消息中包含该资源管理器为该资源池分配的初始的资源容器所在的节点的信息,并向该节点管理器发送启动请求,请求该节点管理器启动该资源池的初始的资源容器。With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, before the application controller receives the resource allocation request from the client, the method further includes: the application controller according to an initial specification of the resource pool The resource manager sends a first resource request, where the first resource request carries a quantity of resources determined according to an initial specification of the resource pool, and acquires a first resource allocation message of the resource pool sent by the resource manager, where the first resource is The allocation message includes information about the node where the resource manager allocates the initial resource container for the resource pool, and sends a startup request to the node manager to request the node manager to start the initial resource container of the resource pool.
由于资源池中的资源容器是由应用控制器提前向资源管理器申请的,当应用控制器接收到来自客户端的资源分配请求时,可以及时的为客户端分配已启动的资源容器,从而避免了现有技术中,接收到应用作业后,启动AM和资源容器的等待时间。 Since the resource container in the resource pool is applied to the resource manager in advance by the application controller, when the application controller receives the resource allocation request from the client, the resource container can be allocated to the client in time, thereby avoiding In the prior art, the waiting time of the AM and the resource container is started after receiving the application job.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第二种可能的实现方式中,该资源分配请求中包含该客户端的标识,该应用控制器从该资源池中选择空闲的资源容器分配给该客户端之后,该方法还包括:该应用控制器向分配给该客户端的资源容器中每个资源容器发送指示消息,该指示消息中携带该客户端的标识,该指示消息用于指示将该每个资源容器分配给该客户端。With reference to the fourth aspect, or any one of the foregoing possible implementation manners of the fourth aspect, in the second possible implementation manner of the fourth aspect, the resource allocation request includes an identifier of the client, where the application controller is in the resource pool After the idle resource container is allocated to the client, the method further includes: the application controller sending an indication message to each resource container in the resource container allocated to the client, where the indication message carries the identifier of the client, the indication A message is used to indicate that each resource container is assigned to the client.
通过在指示消息中携带客户端的标识,指示将对应的资源容器分配给了该客户端,从而使该对应的资源容器向客户端发送注册消息,以使得客户端可以将应用的任务分配给向其注册的资源容器。By carrying the identifier of the client in the indication message, indicating that the corresponding resource container is allocated to the client, so that the corresponding resource container sends a registration message to the client, so that the client can assign the task of the application to the client. Registered resource container.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第三种可能的实现方式中,该方法还包括:该应用控制器确定该资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,则向该资源管理器发送第二资源请求,并接收资源管理器的该资源池的第二资源分配消息,根据该资源池的第二资源分配消息,向节点管理器发送启动请求,请求该节点管理器启动为资源池分配的新增的资源容器。With reference to the fourth aspect, or any one of the foregoing possible implementation manners, in a third possible implementation manner of the fourth aspect, the method further includes: the application controller determining the remaining idle resource containers in the resource pool The number, if the number of remaining free resource containers is less than a preset first threshold, sending a second resource request to the resource manager, and receiving a second resource allocation message of the resource pool of the resource manager, according to the The second resource allocation message of the resource pool sends a startup request to the node manager, requesting the node manager to start a new resource container allocated for the resource pool.
更具体的,可以在应用控制器将资源池中的资源容器分配给该客户端后,确定该资源池中剩余的空闲的资源容器的数目,或者由应用控制器周期性的确定该资源池中剩余的空闲的资源容器的数目,当剩余的空闲的资源容器的数目小于预设的第一阈值时,表明资源池的资源容器可能不能满足后续的分配操作,则通过向资源管理器重新申请资源容器,以补充资源池中的资源容器。More specifically, after the application controller allocates the resource container in the resource pool to the client, the number of idle resource containers remaining in the resource pool may be determined, or the application controller periodically determines the resource pool. The number of remaining idle resource containers. When the number of remaining free resource containers is less than a preset first threshold, indicating that the resource container of the resource pool may not satisfy the subsequent allocation operation, the resource is re-applied to the resource manager. A container to supplement the resource container in the resource pool.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第四种可能的实现方式中,该方法还包括:该应用控制器确定该资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息,该资源释放消息用于释放该至少一个资源容器占用的资源。With reference to the fourth aspect, or any one of the foregoing possible implementation manners, the fourth possible implementation manner of the fourth aspect, the method further includes: determining, by the application controller, the remaining idle resource containers in the resource pool And a quantity release message, if the number of remaining idle resource containers is greater than a preset second threshold, sending a resource release message to the at least one resource container in the idle resource container, where the resource release message is used to release the at least one resource container occupation resource of.
更具体的,可以在分配出去的资源容器执行完客户端的任务后,确定该资源池中剩余的空闲的资源容器的数目,或者由应用控制器周期性的确定该资源池中剩余的空闲的资源容器的数目,当剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息。从而释放资源池占用的资源,保证了全局的资源合理利用。 More specifically, after the allocated resource container performs the task of the client, the number of idle resource containers remaining in the resource pool may be determined, or the application controller periodically determines remaining free resources in the resource pool. The number of containers, when the number of remaining free resource containers is greater than a preset second threshold, sending a resource release message to at least one resource container in the idle resource container. Thereby releasing the resources occupied by the resource pool and ensuring the rational use of the global resources.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第五种可能的实现方式中,该应用控制器维护该资源池中每个资源容器的状态信息,该状态信息表示对应的资源容器是否空闲;该应用控制器从该资源池中选择空闲的资源容器分配给该客户端包括:该应用控制器根据该资源池中每个资源容器的状态信息,选择该资源池中空闲的资源容器,并将所选择的该资源池中空闲的资源容器分配给该客户端。With reference to the fourth aspect, or any one of the foregoing possible implementation manners, in the fifth possible implementation manner of the fourth aspect, the application controller maintains state information of each resource container in the resource pool, the state information Indicates whether the corresponding resource container is idle; the application controller selects an idle resource container from the resource pool and allocates the resource to the client. The application controller selects the resource pool according to status information of each resource container in the resource pool. An idle resource container, and the selected resource container in the selected resource pool is allocated to the client.
其中,如果资源容器当前已被分配出去执行其他任务,则该资源容器的状态为不空闲,如果资源容器当前没有被分配出去,则该资源容器的状态为空闲。具体的,应用控制器可以维护一个状态信息表,状态信息表中维护有资源池中每一个资源容器的状态信息,从而使应用控制器能够快速的确定当前的空闲资源容器,并根据空闲的资源容器和资源分配请求为客户端分配容器。Wherein, if the resource container is currently allocated to perform other tasks, the state of the resource container is not idle, and if the resource container is not currently allocated, the state of the resource container is idle. Specifically, the application controller may maintain a state information table, where the state information table maintains state information of each resource container in the resource pool, so that the application controller can quickly determine the current idle resource container, and according to the idle resources. Container and resource allocation requests allocate containers for clients.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第六种可能的实现方式中,该应用控制器从该资源池中选择空闲的资源容器分配给该客户端之后,该方法还包括:该应用控制器将分配给该客户端的资源容器中的每一个资源容器的状态设置为不空闲。With reference to the fourth aspect, or any one of the foregoing possible implementation manners of the fourth aspect, in the sixth possible implementation manner of the fourth aspect, after the application controller selects an idle resource container from the resource pool and allocates the The method further includes the application controller setting a state of each of the resource containers allocated to the client to be not idle.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第七种可能的实现方式中,该应用控制器将分配给该客户端的资源容器中的每一个资源容器的状态设置为不空闲之后,该方法还包括:该应用控制器接收来自分配给该客户端的资源容器中每一个资源容器的状态更新消息,该状态更新消息用于指示完成了该客户端分配的任务;该应用控制器将分配给该客户端的资源容器中每一个资源容器的状态设置为空闲。With reference to the fourth aspect, or any one of the foregoing possible implementation manners of the fourth aspect, in the seventh possible implementation manner of the fourth aspect, the application controller allocates a status of each resource container in the resource container of the client After being set to not idle, the method further includes: the application controller receiving a status update message from each of the resource containers allocated to the client, the status update message being used to indicate that the task assigned by the client is completed; The application controller sets the state of each resource container in the resource container assigned to the client to idle.
应用控制器根据资源容器的状态的变化,动态的更新自身维护的状态信息,从而保证了有资源分配请求时,能够准确的确定资源池中的空闲资源容器。The application controller dynamically updates the state information maintained by itself according to the change of the state of the resource container, thereby ensuring that the idle resource container in the resource pool can be accurately determined when there is a resource allocation request.
结合第四方面或第四方面以上任一种可能的实现方式,在第四方面第七种可能的实现方式中,该资源分配请求中还包含用户权限信息;该方法还包括:该应用控制器根据预设的用户权限库,验证该用户权限信息,该用户权限库包含该用户权限信息。With reference to the fourth aspect, or any one of the foregoing possible implementation manners, in the seventh possible implementation manner of the fourth aspect, the resource allocation request further includes user rights information; the method further includes: the application controller The user rights information is verified according to a preset user permission library, and the user rights library contains the user rights information.
其中,用户权限库包含不同用户的用户权限信息。如果用户权限库中不包含该客户端的用户权限信息,则拒绝或不响应客户端的资源分配请求;如 果用户权限库中包含该客户端102的用户权限信息,则执行后续资源分配的步骤。The user permission library contains user rights information of different users. If the user permission information of the client is not included in the user permission library, the resource allocation request of the client is rejected or not responded; If the user rights library contains the user rights information of the client 102, the subsequent resource allocation step is performed.
第五方面,本申请提供了一种计算机可读介质,包括计算机执行指令,当计算机的处理器执行该计算机执行指令时,该计算机执行第四方面或第四方面任一种可能的实现方式中的方法。In a fifth aspect, the present application provides a computer readable medium, comprising computer executed instructions, in a possible implementation manner of the fourth aspect or the fourth aspect, when the processor of the computer executes the computer to execute the instruction Methods.
第六方面,本申请提供了一种计算设备,包括:处理器、存储器、总线和通信接口;该存储器用于存储执行指令,该处理器与该存储器通过该总线连接,当该计算设备运行时,该处理器执行该存储器存储的该执行指令,以使该计算设备执行第四方面或第四方面任一种可能的实现方式中的方法。In a sixth aspect, the present application provides a computing device, including: a processor, a memory, a bus, and a communication interface; the memory is configured to store an execution instruction, the processor is connected to the memory through the bus, when the computing device is running The processor executes the execution instructions stored by the memory to cause the computing device to perform the method of any of the possible implementations of the fourth aspect or the fourth aspect.
第七方面,本申请提供了一种分布式系统中资源容器的分配装置,分布式系统包括该装置和节点管理器,该装置用于管理该分布式系统的节点资源,该节点管理器用于基于该节点资源启动资源容器,该资源容器用于执行应用的任务,该装置包括:启动单元,用于在满足触发时机时启动应用控制器,并配置该应用控制器管理的资源池的初始规格;接收单元,用于接收该应用控制器根据该资源池的初始规格发送的该资源池的第一资源请求;分配单元,用于根据该第一资源请求,为该资源池分配初始的资源容器;发送单元,用于向该应用控制器发送该资源池的第一资源分配消息,该第一资源分配消息中包含该分配单元为该资源池分配的初始的资源容器所在的节点的信息。In a seventh aspect, the present application provides a resource container allocation apparatus in a distributed system, where the distributed system includes the apparatus and a node manager, where the apparatus is configured to manage node resources of the distributed system, and the node manager is configured to be based on The node resource starts a resource container, where the resource container is used to perform an application task, and the device includes: a startup unit, configured to start an application controller when the trigger timing is met, and configure an initial specification of the resource pool managed by the application controller; a receiving unit, configured to receive a first resource request of the resource pool that is sent by the application controller according to an initial specification of the resource pool; and an allocating unit, configured to allocate an initial resource container to the resource pool according to the first resource request; And a sending unit, configured to send, to the application controller, a first resource allocation message of the resource pool, where the first resource allocation message includes information about a node where the initial resource container allocated by the allocation unit is allocated to the resource pool.
该装置主动启动应用控制器,并配置应用控制器管理的资源池的初始规格,从而使应用控制器根据资源池的初始规格向该装置提前申请一定数目的资源容器,从而提前启动申请到的容器,减少了后续应用作业等待资源容器启动的时间。The device actively starts the application controller and configures an initial specification of the resource pool managed by the application controller, so that the application controller applies for a certain number of resource containers in advance according to the initial specification of the resource pool, thereby starting the application container in advance. , reducing the time that subsequent application jobs wait for the resource container to start.
结合第七方面,在第七方面第一种可能的实现方式中,该启动单元用于在满足触发时机时启动应用控制器,包括:该启动单元用于接收到预先启动应用控制器的请求或预先配置资源池的请求,或者在系统初始化时,启动该应用控制器。With reference to the seventh aspect, in a first possible implementation manner of the seventh aspect, the starting unit is configured to start the application controller when the trigger timing is met, including: the startup unit is configured to receive a request for pre-launching the application controller or The resource pool request is pre-configured, or the application controller is started when the system is initialized.
应用控制器的启动时间可以有多种方式,可以基于用户需求,由管理员通过指令进行启动,也可以通过系统配置,在系统初始化时,由启动单元自主启动。The startup time of the application controller can be started in a variety of ways. It can be started by the administrator according to the user's requirements, or can be started by the startup unit when the system is initialized.
结合第七方面或第七方面以上任一种可能的实现方式,在第七方面第二种可能的实现方式中,该启动单元用于配置该应用控制器管理的资源池的初 始规格,包括:该启动单元用于根据预先设置的应用的预期的资源需求信息,配置应用控制器管理的资源池的初始规格;或者根据收集到的分布式系统的节点资源的使用信息,配置应用控制器管理的资源池的初始规格。例如,当有充足未被使用的节点资源时,启动单元可以配置较大的资源池的初始规格;当未被使用的节点资源较少时,启动单元可以配置较小的资源池的初始规格。With reference to the seventh aspect, or any one of the foregoing possible implementation manners of the seventh aspect, in the second possible implementation manner of the seventh aspect, the starting unit is configured to configure a resource pool managed by the application controller The initial specification includes: the startup unit is configured to configure an initial specification of a resource pool managed by the application controller according to the expected resource requirement information of the preset application; or configure according to the collected usage information of the node resource of the distributed system. The initial specification of the resource pool managed by the application controller. For example, when there are sufficient unused node resources, the startup unit can configure an initial specification of a larger resource pool; when there are fewer unused node resources, the startup unit can configure an initial specification of a smaller resource pool.
另外,如果应用控制器是通过管理员的指令启动,则资源池的初始规格可以由管理员在应用控制器的启动指令中携带,并由启动单元在启动该应用控制器时,对应用控制器进行配置。In addition, if the application controller is started by an administrator's instruction, the initial specification of the resource pool may be carried by the administrator in the startup command of the application controller, and the startup controller activates the application controller to the application controller. Configure it.
第八方面,本申请提供了一种分布式系统中资源容器的分配装置,分布式系统包括资源管理器和节点管理器,该资源管理器用于管理该分布式系统的节点资源,该节点管理器用于基于该节点资源启动资源容器,该资源容器用于执行应用的任务,该资源管理器在满足触发时机时启动该装置,配置该装置管理的资源池的初始规格,并根据该资源池的初始规格为该装置管理的资源池分配初始的资源容器,该初始的资源容器已启动;In an eighth aspect, the application provides a resource container allocation apparatus in a distributed system, where the distributed system includes a resource manager and a node manager, and the resource manager is configured to manage node resources of the distributed system, and the node manager uses Starting a resource container based on the node resource, where the resource container is used to execute an application task, the resource manager starts the device when the trigger timing is met, configures an initial specification of the resource pool managed by the device, and according to the initial of the resource pool The specification allocates an initial resource container for the resource pool managed by the device, and the initial resource container is started;
该装置包括:接收单元,用于接收来自客户端的资源分配请求,该资源分配请求用于为该客户端上运行的应用请求资源容器,该资源分配请求中携带该应用的资源需求信息;分配单元,用于根据该应用的资源需求信息,从该资源池中选择空闲的资源容器分配给该客户端。The device includes: a receiving unit, configured to receive a resource allocation request from a client, where the resource allocation request is used to request a resource container for an application running on the client, where the resource allocation request carries resource requirement information of the application; And selecting, according to the resource requirement information of the application, an idle resource container from the resource pool to be allocated to the client.
结合第八方面,在第八方面第一种可能的实现方式中,该装置还包括发送单元,该接收单元接收来自该客户端的资源分配请求之前,该发送单元用于根据所述资源池的初始规格向该资源管理器发送第一资源请求,该第一资源请求中携带根据该资源池的初始规格确定的资源数量;该接收单元还用于接收该资源管理器发送的第一资源分配消息,该第一资源分配消息中包含该资源管理器为该资源池分配的初始的资源容器所在的节点的信息;该发送单元还用于向该节点管理器发送启动请求,请求该节点管理器启动该资源池的初始的资源容器。With reference to the eighth aspect, in a first possible implementation manner of the eighth aspect, the apparatus further includes: a sending unit, the receiving unit is configured to use, according to the initial of the resource pool, a resource allocation request from the client The first resource request is sent to the resource manager, where the first resource request carries the quantity of resources determined according to the initial specification of the resource pool; the receiving unit is further configured to receive the first resource allocation message sent by the resource manager, The first resource allocation message includes information about a node where the initial resource container allocated by the resource manager is allocated to the resource pool; the sending unit is further configured to send a startup request to the node manager, requesting the node manager to start the The initial resource container for the resource pool.
由于资源池中的资源容器是由该装置提前向资源管理器申请的,当接收单元接收到来自客户端的资源分配请求时,分配单元可以及时的为客户端分配已启动的资源容器,从而避免了现有技术中,接收到应用作业后,启动AM和资源容器的等待时间。Since the resource container in the resource pool is applied to the resource manager in advance by the device, when the receiving unit receives the resource allocation request from the client, the allocating unit can allocate the activated resource container to the client in time, thereby avoiding In the prior art, the waiting time of the AM and the resource container is started after receiving the application job.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第二 种可能的实现方式中,该装置还包括发送单元,该资源分配请求中包含该客户端的标识,该分配单元从该资源池中选择空闲的资源容器分配给该客户端之后,该发送单元用于向分配给该客户端的资源容器中每个资源容器发送指示消息,该指示消息中携带该客户端的标识,该指示消息用于指示将该每一个资源容器分配给该客户端。In combination with the eighth aspect or the eighth aspect, any one of the possible implementation manners, In a possible implementation, the device further includes a sending unit, where the resource allocation request includes an identifier of the client, and the sending unit selects an idle resource container from the resource pool and allocates the resource to the client, where the sending unit is used for An indication message is sent to each resource container in the resource container allocated to the client, where the indication message carries an identifier of the client, and the indication message is used to indicate that each resource container is allocated to the client.
通过发送单元向分配的资源容器发送指示消息,并在指示消息中携带客户端的标识,指示将对应的资源容器分配给了该客户端,从而使该对应的资源容器向客户端发送注册消息,以使得客户端可以将应用的任务分配给向其注册的资源容器。Sending an indication message to the allocated resource container by the sending unit, and carrying the identifier of the client in the indication message, indicating that the corresponding resource container is allocated to the client, so that the corresponding resource container sends a registration message to the client, Enables the client to assign the application's tasks to the resource container to which it is registered.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第三种可能的实现方式中,该装置还包括确定单元,和发送单元,该确定单元用于确定该资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,该发送单元用于向该资源管理器发送第二资源请求,接收单元还用于接收资源管理器发送的该资源池的第二资源分配消息,发送单元还用于根据该资源池的第二资源分配消息,向节点管理器发送启动请求,请求该节点管理器启动为资源池分配的新增的资源容器。With reference to the eighth aspect, or any one of the foregoing possible implementation manners of the eighth aspect, in a third possible implementation manner of the eighth aspect, the apparatus further includes a determining unit, and a sending unit, the determining unit is configured to determine the resource pool The number of the remaining free resource containers, if the number of remaining free resource containers is less than a preset first threshold, the sending unit is configured to send a second resource request to the resource manager, and the receiving unit is further configured to receive the resource. a second resource allocation message of the resource pool sent by the manager, the sending unit is further configured to send a startup request to the node manager according to the second resource allocation message of the resource pool, requesting the node manager to start a new allocation for the resource pool Increased resource container.
更具体的,可以在分配单元将资源池中的资源容器分配给该客户端后,确定单元确定该资源池中剩余的空闲的资源容器的数目,或者由确定单元周期性的确定该资源池中剩余的空闲的资源容器的数目,当剩余的空闲的资源容器的数目小于预设的第一阈值时,表明资源池的资源容器可能不能满足后续的分配操作,则通过发送单元向资源管理器重新申请资源容器,以补充资源池中的资源容器。More specifically, after the allocating unit allocates the resource container in the resource pool to the client, the determining unit determines the number of free resource containers remaining in the resource pool, or periodically determines the resource pool by the determining unit. The number of remaining idle resource containers, when the number of remaining free resource containers is less than a preset first threshold, indicating that the resource container of the resource pool may not satisfy the subsequent allocation operation, and then re-send to the resource manager through the sending unit. Apply for a resource container to supplement the resource container in the resource pool.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第四种可能的实现方式中,该装置还包括确定单元,和发送单元,该确定单元用于确定该资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则该发送单元用于向空闲的资源容器中的至少一个资源容器发送资源释放消息,该资源释放消息用于释放该至少一个资源容器占用的资源。With reference to the eighth aspect, or any one of the foregoing possible implementation manners of the eighth aspect, in a fourth possible implementation manner of the eighth aspect, the device further includes a determining unit, and a sending unit, the determining unit is configured to determine the resource pool The number of remaining idle resource containers, if the number of remaining free resource containers is greater than a preset second threshold, the sending unit is configured to send a resource release message to at least one resource container in the idle resource container, where The resource release message is used to release resources occupied by the at least one resource container.
更具体的,可以在分配出去的资源容器执行完客户端的任务后,确定单元确定该资源池中剩余的空闲的资源容器的数目,或者由确定单元周期性的确定该资源池中剩余的空闲的资源容器的数目,当剩余的空闲的资源容器的 数目大于预设的第二阈值,则通过发送单元向空闲的资源容器中的至少一个资源容器发送资源释放消息。从而释放资源池占用的资源,保证了全局的资源合理利用。More specifically, after the allocated resource container performs the task of the client, the determining unit determines the number of free resource containers remaining in the resource pool, or periodically determines, by the determining unit, the remaining idle in the resource pool. The number of resource containers, when the remaining free resource containers If the number is greater than the preset second threshold, the resource release message is sent by the sending unit to the at least one resource container in the idle resource container. Thereby releasing the resources occupied by the resource pool and ensuring the rational use of the global resources.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第五种可能的实现方式中,该分配单元还用于维护该资源池中每个资源容器的状态信息,该状态信息表示对应的资源容器是否空闲;该分配单元用于从该资源池中选择空闲的资源容器分配给该客户端包括:该分配单元用于根据该资源池中每个资源容器的状态信息,选择该资源池中空闲的资源容器,并将所选择的该资源池中空闲的资源容器分配给该客户端。With reference to the eighth aspect, or any one of the foregoing possible implementation manners of the eighth aspect, in the fifth possible implementation manner of the eighth aspect, the allocating unit is further configured to maintain state information of each resource container in the resource pool, where The status information indicates whether the corresponding resource container is idle; the allocation unit is configured to select an idle resource container from the resource pool, and the allocation unit is configured to: according to the status information of each resource container in the resource pool, Select an idle resource container in the resource pool and assign the selected resource container in the resource pool to the client.
其中,如果资源容器当前已被分配出去执行其他任务,则该资源容器的状态为不空闲,如果资源容器当前没有被分配出去,则该资源容器的状态为空闲。具体的,分配单元可以维护一个状态信息表,状态信息表中维护有资源池中每一个资源容器的状态信息,从而使分配单元能够快速的确定当前的空闲资源容器,并根据空闲的资源容器和资源分配请求为客户端分配容器。Wherein, if the resource container is currently allocated to perform other tasks, the state of the resource container is not idle, and if the resource container is not currently allocated, the state of the resource container is idle. Specifically, the allocating unit may maintain a state information table, where the state information table maintains state information of each resource container in the resource pool, so that the allocating unit can quickly determine the current idle resource container, and according to the idle resource container and A resource allocation request allocates a container for the client.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第六种可能的实现方式中,该分配单元从该资源池中选择空闲的资源容器分配给该客户端之后,还用于将分配给该客户端的资源容器中的每一个资源容器的状态设置为不空闲。With reference to the eighth aspect, or any one of the foregoing possible implementation manners of the eighth aspect, in the sixth possible implementation manner of the eighth aspect, after the allocation unit selects an idle resource container from the resource pool and allocates the virtual resource container to the client, It is also used to set the state of each resource container in the resource container assigned to the client to not idle.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第七种可能的实现方式中,该分配单元将分配给该客户端的资源容器中的每一个资源容器的状态设置为不空闲之后,该接收单元还用于接收来自分配给该客户端的资源容器中每一个资源容器的状态更新消息,该状态更新消息用于指示完成了该客户端分配的任务;该分配单元还用于根据该状态更新信息将分配给该客户端的资源容器中每一个资源容器的状态设置为空闲。With reference to the eighth aspect, or any one of the foregoing possible implementation manners of the eighth aspect, in the seventh possible implementation manner of the eighth aspect, the allocating unit sets a state of each resource container in the resource container allocated to the client After not being idle, the receiving unit is further configured to receive a status update message from each of the resource containers allocated to the client, the status update message is used to indicate that the task assigned by the client is completed; Used to set the state of each resource container in the resource container assigned to the client to idle based on the status update information.
分配单元根据资源容器的状态的变化,动态的更新自身维护的状态信息,从而保证了有资源分配请求时,能够准确的确定资源池中的空闲资源容器。The allocation unit dynamically updates the state information maintained by itself according to the change of the state of the resource container, thereby ensuring that the idle resource container in the resource pool can be accurately determined when there is a resource allocation request.
结合第八方面或第八方面以上任一种可能的实现方式,在第八方面第七种可能的实现方式中,该资源分配请求中还包含用户权限信息;该接收单元还用于根据预设的用户权限库,验证该用户权限信息,该用户权限库包含该用户权限信息。With reference to the eighth aspect, or any one of the foregoing possible implementation manners of the eighth aspect, in the seventh possible implementation manner of the eighth aspect, the resource allocation request further includes user rights information; the receiving unit is further configured to be configured according to the preset The user permission library verifies the user rights information, and the user permission library contains the user rights information.
其中,用户权限库包含不同用户的用户权限信息。如果用户权限库中不 包含该客户端的用户权限信息,则拒绝或不响应客户端的资源分配请求;如果用户权限库中包含该客户端102的用户权限信息,则执行后续资源分配的步骤。The user permission library contains user rights information of different users. If the user permissions are not in the library If the user rights information of the client is included, the resource allocation request of the client is rejected or not; if the user rights information of the client 102 is included in the user rights library, the subsequent resource allocation step is performed.
第九方面,本申请提供了一种分布式系统中资源容器的分配系统,该分布式系统包括资源管理器和节点管理器;该资源管理器,用于在满足触发时机时启动应用控制器,配置该应用控制器管理的资源池的初始规格,并接收该应用控制器根据该资源池的初始规格发送的该资源池的第一资源请求,根据该第一资源请求,为该资源池分配初始的资源容器,向该应用控制器发送该资源池的第一资源分配消息;该应用控制器,用于获取该资源管理器发送的该资源池的第一资源分配消息,根据该资源池的第一资源分配消息中指示的为所述资源池分配的初始的资源容器所在的节点的信息,请求该节点管理器启动为该资源池的初始的资源容器;该节点管理器,用于根据该应用控制器的请求,启动该初始的资源容器。In a ninth aspect, the present application provides a resource container allocation system in a distributed system, where the distributed system includes a resource manager and a node manager, and the resource manager is configured to start an application controller when a trigger timing is met. Configuring an initial specification of the resource pool managed by the application controller, and receiving a first resource request of the resource pool sent by the application controller according to the initial specification of the resource pool, and initializing the resource pool according to the first resource request The resource container sends a first resource allocation message of the resource pool to the application controller; the application controller is configured to acquire a first resource allocation message of the resource pool sent by the resource manager, according to the resource pool Information indicating a node where the initial resource container allocated for the resource pool is indicated in a resource allocation message, requesting the node manager to start an initial resource container of the resource pool; the node manager is configured to use according to the application The controller's request to start the initial resource container.
结合第九方面,在第九方面第一种可能的实现方式中,该资源管理器用于在满足触发时机时启动应用控制器包括:该资源管理器用于在接收到预先启动该应用控制器的请求,或者预先配置该资源池的请求时,启动该应用控制器。With reference to the ninth aspect, in a first possible implementation manner of the ninth aspect, the resource manager is configured to start the application controller when the trigger timing is met, and the resource manager is configured to receive the request for starting the application controller in advance The application controller is started when the request for the resource pool is pre-configured.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第二种可能的实现方式中,该资源管理器用于配置该应用控制器管理的资源池的初始规格包括:该资源管理器用于根据预先设置的该应用的预期的资源需求信息,或根据收集到的所述分布式系统的节点资源的使用信息,配置该应用控制器管理的资源池的初始规格。With reference to the ninth aspect, or any one of the possible implementation manners of the ninth aspect, in the second possible implementation manner of the ninth aspect, the initial specification of the resource pool used by the resource manager to configure the application controller includes: The resource manager is configured to configure an initial specification of the resource pool managed by the application controller according to the preset resource requirement information of the application set in advance or according to the collected usage information of the node resource of the distributed system.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第三种可能的实现方式中,该应用控制器,还用于接收来自客户端的资源分配请求,该资源分配请求用于为该客户端上运行的应用请求资源容器,并根据该资源分配请求中的该应用的资源需求信息,从该资源池中选择空闲的资源容器分配给该客户端。With reference to the ninth aspect, or any one of the foregoing possible implementation manners of the ninth aspect, the application controller is further configured to receive a resource allocation request from a client, where the resource allocation request is And configured to request a resource container for an application running on the client, and select an idle resource container from the resource pool to allocate to the client according to the resource requirement information of the application in the resource allocation request.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第四种可能的实现方式中,该资源分配请求中包含该客户端的标识,该应用控制器从该资源池中选择空闲的资源容器分配给该客户端之后,还用于:向分配给该客户端的资源容器中每一个资源容器发送指示消息,该指示消息中携带 该客户端的标识,该指示消息用于指示将该每一个资源容器分配给该客户端。With reference to the ninth aspect, or any one of the possible implementation manners of the ninth aspect, in the fourth possible implementation manner of the ninth aspect, the resource allocation request includes an identifier of the client, where the application controller is in the resource pool After the idle resource container is allocated to the client, the method further includes: sending an indication message to each resource container in the resource container allocated to the client, where the indication message is carried The identifier of the client, the indication message is used to indicate that each resource container is allocated to the client.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第五种可能的实现方式中,该应用控制器还用于:确定该资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,则向该资源管理器发送第二资源请求,并接收资源管理器的该资源池的第二资源分配消息,根据该资源池的第二资源分配消息,向节点管理器发送启动请求,请求节点管理器启动为资源池分配的新增的资源容器。With reference to the ninth aspect, or any one of the foregoing possible implementation manners of the ninth aspect, in the fifth possible implementation manner of the ninth aspect, the application controller is further configured to: determine the remaining idle resource containers in the resource pool a number, if the number of remaining free resource containers is less than a preset first threshold, sending a second resource request to the resource manager, and receiving a second resource allocation message of the resource pool of the resource manager, according to the resource The second resource allocation message of the pool sends a start request to the node manager, requesting the node manager to start a new resource container allocated for the resource pool.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第六种可能的实现方式中,该应用控制器还用于:确定该资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息,该资源释放消息用于释放该至少一个资源容器占用的资源。With reference to the ninth aspect, or any one of the foregoing possible implementation manners of the ninth aspect, in the sixth possible implementation manner of the ninth aspect, the application controller is further configured to: determine the remaining idle resource containers in the resource pool a number, if the number of remaining free resource containers is greater than a preset second threshold, sending a resource release message to at least one resource container in the idle resource container, the resource release message being used to release the at least one resource container Resources.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第七种可能的实现方式中,该应用控制器还用于维护该资源池中每一个资源容器的状态信息,该状态信息用于表示对应的资源容器是否空闲;该应用控制器从该资源池中选择空闲的资源容器分配给该客户端之前,还用于:根据该资源池中每一个资源容器的状态信息,确定该资源池中空闲的资源容器。With reference to the ninth aspect or the ninth aspect, the application controller is further configured to maintain state information of each resource container in the resource pool, in a seventh possible implementation manner of the ninth aspect, The status information is used to indicate whether the corresponding resource container is idle. The application controller selects an idle resource container from the resource pool to be allocated to the client, and is further configured to: according to status information of each resource container in the resource pool. To determine the free resource container in the resource pool.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第八种可能的实现方式中,该应用控制器从该资源池中选择空闲的资源容器分配给该客户端之后,还用于:将分配给该客户端的资源容器中的每一个资源容器的状态设置为不空闲。With reference to the ninth aspect, or any one of the foregoing possible implementation manners of the ninth aspect, in the eighth possible implementation manner of the ninth aspect, after the application controller selects an idle resource container from the resource pool and allocates the Also used to: set the state of each resource container in the resource container assigned to the client to not idle.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第九种可能的实现方式中,该应用控制器将分配给该客户端的资源容器中的每一个资源容器的状态设置为不空闲之后,还用于:接收来自分配给该客户端的资源容器中每一个资源容器的状态更新消息,该状态更新消息用于指示完成了该客户端分配的任务,并将分配给该客户端的资源容器中每一个资源容器的状态设置为空闲。With reference to the ninth aspect, or any one of the possible implementation manners of the ninth aspect, in the ninth possible implementation manner of the ninth aspect, the application controller allocates a status of each resource container in the resource container of the client After being set to not idle, the method is further configured to: receive a status update message from each of the resource containers allocated to the client, the status update message is used to indicate that the task assigned by the client is completed, and is assigned to the The status of each resource container in the client's resource container is set to idle.
结合第九方面或第九方面以上任一种可能的实现方式,在第九方面第十种可能的实现方式中,该资源分配请求中还包含用户权限信息;该系统还包括:该应用控制器根据预设的用户权限库,验证该用户权限信息,该用户权限库包含该用户权限信息。 With reference to the ninth aspect, or any one of the possible implementation manners of the ninth aspect, in the tenth possible implementation manner of the ninth aspect, the resource allocation request further includes user rights information; the system further includes: the application controller The user rights information is verified according to a preset user permission library, and the user rights library contains the user rights information.
第九方面为第一方面和第四方面对应的系统实施方式,第一方面或第一方面任意一种可能的实现方式,第四方面或第四方面任意一种可能的实现方式中的特征描述适用于第九方面或第九方面任一种可能的实现方式,在此不再赘述。The ninth aspect is the system implementation corresponding to the first aspect and the fourth aspect, the first aspect or any one of the possible implementation manners of the first aspect, and the feature description in any one of the possible implementation manners of the fourth aspect or the fourth aspect Applicable to any possible implementation of the ninth aspect or the ninth aspect, and details are not described herein again.
根据本发明公开的技术方案,通过应用控制器提前向资源管理器提前申请资源容器,并提前启动请求的资源容器,当接收到应用的资源分配请求后,可以实现资源容器资源的及时分配,避免了资源容器启动的等待时间,且通过对资源池中资源容器的重用,避免了对资源容器的多次开启和关闭的资源消耗,通过应用控制器直接管理资源池中的资源容器,实现了更加灵活的管理。According to the technical solution disclosed by the present invention, the resource container is applied to the resource manager in advance by the application controller, and the requested resource container is started in advance. After receiving the resource allocation request of the application, the resource container resource can be allocated in time to avoid The waiting time of the resource container is started, and the resource container in the resource pool is reused, thereby avoiding resource consumption of multiple opening and closing of the resource container, and directly managing the resource container in the resource pool through the application controller, thereby realizing more Flexible management.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention, Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1为依据本发明一实施例的资源分配系统的示例性联网环境框图;1 is a block diagram of an exemplary networked environment of a resource allocation system in accordance with an embodiment of the present invention;
图2为依据本发明一实施例的计算设备硬件结构示意图;2 is a schematic structural diagram of a hardware of a computing device according to an embodiment of the invention;
图3为依据本发明一实施例的资源分配方法的信令图;FIG. 3 is a signaling diagram of a resource allocation method according to an embodiment of the present invention; FIG.
图4为依据本发明一实施例的资源分配方法的信令图;4 is a signaling diagram of a resource allocation method according to an embodiment of the present invention;
图5为依据本发明一实施例的资源分配装置的逻辑结构示意图;FIG. 5 is a schematic diagram showing the logical structure of a resource allocation apparatus according to an embodiment of the present invention; FIG.
图6为依据本发明一实施例的资源分配装置的逻辑结构示意图;FIG. 6 is a schematic diagram showing the logical structure of a resource allocation apparatus according to an embodiment of the present invention; FIG.
图7为依据本发明一实施例的资源分配装置的逻辑结构示意图。FIG. 7 is a schematic diagram showing the logical structure of a resource allocation apparatus according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合附图,对本发明实施例的实施例进行说明。Embodiments of the embodiments of the present invention will be described below with reference to the accompanying drawings.
术语定义:Definition of Terms:
资源管理器(Resource Manager,RM)是全局的资源管理器,负责整个yarn系统的资源管理和分配。Resource Manager (RM) is a global resource manager responsible for resource management and allocation of the entire yarn system.
资源容器是yarn中的资源抽象,它可以封装某个节点上的多类资源, 如内存、CPU、磁盘、网络等。资源容器用于执行应用的任务。A resource container is a resource abstraction in yarn that encapsulates multiple types of resources on a node. Such as memory, CPU, disk, network, etc. The resource container is used to perform the tasks of the application.
客户端是运行有待申请资源的应用的设备,该应用的类型可以是MapReduce、Giraph、Storm、Spark、Tez/Impala或消息传递接口(Message Passing Interface,MPI)等。The client is a device running an application to be applied for, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or Message Passing Interface (MPI).
节点管理器(Node Manager,NM)是每个节点上的资源和任务管理器,一方面,它会定时地向RM汇报本节点上的资源使用情况和各个资源容器的运行状态;另一方面,它接收并处理来自应用控制器的资源容器的启动/停止等各种请求。The Node Manager (NM) is a resource and task manager on each node. On the one hand, it periodically reports the resource usage on the node and the running status of each resource container to the RM; on the other hand, It receives and processes various requests such as start/stop of resource containers from the application controller.
应用控制器用于向RM申请并管理资源池,资源池中运行有已启动的资源容器,并根据客户端的资源分配请求为客户端分配资源容器。The application controller is used to apply for and manage the resource pool to the RM. The resource pool runs the started resource container, and allocates the resource container to the client according to the resource allocation request of the client.
图1示出了一种资源分配系统100的示例性联网环境框图,如图1所示,系统100包含客户端102,应用控制器104,资源管理器112,以及多个节点106,其中每一个节点106包含一个节点管理器108和至少一个资源容器110,资源池114里包含至少一个资源容器110。1 shows an exemplary networked environment block diagram of a resource allocation system 100, as shown in FIG. 1, a system 100 including a client 102, an application controller 104, a resource manager 112, and a plurality of nodes 106, each of which The node 106 includes a node manager 108 and at least one resource container 110, and the resource pool 114 contains at least one resource container 110.
其中,客户端102上运行有待申请资源的应用,该应用的类型可以是MapReduce、Giraph、Storm、Spark、Tez/Impala或MPI等。客户端102可以为任意类型的计算设备,本发明实施例对此并不进行限定。The client 102 runs an application to be applied for, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI. The client 102 can be any type of computing device, which is not limited by the embodiment of the present invention.
资源容器110是节点中的资源抽象,它可以封装某个节点上的多类资源,如内存、CPU、磁盘、网络等。可选的,资源容器110也可以只封装某个节点上的部分资源,例如只封装内存和CPU,本发明实施例对此并不进行限定。其中,资源容器110可以运行任何类型的任务。例如,MapReduce应用可以请求一个资源容器110来启动map或reduce任务,而Giraph应用可以请求一个资源容器110来运行Giraph任务。用户还可以实现一个自定义的应用类型,通过资源容器110来运行特定的任务,从而实现一种全新的应用程序框架。The resource container 110 is a resource abstraction in a node, which can encapsulate multiple types of resources on a node, such as memory, CPU, disk, network, and the like. Optionally, the resource container 110 may also encapsulate only a part of the resources on a certain node, for example, only the memory and the CPU are encapsulated, which is not limited by the embodiment of the present invention. Among them, the resource container 110 can run any type of task. For example, a MapReduce application can request a resource container 110 to initiate a map or reduce task, and a Giraph application can request a resource container 110 to run a Giraph task. Users can also implement a custom application type that runs a specific task through the resource container 110 to implement a completely new application framework.
应用控制器104用于管理资源池114中的资源容器110,其中,资源池114中的资源容器是已经启动的资源容器,由应用控制器104向资源管理器112提前申请并启动,当有作业时,可以尽快执行作业的任务,从而节省了资源容器114的启动时间。应用控制器104向资源管理器112申请资源时,资源管理器112为应用控制器104返回的资源便是用资源容器110表示。 The application controller 104 is configured to manage the resource container 110 in the resource pool 114. The resource container in the resource pool 114 is a resource container that has been started, and is applied and started by the application controller 104 to the resource manager 112 in advance. At the same time, the task of the job can be performed as soon as possible, thereby saving the startup time of the resource container 114. When the application controller 104 requests resources from the resource manager 112, the resources returned by the resource manager 112 for the application controller 104 are represented by the resource container 110.
应理解,因为不同的应用对资源容器的需求可能不同,即不同的应用需要的资源容器的资源种类,以及每一种类的资源的数量需求不同,所以不同的应用可以对应不同的应用控制器,且每一类应用可以有多个应用控制器,本发明实施例并不对此进行限定,为了描述方便,本发明实施例仅对一个应用的资源申请流程进行描述。It should be understood that different applications may correspond to different application controllers because different applications may have different requirements for resource containers, that is, resource types of resource containers required by different applications, and the number of resources of each type are different. For each type of application, there may be multiple application controllers, which are not limited by the embodiments of the present invention. For the convenience of description, the embodiment of the present invention describes only the resource application process of one application.
应理解,图1中所示的资源池114中包含多个节点106中的多个资源容器110,但并不限定该多个节点106或该多个节点106中的资源容器110全部属于资源池114,资源池114是由资源容器110组成。It should be understood that the resource pool 114 shown in FIG. 1 includes multiple resource containers 110 among the plurality of nodes 106, but does not limit the plurality of nodes 106 or the resource containers 110 of the plurality of nodes 106 all belong to the resource pool. 114. The resource pool 114 is composed of a resource container 110.
应用控制器104用于接收来自客户端102上应用的资源分配请求,并为该应用的作业分配执行该应用的作业所需要的资源容器,在执行作业时,客户端102会为应用的作业中的每个任务分配一个资源容器,且该任务只能使用该资源容器中描述的资源。The application controller 104 is configured to receive a resource allocation request from an application on the client 102, and allocate a resource container required for the application of the application to the application of the application. When the job is executed, the client 102 is in the application's job. Each task is assigned a resource container, and the task can only use the resources described in the resource container.
资源管理器112是一个全局的资源管理器,负责整个系统的资源管理和分配,当接收到来自应用控制器104的资源请求时,可以根据整个系统的负载情况,为应用控制器104分配资源容器。The resource manager 112 is a global resource manager responsible for resource management and allocation of the entire system. When receiving a resource request from the application controller 104, the resource controller can be allocated to the application controller 104 according to the load of the entire system. .
其中,客户端102,应用控制器104,资源管理器112和每一个节点106的节点管理器108可以通过网络进行通信,其中,网络可以是因特网,内联网,局域网(Local Area Networks,简称LANs),广域网络(Wireless Local Area Networks,简称WLANs),存储区域网络(Storage Area Networks,简称SANs)等,或者以上网络的组合。The client 102, the application controller 104, the resource manager 112, and the node manager 108 of each node 106 can communicate through a network, where the network can be the Internet, an intranet, or a local area network (LAN). , Wireless Local Area Networks (WLANs), Storage Area Networks (SANs), etc., or a combination of the above.
应理解,图1的目的仅仅是示例性的引入系统100的参与者以及它们的相互关系。因此,所描绘的系统100被大大地简化,本发明实施例仅仅对其进行概括性的说明,并不对其实现方式进行任何的限定。且图1中的客户端102,应用控制器104,和节点106可以是任意体系结构的,本发明实施例并不对此进行限定。It should be understood that the purpose of FIG. 1 is merely exemplary participants of the system 100 and their interrelationships. Therefore, the depicted system 100 is greatly simplified, and the embodiments of the present invention are merely described in general terms, and the implementation thereof is not limited in any way. The client 102, the application controller 104, and the node 106 in FIG. 1 may be of any architecture, which is not limited by the embodiment of the present invention.
图1所示的应用控制器104和/或资源管理器112可以由图2所示的计算设备200来实现。The application controller 104 and/or resource manager 112 shown in FIG. 1 can be implemented by the computing device 200 shown in FIG. 2.
图2为计算设备200的简化的逻辑结构示意图,如图2所示,计算设备200包括处理器202、内存单元204、输入/输出接口206、通信接口208、总线210和存储设备212。其中,处理器202、内存单元204、输入/输出接 口206、通信接口208和存储设备212,通过总线210实现彼此之间的通信连接。2 is a simplified logical block diagram of computing device 200. As shown in FIG. 2, computing device 200 includes a processor 202, a memory unit 204, an input/output interface 206, a communication interface 208, a bus 210, and a storage device 212. Wherein, the processor 202, the memory unit 204, and the input/output connection The port 206, the communication interface 208, and the storage device 212 implement a communication connection with each other via the bus 210.
处理器202是计算设备200的控制中心,用于执行相关程序,以实现本发明实施例所提供的技术方案。可选的,处理器202包含一个或多个中央处理器单元(Central Processing Unit,CPU),例如,图2所示的中央处理器单元1和中央处理器单元2。可选的,计算设备200还可以包含多个处理器202,每一个处理器202可以是一个单核处理器(包含一个CPU)或多核处理器(包含多个CPU)。除非另有说明,在本发明中,一个用于执行特定功能的组件,例如,处理器202或内存单元204,可以通过配置一个通用的组件来执行相应功能来实现,也可以通过一个专门执行特定功能的专用组件来实现,本申请并不对此进行限定。处理器202可以采用通用的中央处理器,微处理器,应用专用集成电路(Application SQecific Integrated Circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本申请所提供的技术方案。The processor 202 is a control center of the computing device 200 for executing related programs to implement the technical solutions provided by the embodiments of the present invention. Optionally, the processor 202 includes one or more central processing units (CPUs), such as the central processing unit 1 and the central processing unit 2 shown in FIG. Optionally, the computing device 200 can also include multiple processors 202, each of which can be a single core processor (including one CPU) or a multi-core processor (including multiple CPUs). Unless otherwise stated, in the present invention, a component for performing a specific function, for example, the processor 202 or the memory unit 204, may be implemented by configuring a general-purpose component to perform a corresponding function, or may be specifically performed by a specific one. The specific components of the function are implemented, and this application does not limit this. The processor 202 can be a general-purpose central processing unit, a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs to implement the technology provided by the present application. Program.
处理器202可以通过总线210与一个或多个存储方案相连接。存储方案可以包含内存单元204和存储设备212。其中,存储设备212可以为只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。内存单元204可以为随机存取存储器。内存单元204可以与处理器202集成在一起或集成在处理器202的内部,也可以是独立于处理器202的一个或多个存储单元。 Processor 202 can be coupled to one or more storage schemes via bus 210. The storage scheme can include a memory unit 204 and a storage device 212. The storage device 212 can be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM). Memory unit 204 can be a random access memory. The memory unit 204 can be integrated with or integrated with the processor 202, or it can be one or more memory units independent of the processor 202.
供处理器202或处理器202内部的CPU执行的程序代码可以存储在存储设备212或内存单元204中。可选的,存储在存储设备212内部的程序代码(例如,操作系统、应用程序、资源分配模块或通信模块等)被拷贝到内存单元204中,以供处理器202执行。Program code for execution by the processor 202 or a CPU internal to the processor 202 may be stored in the storage device 212 or the memory unit 204. Alternatively, program code (eg, an operating system, an application, a resource allocation module, or a communication module, etc.) stored internal to storage device 212 is copied to memory unit 204 for execution by processor 202.
存储设备212可以为物理硬盘或其分区(包括小型计算机系统接口存储或全局网络块设备卷)、网络存储协议(包括网络文件系统NFS等网络或机群文件系统)、基于文件的虚拟存储设备(虚拟磁盘镜像)、基于逻辑卷的存储设备。可以包含高速随机存储器(RAM),也可以包含非易失性存储器,例如一个或者多个磁盘存储器,闪速存储器,或者其他非易失性存储器。在一些实施例中,存储设备还可能进一步包含与所述一个和多个处理器 202分离的远程存储器,例如通过通信接口208与通信网络进行访问的网盘,该通信网络可以为因特网,内联网,局域网(LANs),广域网络(WLANs),存储区域网络(SANs)等,或者以上网络的组合。The storage device 212 can be a physical hard disk or a partition thereof (including a small computer system interface storage or a global network block device volume), a network storage protocol (including a network file system NFS or the like network or a cluster file system), a file-based virtual storage device (virtual Disk mirroring), logical volume-based storage devices. It may include high speed random access memory (RAM), and may also include non-volatile memory, such as one or more disk memories, flash memories, or other non-volatile memory. In some embodiments, the storage device may further include the one or more processors 202 separate remote storage, such as a network disk accessed through a communication interface 208 with a communication network, which may be the Internet, an intranet, a local area network (LANs), a wide area network (WLANs), a storage area network (SANs), etc., or A combination of the above networks.
操作系统(例如Darwin、RTXC、LINUX、UNIX、OS X、WINDOWS或是诸如Vxworks之类的嵌入式操作系统)包括用于控制和管理常规系统任务(例如内存管理、存储设备控制、电源管理等等)以及有助于各种软硬件组件之间通信的各种软件组件和/或驱动器。Operating systems (such as Darwin, RTXC, LINUX, UNIX, OS X, WINDOWS, or embedded operating systems such as Vxworks) include controls and management of general system tasks (such as memory management, storage device control, power management, etc.) And various software components and/or drivers that facilitate communication between various hardware and software components.
输入/输出接口206用于接收输入的数据和信息,输出操作结果等数据。The input/output interface 206 is for receiving input data and information, and outputting data such as operation results.
通信接口208使用例如但不限于收发器一类的收发装置,来实现计算设备200与其他设备或通信网络之间的通信。 Communication interface 208 enables communication between computing device 200 and other devices or communication networks using transceivers such as, but not limited to, transceivers.
总线210可包括一通路,在计算设备200各个部件(例如处理器202、内存单元204、输入/输出接口206、通信接口208和存储设备212)之间传送信息。可选的,总线210可以使用有线的连接方式或采用无线的通讯方式,本申请并不对此进行限定。Bus 210 may include a path for communicating information between various components of computing device 200, such as processor 202, memory unit 204, input/output interface 206, communication interface 208, and storage device 212. Optionally, the bus 210 can use a wired connection or a wireless communication mode, which is not limited in this application.
应注意,尽管图2所示的计算设备200仅仅示出了处理器202、内存单元204、输入/输出接口206、通信接口208、总线210以及存储设备212,但是在具体实现过程中,本领域的技术人员应当明白,计算设备200还包含实现正常运行所必须的其他器件。It should be noted that although the computing device 200 shown in FIG. 2 only shows the processor 202, the memory unit 204, the input/output interface 206, the communication interface 208, the bus 210, and the storage device 212, in a specific implementation process, the field Those skilled in the art will appreciate that computing device 200 also includes other devices necessary to achieve proper operation.
计算设备200可以为一般的通用计算机或专门用途的计算设备,包括但不限于便携计算机,个人台式计算机,网络服务器,平板电脑,手机,个人数字助理(Personal Digital Assistant,PDA)等任何电子设备,或者以上两种或者多种的组合设备,本申请并不对计算设备200的具体实现形式进行任何限定。The computing device 200 can be a general purpose computer or a special purpose computing device, including but not limited to a portable computer, a personal desktop computer, a network server, a tablet computer, a mobile phone, a personal digital assistant (PDA), or the like. Or a combination of two or more of the above, the present application does not limit the specific implementation of the computing device 200.
此外,图2的计算设备200仅仅是一个计算设备200的例子,计算设备200可能包含相比于图2展示的更多或者更少的组件,或者有不同的组件配置方式。根据具体需要,本领域的技术人员应当明白,计算设备200还可包含实现其他附加功能的硬件器件。本领域的技术人员应当明白,计算设备200也可仅仅包含实现本发明实施例所必须的器件,而不必包含图2中所示的全部器件。同时,图2中展示的各种组件可以用硬件、软件或者硬件与软件的结合方式实施。 Moreover, computing device 200 of FIG. 2 is merely an example of one computing device 200, which may include more or fewer components than those shown in FIG. 2, or have different component configurations. Those skilled in the art will appreciate that computing device 200 may also include hardware devices that implement other additional functions, depending on the particular needs. Those skilled in the art will appreciate that computing device 200 may also only include the components necessary to implement embodiments of the present invention, and does not necessarily include all of the devices shown in FIG. At the same time, the various components shown in Figure 2 can be implemented in hardware, software, or a combination of hardware and software.
图2所示的硬件结构以及上述描述适用于本发明实施例所提供的各种计算设备,适用于执行本发明实施例所提供的各种资源分配方法。The hardware structure shown in FIG. 2 and the foregoing description are applicable to various computing devices provided by the embodiments of the present invention, and are applicable to performing various resource allocation methods provided by the embodiments of the present invention.
如图2所示,计算设备200的内存单元204中包含资源分配模块,处理器202执行该资源分配模块程序代码,实现资源的管理和分配。As shown in FIG. 2, the memory unit 204 of the computing device 200 includes a resource allocation module, and the processor 202 executes the resource allocation module program code to implement resource management and allocation.
资源分配模块可以由一个或者多个操作指令构成,以使计算设备200根据以上描述执行一个或多个方法步骤。具体的方法步骤在本申请的以下部分进行详细描述。The resource allocation module can be comprised of one or more operational instructions to cause computing device 200 to perform one or more method steps in accordance with the above description. The specific method steps are described in detail in the following sections of this application.
图3为依据本发明一实施例的资源分配流程的信令图,分布式系统包括资源管理器112,节点管理器108和至少两个节点106,资源管理器112用于管理该分布式系统的节点资源,节点管理器108用于基于节点资源启动资源容器110,资源容器110封装有节点106的资源,用于执行应用的任务,如图3所示,该资源分配流程包括:3 is a signaling diagram of a resource allocation process according to an embodiment of the present invention. The distributed system includes a resource manager 112, a node manager 108, and at least two nodes 106. The resource manager 112 is configured to manage the distributed system. The node resource is used by the node manager 108 to start the resource container 110 based on the node resource. The resource container 110 encapsulates the resource of the node 106 for performing the task of the application. As shown in FIG. 3, the resource allocation process includes:
302:资源管理器112启动应用控制器104。302: The resource manager 112 launches the application controller 104.
具体的,资源管理器112用于管理该至少两个节点106的资源,以及在满足触发时机时启动应用控制器104,并配置该应用控制器104管理的资源池114的初始规格。Specifically, the resource manager 112 is configured to manage resources of the at least two nodes 106, and launch the application controller 104 when the trigger timing is met, and configure an initial specification of the resource pool 114 managed by the application controller 104.
资源管理器112可以在系统初始化的时候,启动应用控制器104;也可以后续根据需求,由用户根据需求,通过启动指令动态的启动,例如,资源管理器112可以在接收到预先启动应用控制器112的请求,或者接收到预先配置资源池114的请求时,启动该应用控制器104;或者由资源管理器112根据自身的资源状况,后续进行启动,例如,当频繁接收到每一种应用的资源请求时,可以启动对应应用类型的应用控制器104。应理解,本发明实施例并不限定应用控制器104的启动形式。The resource manager 112 may start the application controller 104 when the system is initialized; or may be dynamically started by the user according to the requirements according to the requirements, for example, the resource manager 112 may receive the pre-launched application controller. The request of 112, or the request to pre-configure the resource pool 114, initiates the application controller 104; or is initiated by the resource manager 112 according to its own resource status, for example, when each application is frequently received When the resource is requested, the application controller 104 of the corresponding application type can be started. It should be understood that embodiments of the present invention do not limit the startup form of the application controller 104.
具体的,资源管理器112在启动应用控制器104时,会对应用控制器104管理的资源池114的初始规格进行配置,以使应用控制器104后续向资源管理器112申请资源时,能够根据该初始规格确定向资源管理器112申请的资源容器的数量和规格。Specifically, when the application manager 104 is started, the resource manager 112 configures an initial specification of the resource pool 114 managed by the application controller 104, so that when the application controller 104 subsequently applies for resources to the resource manager 112, This initial specification determines the number and specifications of resource containers that are applied to the resource manager 112.
资源池114的初始规格可以为资源池114初始包含的资源容器的数量,以及每一个资源容器的规格,其中,资源容器的规格为资源容器包含的资源种类,以及每一类资源的数量;如果资源容器的规格提前设定,则资源池 114的初始规格可以为资源池114初始包含的资源容器的数量;如果资源容器的规格被提前配置,且资源容器有多个规格,每一种规格中的资源种类和每一个种类的资源数量都提前设定,则资源池114的初始规格可以为资源池114初始包含的资源容器规格标识,以及每一个规格的资源容器的数目。The initial specification of the resource pool 114 may be the number of resource containers initially included in the resource pool 114, and the specifications of each resource container, where the specification of the resource container is the resource type included in the resource container, and the number of each type of resource; Resource container specifications are set in advance, then resource pool The initial specification of 114 may be the number of resource containers initially included in the resource pool 114; if the specifications of the resource container are configured in advance, and the resource container has multiple specifications, the resource types in each specification and the number of resources of each type are If set in advance, the initial specification of the resource pool 114 may be the resource container specification identifier initially included in the resource pool 114, and the number of resource containers for each specification.
资源池114的初始规格还可以为资源池114初始包含的各种资源的种类,以及每一种资源的数量。The initial specification of the resource pool 114 may also be the type of various resources initially included by the resource pool 114, as well as the number of each resource.
应了解,本发明实施例并不对资源池114的初始规格的形式进行限定,在不同场景下资源数量的表现形式可以不同。It should be understood that the embodiment of the present invention does not limit the form of the initial specification of the resource pool 114, and the representation of the number of resources may be different in different scenarios.
具体的,资源管理器112可以根据预先设置的应用的预期的资源需求信息,配置应用控制器104管理的资源池114的初始规格;资源管理器112还可以根据收集到的分布式系统的节点资源的使用信息,配置应用控制器104管理的资源池114的初始规格。例如,当有充足未被使用的节点资源时,资源管理器112可以配置较大的资源池114的初始规格;当未被使用的节点资源较少时,资源管理器112可以配置较小的资源池114的初始规格。另外,资源池114的初始规格可以由用户在应用控制器104的启动指令中携带,并由资源管理器112在启动应用控制器104时,对应用控制器104进行配置。Specifically, the resource manager 112 may configure an initial specification of the resource pool 114 managed by the application controller 104 according to the expected resource requirement information of the preset application; the resource manager 112 may also be based on the collected node resources of the distributed system. The usage information configures the initial specifications of the resource pool 114 managed by the application controller 104. For example, resource manager 112 may configure an initial specification of a larger resource pool 114 when there are sufficient unused node resources; resource manager 112 may configure smaller resources when there are fewer unused node resources The initial specification of pool 114. Additionally, the initial specification of the resource pool 114 can be carried by the user in the startup command of the application controller 104, and the resource manager 112 configures the application controller 104 when the application controller 104 is launched.
应理解,应用控制器104可以与应用一一对应,因为不同应用对资源容器的需求不同,对应不同应用的资源容器中的资源种类和每一种资源的数量可以不同,不同应用对应的资源池114的初始规格也可以不同,本发明实施例对此并不进行限定。It should be understood that the application controller 104 may have a one-to-one correspondence with the application, because different applications have different requirements for resource containers, and the types of resources and the number of each resource in the resource containers corresponding to different applications may be different, and resource pools corresponding to different applications. The initial specifications of the 114 may also be different, which is not limited by the embodiment of the present invention.
304:应用控制器104向资源管理器112发送第一资源请求。304: The application controller 104 sends a first resource request to the resource manager 112.
应用控制器104还可以向资源管理器112发送第一资源请求之前,还向资源管理器112进行注册,以方便后续资源管理器112对应用控制器104进行管理。The application controller 104 may also register with the resource manager 112 before sending the first resource request to the resource manager 112 to facilitate subsequent resource manager 112 to manage the application controller 104.
应用控制器104还可以向一个注册服务器进行注册,注册服务器的地址信息会向客户端102发布,以供后续客户端102通过该注册服务器查询应用控制器104的地址信息。The application controller 104 can also register with a registration server, and the address information of the registration server is distributed to the client 102 for the subsequent client 102 to query the address information of the application controller 104 through the registration server.
注册服务器中可以保存有应用的类型与应用控制器104的对应关系,应用的类型可以为MapReduce、Giraph、Storm、Spark、Tez/Impala或MPI等,举例说明,注册服务器保存有Spark应用与对应Spark应用的应用控制器104 的对应关系,当客户端102上运行的应用为Spark时,则客户端102可以根据该对应关系查找到对应Spark应用的应用控制器104的地址信息。也可以为某一类型的应用设置单独的注册服务器,例如,可以专门为Spark应用设置对应的注册服务器。本发明实施例并不对注册服务器的具体实现方式进行限定。The registration server can store the mapping between the type of the application and the application controller 104. The type of the application can be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI. For example, the registration server stores the Spark application and the corresponding Spark. Applied application controller 104 Corresponding relationship, when the application running on the client 102 is Spark, the client 102 can find the address information of the application controller 104 corresponding to the Spark application according to the correspondence. It is also possible to set up a separate registration server for a certain type of application. For example, a corresponding registration server can be set specifically for a Spark application. The embodiment of the present invention does not limit the specific implementation manner of the registration server.
可选的,每一类应用可以对应多个应用控制器104,该多个应用控制器104可以供不同的用户使用,从而可以方便分布式系统规模的扩展。Optionally, each type of application may correspond to multiple application controllers 104, and the multiple application controllers 104 may be used by different users, thereby facilitating the expansion of the distributed system scale.
在具体实现过程中,可以在注册服务器上维护客户端102与应用控制器104之间的对应关系,或者设置多个注册服务器,每一个注册服务器对应不同的用户。应了解,本发明实施例并不对每一类应用对应的应用服务器的数量,以及注册服务器的实现方式进行限定。In a specific implementation process, the correspondence between the client 102 and the application controller 104 may be maintained on the registration server, or multiple registration servers may be set, and each registration server corresponds to a different user. It should be understood that the embodiment of the present invention does not limit the number of application servers corresponding to each type of application and the implementation manner of the registration server.
具体的,应用控制器104在向资源管理器112发送的第一资源请求中携带根据所述资源池的初始规格确定的资源数量。Specifically, the application controller 104 carries the quantity of resources determined according to the initial specification of the resource pool in the first resource request sent to the resource manager 112.
资源数量可以为应用控制器104向资源管理器112请求的资源容器数量和每个资源容器的规格,其中,资源容器的规格是指资源容器中包含的资源种类和每一种资源的数量。其中,每一个资源容器中包含的资源可以包含:处理器资源、内存资源、网络、磁盘等中的一种或多种。应理解,根据应用种类的不同,对应的请求的资源容器中的资源种类和每一种资源的数量可以不同,本发明实施例对此并不进行限定。The number of resources may be the number of resource containers requested by the application controller 104 to the resource manager 112 and the specifications of each resource container, wherein the specification of the resource container refers to the kind of resources included in the resource container and the number of each resource. The resources included in each resource container may include one or more of a processor resource, a memory resource, a network, a disk, and the like. It should be understood that, according to the type of application, the resource type in the corresponding requested resource container and the number of each resource may be different, which is not limited by the embodiment of the present invention.
如果资源容器的规格被提前配置,即资源容器中包含的资源种类和每一种资源的数量都提前设定,则应用控制器104向资源管理器112发送的第一资源请求中携带的请求的资源数量可以是资源容器的数量。如果资源容器的规格被提前配置,且资源容器有多个规格,每一种规格中的资源种类和每一个种类的资源数量都提前设定,则资源数量可以为应用控制器104向资源管理器112发送的第一资源请求中携带的资源容器规格标识,以及每一个规格的资源容器的数量。If the specification of the resource container is configured in advance, that is, the resource type included in the resource container and the number of each resource are set in advance, the request that is carried in the first resource request sent by the application controller 104 to the resource manager 112 is The number of resources can be the number of resource containers. If the specification of the resource container is configured in advance, and the resource container has multiple specifications, the resource type in each specification and the resource quantity of each type are set in advance, the resource quantity may be the application controller 104 to the resource manager. The resource container specification identifier carried in the first resource request sent by 112, and the number of resource containers of each specification.
资源数量还可以是应用控制器104需要的各种资源的种类,以及每一种资源的数量。The amount of resources may also be the kind of various resources that the application controller 104 needs, as well as the number of each resource.
应了解,本发明实施例并不对第一资源请求中请求的资源数量的形式进行限定,在不同场景下资源数量的表现形式可以不同。 It should be understood that the embodiment of the present invention does not limit the form of the quantity of resources requested in the first resource request, and the representation of the quantity of resources may be different in different scenarios.
在具体实现过程中,第一资源请求中还可以携带应用控制器104请求的资源容器所在的节点信息和/或机架信息。例如,应用控制器104可以优选与该应用控制器104之间的链路较短的主机,作为运行请求的资源容器的节点,从而实现对资源池114更有效的控制。In a specific implementation process, the first resource request may further carry node information and/or rack information in which the resource container requested by the application controller 104 is located. For example, the application controller 104 may preferably host a shorter link with the application controller 104 as a node of the resource container running the request, thereby enabling more efficient control of the resource pool 114.
306:资源管理器112确定节点,在确定的节点上为应用控制器104分配资源容器。306: The resource manager 112 determines a node to allocate a resource container to the application controller 104 on the determined node.
资源管理器112负责全局的资源管理和分配,资源管理器112接收到应用控制器104根据该资源池114的初始规格发送的资源池114的第一资源请求后,根据所述第一资源请求,为所述资源池分配初始的资源容器。具体的,资源管理器112接收到第一资源请求后,首先确定可选的节点,并根据该资源池114的第一资源请求中请求的资源数量,在可选的节点上为该资源池114分配初始的资源容器。The resource manager 112 is responsible for global resource management and allocation. After the resource manager 112 receives the first resource request of the resource pool 114 sent by the application controller 104 according to the initial specification of the resource pool 114, according to the first resource request, An initial resource container is allocated for the resource pool. Specifically, after receiving the first resource request, the resource manager 112 first determines an optional node, and according to the quantity of resources requested in the first resource request of the resource pool 114, the resource pool 114 is on the optional node. Assign the initial resource container.
可选的,如果第一资源请求中携带了请求的资源容器优选的主机节点和/或机架,资源管理器112接收到第一资源请求后,优先从这些主机节点和/或机架中为应用控制器112分配资源容器,若第一资源请求中请求的主机节点和/或机架负载当前无法满足,例如请求的主机因为负载问题,无法满足请求,则可以在该节点所在的机架中为应用控制器112分配资源容器,若机架因为负载均衡无法满足请求,则可以在该机架相邻的机架中为应用控制器112分配资源容器。Optionally, if the first resource request carries the requested resource container preferred host node and/or the rack, after receiving the first resource request, the resource manager 112 preferentially obtains from the host node and/or the rack The application controller 112 allocates a resource container. If the host node and/or the rack load requested in the first resource request cannot be satisfied currently, for example, the requested host cannot satisfy the request due to a load problem, it may be in the rack where the node is located. A resource container is allocated for the application controller 112. If the rack cannot satisfy the request due to load balancing, the resource controller can be allocated to the application controller 112 in the rack adjacent to the rack.
在另一种可能的实现方式中,如果第一资源请求中没有携带请求的资源容器优选的主机或者机架,则资源管理器112可以根据节点的负载均衡为应用控制器104分配资源容器,或者优选与应用控制器104之间链路较短的主机或机架上为应用控制器104分配资源。In another possible implementation manner, if the first resource request does not carry the requested resource container preferred host or rack, the resource manager 112 may allocate the resource container to the application controller 104 according to the load balancing of the node, or The application controller 104 is preferably allocated resources on a host or rack that has a short link to the application controller 104.
应了解,本发明实施例并不对资源管理器112为应用控制器104分配资源容器的策略进行限定。It should be appreciated that embodiments of the present invention do not define a policy for resource manager 112 to allocate resource containers to application controllers 104.
308:资源管理器112给应用控制器104发送第一资源分配消息。308: The resource manager 112 sends a first resource allocation message to the application controller 104.
具体的,资源管理器112向应用控制器104发送资源池114的第一资源分配消息,该第一资源分配消息中包含资源管理器112为应用控制器104管理的资源池114分配的初始的资源容器所在的节点的信息。Specifically, the resource manager 112 sends a first resource allocation message of the resource pool 114 to the application controller 104, where the first resource allocation message includes an initial resource allocated by the resource manager 112 to the resource pool 114 managed by the application controller 104. Information about the node where the container is located.
具体的,第一资源分配消息中可以携带资源管理器112为应用控制器 104分配的每一个资源容器的规格和每一个资源容器所在的节点的信息。资源容器的规格为资源容器包含资源的种类,以及每一种资源的数量。Specifically, the first resource allocation message may carry the resource manager 112 as an application controller. 104 The specifications of each resource container allocated and the information of the node where each resource container is located. The specification of a resource container is the kind of resource that the resource container contains, and the number of each resource.
如果资源容器的规格是系统提前配置的,即资源容器中包含的资源种类和每一种资源的数量都提前设定,则在第一资源分配消息中可以携带每一个资源容器所在的节点的信息,如果资源管理器112在同一个节点上为应用控制器104分配多个资源容器,则第一资源分配消息中还可以携带在每一个节点中分配的资源容器数目。If the specification of the resource container is configured in advance by the system, that is, the resource type included in the resource container and the quantity of each resource are set in advance, the information of the node where each resource container is located may be carried in the first resource allocation message. If the resource manager 112 allocates multiple resource containers to the application controller 104 on the same node, the first resource allocation message may also carry the number of resource containers allocated in each node.
如果资源容器的规格被提前配置,且资源容器有多个规格,每一种规格中的资源种类和每一个种类的资源数量都被提前设定,则在第一资源分配消息中可以携带有分配的资源容器的规格标识,以及每一个资源容器所在节点的信息。If the resource container specification is configured in advance, and the resource container has multiple specifications, the resource type in each specification and the resource quantity of each type are set in advance, the first resource allocation message may carry the allocation. The specification of the resource container and the information of the node where each resource container is located.
在具体实现过程中,资源管理器112接收到来自应用控制器104的第一资源请求后,可能并不会立即为应用控制器104返回满足要求的资源,而需要应用控制器104不断与资源管理器112通信,探测分配到的资源,并拉过去使用。In a specific implementation process, after receiving the first resource request from the application controller 104, the resource manager 112 may not immediately return the resource that meets the requirements for the application controller 104, but requires the application controller 104 to continuously and resource management. The device 112 communicates, detects the allocated resources, and pulls them past.
310:应用控制器104启动资源容器。310: The application controller 104 launches a resource container.
应用控制器104向节点管理器发送启动请求,请求节点管理器启动资源池114的初始的资源容器。The application controller 104 sends a start request to the node manager requesting the node manager to initiate the initial resource container of the resource pool 114.
具体的,应用控制器104从资源管理器112获取到为其分配的资源容器后,会分别向每一个资源容器所在的节点发送启动请求,更具体的,向每一个资源容器所在的节点的节点管理器发送该启动请求,以启动资源管理器112为其分配的资源容器。Specifically, after the application controller 104 obtains the resource container allocated to the resource manager 112, it sends a startup request to each node where the resource container is located, and more specifically, to the node of the node where each resource container is located. The manager sends the start request to start the resource container for which the resource manager 112 is assigned.
其中,启动请求中还携带资源容器的规格,资源容器的规格为资源容器包含资源的种类,以及每一种资源的数量。The startup request also carries the specification of the resource container, and the specification of the resource container is the type of the resource container including the resource, and the quantity of each resource.
如果资源容器的规格被提前配置,且资源容器有多个规格,每一种规格中的资源种类和每一个种类的资源数量都被提前设定,则在启动请求中还携带请求启动的资源容器的规格标识。If the specification of the resource container is configured in advance, and the resource container has multiple specifications, the resource type in each specification and the number of resources in each type are set in advance, the resource container that requests the startup is also carried in the startup request. Specification of the logo.
可选的,如果资源管理器112为应用控制器104在同一个节点上分配多个资源容器,可以在一次启动请求中携带请求启动该多个资源容器。 Optionally, if the resource manager 112 allocates multiple resource containers on the same node for the application controller 104, the multiple resource containers may be initiated by carrying a request in one startup request.
资源容器所在节点的节点管理器接收到启动请求后,首先进行资源本地化,即创建资源容器的工作目录,并从分布式文件体系(Hadoop Distributed File System,HDFS)下载运行资源容器所需的各种资源(jar包、可执行文件等)等,然后启动资源容器。After receiving the startup request, the node manager of the node where the resource container is located first performs resource localization, that is, creates a working directory of the resource container, and downloads each required for running the resource container from the distributed file system (Hadoop Distributed File System, HDFS). Resources (jar packages, executable files, etc.), etc., and then start the resource container.
应用控制器104将资源管理器112为资源池114分配的初始的资源容器通过节点管理器启动后,启动后的资源容器就组成了资源池114。After the application controller 104 starts the initial resource container allocated by the resource manager 112 for the resource pool 114 through the node manager, the activated resource container constitutes the resource pool 114.
因为资源池114中的资源容器都已经是启动状态,所以接到任务时,可以快速的执行,避免了任务前的启动时间,从而加快对应用的任务的执行速度。Because the resource containers in the resource pool 114 are already in the startup state, when the task is received, the execution can be performed quickly, and the startup time before the task is avoided, thereby speeding up the execution of the application task.
312:应用控制器104接收到来自客户端102的资源分配请求。312: The application controller 104 receives a resource allocation request from the client 102.
其中,客户端102上运行有应用,该应用的类型可以是MapReduce、Giraph、Storm、Spark、Tez/Impala或MPI等。The client 102 runs an application, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI.
当客户端102上运行的应用需要资源来进行任务处理的时候,则通过客户端102向应用控制器发送资源分配请求。When the application running on the client 102 requires resources for task processing, the client 102 sends a resource allocation request to the application controller.
在具体实现过程中,客户端102可以通过一个注册服务器查询对应该应用的应用控制器104的地址信息,并将资源分配请求发送到对应地址信息指示的应用控制器104。其中,注册服务器中保存有应用控制器104与应用的对应信息,不同的应用可以拥有不同的应用控制器104。In a specific implementation process, the client 102 can query the address information of the application controller 104 corresponding to the application through a registration server, and send the resource allocation request to the application controller 104 indicated by the corresponding address information. The registration server stores the corresponding information of the application controller 104 and the application, and different applications may have different application controllers 104.
因为不同类型的应用对资源容器的资源需求不同,不同的应用的资源容器的规格多种多样,为每一类应用配置一个应用控制器104和资源池114,可以实现资源容器的合理化利用。Because different types of applications have different resource requirements for resource containers, the resource containers of different applications have various specifications, and an application controller 104 and a resource pool 114 are configured for each type of application, so that rational utilization of resource containers can be realized.
以Spark应用进行举例说明,具体实现过程可以为:用户通过spark-submit提交一个spark应用,用户的程序会自动创建一个SparkContext,其中SparkContext为Spark提供给用户用来操作Spark的外部接口,SparkContext内部会创建一个DAG调度器(DAGScheduler)和任务调度器(TaskScheduler),其中DAGScheduler用于根据应用的DAG的依赖进行计算调度,TaskScheduler进行任务的计算调度,TaskScheduler会创建一个调度器后端(SchedulerBackend)。SchedulerBackend通过查询注册服务器,查找Spark应用对应的应用控制器104的地址信息,并将资源分配请求发送到对应的应用控制器104。 The Spark application is used as an example. The specific implementation process can be as follows: The user submits a spark application through the spark-submit. The user program automatically creates a SparkContext. The SparkContext is used by the Spark to provide the external interface for the Spark. Create a DAG scheduler (DAGScheduler) and a task scheduler (TaskScheduler), where DAGScheduler is used to calculate and schedule according to the application's DAG dependency, TaskScheduler to perform task calculation and schedule, and TaskScheduler to create a scheduler backend (SchedulerBackend). The schedulerBackend queries the registration server, searches for the address information of the application controller 104 corresponding to the Spark application, and sends the resource allocation request to the corresponding application controller 104.
资源分配请求中携带客户端102向应用控制器104请求的应用的资源需求信息。The resource allocation request carries resource requirement information of the application requested by the client 102 to the application controller 104.
其中,资源分配请求中的应用的资源需求信息可以为请求的资源容器数量,以及每一个资源容器的规格。资源容器的规格为资源容器包含资源的种类,以及每一种资源的数量。The resource requirement information of the application in the resource allocation request may be the number of resource containers requested, and the specifications of each resource container. The specification of a resource container is the kind of resource that the resource container contains, and the number of each resource.
如果资源容器的规格是系统提前配置的,即资源容器中包含的资源种类和每一种资源的数量都提前设定,则应用的资源需求信息可以是请求的资源容器数量。If the specification of the resource container is configured in advance by the system, that is, the resource type included in the resource container and the quantity of each resource are set in advance, the resource requirement information of the application may be the requested resource container number.
如果资源容器的规格被提前配置,且资源容器有多个规格,每一种规格中的资源种类和每一个种类的资源的数量都被提前设定,则应用的资源需求信息可以为请求的资源容器的规格标识以及每一种规格的资源容器的数目。If the specification of the resource container is configured in advance, and the resource container has multiple specifications, the resource type in each specification and the number of resources of each type are set in advance, the resource requirement information of the application may be the requested resource. The specification of the container and the number of resource containers for each specification.
资源分配请求中的应用的资源需求信息还可以是客户端102向应用控制器104请求的各种资源的种类,以及每一种资源的数量。The resource requirement information of the application in the resource allocation request may also be the kind of various resources requested by the client 102 to the application controller 104, and the number of each resource.
可选的,资源分配请求中还包含客户端102的用户权限信息,应用控制器104接收到资源分配请求后,根据预设的用户权限库,验证客户端102的用户权限信息,其中,用户权限库包含不同用户的用户权限信息。如果用户权限库中不包含该客户端102的用户权限信息,则拒绝或不响应客户端102的资源分配请求;如果用户权限库中包含该客户端102的用户权限信息,则执行以下步骤。Optionally, the resource allocation request further includes user rights information of the client 102, and after receiving the resource allocation request, the application controller 104 verifies the user rights information of the client 102 according to the preset user permission library, where the user rights The library contains user permission information for different users. If the user rights information of the client 102 is not included in the user rights library, the resource allocation request of the client 102 is rejected or not; if the user rights library contains the user rights information of the client 102, the following steps are performed.
314:应用控制器104为客户端102分配资源容器。314: The application controller 104 allocates a resource container to the client 102.
应用控制器104接收到来自客户端102的资源分配请求后,根据资源分配请求中的应用的资源需求信息,从资源池114中选择空闲的资源容器分配给客户端102。After receiving the resource allocation request from the client 102, the application controller 104 selects an idle resource container from the resource pool 114 and allocates it to the client 102 according to the resource requirement information of the application in the resource allocation request.
具体的,应用控制器104接收到来自客户端102的资源分配请求后,首先确定资源池114中的空闲的资源容器,并根据资源分配请求中携带的应用的资源需求信息,从空闲的资源容器中为客户端102分配资源容器。其中,空闲的资源容器是指当前没有被分配出去执行应用的任务的资源容器。Specifically, after receiving the resource allocation request from the client 102, the application controller 104 first determines an idle resource container in the resource pool 114, and according to the resource requirement information of the application carried in the resource allocation request, from the idle resource container. A resource container is allocated for the client 102. The idle resource container refers to a resource container that is not currently assigned to perform an application task.
因为应用控制器104控制的资源池114中的资源容器可以同时分配给多个客户端102使用,但是一个资源容器只能同时执行一个客户端102分配的 任务,所以应用控制器104在为客户端102分配资源容器之前,需要首先确定当前空闲的资源容器,从而从空闲的资源容器中为客户端102分配其请求的资源。Because the resource containers in the resource pool 114 controlled by the application controller 104 can be simultaneously allocated for use by the plurality of clients 102, one resource container can only perform one client 102 allocation at the same time. The task, so the application controller 104 needs to first determine the currently idle resource container before allocating the resource container to the client 102, thereby allocating the requested resource from the idle resource container to the client 102.
应用控制器104可以通过向资源池114中的每一个资源容器发送查询消息,来确定当前该资源容器是否处于空闲状态。The application controller 104 can determine whether the current resource container is currently in an idle state by sending a query message to each resource container in the resource pool 114.
优选的,应用控制器104维护资源池114中每个资源容器的状态信息,该状态信息表示资源容器是否处于空闲状态,如果资源容器当前已被分配出去,正在运行着其他任务,则该资源容器的状态为不空闲,如果资源容器当前没有被分配出去,没有运行其他任务,则该资源容器的状态为空闲。Preferably, the application controller 104 maintains state information of each resource container in the resource pool 114, the state information indicating whether the resource container is in an idle state, and if the resource container is currently allocated and other tasks are running, the resource container The status of the resource container is idle. If the resource container is not currently allocated and no other tasks are running, the status of the resource container is idle.
应用控制器104接收到客户端102的资源分配请求后,可以根据资源池114中每一个资源容器的状态信息,确定资源池114中空闲的资源容器,然后根据资源分配请求携带的应用的资源需求信息,从当前空闲的资源容器中为客户端102分配资源容器。After receiving the resource allocation request of the client 102, the application controller 104 may determine the resource container in the resource pool 114 according to the state information of each resource container in the resource pool 114, and then request the resource requirement of the application according to the resource allocation request. Information, the resource container is allocated to the client 102 from the currently idle resource container.
应用控制器104从资源池114中选择空闲的资源容器分配给客户端102可以具体为:应用控制器104根据资源池114中每个资源容器的状态信息,选择资源池114中空闲的资源容器,并将所选择的资源池114中空闲的资源容器分配给客户端102。The application controller 104 selects an idle resource container from the resource pool 114 and allocates it to the client 102. The application controller 104 selects an idle resource container in the resource pool 114 according to the state information of each resource container in the resource pool 114. The resource containers that are free in the selected resource pool 114 are allocated to the client 102.
为了描述方便,将应用控制器104为客户端102分配的资源容器称为第一资源容器组,该第一资源容器组中包含至少一个资源容器。For convenience of description, the resource container allocated by the application controller 104 to the client 102 is referred to as a first resource container group, and the first resource container group includes at least one resource container.
如果应用控制器104维护资源池114中每一个资源容器的状态信息,则应用控制器104将第一资源容器组中的资源容器分配给客户端102后,还将第一资源容器组中每一个资源容器的状态设置为不空闲。If the application controller 104 maintains state information for each resource container in the resource pool 114, the application controller 104 allocates the resource containers in the first resource container group to the client 102, and also each of the first resource container groups. The status of the resource container is set to not idle.
可选的,应用控制器104还确定资源池114中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,则向资源管理器112发送第二资源请求,并接收资源管理器112发送的该资源池114的第二资源分配消息,根据该资源池114的第二资源分配消息,向节点管理器发送启动请求,请求节点管理器启动为资源池114分配的新增的资源容器。此处,第二资源请求中可以携带请求的资源数量,请求的资源数量可以是第一阈值与当前空闲资源容器数目之间的差值,或与该差值呈正相关关系。其中,第二资源请求的形式与第一资源请求的形式类似,在此不再赘述, 第二资源分配消息与第一资源分配消息类似,在此不再赘述。Optionally, the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and sends the second resource to the resource manager 112 if the number of remaining idle resource containers is less than a preset first threshold. And requesting, and receiving the second resource allocation message of the resource pool 114 sent by the resource manager 112, sending a startup request to the node manager according to the second resource allocation message of the resource pool 114, requesting the node manager to start as the resource pool 114 The newly added resource container. Here, the second resource request may carry the requested resource quantity, and the requested resource quantity may be a difference between the first threshold and the current idle resource container number, or may be positively correlated with the difference. The form of the second resource request is similar to the form of the first resource request, and details are not described herein again. The second resource allocation message is similar to the first resource allocation message, and details are not described herein again.
具体实现过程中,应用服务器104可以在将第一资源容器组中的资源容器分配给客户端102后,或者按照预设的周期确定资源池114中剩余的空闲的资源容器的数目。应了解,并发明实施例对应用服务器104确定资源池114中剩余的空闲的资源容器的数目的时间和方式并不进行限定。During the specific implementation process, the application server 104 may determine the number of idle resource containers remaining in the resource pool 114 after allocating the resource containers in the first resource container group to the client 102 or according to a preset period. It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
316:应用控制器104向第一资源容器组中的每一个资源容器发送指示消息,指示消息中携带客户端102的标识。316: The application controller 104 sends an indication message to each resource container in the first resource container group, where the indication message carries the identifier of the client 102.
在客户端102向应用控制器104发送的资源分配请求中还携带有客户端102的标识,应用控制器向第一资源容器组中的每一个资源容器发送的指示消息用于指示将第一资源容器组中的每一个资源容器分配给客户端102。具体实现过程中,客户端102的标识可以为客户端102的地址信息和与第一资源容器组中的资源容器通信的端口号。The resource allocation request sent by the client 102 to the application controller 104 further carries the identifier of the client 102, and the indication message sent by the application controller to each resource container in the first resource container group is used to indicate that the first resource is to be used. Each resource container in the container group is assigned to the client 102. In a specific implementation process, the identifier of the client 102 may be address information of the client 102 and a port number that communicates with the resource container in the first resource container group.
318:第一资源容器组中的每一个资源容器向客户端102发送注册消息。318: Each resource container in the first resource container group sends a registration message to the client 102.
注册消息中携带资源容器的标识信息,资源容器的标识信息是供客户端唯一识别出资源容器的标识信息。可选的,标识信息为资源容器所在的节点的地址信息和资源容器在该节点的信息。The registration message carries the identification information of the resource container, and the identification information of the resource container is the identification information for the client to uniquely identify the resource container. Optionally, the identifier information is address information of a node where the resource container is located and information of the resource container at the node.
注册消息中还可以携带资源容器的规格,即资源容器中的资源种类以及每一种资源的数量。The registration message can also carry the specification of the resource container, that is, the kind of resources in the resource container and the number of each resource.
如果资源容器的规格被提前配置,且资源容器有多个规格,每一种规格中的资源种类和每一个种类的资源数量都被提前设定,则注册消息中可以携带资源容器的规格标识。If the specification of the resource container is configured in advance, and the resource container has multiple specifications, the resource type in each specification and the number of resources in each type are set in advance, the registration message may carry the specification identifier of the resource container.
320:客户端102将应用的作业分配给应用控制器114为客户端102分配的资源容器。320: The client 102 assigns the application's job to the resource container that the application controller 114 allocates for the client 102.
客户端102根据接收到第一资源容器组中的资源容器的注册消息以后,会将应用的作业分为至少一个任务,并将每一个任务分别发送给第一资源容器组中的一个资源容器。After receiving the registration message of the resource container in the first resource container group, the client 102 divides the application job into at least one task, and sends each task to one resource container in the first resource container group.
322:应用控制器114为客户端102分配的资源容器将任务的执行结果发送给客户端102。322: The resource container allocated by the application controller 114 to the client 102 sends the execution result of the task to the client 102.
第一资源容器组中的资源容器将分配给其的任务执行完成后,将任务的 执行结果发送给客户端102。应理解,此处的执行结果可以是实际的任务执行结果,对于不需要返回结果的任务,此处的执行结果可以是一个任务完成指示消息。After the resource container in the first resource container group is assigned to its task execution, the task will be The execution result is sent to the client 102. It should be understood that the execution result here may be an actual task execution result. For a task that does not need to return a result, the execution result here may be a task completion indication message.
324:客户端102结束与第一资源容器组中的资源容器的连接。324: The client 102 ends the connection with the resource container in the first resource container group.
客户端102接收到第一资源容器组中的资源容器返回的任务执行结果后,向第一资源容器组的资源容器发送结束连接消息,结束与第一资源容器组中的资源容器的连接,释放资源容器资源。After receiving the task execution result returned by the resource container in the first resource container group, the client 102 sends an end connection message to the resource container of the first resource container group, ending the connection with the resource container in the first resource container group, and releasing Resource container resource.
326:第一资源容器组中的资源容器向应用控制器104发送状态更新消息。326: The resource container in the first resource container group sends a status update message to the application controller 104.
客户端102释放第一资源容器组中的资源容器资源后,第一资源容器组中的资源容器会向应用控制器104发送状态更新消息,表明可以接收新的任务。After the client 102 releases the resource container resource in the first resource container group, the resource container in the first resource container group sends a status update message to the application controller 104 indicating that a new task can be received.
如果应用控制器104维护资源池114中每一个资源容器的状态信息,则应用控制器104接收到第一资源容器组中的资源容器的状态更新消息后,会将对应资源容器的状态设置为空闲,当有新的资源分配请求时,可以作为备选资源容器分配给客户端102。If the application controller 104 maintains state information of each resource container in the resource pool 114, the application controller 104 sets the state of the corresponding resource container to idle after receiving the status update message of the resource container in the first resource container group. When there is a new resource allocation request, it can be allocated to the client 102 as an alternative resource container.
可选的,应用控制器104还确定资源池114中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息,所述资源释放消息用于释放所述至少一个资源容器占用的资源。释放的资源容器的数目可以为当前空闲的资源容器数目与第二阈值之间的差值,或与该差值呈正相关关系。Optionally, the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and if the number of remaining free resource containers is greater than a preset second threshold, at least one of the idle resource containers The resource container sends a resource release message, and the resource release message is used to release resources occupied by the at least one resource container. The number of resource containers released may be the difference between the number of currently idle resource containers and the second threshold, or may be positively correlated with the difference.
可选的,应用服务器104还维护资源池114中每个资源容器的空闲时间,当资源池114中有资源容器的空闲时间大于预设的第三阈值时,应用服务器104向空闲时间大于预设的第三阈值的资源容器发送资源释放消息。Optionally, the application server 104 further maintains the idle time of each resource container in the resource pool 114. When the idle time of the resource container in the resource pool 114 is greater than a preset third threshold, the application server 104 provides the idle time to be greater than the preset time. The third threshold of the resource container sends a resource release message.
可选的,在剩余的空闲的资源容器的数目大于预设的第二阈值时,应用服务器104优先向空闲时间大于预设的第三阈值的资源容器发送资源释放消息。或者当剩余的空闲的资源容器的数目大于预设的第二阈值,且资源池114中有资源容器的空闲时间大于预设的第三阈值时,应用服务器104向空闲时间大于预设的第三阈值的资源容器发送资源释放消息。Optionally, when the number of remaining idle resource containers is greater than a preset second threshold, the application server 104 preferentially sends a resource release message to the resource container whose idle time is greater than a preset third threshold. Or when the number of remaining idle resource containers is greater than a preset second threshold, and the idle time of the resource container in the resource pool 114 is greater than a preset third threshold, the application server 104 is idle longer than the preset third. The threshold resource container sends a resource release message.
具体实现过程中,应用服务器104可以在接收到第一资源容器组中的资 源容器的状态更新消息,重新将第一资源容器组中的资源容器设置为空闲状态后,或者按照预设的周期确定资源池114中剩余的空闲的资源容器的数目。应了解,并发明实施例对应用服务器104确定资源池114中剩余的空闲的资源容器的数目的时间和方式并不进行限定。During the specific implementation process, the application server 104 may receive the resources in the first resource container group. The status update message of the source container, after re-setting the resource container in the first resource container group to the idle state, or determining the number of free resource containers remaining in the resource pool 114 according to a preset period. It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
应理解,因为每一类应用对资源容器的规格的需求不同,在具体实现过程中,为每一类应用分别配置应用控制器104和资源池114,能够实现资源的合理利用。但本发明实施例并不对此进行限定,可以多类应用共用一个应用控制器104和资源池114。It should be understood that, because each type of application has different requirements on the specifications of the resource container, in the specific implementation process, the application controller 104 and the resource pool 114 are separately configured for each type of application, so that the rational use of resources can be realized. However, the embodiment of the present invention does not limit the application, and the application controller 104 and the resource pool 114 can be shared by multiple types of applications.
根据本发明实施例公开的技术方案,通过应用控制器提前向资源管理器提前申请资源容器,并提前启动请求的资源容器,当接收到应用的资源分配请求后,可以实现资源容器资源的及时分配,避免了资源容器启动的等待时间,且通过对资源池中资源容器的重用,避免了对资源容器的多次开启和关闭的资源消耗以及应多个应用管理器开启和关闭的资源消耗,通过应用控制器直接管理资源池中的资源容器,实现了更加灵活的管理。According to the technical solution disclosed by the embodiment of the present invention, the resource controller is applied to the resource manager in advance by the application controller, and the requested resource container is started in advance. After receiving the resource allocation request of the application, the resource container resource can be allocated in time. By avoiding the waiting time for resource container startup, and by reusing the resource container in the resource pool, resource consumption for multiple opening and closing of the resource container and resource consumption for opening and closing of multiple application managers are avoided. The application controller directly manages the resource containers in the resource pool, enabling more flexible management.
图4为依据本发明一实施例的资源分配流程的信令图,如图4所示,资源分配流程包括:FIG. 4 is a signaling diagram of a resource allocation process according to an embodiment of the present invention. As shown in FIG. 4, the resource allocation process includes:
步骤402-410参照图3实施例步骤302-310,在此不再赘述。Steps 402-410 refer to steps 302-310 of the embodiment of FIG. 3, and details are not described herein again.
412:客户端102向应用控制器104发送应用作业。412: The client 102 sends an application job to the application controller 104.
其中,客户端上运行有用户的应用,该应用的类型可以是MapReduce、Giraph、Storm、Spark、Tez/Impala或MPI等。The client runs an application of the user, and the type of the application may be MapReduce, Giraph, Storm, Spark, Tez/Impala, or MPI.
当客户端上运行的应用需要资源来进行任务处理的时候,则通过客户端102向应用控制器发送应用作业。When the application running on the client needs resources for task processing, the application job is sent to the application controller through the client 102.
在具体实现过程中,客户端102可以通过一个注册服务器查询对应该应用的应用控制器104的地址信息,并将应用作业发送到对应地址信息指示的应用控制器104。其中,注册服务器中保存有应用控制器104与应用的对应信息,不同的应用可以拥有不同的应用控制器104。In a specific implementation process, the client 102 can query the address information of the application controller 104 corresponding to the application through a registration server, and send the application job to the application controller 104 indicated by the corresponding address information. The registration server stores the corresponding information of the application controller 104 and the application, and different applications may have different application controllers 104.
因为不同类型的应用对资源容器的资源需求不同,不同的应用的资源容器的规格多种多样,所以为每一类应用配置一个应用控制器104和资源池,可以实现资源容器的合理化利用。 Because the resource requirements of the resource container are different for different types of applications, and the specifications of the resource containers of different applications are various, an application controller 104 and a resource pool are configured for each type of application, so that the resource container can be rationalized and utilized.
414:应用控制器104为客户端102的应用作业分配资源容器。414: The application controller 104 allocates a resource container for the application job of the client 102.
在具体实现过程中,应用控制器104接收到客户端102发送的应用作业后,首先确定资源池114中的空闲的资源容器,并根据该应用作业的规模,将该应用作业分为至少一个任务,并根据任务的数量,从空闲的资源容器中为客户端102的应用作业分配资源容器。In a specific implementation process, after receiving the application job sent by the client 102, the application controller 104 first determines an idle resource container in the resource pool 114, and divides the application job into at least one task according to the scale of the application job. And allocate a resource container for the application job of the client 102 from the idle resource container according to the number of tasks.
如果资源池114中有多种规格的资源容器,因为不同的任务类型需要的资源种类或每种资源的资源数量不同,则应用控制器104还根据任务的种类为不同的任务分配不同规格的资源容器。例如,应用类型为MapReduce,则任务可以为Map类型或Reduce类型,可能Map任务和Reduce任务对资源容器资源的需求不同,则应用控制器104可以为不同类型的资源容器分配不同类型的资源容器。If there are multiple resource containers in the resource pool 114, because the resource types required by different task types or the resources of each resource are different, the application controller 104 also allocates different specifications of resources for different tasks according to the types of tasks. container. For example, if the application type is MapReduce, the task may be a Map type or a Reduce type. The Map task and the Reduce task may have different resource container resources. The application controller 104 may allocate different types of resource containers for different types of resource containers.
因为应用控制器104控制的资源池114中的资源容器可以同时分配给多个客户端102使用,但是一个资源容器只能同时执行一个客户端102分配的任务,所以应用控制器104在为客户端102分配资源容器之前,需要首先确定当前空闲的资源容器,从而从空闲的资源容器中为客户端102的应用作业分配资源。Because the resource containers in the resource pool 114 controlled by the application controller 104 can be simultaneously allocated to the plurality of clients 102 for use, but one resource container can only perform one task assigned by the client 102 at the same time, the application controller 104 is the client. Before allocating the resource container 102, it is necessary to first determine the currently idle resource container, thereby allocating resources for the application job of the client 102 from the idle resource container.
应用控制器104可以通过向资源池114中的每一个资源容器发送查询消息,来确定当前该资源容器是否处于空闲状态。The application controller 104 can determine whether the current resource container is currently in an idle state by sending a query message to each resource container in the resource pool 114.
优选的,应用控制器104维护资源池114中每一个资源容器的状态信息,该状态信息用于表示资源容器是否处于空闲状态,如果资源容器当前已被分配出去,正在运行着其他任务,则该资源容器的状态为不空闲,如果资源容器当前没有被分配出去,没有运行其他任务,则该资源容器的状态为空闲。Preferably, the application controller 104 maintains state information of each resource container in the resource pool 114, the state information is used to indicate whether the resource container is in an idle state, and if the resource container is currently allocated and other tasks are running, The state of the resource container is not idle. If the resource container is not currently allocated and no other tasks are running, the state of the resource container is idle.
应用控制器104可以根据资源池114中每一个资源容器的状态信息,确定资源池114中空闲的资源容器。The application controller 104 can determine the resource containers that are free in the resource pool 114 based on the state information of each resource container in the resource pool 114.
为了描述方便,将应用控制器104为客户端102的应用任务分配的资源容器称为第二资源容器组,该第二资源容器组中包含至少一个资源容器。For convenience of description, the resource container allocated by the application controller 104 to the application task of the client 102 is referred to as a second resource container group, and the second resource container group includes at least one resource container.
如果应用控制器104维护资源池114中每一个资源容器的状态信息,则应用控制器104将第二资源容器组中的资源容器分配给客户端102的应用作业的任务后,还将第二资源容器组中每一个资源容器的状态设置为不空闲。 If the application controller 104 maintains state information of each resource container in the resource pool 114, the application controller 104 allocates the resource container in the second resource container group to the task of the application job of the client 102, and also the second resource. The status of each resource container in the container group is set to not idle.
可选的,应用控制器104还确定资源池114中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,则向资源管理器112发送第二资源请求,并接收资源管理器112发送的该资源池114的第二资源分配消息,根据该资源池114的第二资源分配消息,应用控制器104向节点管理器发送启动请求,请求节点管理器启动为资源池114分配的新增的资源容器。此处,第二资源请求请求的资源数量可以是第一阈值与当前空闲资源容器数目之间的差值,或与该差值呈正相关关系。Optionally, the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and sends the second resource to the resource manager 112 if the number of remaining idle resource containers is less than a preset first threshold. Requesting, and receiving a second resource allocation message of the resource pool 114 sent by the resource manager 112, according to the second resource allocation message of the resource pool 114, the application controller 104 sends a startup request to the node manager, requesting the node manager to start A new resource container that is allocated for resource pool 114. Here, the number of resources of the second resource request request may be a difference between the first threshold and the current number of idle resource containers, or may be positively correlated with the difference.
具体实现过程中,应用服务器104可以在将第二资源容器组中的资源容器分配给客户端102的该应用作业后,或者按照预设的周期确定资源池114中剩余的空闲的资源容器的数目。应了解,并发明实施例对应用服务器104确定资源池114中剩余的空闲的资源容器的数目的时间和方式并不进行限定。In a specific implementation process, the application server 104 may determine the number of idle resource containers remaining in the resource pool 114 after the resource container in the second resource container group is allocated to the application job of the client 102, or according to a preset period. . It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
416:应用控制器104向第二资源容器组中的每一个资源容器下发任务。416: The application controller 104 delivers a task to each resource container in the second resource container group.
应用控制器104将客户端102的作业分为至少一个任务,并为每一个任务选取一个资源容器后,会将对应的任务下发至第二资源容器组中的资源容器。The application controller 104 divides the job of the client 102 into at least one task, and selects a resource container for each task, and then delivers the corresponding task to the resource container in the second resource container group.
418:应用控制器104接收到第二资源容器组中每一个资源容器返回的任务执行结果。418: The application controller 104 receives the task execution result returned by each resource container in the second resource container group.
第二资源容器组中的资源容器将应用控制器104下发的任务执行完毕后,会将任务的执行结果返回给应用控制器104。应理解,此处的执行结果可以是实际的任务执行结果,对于不需要返回结果的任务,此处的执行结果可以是一个任务完成指示消息。After the resource container in the second resource container group finishes executing the task issued by the application controller 104, the execution result of the task is returned to the application controller 104. It should be understood that the execution result here may be an actual task execution result. For a task that does not need to return a result, the execution result here may be a task completion indication message.
如果应用控制器104维护资源池114中每一个资源容器的状态信息,则应用控制器104接收到第二资源容器组中的资源容器返回的任务执行结果后,会将对应资源容器的状态设置为空闲,当有新的应用作业时,可以作为备选资源容器分配给该新的应用作业。If the application controller 104 maintains the state information of each resource container in the resource pool 114, the application controller 104 receives the task execution result returned by the resource container in the second resource container group, and sets the state of the corresponding resource container to Idle, when there is a new application job, it can be assigned to the new application job as an alternate resource container.
可选的,应用控制器104还确定资源池114中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息,所述资源释放消息用于释放所述至少一个资源容器占用的资源。释放的资源容器的数目可以为当前 空闲的资源容器数目与第二阈值之间的差值,或与该差值呈正相关关系。Optionally, the application controller 104 further determines the number of idle resource containers remaining in the resource pool 114, and if the number of remaining free resource containers is greater than a preset second threshold, at least one of the idle resource containers The resource container sends a resource release message, and the resource release message is used to release resources occupied by the at least one resource container. The number of resource containers released can be current The difference between the number of idle resource containers and the second threshold, or a positive correlation with the difference.
可选的,应用服务器104还维护资源池114中每个资源容器的空闲时间,当资源池114中有资源容器的空闲时间大于预设的第三阈值时,应用服务器104向空闲时间大于预设的第三阈值的资源容器发送资源释放消息。Optionally, the application server 104 further maintains the idle time of each resource container in the resource pool 114. When the idle time of the resource container in the resource pool 114 is greater than a preset third threshold, the application server 104 provides the idle time to be greater than the preset time. The third threshold of the resource container sends a resource release message.
可选的,在剩余的空闲的资源容器的数目大于预设的第二阈值时,应用服务器104优先向空闲时间大于预设的第三阈值的资源容器发送资源释放消息。或者当剩余的空闲的资源容器的数目大于预设的第二阈值,且资源池114中有资源容器的空闲时间大于预设的第三阈值时,应用服务器104向空闲时间大于预设的第三阈值的资源容器发送资源释放消息。Optionally, when the number of remaining idle resource containers is greater than a preset second threshold, the application server 104 preferentially sends a resource release message to the resource container whose idle time is greater than a preset third threshold. Or when the number of remaining idle resource containers is greater than a preset second threshold, and the idle time of the resource container in the resource pool 114 is greater than a preset third threshold, the application server 104 is idle longer than the preset third. The threshold resource container sends a resource release message.
具体实现过程中,应用服务器104可以在接收到第二资源容器组中的每一个资源容器返回的执行结果,将第二资源容器组的资源容器重新设置为空闲状态后后,或者按照预设的周期确定资源池114中剩余的空闲的资源容器的数目。应了解,并发明实施例对应用服务器104确定资源池114中剩余的空闲的资源容器的数目的时间和方式并不进行限定。During the specific implementation process, the application server 104 may receive the execution result returned by each resource container in the second resource container group, reset the resource container of the second resource container group to the idle state, or follow the preset. The period determines the number of free resource containers remaining in resource pool 114. It should be appreciated that the time and manner in which the application server 104 determines the number of free resource containers remaining in the resource pool 114 is not limited by the inventive embodiment.
420:应用控制器104将客户端102的应用作业的执行结果返回给客户端102。420: The application controller 104 returns the execution result of the application job of the client 102 to the client 102.
在具体实现方式中,应用控制器104可以将第二资源容器组中每一个资源容器返回的执行结果合并后,将合并后的执行结果发送至客户端102。In a specific implementation, the application controller 104 may merge the execution results returned by each resource container in the second resource container group, and then send the merged execution result to the client 102.
应理解,因为每一类应用对资源容器的规格的需求不同,在具体实现过程中,为每一类应用分别配置应用控制器104和资源池114,能够实现资源的合理利用。但本发明实施例并不对此进行限定,可以多类应用共用一个应用控制器104和资源池114。It should be understood that, because each type of application has different requirements on the specifications of the resource container, in the specific implementation process, the application controller 104 and the resource pool 114 are separately configured for each type of application, so that the rational use of resources can be realized. However, the embodiment of the present invention does not limit the application, and the application controller 104 and the resource pool 114 can be shared by multiple types of applications.
根据本发明实施例公开的技术方案,通过应用控制器提前向资源管理器提前申请资源容器,并提前启动请求的资源容器,当接收到客户端的应用作业后,可以实现资源容器资源的及时分配,避免了资源容器启动的等待时间,且通过对资源池中资源容器的重用,避免了对资源容器的多次开启和关闭的资源消耗以及应多个应用管理器开启和关闭的资源消耗,通过应用控制器直接管理资源池中的资源容器,实现了更加灵活的管理。According to the technical solution disclosed in the embodiment of the present invention, the resource controller is applied to the resource manager in advance by the application controller, and the requested resource container is started in advance. After receiving the application job of the client, the resource container resource can be allocated in time. The waiting time of the resource container startup is avoided, and the resource consumption of the resource container in the resource pool is avoided, and the resource consumption of the resource container being opened and closed multiple times and the resource consumption of multiple application managers being turned on and off are avoided. The controller directly manages the resource containers in the resource pool, enabling more flexible management.
图5为依据本发明一实施例的分布式系统中资源容器的分配装置500的逻辑结构示意图,分布式系统包括装置500和节点管理器,装置500用于 管理分布式系统的节点资源,节点管理器用于基于节点资源启动资源容器,资源容器用于执行应用的任务,如图5所示,装置500包含启动单元502、接收单元504、分配单元506和处理单元508,其中,FIG. 5 is a schematic diagram of a logical structure of a resource container allocation apparatus 500 in a distributed system according to an embodiment of the present invention. The distributed system includes a device 500 and a node manager, where the apparatus 500 is used. The node resource of the distributed system is managed, the node manager is configured to start the resource container based on the node resource, and the resource container is used to perform the task of the application. As shown in FIG. 5, the device 500 includes a starting unit 502, a receiving unit 504, an allocating unit 506, and processing. Unit 508, wherein
启动单元502,用于在满足触发时机时启动应用控制器,并配置应用控制器管理的资源池的初始规格。The startup unit 502 is configured to start the application controller when the trigger timing is met, and configure an initial specification of the resource pool managed by the application controller.
其中,启动单元502可以在接收到预先启动应用控制器的请求,或者接收到预先配置资源池的请求时,启动该应用控制器。启动单元502可以在系统初始化的时候启动;也可以后续根据需求,由用户根据需求,通过启动指令动态的启动;或者由资源管理器根据自身的资源状况,后续进行启动,本发明实施例并不限定应用控制器的启动形式。The activation unit 502 can start the application controller upon receiving a request to start the application controller in advance or receiving a request for pre-configuring the resource pool. The startup unit 502 can be started when the system is initialized, or can be dynamically started by the user according to the requirements according to the requirements, or can be started by the resource manager according to the resource status of the user, and the embodiment of the present invention does not Define the startup form of the application controller.
具体的,启动单元502可以根据预先设置的应用的预期的资源需求信息,配置应用控制器管理的资源池的初始规格;启动单元502还可以根据装置500收集到的分布式系统的节点资源的使用信息,配置应用控制器管理的资源池的初始规格。例如,当有充足未被使用的节点资源时,启动单元502可以配置较大的资源池的初始规格;当未被使用的节点资源较少时,启动单元502可以配置较小的资源池的初始规格。另外,资源池的初始规格可以由用户在应用控制器的启动指令中携带,并由启动单元502在启动应用控制器时,对应用控制器进行配置。Specifically, the initiating unit 502 may configure an initial specification of the resource pool managed by the application controller according to the expected resource requirement information of the preset application; the initiating unit 502 may further use the node resource of the distributed system collected by the device 500. Information, configure the initial specifications of the resource pool managed by the application controller. For example, when there are sufficient unused node resources, the startup unit 502 can configure an initial specification of a larger resource pool; when there are fewer unused node resources, the startup unit 502 can configure an initial of a smaller resource pool. specification. In addition, the initial specification of the resource pool may be carried by the user in the startup command of the application controller, and the startup controller 502 configures the application controller when the application controller is started.
在具体实现过程中,接收单元502可以由图2所示的处理器202,内存单元204和通信接口208来实现。更具体的,可以由处理器202执行内存单元204中的通信模块和资源分配模块,以使通信接口208通过指令启动应用服务器。In a specific implementation process, the receiving unit 502 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. More specifically, the communication module and resource allocation module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to launch the application server by instruction.
接收单元504,用于接收应用控制器根据资源池的初始规格发送的该资源池的第一资源请求。The receiving unit 504 is configured to receive a first resource request of the resource pool that is sent by the application controller according to an initial specification of the resource pool.
在具体实现过程中,接收单元504可以由图2所示的处理器202,内存单元204和通信接口208来实现。更具体的,可以由处理器202执行内存单元204中的通信模块,以使通信接口208接收应用控制器根据资源池的初始规格发送的资源池的资源请求。In a specific implementation process, the receiving unit 504 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. More specifically, the communication module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to receive a resource request from a resource pool that the application controller sent according to the initial specification of the resource pool.
分配单元506,用于根据第一资源请求,为资源池分配初始的资源容器。The allocating unit 506 is configured to allocate an initial resource container to the resource pool according to the first resource request.
在具体实现过程中,分配单元506可以由图2所示的处理器202和内存单元来实现。更具体的,可以由处理器202执行内存单元204中的资源分配 模块,为应用控制器分配资源池的节点资源。In a specific implementation process, the allocating unit 506 can be implemented by the processor 202 and the memory unit shown in FIG. 2. More specifically, resource allocation in memory unit 204 can be performed by processor 202. A module that allocates node resources for a resource pool to an application controller.
发送单元508,用于向应用控制器发送该资源池的第一资源分配消息,第一资源分配消息中包含分配单元506为该资源池分配的初始的资源容器所在的节点的信息。The sending unit 508 is configured to send, to the application controller, a first resource allocation message of the resource pool, where the first resource allocation message includes information about a node where the initial resource container allocated by the allocation unit 506 is allocated to the resource pool.
在具体实现过程中,发送单元508可以由图2所示的处理器202,内存单元204和通信接口208来实现。更具体的,可以由处理器202执行内存单元204中的通信模块,以使通信接口208向应用控制器发送资源池的第一资源分配消息。In a specific implementation process, the sending unit 508 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. 2. More specifically, the communication module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to send a first resource allocation message for the resource pool to the application controller.
本发明实施例是图3和图4实施例中资源管理器的装置实施例,图3和图4实施例部分的特征描述,适用于本发明实施例,在此不再赘述。The embodiment of the present invention is an apparatus embodiment of the resource manager in the embodiment of FIG. 3 and FIG. 4, and the feature description of the embodiment of FIG. 3 and FIG. 4 is applicable to the embodiment of the present invention, and details are not described herein again.
图6为依据本发明一实施例的分布式系统中资源容器的分配装置600的逻辑结构示意图,分布式系统包括资源管理器和节点管理器,资源管理器用于管理分布式系统的节点资源,节点管理器用于基于节点资源启动资源容器,资源容器用于执行应用的任务,资源管理器在满足触发时机时启动装置600,配置装置600管理的资源池的初始规格,并根据资源池的初始规格为该资源池分配初始的资源容器,资源池中初始的资源容器已启动;FIG. 6 is a schematic diagram showing the logical structure of a resource container allocation apparatus 600 in a distributed system according to an embodiment of the present invention. The distributed system includes a resource manager and a node manager, and the resource manager is used to manage node resources and nodes of the distributed system. The manager is configured to start a resource container based on the node resource, where the resource container is used to execute the task of the application, and the resource manager starts the device 600 when the triggering time is met, and configures the initial specification of the resource pool managed by the device 600, and according to the initial specification of the resource pool, The resource pool allocates an initial resource container, and the initial resource container in the resource pool is started;
装置600包括: Apparatus 600 includes:
接收单元602,用于接收来自客户端的资源分配请求,该资源分配请求用于为该客户端上运行的应用请求资源容器,该资源分配请求中携带该应用的资源需求信息。The receiving unit 602 is configured to receive a resource allocation request from the client, where the resource allocation request is used to request a resource container for the application running on the client, where the resource allocation request carries resource requirement information of the application.
在具体实现过程中,接收单元602可以由图2所示的处理器202,内存单元204和通信接口208来实现。更具体的,可以由处理器202执行内存单元204中的通信模块,以使通信接口208接收来自客户端的资源分配请求。In a specific implementation process, the receiving unit 602 can be implemented by the processor 202, the memory unit 204, and the communication interface 208 shown in FIG. More specifically, the communication module in memory unit 204 can be executed by processor 202 to cause communication interface 208 to receive a resource allocation request from the client.
分配单元604,用于根据该应用的资源需求信息,从该资源池中选择空闲的资源容器分配给该客户端。The allocating unit 604 is configured to select, from the resource pool, an idle resource container to be allocated to the client according to the resource requirement information of the application.
在具体实现过程中,分配单元604可以由图2所示的处理器202和内存单元来实现。更具体的,可以由处理器202执行内存单元204中的资源分配模块,为应用控制器分配资源池的节点资源。In a specific implementation process, the allocating unit 604 can be implemented by the processor 202 and the memory unit shown in FIG. 2. More specifically, the resource allocation module in the memory unit 204 can be executed by the processor 202 to allocate the node resources of the resource pool to the application controller.
如图7所示,装置600还包括发送单元606,接收单元602接收来自客户端的资源分配请求之前,发送单元606用于根据该资源池的初始规格向资源管理器发送第一资源请求,第一资源请求中携带根据资源池的初始规格确 定的资源数量;接收单元602还用于接收该资源管理器发送的该资源池的第一资源分配消息,该第一资源分配消息中包含该资源管理器为该资源池分配的初始的资源容器所在的节点的信息;发送单元606还用于向该节点管理器发送启动请求,请求该节点管理器启动该资源池的初始的资源容器。As shown in FIG. 7, the apparatus 600 further includes a sending unit 606. Before the receiving unit 602 receives the resource allocation request from the client, the sending unit 606 is configured to send the first resource request to the resource manager according to the initial specification of the resource pool, where The resource request carries the initial specification according to the resource pool. The receiving unit 602 is further configured to receive a first resource allocation message of the resource pool sent by the resource manager, where the first resource allocation message includes an initial resource container allocated by the resource manager for the resource pool. The information of the node where the node is located; the sending unit 606 is further configured to send a startup request to the node manager, requesting the node manager to start the initial resource container of the resource pool.
具体的,资源分配请求中包含客户端的标识,分配单元604从资源池中选择空闲的资源容器分配给客户端之后,发送单元606用于向分配给客户端的资源容器中每个资源容器发送指示消息,指示消息中携带客户端的标识,指示消息用于指示将每一个资源容器分配给客户端。Specifically, the resource allocation request includes the identifier of the client, and after the allocating unit 604 selects the idle resource container from the resource pool and allocates the data to the client, the sending unit 606 is configured to send an indication message to each resource container in the resource container allocated to the client. The indication message carries the identifier of the client, and the indication message is used to indicate that each resource container is allocated to the client.
如图7所示,装置600还包括确定单元608,确定单元608用于确定资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,发送单元606用于向资源管理器发送第二资源请求,接收单元602还用于接收资源管理器发送的该资源池的第二资源分配消息,发送单元606还用于根据该资源池的第二资源分配消息,向节点管理器发送启动请求,请求节点管理器启动为资源池分配的新增的资源容器。As shown in FIG. 7, the apparatus 600 further includes a determining unit 608, configured to determine the number of free resource containers remaining in the resource pool, and if the number of remaining free resource containers is less than a preset first threshold, send The unit 606 is configured to send a second resource request to the resource manager, and the receiving unit 602 is further configured to receive the second resource allocation message of the resource pool sent by the resource manager, where the sending unit 606 is further configured to use the second resource according to the resource pool. Assign a message, send a start request to the node manager, and request the node manager to start the new resource container allocated for the resource pool.
如果剩余的空闲的资源容器的数目大于预设的第二阈值,则发送单元606用于向空闲的资源容器中的至少一个资源容器发送资源释放消息,资源释放消息用于释放至少一个资源容器占用的资源。If the number of remaining free resource containers is greater than a preset second threshold, the sending unit 606 is configured to send a resource release message to the at least one resource container in the idle resource container, where the resource release message is used to release at least one resource container occupation resource of.
可选的,分配单元604还用于维护所述资源池中每个资源容器的状态信息,所述状态信息表示对应的资源容器是否空闲;分配单元604用于从资源池中选择空闲的资源容器分配给该客户端包括:分配单元604用于根据该资源池中每个资源容器的状态信息,选择该资源池中空闲的资源容器,并将所选择的该资源池中空闲的资源容器分配给该客户端。Optionally, the allocating unit 604 is further configured to maintain state information of each resource container in the resource pool, where the state information indicates whether the corresponding resource container is idle; and the allocating unit 604 is configured to select an idle resource container from the resource pool. The allocation to the client includes: an allocating unit 604, configured to select an idle resource container in the resource pool according to status information of each resource container in the resource pool, and allocate the selected resource container in the selected resource pool to The client.
分配单元604从资源池中选择空闲的资源容器分配给客户端之后,还用于将分配给客户端的资源容器中的每一个资源容器的状态设置为不空闲。After the allocation unit 604 selects an idle resource container from the resource pool and allocates it to the client, it also sets the state of each resource container in the resource container allocated to the client to be not idle.
分配单元604将分配给客户端的资源容器中的每一个资源容器的状态设置为不空闲之后,接收单元602还用于接收来自分配给客户端的资源容器中每一个资源容器的状态更新消息,状态更新消息用于指示完成了客户端分配的任务;分配单元604还用于根据状态更新信息将分配给客户端的资源容器中每一个资源容器的状态设置为空闲。After the allocating unit 604 sets the state of each resource container in the resource container allocated to the client to not idle, the receiving unit 602 is further configured to receive a status update message from each of the resource containers allocated to the client, the status update. The message is used to indicate that the task of the client allocation is completed; the allocating unit 604 is further configured to set the state of each resource container in the resource container allocated to the client to be idle according to the status update information.
可选的,资源分配请求中还包含用户权限信息;接收单元602还用于根据预设的用户权限库,验证用户权限信息,用户权限库包含用户权限信息。 本发明实施例是图3实施例中应用控制器的装置实施例,图3实施例部分的特征描述,适用于本发明实施例,在此不再赘述。Optionally, the resource allocation request further includes user rights information; the receiving unit 602 is further configured to verify user rights information according to the preset user rights library, where the user rights library includes user rights information. The embodiment of the present invention is an apparatus embodiment of the application controller in the embodiment of FIG. 3, and the feature description of the embodiment of FIG. 3 is applicable to the embodiment of the present invention, and details are not described herein again.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,设备和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and may be implemented in another manner, for example, multiple modules or components may be combined or may be Integrate into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated. The components displayed as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of hardware plus software function modules.
上述以软件功能模块的形式实现的集成的模块,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:移动硬盘、只读存储器(英文:Read-Only Memory,简称ROM)、随机存取存储器(英文:Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-described integrated modules implemented in the form of software function modules can be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform some of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a mobile hard disk, a read-only memory (English: Read-Only Memory, ROM for short), a random access memory (English: Random Access Memory, RAM for short), a magnetic disk or an optical disk, and the like. The medium of the code.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。 It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and are not limited thereto; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments are modified, or some of the technical features are equivalently replaced; and the modifications or substitutions do not deviate from the technical scope of the embodiments of the present invention.

Claims (29)

  1. 一种分布式系统,其特征在于,所述分布式系统包括资源管理器和节点管理器;A distributed system, characterized in that the distributed system comprises a resource manager and a node manager;
    所述资源管理器,用于在满足触发时机时启动应用控制器,配置所述应用控制器管理的资源池的初始规格,并接收所述应用控制器根据所述资源池的初始规格发送的所述资源池的第一资源请求,根据所述第一资源请求,为所述资源池分配初始的资源容器,向所述应用控制器发送所述资源池的第一资源分配消息;The resource manager is configured to start an application controller when a triggering time is met, configure an initial specification of a resource pool managed by the application controller, and receive a location that is sent by the application controller according to an initial specification of the resource pool. The first resource request of the resource pool is allocated to the resource pool according to the first resource request, and the first resource allocation message of the resource pool is sent to the application controller;
    所述应用控制器,用于获取所述资源管理器发送的所述资源池的第一资源分配消息,根据所述资源池的第一资源分配消息中指示的为所述资源池分配的初始的资源容器所在的节点的信息,请求所述节点管理器启动所述资源池的初始的资源容器;The application controller is configured to acquire a first resource allocation message of the resource pool sent by the resource manager, and the initial allocation of the resource pool indicated in the first resource allocation message of the resource pool is Information about a node where the resource container is located, requesting the node manager to start an initial resource container of the resource pool;
    所述节点管理器,用于根据所述应用控制器的请求,启动所述初始的资源容器。The node manager is configured to start the initial resource container according to a request of the application controller.
  2. 根据权利要求1所述的系统,其特征在于,所述资源管理器用于在满足触发时机时启动应用控制器包括:所述资源管理器在接收到预先启动所述应用控制器的请求,或者预先配置所述资源池的请求时,启动所述应用控制器。The system according to claim 1, wherein the resource manager is configured to start an application controller when a trigger timing is met, comprising: the resource manager receiving a request to start the application controller in advance, or pre- The application controller is started when the request for the resource pool is configured.
  3. 根据权利要求1或2所述的系统,其特征在于,所述资源管理器用于配置所述应用控制器管理的资源池的初始规格包括:所述资源管理器用于根据预先设置的所述应用的预期的资源需求信息,或根据收集到的所述分布式系统的节点资源的使用信息,配置所述应用控制器管理的资源池的初始规格。The system according to claim 1 or 2, wherein the resource manager is configured to configure an initial specification of a resource pool managed by the application controller, the resource manager configured to use the application according to a preset The expected resource requirement information, or the initial specification of the resource pool managed by the application controller, according to the collected usage information of the node resource of the distributed system.
  4. 根据权利要求1-3任一项所述的系统,其特征在于,所述应用控制器还用于:接收来自客户端的资源分配请求,所述资源分配请求用于为所述客户端上运行的应用请求资源容器,并根据所述资源分配请求中的所述应用的资源需求信息,从所述资源池中选择空闲的资源容器分配给所述客户端。The system according to any one of claims 1 to 3, wherein the application controller is further configured to: receive a resource allocation request from a client, where the resource allocation request is used for running on the client The application requests the resource container, and selects an idle resource container from the resource pool to allocate to the client according to the resource requirement information of the application in the resource allocation request.
  5. 根据权利要求4所述的系统,其特征在于,所述资源分配请求中包含所述客户端的标识,所述应用控制器从所述资源池中选择空闲的资源容器分配给所述客户端之后,还用于:向分配给所述客户端的资源容器中每个资 源容器发送指示消息,所述指示消息中携带所述客户端的标识。The system according to claim 4, wherein the resource allocation request includes an identifier of the client, and after the application controller selects an idle resource container from the resource pool to allocate to the client, Also used for: each resource in a resource container allocated to the client The source container sends an indication message, where the indication message carries the identifier of the client.
  6. 根据权利要求1-5任一项所述的系统,其特征在于,所述应用控制器还用于:确定所述资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,则向所述资源管理器发送第二资源请求,接收所述资源管理器发送的所述资源池的第二资源分配消息,并根据所述资源池的第二资源分配消息,向所述节点管理器发送启动请求,请求所述节点管理器启动为所述资源池分配的新增的资源容器。The system according to any one of claims 1 to 5, wherein the application controller is further configured to: determine the number of free resource containers remaining in the resource pool, if the remaining idle resource containers are Sending a second resource request to the resource manager, and receiving a second resource allocation message of the resource pool sent by the resource manager, and according to the second resource pool, the number is less than a preset first threshold. a resource allocation message, sending a startup request to the node manager, requesting the node manager to start an additional resource container allocated for the resource pool.
  7. 根据权利要求1-6任一项所述的系统,其特征在于,所述应用控制器还用于:确定所述资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息。The system according to any one of claims 1 to 6, wherein the application controller is further configured to: determine the number of free resource containers remaining in the resource pool, if the remaining idle resource containers are If the number is greater than the preset second threshold, the resource release message is sent to at least one resource container in the idle resource container.
  8. 一种分布式系统中资源容器的分配方法,其特征在于,分布式系统包括资源管理器和节点管理器,所述资源管理器用于管理所述分布式系统的节点资源,所述节点管理器用于基于所述节点资源启动资源容器,所述资源容器用于执行应用的任务;A method for allocating resource containers in a distributed system, characterized in that the distributed system comprises a resource manager for managing node resources of the distributed system, and a node manager for Starting a resource container based on the node resource, the resource container for performing a task of an application;
    所述方法包括:The method includes:
    所述资源管理器在满足触发时机时启动应用控制器,并配置所述应用控制器管理的资源池的初始规格;The resource manager starts the application controller when the trigger timing is met, and configures an initial specification of the resource pool managed by the application controller;
    所述资源管理器接收所述应用控制器根据所述资源池的初始规格发送的所述资源池的第一资源请求;Receiving, by the resource manager, a first resource request of the resource pool that is sent by the application controller according to an initial specification of the resource pool;
    所述资源管理器根据所述第一资源请求,为所述资源池分配初始的资源容器;The resource manager allocates an initial resource container to the resource pool according to the first resource request;
    所述资源管理器向所述应用控制器发送所述资源池的第一资源分配消息,所述第一资源分配消息中包含所述资源管理器为所述资源池分配的初始的资源容器所在的节点的信息。The resource manager sends a first resource allocation message of the resource pool to the application controller, where the first resource allocation message includes an initial resource container allocated by the resource manager for the resource pool. Node information.
  9. 根据权利要求8所述的方法,其特征在于,所述资源管理器在满足触发时机时启动应用控制器包括:The method according to claim 8, wherein the starting the application controller when the resource manager meets the trigger timing comprises:
    所述资源管理器接收到预先启动所述应用控制器的请求,或者预先配置所述资源池的请求时,启动所述应用控制器。The resource manager starts the application controller when receiving a request to start the application controller in advance, or pre-configuring the resource pool.
  10. 根据权利要求8或9所述的方法,其特征在于,所述资源管理器配置所述应用控制器管理的资源池的初始规格包括: The method according to claim 8 or 9, wherein the resource manager configures an initial specification of a resource pool managed by the application controller, including:
    所述资源管理器根据预先设置的所述应用的预期的资源需求信息,配置所述应用控制器管理的资源池的初始规格;或者,The resource manager configures an initial specification of a resource pool managed by the application controller according to the expected resource requirement information of the application that is set in advance; or
    所述资源管理器根据收集到的所述分布式系统的节点资源的使用信息,配置所述应用控制器管理的资源池的初始规格。The resource manager configures an initial specification of the resource pool managed by the application controller according to the collected usage information of the node resource of the distributed system.
  11. 一种分布式系统中资源容器的分配方法,其特征在于,分布式系统包括资源管理器和节点管理器,所述资源管理器用于管理分布式系统的节点资源,所述节点管理器用于基于所述节点资源启动资源容器,所述资源管理器在满足触发时机时启动应用控制器,配置所述应用控制器管理的资源池的初始规格,并根据所述资源池的初始规格为所述资源池分配初始的资源容器,所述初始的资源容器已启动;A method for allocating resource containers in a distributed system, characterized in that the distributed system comprises a resource manager for managing node resources of the distributed system, and a node manager for The node resource starts the resource container, and the resource manager starts the application controller when the triggering time is met, configures an initial specification of the resource pool managed by the application controller, and uses the initial specification of the resource pool as the resource pool. Allocating an initial resource container that has been started;
    所述方法包括:The method includes:
    所述应用控制器接收来自客户端的资源分配请求,所述资源分配请求用于为所述客户端上运行的应用请求资源容器,所述资源分配请求中携带所述应用的资源需求信息;The application controller receives a resource allocation request from a client, where the resource allocation request is used to request a resource container for an application running on the client, where the resource allocation request carries resource requirement information of the application;
    所述应用控制器根据所述应用的资源需求信息,从所述资源池中选择空闲的资源容器分配给所述客户端。The application controller selects an idle resource container from the resource pool to allocate to the client according to the resource requirement information of the application.
  12. 根据权利要求11所述的方法,其特征在于,所述应用控制器接收来自所述客户端的资源分配请求之前,所述方法还包括:The method according to claim 11, wherein before the application controller receives the resource allocation request from the client, the method further includes:
    所述应用控制器根据所述资源池的初始规格向所述资源管理器发送第一资源请求;The application controller sends a first resource request to the resource manager according to an initial specification of the resource pool;
    所述应用控制器获取所述资源管理器发送的所述资源池的第一资源分配消息,所述第一资源分配消息中包含所述资源管理器为所述资源池分配的初始的资源容器所在的节点的信息;The application controller acquires a first resource allocation message of the resource pool sent by the resource manager, where the first resource allocation message includes an initial resource container allocated by the resource manager for the resource pool. Information about the node;
    所述应用控制器向所述节点管理器发送启动请求,请求所述节点管理器启动所述资源池的初始的资源容器。The application controller sends a start request to the node manager, requesting the node manager to initiate an initial resource container of the resource pool.
  13. 根据权利要求11或12所述的方法,其特征在于,所述资源分配请求中包含所述客户端的标识,所述应用控制器从所述资源池中选择空闲的资源容器分配给所述客户端之后,所述方法还包括:The method according to claim 11 or 12, wherein the resource allocation request includes an identifier of the client, and the application controller selects an idle resource container from the resource pool and allocates the identifier to the client. Thereafter, the method further includes:
    所述应用控制器向分配给所述客户端的资源容器中每个资源容器发送指示消息,所述指示消息中携带所述客户端的标识。The application controller sends an indication message to each resource container in the resource container allocated to the client, where the indication message carries the identifier of the client.
  14. 根据权利要求11-13任一项所述的方法,其特征在于,所述方法还 包括:Method according to any of claims 11-13, characterized in that the method further include:
    所述应用控制器确定所述资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,则向所述资源管理器发送第二资源请求;The application controller determines the number of idle resource containers remaining in the resource pool, and if the number of remaining free resource containers is less than a preset first threshold, sending a second resource request to the resource manager;
    所述应用控制器接收所述资源管理器发送的所述资源池的第二资源分配消息;The application controller receives a second resource allocation message of the resource pool sent by the resource manager;
    所述应用控制器根据所述资源池的第二资源分配消息,向所述节点管理器发送启动请求,请求所述节点管理器启动为所述资源池分配的新增的资源容器。And the application controller sends a startup request to the node manager according to the second resource allocation message of the resource pool, and requests the node manager to start a new resource container allocated for the resource pool.
  15. 根据权利要求11-14任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 11 to 14, wherein the method further comprises:
    所述应用控制器确定所述资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则向空闲的资源容器中的至少一个资源容器发送资源释放消息。The application controller determines the number of idle resource containers remaining in the resource pool, and if the number of remaining free resource containers is greater than a preset second threshold, sending to at least one resource container in the idle resource container Resource release message.
  16. 根据权利要求11-15任一项所述的方法,其特征在于,所述应用控制器维护所述资源池中每个资源容器的状态信息,所述状态信息表示对应的资源容器是否空闲;The method according to any one of claims 11 to 15, wherein the application controller maintains state information of each resource container in the resource pool, and the state information indicates whether the corresponding resource container is idle;
    所述应用控制器从所述资源池中选择空闲的资源容器分配给所述客户端包括:The selecting, by the application controller, the idle resource container from the resource pool to be allocated to the client includes:
    所述应用控制器根据所述资源池中每个资源容器的状态信息,选择所述资源池中空闲的资源容器,并将所选择的所述资源池中空闲的资源容器分配给所述客户端。The application controller selects an idle resource container in the resource pool according to status information of each resource container in the resource pool, and allocates the selected resource container in the selected resource pool to the client. .
  17. 一种分布式系统中资源容器的分配装置,其特征在于,分布式系统包括所述装置和节点管理器,所述装置用于管理所述分布式系统的节点资源,所述节点管理器用于基于所述节点资源启动资源容器,所述资源容器用于执行应用的任务,A device for allocating resource containers in a distributed system, characterized in that the distributed system comprises the device and a node manager, the device is for managing node resources of the distributed system, and the node manager is used for The node resource starts a resource container, and the resource container is used to perform an application task,
    所述装置包括:The device includes:
    启动单元,用于在满足触发时机时启动应用控制器,并配置所述应用控制器管理的资源池的初始规格;a startup unit, configured to start an application controller when a trigger timing is met, and configure an initial specification of a resource pool managed by the application controller;
    接收单元,用于接收所述应用控制器根据所述资源池的初始规格发送的所述资源池的第一资源请求; a receiving unit, configured to receive a first resource request of the resource pool that is sent by the application controller according to an initial specification of the resource pool;
    分配单元,用于根据所述第一资源请求,为所述资源池分配初始的资源容器;An allocating unit, configured to allocate an initial resource container to the resource pool according to the first resource request;
    发送单元,用于向所述应用控制器发送所述资源池的第一资源分配消息,所述第一资源分配消息中包含所述分配单元为所述资源池分配的初始的资源容器所在的节点的信息。a sending unit, configured to send, to the application controller, a first resource allocation message of the resource pool, where the first resource allocation message includes a node where an initial resource container allocated by the allocation unit is configured by using the resource pool Information.
  18. 根据权利要求17所述的装置,其特征在于,所述启动单元用于在满足触发时机时启动应用控制器,包括:The device according to claim 17, wherein the activation unit is configured to start the application controller when the trigger timing is met, including:
    所述启动单元用于在所述接收单元接收到预先启动所述应用控制器的请求,或者预先配置所述资源池的请求时,启动所述应用控制器。The startup unit is configured to start the application controller when the receiving unit receives a request to start the application controller in advance, or pre-configures the request of the resource pool.
  19. 根据权利要求17或18所述的装置,其特征在于,所述启动单元用于配置所述应用控制器管理的资源池的初始规格,包括:The device according to claim 17 or 18, wherein the activation unit is configured to configure an initial specification of a resource pool managed by the application controller, including:
    所述启动单元用于根据预先设置的所述应用的预期的资源需求信息,配置所述应用控制器管理的资源池的初始规格;或者,The startup unit is configured to configure an initial specification of a resource pool managed by the application controller according to the expected resource requirement information of the application that is set in advance; or
    所述启动单元用于根据收集到的所述分布式系统的节点资源的使用信息,配置所述应用控制器管理的资源池的初始规格。The startup unit is configured to configure an initial specification of a resource pool managed by the application controller according to the collected usage information of the node resource of the distributed system.
  20. 一种分布式系统中资源容器的分配装置,其特征在于,分布式系统包括资源管理器和节点管理器,所述资源管理器用于管理分布式系统的节点资源,所述节点管理器用于基于所述节点资源启动资源容器,所述资源管理器在满足触发时机时启动所述装置,配置所述装置管理的资源池的初始规格,并根据所述资源池的初始规格为所述资源池分配初始的资源容器,所述初始的资源容器已启动;A device for allocating resource containers in a distributed system, characterized in that the distributed system comprises a resource manager for managing node resources of the distributed system, and a node manager for The node resource starts a resource container, and the resource manager starts the device when the triggering time is met, configures an initial specification of the resource pool managed by the device, and allocates an initial allocation to the resource pool according to an initial specification of the resource pool. Resource container, the initial resource container has been started;
    所述装置包括:The device includes:
    接收单元,用于接收来自客户端的资源分配请求,所述资源分配请求用于为所述客户端上运行的应用请求资源容器,所述资源分配请求中携带所述应用的资源需求信息;a receiving unit, configured to receive a resource allocation request from a client, where the resource allocation request is used to request a resource container for an application running on the client, where the resource allocation request carries resource requirement information of the application;
    分配单元,用于根据所述应用的资源需求信息,从所述资源池中选择空闲的资源容器分配给所述客户端。And an allocating unit, configured to select, from the resource pool, an idle resource container to be allocated to the client according to the resource requirement information of the application.
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括发送单元,所述接收单元接收来自所述客户端的资源分配请求之前,所述发送单元用于根据所述资源池的初始规格向所述资源管理器发送第一资源请求;The apparatus according to claim 20, wherein said apparatus further comprises a transmitting unit, said transmitting unit is configured to use an initial specification of said resource pool before receiving said resource allocation request from said client Sending a first resource request to the resource manager;
    所述接收单元还用于:接收所述资源管理器发送的所述资源池的第一资 源分配消息,所述第一资源分配消息中包含所述资源管理器为所述资源池分配的初始的资源容器所在的节点的信息;The receiving unit is further configured to: receive the first resource of the resource pool sent by the resource manager a source allocation message, where the first resource allocation message includes information of a node where the initial resource container allocated by the resource manager for the resource pool is located;
    所述发送单元还用于:向所述节点管理器发送启动请求,请求所述节点管理器启动所述资源池的初始的资源容器。The sending unit is further configured to: send a startup request to the node manager, requesting the node manager to start an initial resource container of the resource pool.
  22. 根据权利要求20或21所述的装置,其特征在于,所述装置还包括发送单元,所述资源分配请求中包含所述客户端的标识,所述分配单元从所述资源池中选择空闲的资源容器分配给所述客户端之后,所述发送单元用于:向分配给所述客户端的资源容器中每个资源容器发送指示消息,所述指示消息中携带所述客户端的标识。The device according to claim 20 or 21, wherein the device further comprises a sending unit, wherein the resource allocation request includes an identifier of the client, and the allocating unit selects an idle resource from the resource pool. After the container is allocated to the client, the sending unit is configured to: send an indication message to each resource container in the resource container allocated to the client, where the indication message carries the identifier of the client.
  23. 根据权利要求20-22任一项所述的装置,其特征在于,所述装置还包括确定单元,和发送单元,所述确定单元用于确定所述资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目小于预设的第一阈值,所述发送单元用于向所述资源管理器发送第二资源请求;The apparatus according to any one of claims 20 to 22, wherein the apparatus further comprises a determining unit, and a transmitting unit, the determining unit is configured to determine the number of free resource containers remaining in the resource pool The sending unit is configured to send a second resource request to the resource manager, if the number of remaining free resource containers is less than a preset first threshold;
    所述接收单元还用于:接收所述资源管理器发送的所述资源池的第二资源分配消息;The receiving unit is further configured to: receive a second resource allocation message of the resource pool sent by the resource manager;
    所述发送单元还用于:根据所述资源池的第二资源分配消息,向所述节点管理器发送启动请求,请求所述节点管理器启动为所述资源池分配的新增的资源容器。The sending unit is further configured to: send, according to the second resource allocation message of the resource pool, a startup request to the node manager, requesting the node manager to start a new resource container allocated for the resource pool.
  24. 根据权利要求20-23任一项所述的装置,其特征在于,所述装置还包括确定单元,和发送单元,所述确定单元用于确定所述资源池中剩余的空闲的资源容器的数目,如果剩余的空闲的资源容器的数目大于预设的第二阈值,则所述发送单元用于向空闲的资源容器中的至少一个资源容器发送资源释放消息。The apparatus according to any one of claims 20 to 23, characterized in that the apparatus further comprises a determining unit, and a transmitting unit, the determining unit for determining the number of free resource containers remaining in the resource pool And if the number of remaining idle resource containers is greater than a preset second threshold, the sending unit is configured to send a resource release message to at least one resource container in the idle resource container.
  25. 根据权利要求20-24任一项所述的装置,其特征在于,所述分配单元还用于维护所述资源池中每个资源容器的状态信息,所述状态信息表示对应的资源容器是否空闲;The device according to any one of claims 20 to 24, wherein the allocating unit is further configured to maintain state information of each resource container in the resource pool, the state information indicating whether the corresponding resource container is idle. ;
    所述分配单元用于从所述资源池中选择空闲的资源容器分配给所述客户端包括:所述分配单元用于根据所述资源池中每个资源容器的状态信息,选择所述资源池中空闲的资源容器,并将所选择的所述资源池中空闲的资源容器分配给所述客户端。The allocating unit is configured to select an idle resource container from the resource pool, and the allocation unit is configured to select the resource pool according to status information of each resource container in the resource pool. An idle resource container, and the selected resource container in the selected resource pool is allocated to the client.
  26. 一种计算机可读介质,其特征在于,包括计算机执行指令,当计 算机的处理器执行所述计算机执行指令时,所述计算机执行权利要求8-10任一项所述的方法。A computer readable medium, comprising: a computer executing instructions The computer executes the method of any of claims 8-10 when the processor of the computer executes the computer-executed instructions.
  27. 一种计算设备,其特征在于,包括:处理器、存储器、总线和通信接口;A computing device, comprising: a processor, a memory, a bus, and a communication interface;
    所述存储器用于存储执行指令,所述处理器与所述存储器通过所述总线连接,当所述计算设备运行时,所述处理器执行所述存储器存储的所述执行指令,以使所述装置执行权利要求8-10任一项所述的方法。The memory is configured to store execution instructions, the processor is coupled to the memory via the bus, and when the computing device is running, the processor executes the execution instructions stored by the memory to cause the The apparatus performs the method of any of claims 8-10.
  28. 一种计算机可读介质,其特征在于,包括计算机执行指令,当计算机的处理器执行所述计算机执行指令时,所述计算机执行权利要求11-16任一项所述的方法。A computer readable medium, comprising computer executed instructions for performing the method of any of claims 11-16 when a processor of a computer executes the computer to execute an instruction.
  29. 一种计算设备,其特征在于,包括:处理器、存储器、总线和通信接口;A computing device, comprising: a processor, a memory, a bus, and a communication interface;
    所述存储器用于存储执行指令,所述处理器与所述存储器通过所述总线连接,当所述计算设备运行时,所述处理器执行所述存储器存储的所述执行指令,以使所述装置执行权利要求11-16任一项所述的方法。 The memory is configured to store execution instructions, the processor is coupled to the memory via the bus, and when the computing device is running, the processor executes the execution instructions stored by the memory to cause the The apparatus performs the method of any of claims 11-16.
PCT/CN2015/099258 2015-12-28 2015-12-28 Resource allocation method, device, and system WO2017113074A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/099258 WO2017113074A1 (en) 2015-12-28 2015-12-28 Resource allocation method, device, and system
CN201580084802.8A CN108293041B (en) 2015-12-28 2015-12-28 Distributed system, resource container allocation method, resource manager and application controller

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/099258 WO2017113074A1 (en) 2015-12-28 2015-12-28 Resource allocation method, device, and system

Publications (1)

Publication Number Publication Date
WO2017113074A1 true WO2017113074A1 (en) 2017-07-06

Family

ID=59224029

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099258 WO2017113074A1 (en) 2015-12-28 2015-12-28 Resource allocation method, device, and system

Country Status (2)

Country Link
CN (1) CN108293041B (en)
WO (1) WO2017113074A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109649A (en) * 2018-02-01 2019-08-09 中国电信股份有限公司 For container control method, device and the containment system of Web service
CN111274022A (en) * 2018-12-05 2020-06-12 北京华胜天成科技股份有限公司 Server resource allocation method and system
CN111427675A (en) * 2020-03-20 2020-07-17 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium
CN111949407A (en) * 2020-08-13 2020-11-17 北京字节跳动网络技术有限公司 Resource allocation method and device
CN112015542A (en) * 2019-05-29 2020-12-01 潘仲光 Resource collection method, device and storage medium
CN112052084A (en) * 2019-06-05 2020-12-08 杭州海康威视数字技术股份有限公司 Resource allocation method and computer equipment
CN112306640A (en) * 2020-11-12 2021-02-02 广州方硅信息技术有限公司 Container dispensing method, apparatus, device and medium therefor
CN112653571A (en) * 2020-08-20 2021-04-13 国家电网公司华中分部 Hybrid scheduling method based on virtual machine and container
CN113419839A (en) * 2021-07-20 2021-09-21 北京字节跳动网络技术有限公司 Resource scheduling method and device for multi-type jobs, electronic equipment and storage medium
CN113507441A (en) * 2021-06-08 2021-10-15 中国联合网络通信集团有限公司 Security resource expansion method, security protection management platform and data node
CN115454391A (en) * 2022-11-11 2022-12-09 零氪科技(北京)有限公司 Client, client construction method and device, electronic equipment and storage medium
CN117873738A (en) * 2024-03-12 2024-04-12 苏州元脑智能科技有限公司 Resource allocation method, device, electronic equipment and storage medium
CN114840125B (en) * 2022-03-30 2024-04-26 曙光信息产业(北京)有限公司 Device resource allocation and management method, device resource allocation and management device, device resource allocation and management medium, and program product

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112835996A (en) * 2019-11-22 2021-05-25 北京初速度科技有限公司 Map production system and method thereof
CN111429091A (en) * 2020-03-19 2020-07-17 北京字节跳动网络技术有限公司 Resource allocation method and device, electronic equipment and storage medium
CN111694649B (en) * 2020-06-12 2023-07-18 北京火山引擎科技有限公司 Resource scheduling method, device, computer equipment and storage medium
CN112083932B (en) * 2020-08-18 2022-02-25 上海交通大学 Function preheating system and method on virtual network equipment
CN113391906B (en) * 2021-06-25 2024-03-01 北京字节跳动网络技术有限公司 Job updating method, job updating device, computer equipment and resource management system
CN113395291B (en) * 2021-06-30 2023-03-17 北京爱奇艺科技有限公司 Flow control method and device, electronic equipment and storage medium
CN115827255B (en) * 2023-02-16 2023-04-21 中国电力科学研究院有限公司 Application resource self-adaptive allocation management method and system for concentrator
CN116389172B (en) * 2023-06-05 2023-09-19 国网四川省电力公司信息通信公司 Multi-tenant-based container cloud platform resource security management method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731595A (en) * 2015-03-26 2015-06-24 江苏物联网研究发展中心 Big-data-analysis-oriented mixing computing system
CN104765870A (en) * 2015-04-26 2015-07-08 成都创行信息科技有限公司 Delay scheduling method related to network data
CN104780146A (en) * 2014-01-13 2015-07-15 华为技术有限公司 Resource manage method and device
CN105045656A (en) * 2015-06-30 2015-11-11 深圳清华大学研究院 Virtual container based big data storage and management method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8706798B1 (en) * 2013-06-28 2014-04-22 Pepperdata, Inc. Systems, methods, and devices for dynamic resource monitoring and allocation in a cluster system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104780146A (en) * 2014-01-13 2015-07-15 华为技术有限公司 Resource manage method and device
CN104731595A (en) * 2015-03-26 2015-06-24 江苏物联网研究发展中心 Big-data-analysis-oriented mixing computing system
CN104765870A (en) * 2015-04-26 2015-07-08 成都创行信息科技有限公司 Delay scheduling method related to network data
CN105045656A (en) * 2015-06-30 2015-11-11 深圳清华大学研究院 Virtual container based big data storage and management method

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109649A (en) * 2018-02-01 2019-08-09 中国电信股份有限公司 For container control method, device and the containment system of Web service
CN110109649B (en) * 2018-02-01 2023-08-08 中国电信股份有限公司 Container control method, device and container system for Web service
CN111274022A (en) * 2018-12-05 2020-06-12 北京华胜天成科技股份有限公司 Server resource allocation method and system
CN111274022B (en) * 2018-12-05 2024-05-14 北京华胜天成科技股份有限公司 Server resource allocation method and system
CN112015542A (en) * 2019-05-29 2020-12-01 潘仲光 Resource collection method, device and storage medium
CN112052084A (en) * 2019-06-05 2020-12-08 杭州海康威视数字技术股份有限公司 Resource allocation method and computer equipment
CN111427675B (en) * 2020-03-20 2023-03-14 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium
CN111427675A (en) * 2020-03-20 2020-07-17 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium
CN111949407B (en) * 2020-08-13 2024-04-12 抖音视界有限公司 Resource allocation method and device
CN111949407A (en) * 2020-08-13 2020-11-17 北京字节跳动网络技术有限公司 Resource allocation method and device
CN112653571A (en) * 2020-08-20 2021-04-13 国家电网公司华中分部 Hybrid scheduling method based on virtual machine and container
CN112653571B (en) * 2020-08-20 2024-03-22 国家电网公司华中分部 Mixed scheduling method based on virtual machine and container
CN112306640A (en) * 2020-11-12 2021-02-02 广州方硅信息技术有限公司 Container dispensing method, apparatus, device and medium therefor
CN113507441A (en) * 2021-06-08 2021-10-15 中国联合网络通信集团有限公司 Security resource expansion method, security protection management platform and data node
CN113419839A (en) * 2021-07-20 2021-09-21 北京字节跳动网络技术有限公司 Resource scheduling method and device for multi-type jobs, electronic equipment and storage medium
CN114840125B (en) * 2022-03-30 2024-04-26 曙光信息产业(北京)有限公司 Device resource allocation and management method, device resource allocation and management device, device resource allocation and management medium, and program product
CN115454391A (en) * 2022-11-11 2022-12-09 零氪科技(北京)有限公司 Client, client construction method and device, electronic equipment and storage medium
CN115454391B (en) * 2022-11-11 2023-06-16 零氪科技(北京)有限公司 Client, client construction method, device, electronic equipment and storage medium
CN117873738A (en) * 2024-03-12 2024-04-12 苏州元脑智能科技有限公司 Resource allocation method, device, electronic equipment and storage medium
CN117873738B (en) * 2024-03-12 2024-05-24 苏州元脑智能科技有限公司 Resource allocation method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108293041A (en) 2018-07-17
CN108293041B (en) 2020-10-09

Similar Documents

Publication Publication Date Title
WO2017113074A1 (en) Resource allocation method, device, and system
US10275851B1 (en) Checkpointing for GPU-as-a-service in cloud computing environment
US9934073B2 (en) Extension of resource constraints for service-defined containers
US11429408B2 (en) System and method for network function virtualization resource management
CN108737468B (en) Cloud platform service cluster, construction method and device
CN107690622B9 (en) Method, equipment and system for realizing hardware acceleration processing
US8756599B2 (en) Task prioritization management in a virtualized environment
CN109313564B (en) Server computer management system for highly available virtual desktops supporting multiple different tenants
US20190334765A1 (en) Apparatuses and methods for site configuration management
WO2015176636A1 (en) Distributed database service management system
US10740133B2 (en) Automated data migration of services of a virtual machine to containers
JP2021504795A (en) Methods, devices, and electronic devices for cloud service migration
CN108089913B (en) Virtual machine deployment method of super-fusion system
WO2016035003A1 (en) Transparent non-uniform memory access (numa) awareness
US11169846B2 (en) System and method for managing tasks and task workload items between address spaces and logical partitions
US9092272B2 (en) Preparing parallel tasks to use a synchronization register
JP2014530413A (en) Method and apparatus for providing isolated virtual space
US11159367B2 (en) Apparatuses and methods for zero touch computing node initialization
US10728169B1 (en) Instance upgrade migration
US20210250234A1 (en) Methods and apparatus to migrate physical server hosts between virtual standard switches and virtual distributed switches in a network
WO2015083255A1 (en) Computer system and virtual machine control method
US20210200573A1 (en) Virtual Machine Live Migration Method And Communications Device
CN111638961A (en) Resource scheduling system and method, computer system, and storage medium
US10289306B1 (en) Data storage system with core-affined thread processing of data movement requests
JP5655612B2 (en) Information processing system, information processing method, and control program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15911705

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15911705

Country of ref document: EP

Kind code of ref document: A1