CN107436806A - A kind of resource regulating method and system - Google Patents

A kind of resource regulating method and system Download PDF

Info

Publication number
CN107436806A
CN107436806A CN201610367792.5A CN201610367792A CN107436806A CN 107436806 A CN107436806 A CN 107436806A CN 201610367792 A CN201610367792 A CN 201610367792A CN 107436806 A CN107436806 A CN 107436806A
Authority
CN
China
Prior art keywords
resource
module
subtask
task
state machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610367792.5A
Other languages
Chinese (zh)
Inventor
赵光亚
章佳磊
李振辉
刘大健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Commerce Group Co Ltd
Original Assignee
Suning Commerce Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Commerce Group Co Ltd filed Critical Suning Commerce Group Co Ltd
Priority to CN201610367792.5A priority Critical patent/CN107436806A/en
Publication of CN107436806A publication Critical patent/CN107436806A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a kind of resource regulating method and system, it is related to electronic information technical field, system delay can be reduced, improves system global calculating performance and execution efficiency.The present invention includes:Being submitted according to client service module for task, application management module and application management operation container for managing the task are created in node administration module;Subtask is generated according to the task by the application management module;The computing resource of corresponding the generated subtask of distribution, and thread is established according to the computing resource distributed, wherein, the corresponding thread in a subtask;Subtask is run on each self-corresponding thread in each subtask.The present invention is applied to the efficient resource scheduling under big flyweight task computation scene.

Description

A kind of resource regulating method and system
Technical field
The present invention relates to electronic information technical field, more particularly to a kind of resource regulating method and system.
Background technology
With the development of big data and related industry, in order to meet growing computational requirements, progressively develop And perfect distributed and cloud computing technology, wherein using all kinds of distributed computing frameworks such as MapReduce and Storm as mainly Instrument, although these Computational frames are each variant on framework and algorithm, all have to operate at corresponding distributed resource On Scheduling Framework.
Current distributed resource scheduling framework, such as Yarn and Mesos etc., are used in a manner of creating subprocess The resource such as cpu and internal memory, so as to meet the demand for the magnanimity computing resource needed for individual task, in order to realize big number According to individual task expense in field is big, the calculating scene that time-consuming.
But but individual task expense high for concurrency is small, takes short calculating scene, the actual motion of individual process Time is minimum, but because concurrency height results in the need for frequently creating process, adds system delay, reduce whole system Execution efficiency.
The content of the invention
Embodiments of the invention provide a kind of resource regulating method and system, can reduce system delay, and it is complete to improve system The calculating performance and execution efficiency of office.
To reach above-mentioned purpose, embodiments of the invention adopt the following technical scheme that:
In a first aspect, embodiments of the invention provide a kind of resource regulating method, including:According to client service module (Client) submitting for task, the application pipe for managing the task is created in node administration module (NodeManager) Manage module (AppMaster) and application management operation container (AppMaster Container);Pass through the application management module (AppMaster) subtask is generated according to the task;The computing resource of corresponding the generated subtask of distribution, and according to dividing The computing resource matched somebody with somebody establishes thread, wherein, the corresponding thread in a subtask;Transported on each self-corresponding thread in each subtask Row subtask.
With reference in a first aspect, in the first possible implementation of first aspect, the distribution corresponds to what is generated The computing resource of subtask, and thread is established according to the computing resource distributed, including:Scheduling of resource module (ResourceManager) computing resource is distributed to the application management module (AppMaster) according to the subtask generated, And thread is established according to the computing resource distributed.
With reference to the first possible implementation of first aspect, in second of possible implementation, the resource In scheduler module (ResourceManager), application example state machine, node instance state machine and resource container example shape are set State machine, the application example state machine have corresponded to application example, the node instance state machine corresponding node example, the resource Container instance state machine corresponds to resource container example, and each state machine is according to the event modification state received.
With reference to the possible implementation of the first of first aspect or first aspect, in the third possible implementation In, it is described to run subtask on each self-corresponding thread in each subtask, including:Start the node administration associated with the task Application management operation container in module, and run the task execution module in each node administration module (NodeManager) (AppWorker) subtasking, wherein, a task execution module is used for a corresponding subtask, and appoints in this son Run corresponding to business on thread.
With reference to second of possible implementation of first aspect, in the 4th kind of possible implementation, in addition to:Institute State scheduling of resource module (ResourceManager) by the task that the client service module (Client) is submitted store to point Cloth columnar database (Hbase), wherein, for cluster (HA) where the scheduling of resource module (ResourceManager), Distributed lock service based on Zookeeper, elect interface to carry out data interaction by leader, and pass through Apache Curator encapsulates the application programming interface (api) of Zookeeper bottoms.
Second aspect, embodiments of the invention provide a kind of resource scheduling system, including:Scheduling of resource module (ResourceManager), for being submitted according to client service module (Client) for task, in node administration module (NodeManager) the application management module (AppMaster) for managing the task is created in and application management operation is held Device (AppMaster Container);The application management module (AppMaster), for being appointed according to task generation Business;The scheduling of resource module (ResourceManager), the computing resource of corresponding the generated subtask of distribution is additionally operable to, And thread is established according to the computing resource distributed, wherein, the corresponding thread in a subtask;In order to each in each subtask Subtask is run on self-corresponding thread.
With reference to second aspect, in the first possible implementation of second aspect, the scheduling of resource module (ResourceManager), specifically for being distributed according to the subtask generated to the application management module (AppMaster) Computing resource, and thread is established according to the computing resource distributed.
With reference to the first possible implementation of second aspect, in second of possible implementation, in the money In source scheduler module (ResourceManager), it is real to be provided with application example state machine, node instance state machine and resource container Example state machine, the application example state machine have corresponded to application example, and the node instance state machine corresponding node example is described Resource container example state machine corresponds to resource container example, and each state machine is according to the event modification state received.
With reference to the possible implementation of the first of second aspect or second aspect, in the third possible implementation In, the application management module, it is additionally operable to notice and starts in the node administration module (NodeManager) associated with the task Application management operation container;The node administration module (NodeManager), for operation task execution module (AppWorker) subtasking, wherein, a task execution module is used for a corresponding subtask, and appoints in this son Run corresponding to business on thread.
With reference to second of possible implementation of second aspect, in the 4th kind of possible implementation, the resource Scheduler module (ResourceManager), be additionally operable to by the client service module (Client) submit task store to Distributed columnar database (Hbase), wherein, for cluster where the scheduling of resource module (ResourceManager) (HA), the distributed lock service based on Zookeeper, elect interface to carry out data interaction by leader, and pass through Apache Curator encapsulates the application programming interface (api) of Zookeeper bottoms.
Resource regulating method provided in an embodiment of the present invention and system, the distributed layer resource using thread as resource units Scheduling strategy and system, to meet that the efficient resource under big flyweight task computation scene is dispatched.Relative to existing scheme, sheet Embodiment employs thread as calculating demand of the scheduling of resource unit for handling magnanimity lightweight task, contrasts traditional scheme In for computing resource in individual task process scheduling cpu+ scheme, it is shared in itself that the present embodiment greatly reduces system Expense, so as to effectively improve the computational efficiency of individual node, and then reduce system delay, improve the global calculating performance of system And execution efficiency.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, it will use below required in embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for ability For the those of ordinary skill of domain, on the premise of not paying creative work, it can also be obtained according to these accompanying drawings other attached Figure.
Fig. 1 is system architecture schematic diagram provided in an embodiment of the present invention;
Fig. 2 is the schematic flow sheet of resource regulating method provided in an embodiment of the present invention;
Fig. 3 is interaction flow schematic diagram provided in an embodiment of the present invention;
Fig. 4 is the schematic diagram of Rmi interface schemes provided in an embodiment of the present invention.
Embodiment
To make those skilled in the art more fully understand technical scheme, below in conjunction with the accompanying drawings and specific embodiment party Formula is described in further detail to the present invention.Embodiments of the present invention are described in more detail below, the embodiment is shown Example is shown in the drawings, wherein same or similar label represents same or similar element or has identical or class from beginning to end Like the element of function.Embodiment below with reference to accompanying drawing description is exemplary, is only used for explaining the present invention, and can not It is construed to limitation of the present invention.Those skilled in the art of the present technique are appreciated that unless expressly stated, odd number shape used herein Formula " one ", "one", " described " and "the" may also comprise plural form.It is to be further understood that the specification of the present invention The middle wording " comprising " used refers to the feature, integer, step, operation, element and/or component be present, but it is not excluded that In the presence of or other one or more features of addition, integer, step, operation, element, component and/or their groups.It should be understood that When we claim element to be " connected " or during " coupled " to another element, it can be directly connected or coupled to other elements, or There may also be intermediary element.In addition, " connection " used herein or " coupling " can include wireless connection or coupling.Here make Wording "and/or" includes any cell of one or more associated list items and all combined.The art Technical staff is appreciated that unless otherwise defined all terms (including technical term and scientific terminology) used herein have With the general understanding identical meaning of the those of ordinary skill in art of the present invention.It is it should also be understood that such as general Those terms defined in dictionary, which should be understood that, has the meaning consistent with the meaning in the context of prior art, and Unless being defined as here, will not be explained with the implication of idealization or overly formal.
The embodiment of the present invention provides a kind of resource scheduling system as shown in Figure 1, including:
Scheduling of resource module (ResourceManager), appoint for what is submitted according to client service module (Client) Business, creates the application management module (AppMaster) for managing the task in node administration module (NodeManager) With application management operation container (AppMaster Container);The scheduling of resource module (ResourceManager), also For distributing the computing resource of corresponding generated subtask, and thread is established according to the computing resource distributed, wherein, one The corresponding thread in subtask;In order to run subtask on each self-corresponding thread in each subtask.
The application management module (AppMaster), for generating subtask according to the task.
Wherein, the scheduling of resource module (ResourceManager), specifically for according to the subtask generated to institute Application management module (AppMaster) distribution computing resource is stated, and thread is established according to the computing resource distributed.
In the scheduling of resource module (ResourceManager), it is provided with application example state machine and (may be simply referred to as RMApp state machines), node instance state machine (may be simply referred to as RMNode state machines) and resource container example state machine (can be referred to as For RMContainer state machines), the application example state machine has corresponded to application example, and the node instance state machine is corresponding Node instance, the resource container example state machine correspond to resource container example, and each state machine is according to the event modification received State.
Further, the application management module, it is additionally operable to notice and starts the node administration module associated with the task (NodeManager) the AppMaster Container in,
The node administration module (NodeManager), for running AppWorker subtaskings, wherein, one is appointed Execution module of being engaged in is used for a corresponding subtask, and is being run corresponding to this subtask on thread.In the present embodiment, institute Scheduling of resource module (ResourceManager) is stated, is additionally operable to the submitting the client service module (Client) of the task Store to distributed columnar database (Hbase) (a kind of distributed, towards row PostgreSQL database), wherein, for described Cluster (HA) where scheduling of resource module (ResourceManager), based on a kind of Zookeeper (distributed, open source code Distributed application program coordination service) distributed lock service, elect interface to carry out data interaction by leader, and pass through A kind of Apache Curator (Advanced Application Interface framework based on Zookeeper encapsulation) encapsulation Zookeeper bottoms are answered With Program Interfaces (api).
Distributed layer resource dispatching strategy of the present embodiment using thread as resource units, to meet big flyweight task Calculate the efficient resource scheduling under scene.Relative to existing scheme, the present embodiment employs thread and used as scheduling of resource unit In the calculating demand of processing magnanimity lightweight task, contrast in traditional scheme for calculating money in individual task process scheduling cpu+ The scheme in source, the present embodiment greatly reduces system shared expense in itself, so as to effectively improve the calculating of individual node Efficiency, and then reduce system delay, improve system global calculating performance and execution efficiency.
The method flow that the present embodiment is provided, specifically it can be used in the system of framework as shown in Figure 1, including:RM (ResourceManager, scheduling of resource module/be alternatively referred to as core node scheduling of resource module), Client (client services Module), NM (NodeManager, node administration module), AM (AppMaster, application management module), AW (AppWorker, appoint Business execution module).Wherein, RM sends task instances (being referred to as in the present embodiment " task ") for handling Client, And start/AM is monitored, it is additionally operable to monitor NM, and to computing resource change/application/and discharge, it is additionally operable to answer AM request pipe Reason/distribution computing resource.All threads of the NM where start/manage on Node (node) and Node Container (based on Calculate resource container), it is additionally operable to the application on node where monitoring, and monitor node state and periodically report node state to RM. AM is used for subtask fractionation/scheduling to the Client single application examples submitted, and is additionally operable to starting and monitoring AW.And by AM Apply for/discharge thread computing resource to RM.Wherein the logic of fractionation/scheduling of subtask can be by user according to required calculating Model is voluntarily realized.AW is used for subtasking, and the computation model run on AW can be by user according to concrete application scene Setting.Client is used to submit task instances to RM, is additionally operable to each node state of monitoring system.In the present embodiment, calculate Resource includes the hardware resources such as cpu and internal memory.
The embodiment of the present invention provides a kind of resource regulating method, as shown in Fig. 2 including:
S1, being submitted according to client service module for task, are created in node administration module for managing the task Application management module and application management operation container.
Wherein, RM can newly create an AM for being directed to the task, or distribute an existing AM for the task.In this reality Apply in example, using thread as scheduling of resource unit, and global scheduling of resource is carried out by RM.
S2, by the application management module according to the task generate subtask.
Wherein, the corresponding thread in a subtask.In the present embodiment, the AM as caused by each application example is carried out single The resource and task scheduling of individual application example, it is configured to split the interface of subtask in AppMaster, so that user is according to industry Business demand and flock size realize specific fractionation mode, wherein performing quilt in allocated thread resources by AppWorker The task of distribution.Such as:In cloud crawler system, it is assumed that an application crawls link comprising 1000, and cluster one shares 100 works Make node, in maximized parallel processing, then can be applied one by being used to split the interface of subtask in AppMaster 100 subtasks of corresponding 100 working nodes are split as, each subtask crawls link comprising 10.
S3, the computing resource of corresponding the generated subtask of distribution, and thread is established according to the computing resource distributed.
Wherein, AppWorker is distributed for each single item subtask, and correspondence is performed in allocated thread resources by AW Subtask, wherein distributing the thread that corresponding computing resource is established can be understood as thread resources.
S4, run subtask on each self-corresponding thread in each subtask.
The present embodiment provides a kind of distributed layer resource dispatching strategy and system using thread as resource units, to meet Efficient resource scheduling under big flyweight task computation scene.Relative to existing scheme, the present embodiment employs thread conduct Scheduling of resource unit is used for the calculating demand for handling magnanimity lightweight task, contrasts in traditional scheme and is adjusted for individual task process The scheme of computing resource in cpu+ is spent, the present embodiment greatly reduces system shared expense in itself, so as to effectively improve The computational efficiency of individual node, and then reduce system delay, improve system global calculating performance and execution efficiency.
In the present embodiment, the computing resource of corresponding the generated subtask of the distribution, and according to the calculating distributed Resource establishes thread, including:Scheduling of resource module is calculated to the application management module assignment according to the subtask generated and provided Source, and thread is established according to the computing resource distributed.Wherein, the son that run on each self-corresponding thread in each subtask is appointed The concrete mode of business, including:The AppMaster Container started in the node administration module associated with the task (should With management operating container), and the AppWorker subtaskings in each NodeManager (node administration module) are run, its In, a task execution module is used for a corresponding subtask, and is being run corresponding to this subtask on thread.
Specifically, during the actual execution of system architecture as shown in Figure 1 method flow as shown in Figure 2, resource The main process of scheduling is as shown in figure 3, specifically include:1st, Client submits task to RM;2nd, submit and appoint to RM for Client Business, RM is created or distribution AM;3rd, RM notifies NM to start AppMaster Container;4th, AM task resolutions generate single application Multiple subtasks corresponding to example, and each subtask is dispatched, wherein, each subtask will be dispatched to corresponding AW by AM;5、AM The computing resource for running AW is asked to RM;6th, RM answers AM request to distribute computing resource;7th, RM provides the calculating distributed Source feeds back to AM;8th, AM notifies NM to start Container operations AW;9th, the subtask as corresponding to being performed AW;10th, NM feeds back to AM The implementing result of each subtask;11st, AM notifies the allocated computing resource of RM releases;12 while AM monitors all subtasks and held Row finishes;13rd, NM informs that AM terminates to RM;14th, RM terminates to the Client tasks of informing.
Specifically, each inter-module communication in the system of framework is realized based on RMI as shown in Figure 1, such as intermodule RMI connects Mouthful relation is as shown in Figure 4, wherein, configure interface 1-1 in Client:ClientRmi;Interface 2-1 is configured in RM: RMClientRmi, interface 2-2:RMAdminRmi, interface 2-3:RMAppMasterRmi, interface 2-4: The interfaces such as RMResourceTrackerRmi;Interface 3-1 is configured in AM:AMResourceRmi, interface 3-2: The interfaces such as AMContainerRmi;Interface 4-1 is configured in NM:NMAdminRmi, interface 4-2:The interfaces such as NMContainerRmi. The message interacted between each module by above-mentioned interface is illustrated, including:
Client:Client receives the message of RM transmissions by ClientRmi, such as:For representing to inform that RM is led Standby switching, more new node (node) state, message of renewal app end-state etc..
ResourceManager:What RM was submitted and stopped by the RMClientRmi receptions Client relevant tasks sent Message;RM sends the message for management node by RMAdminRmi receptions Client, such as:For setting nodal community Message, the message for reading nodal community, the message for reading node state, the message for reset node, for reading Take the message of APP states, the message for reading RM states, the operator for Client that active-standby switch is set manually for RM Message, also pass through RMAdminRmi synchronization caching resources;RM receives the message of AM transmissions, example by RMAppMasterRmi Such as:AM registration message, the message for representing APP execution end, apply for AM to RM or notify release computing resource simultaneously Message of more new state etc.;RM receives the message of NM transmissions by RMResourceTrackerRmi, such as:NM is sent after reaching the standard grade Registration message, message etc. of NM renewal heartbeats and running status.
AppMaster:AM receives the computing resource that RM returns to AM distribution by AMResourceRmi, wherein, RM can be with AM requests distribution computing resource in real time is answered, can also be returned after AM sends request by way of asynchronous allocation to AM and calculate money Source;AM is used to represent that what tasks carrying finished to reply message by what AMContainerRmi received that RM or NM send.
NodeManager:The Container that is used to trigger that NM receives RM or AM transmissions by NMContainerRmi starts Message;Further, RM also passes through status informations of the NMContainerRmi from the NM Container obtained and control Container stops;RM by NMAdminRmi obtain node configuration information and, also pass through NMAdminRmi set or change The configuration information of node, also restart or update caching by the NMAdminRmi nodes triggered.
In the present embodiment, Inter-Process Communication mode is used as by RMI, realizes in system height package frame between each module Bridge joint mouth, and a large amount of management and monitoring interfaces are provided.Relative in the prior art using legacy interface progress process task processing Mode, due to used RMI and height encapsulation framework interface, contrast traditional scheduler framework, can effectively simplify upper layer application Development difficulty and the construction cycle, simultaneously because providing a large amount of monitoring and management interface, easily can manage and monitor cluster Operation conditions, improve the maintainability of system, reduce O&M cost.
Further, application example state machine, node instance shape are set in scheduling of resource module (ResourceManager) State machine and resource container example state machine, the application example state machine have corresponded to application example, the node instance state machine Corresponding node example, the resource container example state machine correspond to resource container example, and each state machine is according to the event received Change state.Wherein, the component in the present embodiment inside modules includes three types:Service (Service), event (EventHandler) and state machine (StateMachine), and pass through asynchronous event driven mode triggering state machine state and turn Change, compared to the computation model of traditional function call mode, the concurrency of the asynchronous event driven mode used in the present embodiment Can be higher, simultaneously as the presence of asynchronous event, reduce the design difficulty of the functional unit for high cohesion, lower coupling.Example Such as:Component clustering in RM is as shown in table 1:
Table 1
Wherein, key data structure of each state machine as each module in system, event generation/processing, Yi Jizhuan are undertaken State conversion/record.The present embodiment in scheduling of resource module ResourceManager, configure RMApp, RMNode and Tri- kinds of state machines of RMContainer, three kinds of application, node and resource container example types have been corresponded to respectively.Each state machine it Between by mutually sending event, change state, realize thread resources scheduling and distribution.Such as:The original state of state machine is NEW, intermediateness have NEW_SAVING/SUBMITTED/ALLOCATED/LAUCHED/RUNNING, and end-state has FINISHED/KILLED/FAILED/EXPIRED.In State Transferring according to event type and the difference with parameter, also wrap Include the process of branch/recurrent state.The layering scheduling for computing resource is realized, has disperseed the complexity of Centroid and has born Carry, in order to the dilatation of the scale of cluster.And it is easy to user to can customize the AppMaster and AppWorker of multiple functions, Support a variety of Heterogeneous Computing model sharing thread resources and performed parallel under same system.
In the present embodiment, in addition to:The scheduling of resource module (ResourceManager) is by the client service The task that module (Client) is submitted is stored to distributed columnar database (Hbase), wherein, for the scheduling of resource module (ResourceManager) cluster (HA), the distributed lock service based on Zookeeper, interface is elected by leader where Data interaction is carried out, and the application programming interface (api) of Zookeeper bottoms is encapsulated by Apache Curator.Tool Body, using realized based on the election strategy that Zookeeper is provided ResourceManager HA (High Available, High availability cluster), so as to ensure business continuance so that RM operates in two or more nodes, is divided into active section Point and secondary node, remaining module in system as shown in Figure 1 can clustered deploy(ment), so as to which proof load is balanced and exception Reason, it is ensured that the High Availabitity of whole system.And cause the system of the present embodiment in all operations in a production environment, it is both needed to nothing Single-point is as premise.Such as:For RM HA implementation:RM High Availabitity framework, the distribution based on ZK (Zookeeper) Formula lock service, encapsulating ZK bottoms api by Apache Curator, (Application Programming Interface, should With Program Interfaces), and provide and interface is elected by leader.And Hbase is used to store what is had been filed on as shared storage Mission bit stream, it is ensured that standby host can continue executing with unfinished task upon actuation.So as to meet the present embodiment in distributed system In concrete application, such as:The scheme provided by the present embodiment, using thread as scheduling of resource unit, in distributed network The lightweight calculating task scene such as crawler system cluster.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment Divide mutually referring to what each embodiment stressed is the difference with other embodiment.It is real especially for equipment For applying example, because it is substantially similar to embodiment of the method, so describing fairly simple, related part is referring to embodiment of the method Part explanation.The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited to This, any one skilled in the art the invention discloses technical scope in, the change that can readily occur in or replace Change, should all be included within the scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claim Enclose and be defined.

Claims (10)

  1. A kind of 1. resource regulating method, it is characterised in that including:
    Being submitted according to client service module (Client) for task, create and use in node administration module (NodeManager) In the application management module (AppMaster) for managing the task and application management operation container (AppMaster Container);
    Subtask is generated according to the task by the application management module (AppMaster);
    The computing resource of corresponding the generated subtask of distribution, and thread is established according to the computing resource distributed, wherein, one The corresponding thread in subtask;
    Subtask is run on each self-corresponding thread in each subtask.
  2. 2. according to the method for claim 1, it is characterised in that the calculating money of corresponding the generated subtask of distribution Source, and thread is established according to the computing resource distributed, including:
    Scheduling of resource module (ResourceManager) is according to the subtask generated to the application management module (AppMaster) computing resource is distributed, and thread is established according to the computing resource distributed.
  3. 3. according to the method for claim 2, it is characterised in that in the scheduling of resource module (ResourceManager), Application example state machine, node instance state machine and resource container example state machine are set, and the application example state machine corresponds to Application example, the node instance state machine corresponding node example, the resource container example state machine correspond to resource container Example, each state machine is according to the event modification state received.
  4. 4. method according to claim 1 or 2, it is characterised in that described to be transported on each self-corresponding thread in each subtask Row subtask, including:
    Start the application management operation container in the node administration module associated with the task, and run each node administration module (NodeManager) task execution module (AppWorker) subtasking in, wherein, a task execution module is used for A corresponding subtask, and run corresponding to this subtask on thread.
  5. 5. according to the method for claim 3, it is characterised in that also include:
    The scheduling of resource module (ResourceManager) deposits the task that the client service module (Client) is submitted Store up to distributed columnar database (Hbase), wherein, for collecting where the scheduling of resource module (ResourceManager) Group (HA), the distributed lock service based on Zookeeper, elect interface to carry out data interaction by leader, and pass through Apache Curator encapsulate the application programming interface (api) of Zookeeper bottoms.
  6. A kind of 6. resource scheduling system, it is characterised in that including:
    Scheduling of resource module (ResourceManager), for being submitted according to client service module (Client) for task, The application management module (AppMaster) for managing the task is created in node administration module (NodeManager) and is answered With management operating container (AppMaster Container);
    The application management module (AppMaster), for generating subtask according to the task;
    The scheduling of resource module (ResourceManager), the computing resource of corresponding the generated subtask of distribution is additionally operable to, And thread is established according to the computing resource distributed, wherein, the corresponding thread in a subtask;In order to each in each subtask Subtask is run on self-corresponding thread.
  7. 7. system according to claim 6, it is characterised in that the scheduling of resource module (ResourceManager), tool Body is used to distribute computing resource to the application management module (AppMaster) according to the subtask generated, and according to dividing The computing resource matched somebody with somebody establishes thread.
  8. 8. system according to claim 7, it is characterised in that in the scheduling of resource module (ResourceManager) In, it is provided with application example state machine, node instance state machine and resource container example state machine, the application example state machine Application example, the node instance state machine corresponding node example are corresponded to, the resource container example state machine corresponds to resource Container instance, each state machine is according to the event modification state received.
  9. 9. the system according to claim 6 or 7, it is characterised in that the application management module, be additionally operable to notice start with Application management operation container in the node administration module (NodeManager) of the task association;
    The node administration module (NodeManager), for operation task execution module (AppWorker) subtasking, Wherein, a task execution module is used for a corresponding subtask, and is being run corresponding to this subtask on thread.
  10. 10. system according to claim 8, it is characterised in that the scheduling of resource module (ResourceManager), It is additionally operable to store the task that the client service module (Client) is submitted to distributed columnar database (Hbase), its In, for cluster (HA), the distributed lock based on Zookeeper where the scheduling of resource module (ResourceManager) Service, elect interface to carry out data interaction by leader, and answering for Zookeeper bottoms is encapsulated by Apache Curator With Program Interfaces (api).
CN201610367792.5A 2016-05-27 2016-05-27 A kind of resource regulating method and system Pending CN107436806A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610367792.5A CN107436806A (en) 2016-05-27 2016-05-27 A kind of resource regulating method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610367792.5A CN107436806A (en) 2016-05-27 2016-05-27 A kind of resource regulating method and system

Publications (1)

Publication Number Publication Date
CN107436806A true CN107436806A (en) 2017-12-05

Family

ID=60454535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610367792.5A Pending CN107436806A (en) 2016-05-27 2016-05-27 A kind of resource regulating method and system

Country Status (1)

Country Link
CN (1) CN107436806A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345497A (en) * 2018-01-17 2018-07-31 千寻位置网络有限公司 GNSS positions execution method and system, the positioning device of simulation offline
CN108363649A (en) * 2017-12-29 2018-08-03 微梦创科网络科技(中国)有限公司 A kind of method and device of distribution statistical log visit capacity
CN108491253A (en) * 2018-01-30 2018-09-04 济南浪潮高新科技投资发展有限公司 A kind of calculating task processing method and edge calculations equipment
CN109361532A (en) * 2018-09-11 2019-02-19 上海天旦网络科技发展有限公司 The high-availability system and method and computer readable storage medium of network data analysis
CN109471621A (en) * 2018-09-26 2019-03-15 西安电子科技大学工程技术研究院有限公司 A kind of tools build method under linux system based on big data
CN109525436A (en) * 2018-12-19 2019-03-26 福建新大陆软件工程有限公司 Application program main/standby switching method and system
CN109684036A (en) * 2018-12-17 2019-04-26 武汉烽火信息集成技术有限公司 A kind of container cluster management method, storage medium, electronic equipment and system
CN110442446A (en) * 2019-06-29 2019-11-12 西南电子技术研究所(中国电子科技集团公司第十研究所) The method of processing high-speed digital signal data flow in real time
CN110647393A (en) * 2018-06-27 2020-01-03 厦门本能管家科技有限公司 Elastic process management system and method
WO2020112029A1 (en) * 2018-11-30 2020-06-04 Purple Ds Private Ltd. System and method for facilitating participation in a blockchain environment
CN111651276A (en) * 2020-06-04 2020-09-11 杭州海康威视系统技术有限公司 Scheduling method and device and electronic equipment
CN112965800A (en) * 2021-03-09 2021-06-15 上海焜耀网络科技有限公司 Distributed computing task scheduling system
CN113448710A (en) * 2021-07-01 2021-09-28 星辰天合(北京)数据科技有限公司 Distributed application system based on business resources
CN113515356A (en) * 2021-04-13 2021-10-19 中国航天科工集团八五一一研究所 Lightweight distributed resource management and task scheduler and method
CN113641495A (en) * 2021-08-12 2021-11-12 成都中科大旗软件股份有限公司 Distributed scheduling method and system based on big data calculation
CN115242596A (en) * 2022-06-13 2022-10-25 中国科学院信息工程研究所 User-oriented network test bed scene service scheduling method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279390A (en) * 2012-08-21 2013-09-04 中国科学院信息工程研究所 Parallel processing system for small operation optimizing
US20140059310A1 (en) * 2012-08-24 2014-02-27 Vmware, Inc. Virtualization-Aware Data Locality in Distributed Data Processing
CN103944769A (en) * 2014-05-05 2014-07-23 江苏物联网研究发展中心 RPC (Remote Procedure Call) protocol based cluster resource unified management system
US20140245298A1 (en) * 2013-02-27 2014-08-28 Vmware, Inc. Adaptive Task Scheduling of Hadoop in a Virtualized Environment
CN104112049A (en) * 2014-07-18 2014-10-22 西安交通大学 P2P (peer-to-peer) architecture based cross-data-center MapReduce task scheduling system and P2P architecture based cross-data-center MapReduce task scheduling method
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform
CN104951372A (en) * 2015-06-16 2015-09-30 北京工业大学 Method for dynamic allocation of Map/Reduce data processing platform memory resources based on prediction
CN105512083A (en) * 2015-11-30 2016-04-20 华为技术有限公司 YARN based resource management method, device and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279390A (en) * 2012-08-21 2013-09-04 中国科学院信息工程研究所 Parallel processing system for small operation optimizing
US20140059310A1 (en) * 2012-08-24 2014-02-27 Vmware, Inc. Virtualization-Aware Data Locality in Distributed Data Processing
US20140245298A1 (en) * 2013-02-27 2014-08-28 Vmware, Inc. Adaptive Task Scheduling of Hadoop in a Virtualized Environment
CN103944769A (en) * 2014-05-05 2014-07-23 江苏物联网研究发展中心 RPC (Remote Procedure Call) protocol based cluster resource unified management system
CN104112049A (en) * 2014-07-18 2014-10-22 西安交通大学 P2P (peer-to-peer) architecture based cross-data-center MapReduce task scheduling system and P2P architecture based cross-data-center MapReduce task scheduling method
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform
CN104951372A (en) * 2015-06-16 2015-09-30 北京工业大学 Method for dynamic allocation of Map/Reduce data processing platform memory resources based on prediction
CN105512083A (en) * 2015-11-30 2016-04-20 华为技术有限公司 YARN based resource management method, device and system

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363649A (en) * 2017-12-29 2018-08-03 微梦创科网络科技(中国)有限公司 A kind of method and device of distribution statistical log visit capacity
CN108345497A (en) * 2018-01-17 2018-07-31 千寻位置网络有限公司 GNSS positions execution method and system, the positioning device of simulation offline
CN108491253A (en) * 2018-01-30 2018-09-04 济南浪潮高新科技投资发展有限公司 A kind of calculating task processing method and edge calculations equipment
CN110647393A (en) * 2018-06-27 2020-01-03 厦门本能管家科技有限公司 Elastic process management system and method
CN109361532A (en) * 2018-09-11 2019-02-19 上海天旦网络科技发展有限公司 The high-availability system and method and computer readable storage medium of network data analysis
CN109361532B (en) * 2018-09-11 2021-08-24 上海天旦网络科技发展有限公司 High availability system and method for network data analysis and computer readable storage medium
CN109471621A (en) * 2018-09-26 2019-03-15 西安电子科技大学工程技术研究院有限公司 A kind of tools build method under linux system based on big data
WO2020112029A1 (en) * 2018-11-30 2020-06-04 Purple Ds Private Ltd. System and method for facilitating participation in a blockchain environment
CN109684036A (en) * 2018-12-17 2019-04-26 武汉烽火信息集成技术有限公司 A kind of container cluster management method, storage medium, electronic equipment and system
CN109684036B (en) * 2018-12-17 2021-08-10 武汉烽火信息集成技术有限公司 Container cluster management method, storage medium, electronic device and system
CN109525436B (en) * 2018-12-19 2022-09-16 福建新大陆软件工程有限公司 Method and system for switching main application program and standby application program
CN109525436A (en) * 2018-12-19 2019-03-26 福建新大陆软件工程有限公司 Application program main/standby switching method and system
CN110442446A (en) * 2019-06-29 2019-11-12 西南电子技术研究所(中国电子科技集团公司第十研究所) The method of processing high-speed digital signal data flow in real time
CN110442446B (en) * 2019-06-29 2022-12-13 西南电子技术研究所(中国电子科技集团公司第十研究所) Method for real-time processing high-speed digital signal data stream
CN111651276A (en) * 2020-06-04 2020-09-11 杭州海康威视系统技术有限公司 Scheduling method and device and electronic equipment
CN112965800A (en) * 2021-03-09 2021-06-15 上海焜耀网络科技有限公司 Distributed computing task scheduling system
CN113515356A (en) * 2021-04-13 2021-10-19 中国航天科工集团八五一一研究所 Lightweight distributed resource management and task scheduler and method
CN113515356B (en) * 2021-04-13 2022-11-25 中国航天科工集团八五一一研究所 Lightweight distributed resource management and task scheduler and method
CN113448710A (en) * 2021-07-01 2021-09-28 星辰天合(北京)数据科技有限公司 Distributed application system based on business resources
CN113448710B (en) * 2021-07-01 2024-04-09 北京星辰天合科技股份有限公司 Distributed application system based on business resources
CN113641495A (en) * 2021-08-12 2021-11-12 成都中科大旗软件股份有限公司 Distributed scheduling method and system based on big data calculation
CN115242596A (en) * 2022-06-13 2022-10-25 中国科学院信息工程研究所 User-oriented network test bed scene service scheduling method and device
CN115242596B (en) * 2022-06-13 2024-04-30 中国科学院信息工程研究所 User-oriented network test bed scene service scheduling method and device

Similar Documents

Publication Publication Date Title
CN107436806A (en) A kind of resource regulating method and system
CN103944769B (en) Cluster resource system for unified management based on RPC agreements
CN102760074B (en) Method and its system for high load capacity operation flow scalability
CN100489790C (en) Processing management device, computer system, distributed processing method
CN105045658B (en) A method of realizing that dynamic task scheduling is distributed using multinuclear DSP embedded
CN103279390B (en) A kind of parallel processing system (PPS) towards little optimization of job
CN104050029B (en) A kind of task scheduling system
CN110247954A (en) A kind of dispatching method and system of distributed task scheduling
CN105912401A (en) Distributed data batch processing system and method
CN107020635B (en) Method for operating multi-master-node robot operating system on multiple robots
CN104503832B (en) A kind of scheduling virtual machine system and method for fair and efficiency balance
CN105404549B (en) Scheduling virtual machine system based on yarn framework
CN105786611A (en) Method and device for task scheduling of distributed cluster
CN110958311A (en) YARN-based shared cluster elastic expansion system and method
Li et al. Research on distributed architecture based on SOA
CN109783225A (en) A kind of tenant's priority management method and system of multi-tenant big data platform
CN108073414B (en) Implementation method for merging multithreading concurrent requests and submitting and distributing results in batches based on Jedis
CN109739640A (en) A kind of container resource management system based on Shen prestige framework
CN102916992A (en) Method and system for scheduling cloud computing remote resources unifiedly
CN113535362A (en) Distributed scheduling system architecture and micro-service workflow scheduling method
CN108170417B (en) Method and device for integrating high-performance job scheduling framework in MESOS cluster
CN101006426A (en) Processing management device, computer system, distributed processing method, and computer program
CN101540776A (en) Grid middleware system supporting adaptive scheduling
CN110740047B (en) Network slice management arrangement system
CN102137162A (en) CAD (Computer Aided Design) integrated system based on mode of software used as service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171205

RJ01 Rejection of invention patent application after publication