CN108845884A - Physical source distributing method, apparatus, computer equipment and storage medium - Google Patents

Physical source distributing method, apparatus, computer equipment and storage medium Download PDF

Info

Publication number
CN108845884A
CN108845884A CN201810621848.4A CN201810621848A CN108845884A CN 108845884 A CN108845884 A CN 108845884A CN 201810621848 A CN201810621848 A CN 201810621848A CN 108845884 A CN108845884 A CN 108845884A
Authority
CN
China
Prior art keywords
task
spark
resource
spark task
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810621848.4A
Other languages
Chinese (zh)
Other versions
CN108845884B (en
Inventor
黄志辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN201810621848.4A priority Critical patent/CN108845884B/en
Publication of CN108845884A publication Critical patent/CN108845884A/en
Application granted granted Critical
Publication of CN108845884B publication Critical patent/CN108845884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

This application involves a kind of physical source distributing method, apparatus, computer equipment and storage mediums.The method includes:Receive the Spark task and corresponding configuration file that terminal is submitted;The resource allocation parameters that Spark task is read from configuration file carry out physical source distributing according to resource allocation parameters;Physical resource based on distribution executes Spark task, monitors the execution efficiency of Spark task;When monitoring execution efficiency lower than threshold value, the resource allocation parameters in configuration file are adjusted;Spark task is dispatched to the physical resource being adapted with resource allocation parameters adjusted from allocated physical resource to continue to execute.Spark task version updating can not depended on using this method, Resource dynamic allocation is carried out to Spark task in time, and then improve Spark task run efficiency.

Description

Physical source distributing method, apparatus, computer equipment and storage medium
Technical field
This application involves field of computer technology, set more particularly to a kind of physical source distributing method, apparatus, computer Standby and storage medium.
Background technique
A kind of Spark (computing engines for large-scale data processing) task is committed to task schedule after the completion of exploitation Platform.Task schedule platform can execute multiple Spark task schedules.Task schedule platform needs for each Spark task Suitable physical resource is distributed, such as CPU (Central Processing Unit, central processing unit), memory etc..Resource allocation It is unreasonable to will lead to Spark task run inefficiency, or even be unable to run at all.However, the money of traditional approach Spark task Source allocation strategy is a kind of method of static state.Even if there is a situation where resource allocation it is unreasonable be also required to until Spark task into Physical source distributing can be just re-started after row version updating, and thus Spark task run efficiency is affected greatly.
Summary of the invention
Based on this, it is necessary to which in view of the above technical problems, providing one kind, can not depend on Spark task version updating timely To Spark task carry out Resource dynamic allocation, and then improve Spark task run efficiency physical source distributing method, apparatus, Computer equipment and storage medium.
A kind of physical source distributing method, the method includes:Receive Spark task and corresponding configuration that terminal is submitted File;The resource allocation parameters that the Spark task is read from the configuration file are carried out according to the resource allocation parameters Physical source distributing;Physical resource based on distribution executes the Spark task;During the Spark task execution, monitoring The execution efficiency of the Spark task;When monitoring the execution efficiency lower than threshold value, to the resource in the configuration file Allocation of parameters is adjusted;The Spark task is dispatched to from allocated physical resource and is joined with resource allocation adjusted It is continued to execute on the adaptable physical resource of number.
In one of the embodiments, it is described receive terminal submit Spark task and corresponding configuration file before, also Including:It receives the Spark task that terminal is sent and develops request;The exploitation request is identified comprising entrance function;Enter described in identification The corresponding function queue of mouth function identification;The function queue includes multiple business functions;Multiple business functions are distinguished Be converted to corresponding multiple background tasks;It calls the entrance function to identify corresponding group's decorator to seal multiple background tasks Dress is multiple tasks group;The dispatching sequence for configuring multiple tasks group, is packaged multiple tasks group based on the dispatching sequence, Obtain the Spark task.
The Spark task includes Shell script in one of the embodiments,;The Shell script is prefixed to institute State the call back function of configuration file;The resource allocation parameters that the Spark task is read from the configuration file, packet It includes:By executing Spark task described in Shell script startup;Based on the call back function, the configuration file is returned in generation Adjust instruction;Corresponding configuration file is pulled according to the callback instruction;The resource point is read from the configuration file pulled With parameter.
The physical resource based on distribution executes the Spark task in one of the embodiments, including:By institute It states Spark task and is split as multiple tasks group;Each task groups have corresponding task group identification;The task groups are split as Multiple background tasks;Each background task has corresponding log decorator;Physical resource based on distribution executes multiple bases Task generates the execution journal of each background task;Using the log decorator, in the execution journal of corresponding background task Add the corresponding task group identification of the background task;When the Spark task execution finishes, there is same task group to record Multiple execution journal of mark are collected, and generate the corresponding task daily record of each task group identification.
The execution efficiency of the monitoring Spark task in one of the embodiments, including:Calculate the Spark The task total amount of task;The task duration of the Spark task is calculated according to the task total amount;According to preset time frequency tune The operation information of the Spark task is acquired with task run monitor component;The Spark is calculated according to the operation information to appoint It is engaged in the task execution amount of multiple timing nodes;According to the task execution amount and the task duration, calculates the Spark and appoint It is engaged in the execution efficiency of multiple timing nodes.
It is described when monitoring the execution efficiency lower than threshold value in one of the embodiments, to the configuration file In resource allocation parameters be adjusted, including:Compare whether the execution efficiency is lower than threshold value;If so, according to the task Total amount and task execution amount calculate remaining task amount;Residual time length is calculated according to the task duration and current timing node; Newly-increased physical resource is needed according to the remaining task amount and residual time length measuring and calculating;Otherwise, according to the operational information recording Two neighboring timing node resource using information, computing resource utilization rate;It needs to release according to resource utilization measuring and calculating The physical resource put;The resource allocation parameters are adjusted according to results of measuring.
It is described when monitoring the execution efficiency lower than threshold value in one of the embodiments, to the configuration file In resource allocation parameters be adjusted, including:Compare whether the execution efficiency is lower than threshold value;If so, marking the Spark Task execution is abnormal, obtains the task daily record of the Spark task;Abnormal cause positioning is carried out according to the task daily record;If The abnormal cause includes physical resource deficiency, generates resource adjustment according to the resource allocation parameters of configuration file record and mentions Show the page, the resource is adjusted into the prompt page and is sent to the terminal;The terminal is set to adjust the prompt page in the resource The resource allocation parameters are adjusted.
A kind of physical source distributing device, described device include:Resource distribution module, for receiving terminal submission Spark task and corresponding configuration file;The resource allocation parameters of the Spark task, root are read from the configuration file Physical source distributing is carried out according to the resource allocation parameters;Efficiency monitoring module executes institute for the physical resource based on distribution State Spark task;During the Spark task execution, the execution efficiency of the Spark task is monitored;Resource adjusts module, For being adjusted to the resource allocation parameters in the configuration file when monitoring the execution efficiency lower than threshold value;It will The Spark task is dispatched on the physical resource being adapted with resource allocation parameters adjusted from allocated physical resource It continues to execute.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device realizes following steps when executing the computer program:Receive the Spark task and corresponding configuration file that terminal is submitted;From The resource allocation parameters that the Spark task is read in the configuration file carry out physics money according to the resource allocation parameters Source distribution;Physical resource based on distribution executes the Spark task;During the Spark task execution, described in monitoring The execution efficiency of Spark task;When monitoring the execution efficiency lower than threshold value, to the resource allocation in the configuration file Parameter is adjusted;The Spark task is dispatched to and resource allocation parameters phase adjusted from allocated physical resource It is continued to execute on the physical resource of adaptation.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor Following steps are realized when row:Receive the Spark task and corresponding configuration file that terminal is submitted;It is read from the configuration file The resource allocation parameters of the Spark task carry out physical source distributing according to the resource allocation parameters;Object based on distribution It manages resource and executes the Spark task;During the Spark task execution, the execution efficiency of the Spark task is monitored; When monitoring the execution efficiency lower than threshold value, the resource allocation parameters in the configuration file are adjusted;It will be described Spark task is dispatched on the physical resource being adapted with resource allocation parameters adjusted from allocated physical resource to be continued It executes.
Above-mentioned physical source distributing method, apparatus, computer equipment and storage medium, the configuration file submitted according to terminal The resource allocation parameters of the Spark task of middle record, the Spark task that can be submitted for terminal distribute physical resource;Based on point The physical resource matched can execute Spark task;By monitoring the execution efficiency of the Spark task, can be tied according to monitoring Fruit is adjusted the resource allocation parameters in the configuration file;By the Spark task schedule to resource adjusted The adaptable physical resource of allocation of parameters executes.Since resource allocation parameters individually being stored in a manner of configuration file, Independently of Spark task itself, resource allocation ginseng is flexibly freely modified so as to get rid of the limitation of Spark task version updating Number;Real-time monitoring Spark task execution efficiency, and according to the physical resource of execution efficiency dynamic adjustment distribution, it is adapted to Actual demand of the Spark task to physical resource, and then Spark task execution efficiency can be improved.
Detailed description of the invention
Fig. 1 is the application scenario diagram of physical source distributing method in one embodiment;
Fig. 2 is the flow diagram of physical source distributing method in one embodiment;
Fig. 3 is the structural block diagram of physical source distributing device in one embodiment;
Fig. 4 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Physical source distributing method provided by the present application, can be applied in application environment as shown in Figure 1.Wherein, eventually End 102 is communicated with server 104 by network.Wherein, terminal 102 can be, but not limited to be various personal computers, pen Remember this computer, smart phone, tablet computer and portable wearable device, what server 104 can be formed with multiple servers Server cluster is realized.Spark task and corresponding configuration file are committed to server 104 by terminal 102.Server 104 On deploy task schedule platform, to Spark task schedule execute.The resource point that task schedule platform is recorded based on configuration file It is that Spark task distributes corresponding physical resource with parameter.Task schedule platform provides the physics of Spark task schedule to distribution It executes on source, and is monitored according to execution efficiency of the preset time frequency to Spark task.Task schedule platform compares execution efficiency Whether threshold value is lower than.If execution efficiency is lower than threshold value, remaining task amount is calculated according to task total amount and task execution amount;According to appoint Duration of being engaged in and current timing node calculate residual time length;Newly-increased physics is needed according to remaining task amount and residual time length measuring and calculating Resource.If execution efficiency is greater than or equal to threshold value, letter is used according to the resource of the two neighboring timing node of operational information recording Breath, computing resource utilization rate;The physical resource for needing to discharge according to resource utilization measuring and calculating.Task schedule platform stops executing Spark task joins Spark task schedule to resource allocation adjusted according to results of measuring adjustresources allocation of parameters The adaptable physical resource of number executes.Above-mentioned physical source distributing process, due to by resource allocation parameters with the side of configuration file Formula is individually stored, and independently of Spark task itself, the limitation so as to get rid of Spark task version updating is flexible certainly By modification resource allocation parameters;Real-time monitoring Spark task execution efficiency, and according to the physics of execution efficiency dynamic adjustment distribution Resource is adapted to actual demand of the Spark task to physical resource, and then Spark task execution efficiency can be improved.
In one embodiment, as shown in Fig. 2, providing a kind of physical source distributing method, it is applied to Fig. 1 in this way In server for be illustrated, include the following steps:
Step 202, the Spark task and corresponding configuration file that terminal is submitted are received.
The corresponding service logic script of Spark task includes Shell script.Task schedule personnel are by the money of spark task Source allocation of parameters is recorded in configuration file, and the preset call back function to configuration file in Shell script.Resource allocation ginseng Number can be what task schedule personnel estimated in advance according to the task amount of Spark task.
The server cluster of multiple server compositions, including host node Master and multiple working node Worker.Task Spark task and corresponding configuration file are committed to host node by spark-submit order in terminal by dispatcher.It is main Task schedule platform is deployed on node, for being scheduled execution to multiple Spark tasks that multiple terminals are submitted.Task tune Degree platform is individually stored configuration file independently of Spark task, and corresponding for each Spark task start one Driver process.According to preset deployment mode (deploy-mode), Driver process in the local boot Spark task or Certain working node starts Spark task to person in the cluster.
Step 204, the resource allocation parameters that Spark task is read from configuration file, carry out according to resource allocation parameters Physical source distributing.
Task schedule platform is based on Driver process initiation Spark task, and distributes physical resource for Spark task.Tool Body, Driver process calls the corresponding Shell script of Spark task, a callback instruction to configuration file is generated, according to Callback instruction reads the resource allocation parameters in configuration file.Driver process is according to the resource allocation parameters read, to collection Group's manager application operation Spark task needs physical resource to be used.Cluster manager dual system can be Spark Standalone Cluster or YARN resource management cluster etc..Physical resource refers to memory and CPU etc..Cluster manager dual system exists according to resource allocation parameters Start a certain number of Executor processes on each working node of cluster.It is readily appreciated that, Driver process and each Executor Process itself can also occupy certain physical resource.
Step 206, the physical resource based on distribution executes Spark task.
After applying for physical resource needed for Spark task execution, task schedule platform is based on Driver process and opens Begin scheduling execution Spark task.Specifically, Spark task is split as the task groups of multiple asynchronous executions by Driver process Stage, each task groups stage include multiple asynchronous executions and/or the background task task concurrently executed.Driver process will Multiple background task task of one task groups stage, which are assigned in multiple Executor processes, to be executed.Background task task is The smallest execution unit.The implementing result of each background task task is stored to the corresponding memory of Executor process or place work Make in the disk file of node.When all background task task of current task group stage are carried out and finish, Driver process exists Intermediate result, and management and running next task group stage are written in the disk file of each working node local.So circulation Back and forth, until all having executed Spark task.
Step 208, during Spark task execution, the execution efficiency of Spark task is monitored.
During Spark task execution, task schedule platform executes effect based on Driver monitoring the process Spark task Rate calculates the execution speed of background task task.It is readily appreciated that, the execution speed of background task task and corresponding Executor The physical resources such as the CPU core number of process are directly related.In general, a CPU same time executes a thread.Physical resource is enough In the case where, as the multiple background task task being assigned in Executor process, multi-thread concurrent can be called to execute more A background task task, to improve the execution efficiency of Spark task.
Step 210, when monitoring execution efficiency lower than threshold value, the resource allocation parameters in configuration file are adjusted It is whole.
Task schedule platform is based on Driver process and compares whether execution efficiency is lower than threshold value.Threshold value can be according to practical need Free setting is asked, it can also be without limitation with dynamic change.If execution efficiency is lower than threshold value, indicate that current Spark task is deposited In the insufficient risk of physical resource, task schedule platform, which generates, to be stopped executing instruction, and will stop executing instruction being sent to corresponding work Make node, to terminate corresponding Driver process and Executor process.The measuring and calculating of task schedule platform needs newly-increased physics money Source is adjusted according to the resource allocation parameters that results of measuring corresponds to configuration file record to Spark task.If execution efficiency is big In or equal to threshold value, indicating current Spark task, there is no the insufficient risk of physical resource or risk are relatively low.Task schedule Platform judges whether the allocated physical resource of Spark task has idling-resource, and measuring and calculating needs the physical resource discharged, according to The resource allocation parameters that results of measuring corresponds to configuration file record to Spark task are adjusted.
Step 212, Spark task is dispatched to from allocated physical resource and is mutually fitted with resource allocation parameters adjusted It is continued to execute on the physical resource answered.
Task schedule platform be based on resource allocation parameters adjusted, again for one Driver of Spark task start into Journey calls the Driver process of the new starting to distribute physical resource again in the manner described above for Spark task, i.e., more in cluster A working node restarts a certain number of Executor processes.Driver process is by Spark task schedule extremely and after adjustment The adaptable physical resource of resource allocation parameters execute, i.e., multiple background task task that Spark task is split are sent It is executed to the multiple Executor processes redistributed.The Driver process that task schedule platform is based on continues to monitor Spark task Execution efficiency, and the adjustment of resource allocation parameters is carried out according to execution efficiency, until Spark task execution finishes.
Traditional resource allocation parameters are fixedly arranged in the Shell script of Spark task, so that only waiting until Spark task carries out just can be carried out resource allocation parameters change when version updating, so that resource allocation parameters modification is inconvenient, in turn Influence Spark task run efficiency and operation result.
In the present embodiment, the resource allocation parameters of Spark task recorded in the configuration file submitted according to terminal can be with Physical resource is distributed for the Spark task that terminal is submitted;Physical resource based on distribution can execute Spark task;Pass through prison The execution efficiency for surveying Spark task, can be adjusted the resource allocation parameters in configuration file according to monitoring result;It will Spark task schedule to the physical resource being adapted with resource allocation parameters adjusted executes.Due to by resource allocation parameters It is individually stored in a manner of configuration file, independently of Spark task itself, so as to get rid of Spark task version more Resource allocation parameters are flexibly freely modified in new limitation;Real-time monitoring Spark task execution efficiency, and according to execution efficiency dynamic The physical resource for adjusting distribution, is adapted to actual demand of the Spark task to physical resource, and then Spark can be improved and appoint Business execution efficiency.
In one embodiment, before receiving the Spark task and corresponding configuration file that terminal is submitted, further include:It connects It receives the Spark task that terminal is sent and develops request;Exploitation request is identified comprising entrance function;Identify that entrance function mark is corresponding Function queue;Function queue includes multiple business functions;Multiple business functions are respectively converted into corresponding multiple background tasks; Multiple background tasks are encapsulated as multiple tasks group by the corresponding group's decorator of call entry function identification;Configure multiple tasks group Dispatching sequence, multiple tasks group is packaged based on dispatching sequence, obtains Spark task.
Spark task is based on multiple business functions and realizes certain business function.For convenience, one kind will be realized jointly Multiple business functions of business function are known as function queue.Spark task has corresponding service logic script.Service logic foot This includes multiple function queues.Different business functions is realized in different functions queue.It is readily appreciated that, the division for business function Dimension, Spark task developer can freely define.When the service logic of Spark task changes, it is part of or The corresponding function queue of whole business functions is accordingly changed.Multiple business functions are arranged according to dispatching sequence in function queue Column, are known as entrance function for the business function of wherein the first dispatching sequence.
Above-mentioned Spark task can be to be developed based on a kind of distributive parallel computation framework provided in this embodiment. The frame includes task decorator, group's decorator and group's container.Task decorator is used to business function being converted to correspondence Background task task.Group's decorator is used to background task task multiple in a task queue being encapsulated as corresponding task Group Stage.Group's container is used to multiple tasks group being encapsulated as corresponding task group Job.
It, can be with when developer calls distributive parallel computation framework provided by the present application to carry out the exploitation of Spark task Exploitation request is sent by terminal to server.Server is according to exploitation request distributive parallel computation framework.Server Based on the first call request that terminal is sent, task decorator is returned to terminal.Each entrance of the terminal in service logic script A task decorator is added at function.Specifically, the corresponding call back function of a task decorator is added in entrance function, And task decorator is set by the callback object of call back function, it is that entrance function is touched by the readjustment condition setting of call back function Hair executes.When the entrance function is triggered and executes, one is generated by call back function, the readjustment of corresponding task decorator is referred to It enables, server can call corresponding task decorator according to callback instruction, and entrance function is corresponded to letter using task decorator Each business function is encapsulated as corresponding background task in number queue.
The second call request that server is sent based on terminal returns to group's decorator to terminal.Group's decorator includes Multiple elements, such as status checker, preposition detector and litter cleaner.Wherein, status checker is used for encapsulation The execution state of the task groups of generation is checked.Preposition detector is for detecting whether current task group meets execution condition. Litter cleaner is used to carry out rubbish cleaning to when cancelling current task group.Multiple function is also recorded in group's decorator Parameter, such as the task group identification Stage_id of task groups.Terminal adds one at each entrance function of service logic script A group's decorator.Specifically, terminal adds the corresponding call back function of group's decorator in entrance function, and will readjustment The callback object of function is set as group's decorator, is to generate entrance function corresponding by the readjustment condition setting of call back function Business queue, configures multiple elements in the corresponding group's decorator of each entrance function and parameters.In entrance Function is triggered when executing, and generates a callback instruction to relevant groups decorator by call back function, server can root Corresponding group's decorator is called according to callback instruction, group's decorator identifies the corresponding task queue of the entrance function, by this Multiple background task task are encapsulated as corresponding task groups Stage in business queue.
The third call request that server is sent based on terminal returns to group's container to terminal.Developer will in terminal Group's container is added to service logic script, and configures the corresponding task group identification of group's container.Group's container is used for will be more A task groups are encapsulated as task group.In other words, group's container is for accommodating multiple tasks group, and this multiple tasks group is as corresponding The encapsulated object of group's container.As shown in the realization script of above-mentioned group's wrapper, developer is in terminal by task group identification Job_id is added to the corresponding group's decorator of each encapsulated object, with the encapsulation relationship established between task groups.Service logic Multiple group's containers can be added in script.According to the task group identification in group's wrapper, can determine whether to encapsulate task groups to Which task group.It is readily appreciated that, the one or more task group Job encapsulated are above-mentioned Spark task.
Group's container itself provides asynchronous execution function in distributive parallel computation framework.It is added to by group's container After service logic script, developer can use asynchronous execution function, in group's container pre-define multiple tasks group it Between dispatching sequence rules of arrangement.Rules of arrangement includes asynchronous between multiple tasks group mark and multiple tasks group mark holds Capable successive dispatching sequence.
Traditional most distributive parallel computation frameworks can only carry out control scheduling based on individual task, lack business level Concurrent control mechanism.If developer is desired based on the task schedule that multiple tasks realize business level, need opening Hair process additional maintenance one opens even multiple tables of data for dispatching sequence between logger task, to developer bring it is many not Just.
In the present embodiment, due to being integrated with group's decorator in Spark task in advance, and group's decorator itself provides and appoints Business encapsulation and dispatching sequence's capacity of arranging movements, multiple scattered background tasks according to service logic be encapsulated as can be realized certain The task groups of business function or task group, and then task tune can be realized from business level without additional maintenance tables of data Degree, fills in the mode in tables of data compared to traditional dispatching sequence by multiple scattered background tasks one by one, can be significantly Simplify the exploitation of Spark task.
In one embodiment, Spark task includes Shell script;Shell script is prefixed the readjustment to configuration file Function;The resource allocation parameters of Spark task are read from configuration file, including:By executing Shell script startup Spark Task;Based on the preset call back function of Shell script, the callback instruction to configuration file is generated;It is pulled pair according to callback instruction The configuration file answered;Resource allocation parameters are read from the configuration file pulled.
The corresponding service logic script of Spark task includes Shell script, Submit script and Class script etc..Task Dispatching platform passes through load Shell script startup Spark task.Shell script, which is used to record, executes the number that Spark task needs Enter ginseng according to parameter, such as execution of Submit script or Class script.Traditional resource allocation parameters by Spark task are also remembered It records to Shell script, and the fixation of Shell script is encapsulated in Spark task, can only carry out phase with the version updating of Spark task Answer the change of data parameters.The present embodiment records resource allocation parameters to the configuration file independently of Spark task, and The preset call back function to configuration file in Shell script.When Shell script is scheduled to be executed, generated based on call back function Callback instruction.Driver process pulls corresponding configuration file from the configuration pulled according to the file identification that callback instruction carries Resource allocation parameters are read in file.
In the present embodiment, due to individually being stored resource allocation parameters in a manner of configuration file, independently of Spark Task itself flexibly freely modifies resource allocation parameters so as to get rid of the limitation of Spark task version updating.
In one embodiment, the physical resource based on distribution executes Spark task, including:Spark task is split as Multiple tasks group;Each task groups have corresponding task group identification;Task groups are split as multiple background tasks;Each basis Task has corresponding log decorator;Physical resource based on distribution executes multiple background tasks, generates each background task Execution journal;Using log decorator, the corresponding task groups of background task are added in the execution journal of corresponding background task Mark;When Spark task execution finishes, the multiple execution journal for having same task group to identify record are collected, and are generated The corresponding task daily record of each task group identification.
Server can be according to the reverse logic of the wide logic for relying on narrow dependence or above-mentioned encapsulation process by Spark task It is split as multiple tasks group, each task groups are split as multiple background tasks.Server is according to layout preparatory in task group The dispatching sequence of multiple tasks group is scheduled execution to multiple tasks group in task group.In other words, server is dispatched first Multiple background tasks of serial task group are distributed to multiple working nodes and are executed, multiple bases to first dispatching sequence's task groups Task, which is all finished, executes multiple background tasks of next dispatching sequence's task groups.When background task is performed, generate Corresponding execution journal.
It is executed since server host node distributes multiple background tasks to working nodes multiple in cluster, so that holding The execution journal that the different background tasks of row generate is dispersed in multiple servers or multiple virtual machines, and then see developer can only To the dispatch situation of the task level of scattered black box.But developer is usually only concerned the task schedule of business level, this is opened It is greatly inconvenient that hair personnel consult journal tape.
To solve the above-mentioned problems, distributive parallel computation framework provided in this embodiment further includes log decorator.Day Will decorator enables Spark task to carry out log generation from business dimension, i.e. the multiple of same business function are realized in control The execution journal of background task, which is concentrated, to be shown.
When developer calls distributive parallel computation framework provided by the present application to carry out the exploitation of Spark task, service The 4th call request that device is sent based on terminal returns to log decorator to terminal.Developer is by terminal in service logic A log decorator is added at each entrance function of script.Multiple business functions in same entrance function respective function queue Corresponding same log decorator.Developer refers to settled date will processing mode by terminal in the log decorator of deployment.Log Decorative device such as generates the corresponding log of each task groups there are many different log processing modes, or generates each task group Corresponding log etc..According to specified log processing mode, corresponding task group identification or task are added in log decorator Group identification.
When background task is performed, corresponding execution journal is generated, corresponding log decorator is called, log decorator is pressed Corresponding task group identification or task group identification etc. are inserted into task daily record according to specified log processing mode.In certain business When being finished, the acquisition in the multiple servers or multiple virtual machines for executing this business of log decorator has same task Multiple execution journal of group mark or task group identification, integrate collected multiple execution journal, generate each task Task daily record return terminal is shown by group mark or the corresponding task daily record of task group identification.
In the present embodiment, the corresponding log decorator of each entrance function is prefixed in Spark task.Log decorator sheet Body provides the ability that log collection is carried out from business level, executes background task in multiple servers and generates scattered execution journal Afterwards, log collection and integration are carried out from business level automatically, solves the problems, such as that log is consulted inconvenient caused by dispersing because of log.
In one embodiment, the execution efficiency of Spark task is monitored, including:Calculate the task total amount of Spark task; The task duration of Spark task is calculated according to task total amount;It is acquired according to preset time frequency coordination task run monitor component The operation information of Spark task;Spark task is calculated in the task execution amount of multiple timing nodes according to operation information;According to Task execution amount and task duration calculate Spark task in the execution efficiency of multiple timing nodes.
During Spark task execution, the execution efficiency of Driver process real-time monitoring Spark task.Spark task Execution efficiency refers to the task execution amount of unit time.Specifically, Driver process calculates the task total amount of Spark task, and The task duration of Spark task is calculated according to task total amount.Task schedule platform can be default according to preset time frequency coordination Task run monitor component acquisition Spark task operation information.Task run monitor component can be REST interface (Representational State Transfer, declarative state transmitting) etc..The operation information of Spark task includes working as The execution state of preceding moment multiple tasks group, and the task amount that state is the task groups executed is executed, execution state is to hold The task amount etc. of the background task executed in task groups in row.
Driver process is the task amount of the task groups executed according to execution state, and execution state is task in execution The task amount of the background task executed in group calculates Spark task in the task execution amount at current time.Driver process root It is calculated according to the initial time for starting to execute Spark task of record and current time and executes duration.Driver process is according to task Execution amount and execution duration calculate Spark task in the execution efficiency of current time node.
In the present embodiment, according to the task execution amount for the Spark task that real-time monitoring obtains, and the execution calculated in advance The task duration that Spark task needs calculates Spark task in the execution efficiency of different monitoring time nodes, can make to calculate Obtained execution efficiency is more bonded Spark task practical operation situation, and then improves the calculating accuracy rate of execution efficiency.
In one embodiment, when monitoring execution efficiency lower than threshold value, to the resource allocation parameters in configuration file It is adjusted, including:Compare whether execution efficiency is lower than threshold value;If so, being calculated according to task total amount and task execution amount remaining Task amount;Residual time length is calculated according to task duration and current timing node;Calculated according to remaining task amount and residual time length Need newly-increased physical resource;Otherwise, it according to the resource using information of the two neighboring timing node of operational information recording, calculates Resource utilization;The physical resource for needing to discharge according to resource utilization measuring and calculating;According to results of measuring adjustresources allocation of parameters.
Task schedule platform can be automatically right according to the monitoring result to Spark task execution efficiency based on Driver process Resource allocation parameters are adjusted.Specifically, Driver process compares whether execution efficiency is lower than threshold value.If so, Driver into Journey calculates remaining task amount, and according to the execution of measuring and calculating according to the task total amount and task execution amount of the Spark task of measuring and calculating The task duration and current timing node that Spark task needs, calculate residual time length.Driver process is according to remaining task amount And residual time length, calculate the target execution efficiency of Spark task.Driver process reads the resource allocation of configuration file record Parameter, the Spark task obtained according to monitoring are determined in the physical resource of current time actual execution efficiency and corresponding distribution Reach the target physical resource of target execution efficiency needs.It is readily appreciated that, target physical resource and allocated physical resource Difference is to need newly-increased physical resource.The resource allocation that Driver process records configuration file according to target physical resource Parameter is adjusted.
The operation information of Spark task based on the acquisition of preset task run monitor component further includes Spark task Resource using information, such as CPU usage, memory remaining space capacity etc..If execution efficiency be greater than or equal to threshold value, Driver into Journey calculates the resource utilization of physical resource, according to resource according to the resource using information of the two neighboring timing node of acquisition Utilization rate judges allocated physical resource with the presence or absence of free physical resource.Driver process reads the money of configuration file record Source allocation of parameters determines the free physical resource for needing to discharge according to resource allocation parameters and resource utilization.Driver into Journey is adjusted according to the resource allocation parameters that free physical resource records configuration file.
In the present embodiment, resource allocation parameters are adjusted automatically according to the monitoring result to Spark task execution efficiency It is whole, carry out that physical resource is newly-increased in time when execution efficiency is lower than threshold value, with guarantee the execution efficiency of Spark task and execute at Power;Even if carrying out physical resource release when execution efficiency is greater than or equal to threshold value, physical resource utilization rate can be improved, subtract Few waste to physical resource.
In one embodiment, when monitoring execution efficiency lower than threshold value, to the resource allocation parameters in configuration file It is adjusted, including:Compare whether execution efficiency is lower than threshold value;If so, label Spark task execution is abnormal, obtains Spark and appoint The task daily record of business;Abnormal cause positioning is carried out according to task daily record;If abnormal cause includes physical resource deficiency, according to configuration The resource allocation parameters of file record generate the resource adjustment prompt page, and resource is adjusted the prompt page and is sent to terminal;Make end End is adjusted in resource adjustment tips page in face of resource allocation parameters.
When needing to be adjusted resource allocation parameters, Driver process sends resource adjustment prompt to terminal.Specifically , Driver process compares whether execution efficiency is lower than threshold value.If so, indicate that Spark task may occur to execute exception, Driver process obtains the task daily record of Spark task, traverses to execution a plurality of in task daily record record, screening executes different Normal executes record, and it is abnormal former to be recorded in progress in the corresponding service logic script of Spark task according to the execution for executing exception Because of positioning.If abnormal cause includes physical resource deficiency, stop executing Spark task, the money recorded according to configuration file Source allocation of parameters generates the resource adjustment prompt page, and resource is adjusted the prompt page and is sent to terminal.Task schedule personnel can be with Tips page is adjusted in resource by terminal to be adjusted in face of resource allocation parameters.In another embodiment, task schedule people Member at any time can be out of service Spark task, restart phase after modifying in configuration file to respective resources allocation of parameters Spark task is answered, is no longer limited by Spark version.
Task schedule platform be based on resource allocation parameters adjusted, again for one Driver of Spark task start into Journey, call the Driver process of the new starting according to resource allocation parameters adjusted be again Spark task distribution physics money Source is restarted based on the physical resource redistributed and executes abnormal Spark task.
In the present embodiment, is judged automatically and whether needed to resource point according to the monitoring result to Spark task execution efficiency It is adjusted with parameter, if desired resource allocation parameters is adjusted, send resource adjustment prompt to terminal in time, and providing The physical resource parameters of the source adjustment prompt original setting of page presentation, facilitate user referring to modification.
It should be understood that although each step in the flow chart of Fig. 2 is successively shown according to the instruction of arrow, this A little steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly state otherwise herein, these steps It executes there is no the limitation of stringent sequence, these steps can execute in other order.Moreover, at least part in Fig. 2 Step may include that perhaps these sub-steps of multiple stages or stage are executed in synchronization to multiple sub-steps It completes, but can execute at different times, the execution sequence in these sub-steps or stage, which is also not necessarily, successively to be carried out, But it can be executed in turn or alternately at least part of the sub-step or stage of other steps or other steps.
In one embodiment, as shown in figure 3, providing a kind of physical source distributing device, including:Resource distribution module 302, efficiency monitoring module 304 and resource adjust module 306, wherein:
Resource distribution module 302, for receiving the Spark task and corresponding configuration file that terminal is submitted;From configuration text The resource allocation parameters that Spark task is read in part carry out physical source distributing according to resource allocation parameters.
Efficiency monitoring module 304 executes Spark task for the physical resource based on distribution;In the Spark task execution phase Between, monitor the execution efficiency of Spark task.
Resource adjusts module 306, for when monitoring execution efficiency lower than threshold value, to the resource allocation in configuration file Parameter is adjusted;Spark task is dispatched to from allocated physical resource and is adapted with resource allocation parameters adjusted Physical resource on continue to execute.
In one embodiment, which further includes task package module 308, and the Spark for receiving terminal transmission appoints Business exploitation request;Exploitation request is identified comprising entrance function;Identify that entrance function identifies corresponding function queue;Function queue packet Include multiple business functions;Multiple business functions are respectively converted into corresponding multiple background tasks;Call entry function identification pair Multiple background tasks are encapsulated as multiple tasks group by the group's decorator answered;The dispatching sequence for configuring multiple tasks group, based on tune Degree sequence is packaged multiple tasks group, obtains Spark task.
In one embodiment, Spark task includes Shell script;Shell script is prefixed the readjustment to configuration file Function;Resource distribution module 302 is also used to by executing Shell script startup Spark task;It is preset based on Shell script Call back function generates the callback instruction to configuration file;Corresponding configuration file is pulled according to callback instruction;Match from what is pulled It sets and reads resource allocation parameters in file.
In one embodiment, efficiency monitoring module 304 is also used to Spark task being split as multiple tasks group;Each Task groups have corresponding task group identification;Task groups are split as multiple background tasks;Each background task has corresponding Log decorator;Physical resource based on distribution executes multiple background tasks, generates the execution journal of each background task;It utilizes Log decorator adds the corresponding task group identification of background task in the execution journal of corresponding background task;When Spark task When being finished, the multiple execution journal for having same task group to identify record are collected, and generate each task group identification pair The task daily record answered.
In one embodiment, efficiency monitoring module 304 is also used to calculate the task total amount of Spark task;According to task The task duration of total amount measuring and calculating Spark task;Spark task is acquired according to preset time frequency coordination task run monitor component Operation information;Spark task is calculated in the task execution amount of multiple timing nodes according to operation information;According to task execution amount And task duration, Spark task is calculated in the execution efficiency of multiple timing nodes.
In one embodiment, resource adjustment module 306 is also used to compare whether execution efficiency is lower than threshold value;If so, root Remaining task amount is calculated according to task total amount and task execution amount;When calculating remaining according to task duration and current timing node It is long;Newly-increased physical resource is needed according to remaining task amount and residual time length measuring and calculating;Otherwise, according to the adjacent of operational information recording The resource using information of two timing nodes, computing resource utilization rate;The physics money for needing to discharge according to resource utilization measuring and calculating Source;According to results of measuring adjustresources allocation of parameters.
In one embodiment, resource adjustment module 306 is also used to compare whether execution efficiency is lower than threshold value;If so, mark Remember that Spark task execution is abnormal, obtains the task daily record of Spark task;Abnormal cause positioning is carried out according to task daily record;If different Normal reason includes physical resource deficiency, generates the resource adjustment prompt page according to the resource allocation parameters of configuration file record, will The resource adjustment prompt page is sent to terminal;It is adjusted terminal in face of resource allocation parameters in resource adjustment tips page.
Specific about physical source distributing device limits the limit that may refer to above for physical source distributing method Fixed, details are not described herein.Modules in above-mentioned physical source distributing device can fully or partially through software, hardware and its Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding Operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 4.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The database of machine equipment is used for storage configuration file.The network interface of the computer equipment is used to pass through network with external terminal Connection communication.To realize a kind of physical source distributing method when the computer program is executed by processor.
It will be understood by those skilled in the art that structure shown in Fig. 4, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, which is stored with Computer program, the processor realize following steps when executing computer program:Receive Spark task and correspondence that terminal is submitted Configuration file;The resource allocation parameters that Spark task is read from configuration file carry out physics money according to resource allocation parameters Source distribution;Physical resource based on distribution executes Spark task;During Spark task execution, monitoring Spark task is held Line efficiency;When monitoring execution efficiency lower than threshold value, the resource allocation parameters in configuration file are adjusted;By Spark Task is dispatched on the physical resource being adapted with resource allocation parameters adjusted from allocated physical resource and continues to execute.
In one embodiment, following steps are also realized when processor executes computer program:Receive what terminal was sent The exploitation request of Spark task;Exploitation request is identified comprising entrance function;Identify that entrance function identifies corresponding function queue;Letter Number queue includes multiple business functions;Multiple business functions are respectively converted into corresponding multiple background tasks;Call entry letter Number identifies corresponding group's decorator and multiple background tasks is encapsulated as multiple tasks group;The scheduling for configuring multiple tasks group is suitable Sequence is packaged multiple tasks group based on dispatching sequence, obtains Spark task.
In one embodiment, Spark task includes Shell script;Shell script is prefixed the readjustment to configuration file Function;Processor also realizes following steps when executing computer program:By executing Shell script startup Spark task;It is based on The preset call back function of Shell script generates the callback instruction to configuration file;Corresponding configuration text is pulled according to callback instruction Part;Resource allocation parameters are read from the configuration file pulled.
In one embodiment, following steps are also realized when processor executes computer program:Spark task is split as Multiple tasks group;Each task groups have corresponding task group identification;Task groups are split as multiple background tasks;Each basis Task has corresponding log decorator;Physical resource based on distribution executes multiple background tasks, generates each background task Execution journal;Using log decorator, the corresponding task groups of background task are added in the execution journal of corresponding background task Mark;When Spark task execution finishes, the multiple execution journal for having same task group to identify record are collected, and are generated The corresponding task daily record of each task group identification.
In one embodiment, following steps are also realized when processor executes computer program:Calculate appointing for Spark task Business total amount;The task duration of Spark task is calculated according to task total amount;According to preset time frequency coordination task run monitoring group The operation information of part acquisition Spark task;Spark task is calculated in the task execution of multiple timing nodes according to operation information Amount;According to task execution amount and task duration, Spark task is calculated in the execution efficiency of multiple timing nodes.
In one embodiment, following steps are also realized when processor executes computer program:Whether compare execution efficiency Lower than threshold value;If so, calculating remaining task amount according to task total amount and task execution amount;According to task duration and current time Node calculates residual time length;Newly-increased physical resource is needed according to remaining task amount and residual time length measuring and calculating;Otherwise, according to operation The resource using information of the two neighboring timing node of information record, computing resource utilization rate;Need are calculated according to resource utilization The physical resource to be discharged;According to results of measuring adjustresources allocation of parameters.
In one embodiment, following steps are also realized when processor executes computer program:Whether compare execution efficiency Lower than threshold value;If so, label Spark task execution is abnormal, the task daily record of Spark task is obtained;It is carried out according to task daily record Abnormal cause positioning;If abnormal cause includes physical resource deficiency, money is generated according to the resource allocation parameters of configuration file record The source adjustment prompt page, adjusts the prompt page for resource and is sent to terminal;Terminal is set to adjust tips page in face of resource point in resource It is adjusted with parameter.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program realizes following steps when being executed by processor:Receive the Spark task and corresponding configuration file that terminal is submitted;From matching The resource allocation parameters for reading Spark task in file are set, carry out physical source distributing according to resource allocation parameters;Based on distribution Physical resource execute Spark task;During Spark task execution, the execution efficiency of Spark task is monitored;When monitoring When execution efficiency is lower than threshold value, the resource allocation parameters in configuration file are adjusted;By Spark task from allocated object It is continued to execute in reason scheduling of resource to the physical resource being adapted with resource allocation parameters adjusted.
In one embodiment, following steps are also realized when computer program is executed by processor:Receive what terminal was sent The exploitation request of Spark task;Exploitation request is identified comprising entrance function;Identify that entrance function identifies corresponding function queue;Letter Number queue includes multiple business functions;Multiple business functions are respectively converted into corresponding multiple background tasks;Call entry letter Number identifies corresponding group's decorator and multiple background tasks is encapsulated as multiple tasks group;The scheduling for configuring multiple tasks group is suitable Sequence is packaged multiple tasks group based on dispatching sequence, obtains Spark task.
In one embodiment, Spark task includes Shell script;Shell script is prefixed the readjustment to configuration file Function;Following steps are also realized when computer program is executed by processor:By executing Shell script startup Spark task;Base In the preset call back function of Shell script, the callback instruction to configuration file is generated;Corresponding configuration is pulled according to callback instruction File;Resource allocation parameters are read from the configuration file pulled.
In one embodiment, following steps are also realized when computer program is executed by processor:Spark task is split For multiple tasks group;Each task groups have corresponding task group identification;Task groups are split as multiple background tasks;Each base Plinth task has corresponding log decorator;Physical resource based on distribution executes multiple background tasks, generates each basis and appoints The execution journal of business;Using log decorator, the corresponding task of background task is added in the execution journal of corresponding background task Group mark;When Spark task execution finishes, the multiple execution journal for having same task group to identify record are collected, raw At the corresponding task daily record of each task group identification.
In one embodiment, following steps are also realized when computer program is executed by processor:Calculate Spark task Task total amount;The task duration of Spark task is calculated according to task total amount;It is monitored according to preset time frequency coordination task run The operation information of component acquisition Spark task;Spark task is calculated in the task execution of multiple timing nodes according to operation information Amount;According to task execution amount and task duration, Spark task is calculated in the execution efficiency of multiple timing nodes.
In one embodiment, following steps are also realized when computer program is executed by processor:Comparing execution efficiency is It is no to be lower than threshold value;If so, calculating remaining task amount according to task total amount and task execution amount;According to task duration and it is current when Intermediate node calculates residual time length;Newly-increased physical resource is needed according to remaining task amount and residual time length measuring and calculating;Otherwise, according to fortune The resource using information of the two neighboring timing node of row information record, computing resource utilization rate;Calculated according to resource utilization The physical resource for needing to discharge;According to results of measuring adjustresources allocation of parameters.
In one embodiment, following steps are also realized when computer program is executed by processor:Comparing execution efficiency is It is no to be lower than threshold value;If so, label Spark task execution is abnormal, the task daily record of Spark task is obtained;According to task daily record into The positioning of row abnormal cause;If abnormal cause includes physical resource deficiency, generated according to the resource allocation parameters of configuration file record The resource adjustment prompt page, adjusts the prompt page for resource and is sent to terminal;Make terminal in resource adjustment tips page in face of resource Allocation of parameters is adjusted.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Instruct relevant hardware to complete by computer program, computer program to can be stored in a non-volatile computer readable It takes in storage medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, this Shen Please provided by any reference used in each embodiment to memory, storage, database or other media, may each comprise Non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.
Above embodiments only express the several embodiments of the application, and the description thereof is more specific and detailed, but can not Therefore it is construed as limiting the scope of the patent.It should be pointed out that for those of ordinary skill in the art, Under the premise of not departing from the application design, various modifications and improvements can be made, these belong to the protection scope of the application. Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (10)

1. a kind of physical source distributing method, the method includes:
Receive the Spark task and corresponding configuration file that terminal is submitted;
The resource allocation parameters that the Spark task is read from the configuration file are carried out according to the resource allocation parameters Physical source distributing;
Physical resource based on distribution executes the Spark task;
During the Spark task execution, the execution efficiency of the Spark task is monitored;
When monitoring the execution efficiency lower than threshold value, the resource allocation parameters in the configuration file are adjusted, it will The Spark task from allocated physical resource be dispatched on physical resource corresponding with resource allocation parameters adjusted after It is continuous to execute.
2. the method according to claim 1, wherein the Spark task and corresponding for receiving terminal and submitting Before configuration file, further include:
It receives the Spark task that terminal is sent and develops request;The exploitation request is identified comprising entrance function;
Identify that the entrance function identifies corresponding function queue;The function queue includes multiple business functions;
Multiple business functions are respectively converted into corresponding multiple background tasks;
It calls the entrance function to identify corresponding group's decorator and multiple background tasks is encapsulated as multiple tasks group;
The dispatching sequence for configuring multiple task groups is packaged multiple task groups based on the dispatching sequence, obtains To the Spark task.
3. the method according to claim 1, wherein the Spark task includes Shell script;The Shell Script is prefixed the call back function to the configuration file;The money that the Spark task is read from the configuration file Source allocation of parameters, including:
By executing Spark task described in the Shell script startup;
Based on the call back function, the callback instruction to the configuration file is generated;
Corresponding configuration file is pulled according to the callback instruction;
The resource allocation parameters are read from the configuration file pulled.
4. the method according to claim 1, wherein the physical resource based on distribution executes the Spark Task, including:
The Spark task is split as multiple tasks group;Each task groups have corresponding task group identification;
The task groups are split as multiple background tasks;Each background task has corresponding log decorator;
Physical resource based on distribution executes multiple background tasks, generates the execution journal of each background task;
Using the log decorator, the corresponding task groups of the background task are added in the execution journal of corresponding background task Mark;
When the Spark task execution finishes, the multiple execution journal for having same task group to identify record are collected, raw At the corresponding task daily record of each task group identification.
5. the method according to claim 1, wherein the execution efficiency of the monitoring Spark task, packet It includes:
Calculate the task total amount of the Spark task;
The task duration of the Spark task is calculated according to the task total amount;
The operation information of the Spark task is acquired according to preset time frequency coordination task run monitor component;
The Spark task is calculated in the task execution amount of multiple timing nodes according to the operation information;
According to the task execution amount and the task duration, the Spark task is calculated in multiple timing nodes and executes effect Rate.
6. according to the method described in claim 5, it is characterized in that, described when monitoring the execution efficiency lower than threshold value, Resource allocation parameters in the configuration file are adjusted, including:
Compare whether the execution efficiency is lower than threshold value;
If so, calculating remaining task amount according to the task total amount and task execution amount;According to the task duration and currently Timing node calculates residual time length;Newly-increased physical resource is needed according to the remaining task amount and residual time length measuring and calculating;
Otherwise, according to the resource using information of the two neighboring timing node of the operational information recording, computing resource utilization rate; The physical resource for needing to discharge according to resource utilization measuring and calculating;
The resource allocation parameters are adjusted according to results of measuring.
7. the method according to claim 1, wherein described when monitoring the execution efficiency lower than threshold value, Resource allocation parameters in the configuration file are adjusted, including:
Compare whether the execution efficiency is lower than threshold value;
If so, marking the Spark task execution abnormal, the task daily record of the Spark task is obtained;
Abnormal cause positioning is carried out according to the task daily record;
If the abnormal cause includes physical resource deficiency, resource is generated according to the resource allocation parameters of configuration file record The resource is adjusted the prompt page and is sent to the terminal by the adjustment prompt page;Mention the terminal in resource adjustment Show that the page is adjusted the resource allocation parameters.
8. a kind of physical source distributing device, which is characterized in that described device includes:
Resource distribution module, for receiving the Spark task and corresponding configuration file that terminal is submitted;From the configuration file The resource allocation parameters for reading the Spark task carry out physical source distributing according to the resource allocation parameters;
Efficiency monitoring module executes the Spark task for the physical resource based on distribution;In the Spark task execution Period monitors the execution efficiency of the Spark task;
Resource adjusts module, for dividing the resource in the configuration file when monitoring the execution efficiency lower than threshold value It is adjusted with parameter;The Spark task is dispatched to and resource allocation parameters adjusted from allocated physical resource It is continued to execute on adaptable physical resource.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 7 is realized when being executed by processor.
CN201810621848.4A 2018-06-15 2018-06-15 Physical resource allocation method, device, computer equipment and storage medium Active CN108845884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810621848.4A CN108845884B (en) 2018-06-15 2018-06-15 Physical resource allocation method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810621848.4A CN108845884B (en) 2018-06-15 2018-06-15 Physical resource allocation method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108845884A true CN108845884A (en) 2018-11-20
CN108845884B CN108845884B (en) 2024-04-19

Family

ID=64202053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810621848.4A Active CN108845884B (en) 2018-06-15 2018-06-15 Physical resource allocation method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN108845884B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491841A (en) * 2018-11-21 2019-03-19 南京安讯科技有限责任公司 A method of improving Spark on yarn real-time task reliability
CN110275777A (en) * 2019-06-10 2019-09-24 广州市九重天信息科技有限公司 Resource scheduling system
CN110597858A (en) * 2019-08-30 2019-12-20 深圳壹账通智能科技有限公司 Task data processing method and device, computer equipment and storage medium
CN111078496A (en) * 2019-11-29 2020-04-28 联想(北京)有限公司 Data monitoring method, platform and storage medium
CN111338779A (en) * 2020-02-27 2020-06-26 深圳华锐金融技术股份有限公司 Resource allocation method, device, computer equipment and storage medium
CN111767092A (en) * 2020-06-30 2020-10-13 深圳前海微众银行股份有限公司 Job execution method, device, system and computer readable storage medium
CN112068874A (en) * 2020-07-30 2020-12-11 深圳市优必选科技股份有限公司 Software project continuous integration method and device, terminal equipment and storage medium
CN112114958A (en) * 2019-06-21 2020-12-22 上海哔哩哔哩科技有限公司 Resource isolation method, distributed platform, computer device, and storage medium
CN112148469A (en) * 2019-06-28 2020-12-29 杭州海康威视数字技术股份有限公司 Method, apparatus and computer storage medium for managing resources
WO2021017701A1 (en) * 2019-07-29 2021-02-04 中兴通讯股份有限公司 Spark performance optimization control method and apparatus, and device and storage medium
CN112527384A (en) * 2020-12-15 2021-03-19 青岛海尔科技有限公司 Resource allocation parameter configuration method and device, storage medium and electronic device
CN112597121A (en) * 2020-12-25 2021-04-02 北京知因智慧科技有限公司 Logic script processing method and device, electronic equipment and storage medium
CN113691587A (en) * 2021-07-20 2021-11-23 北京达佳互联信息技术有限公司 Virtual resource processing method and device, electronic equipment and storage medium
CN114168302A (en) * 2021-12-28 2022-03-11 中国建设银行股份有限公司 Task scheduling method, device, equipment and storage medium
EP4086764A1 (en) * 2021-05-06 2022-11-09 Ateme Method for dynamic resources allocation and apparatus for implementing the same
CN115794591A (en) * 2023-02-06 2023-03-14 南方电网数字电网研究院有限公司 Scheduling method of power grid IT (information technology) resources
WO2023115931A1 (en) * 2021-12-21 2023-06-29 浪潮通信信息系统有限公司 Big-data component parameter adjustment method and apparatus, and electronic device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594869A (en) * 2011-12-30 2012-07-18 深圳市同洲视讯传媒有限公司 Method and device for dynamically distributing resources under cloud computing environment
CN104951372A (en) * 2015-06-16 2015-09-30 北京工业大学 Method for dynamic allocation of Map/Reduce data processing platform memory resources based on prediction
CN106033371A (en) * 2015-03-13 2016-10-19 杭州海康威视数字技术股份有限公司 Method and system for dispatching video analysis task
CN107291550A (en) * 2017-06-22 2017-10-24 华中科技大学 A kind of Spark platform resources dynamic allocation method and system for iterated application
CN107454019A (en) * 2017-09-28 2017-12-08 北京邮电大学 Software defined network distribution method of dynamic bandwidth, device, equipment and storage medium
US20180024863A1 (en) * 2016-03-31 2018-01-25 Huawei Technologies Co., Ltd. Task Scheduling and Resource Provisioning System and Method
CN108023759A (en) * 2016-10-28 2018-05-11 腾讯科技(深圳)有限公司 Adaptive resource regulating method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594869A (en) * 2011-12-30 2012-07-18 深圳市同洲视讯传媒有限公司 Method and device for dynamically distributing resources under cloud computing environment
CN106033371A (en) * 2015-03-13 2016-10-19 杭州海康威视数字技术股份有限公司 Method and system for dispatching video analysis task
CN104951372A (en) * 2015-06-16 2015-09-30 北京工业大学 Method for dynamic allocation of Map/Reduce data processing platform memory resources based on prediction
US20180024863A1 (en) * 2016-03-31 2018-01-25 Huawei Technologies Co., Ltd. Task Scheduling and Resource Provisioning System and Method
CN108023759A (en) * 2016-10-28 2018-05-11 腾讯科技(深圳)有限公司 Adaptive resource regulating method and device
CN107291550A (en) * 2017-06-22 2017-10-24 华中科技大学 A kind of Spark platform resources dynamic allocation method and system for iterated application
CN107454019A (en) * 2017-09-28 2017-12-08 北京邮电大学 Software defined network distribution method of dynamic bandwidth, device, equipment and storage medium

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491841A (en) * 2018-11-21 2019-03-19 南京安讯科技有限责任公司 A method of improving Spark on yarn real-time task reliability
CN110275777B (en) * 2019-06-10 2021-10-29 广州市九重天信息科技有限公司 Resource scheduling system
CN110275777A (en) * 2019-06-10 2019-09-24 广州市九重天信息科技有限公司 Resource scheduling system
CN112114958A (en) * 2019-06-21 2020-12-22 上海哔哩哔哩科技有限公司 Resource isolation method, distributed platform, computer device, and storage medium
CN112148469B (en) * 2019-06-28 2024-02-20 杭州海康威视数字技术股份有限公司 Method and device for managing resources and computer storage medium
CN112148469A (en) * 2019-06-28 2020-12-29 杭州海康威视数字技术股份有限公司 Method, apparatus and computer storage medium for managing resources
WO2021017701A1 (en) * 2019-07-29 2021-02-04 中兴通讯股份有限公司 Spark performance optimization control method and apparatus, and device and storage medium
CN112379935A (en) * 2019-07-29 2021-02-19 中兴通讯股份有限公司 Spark performance optimization control method, device, equipment and storage medium
CN110597858A (en) * 2019-08-30 2019-12-20 深圳壹账通智能科技有限公司 Task data processing method and device, computer equipment and storage medium
CN111078496A (en) * 2019-11-29 2020-04-28 联想(北京)有限公司 Data monitoring method, platform and storage medium
CN111338779A (en) * 2020-02-27 2020-06-26 深圳华锐金融技术股份有限公司 Resource allocation method, device, computer equipment and storage medium
CN111338779B (en) * 2020-02-27 2021-11-02 深圳华锐金融技术股份有限公司 Resource allocation method, device, computer equipment and storage medium
CN111767092A (en) * 2020-06-30 2020-10-13 深圳前海微众银行股份有限公司 Job execution method, device, system and computer readable storage medium
CN112068874B (en) * 2020-07-30 2023-12-29 深圳市优必选科技股份有限公司 Continuous integration method and device for software items, terminal equipment and storage medium
CN112068874A (en) * 2020-07-30 2020-12-11 深圳市优必选科技股份有限公司 Software project continuous integration method and device, terminal equipment and storage medium
CN112527384A (en) * 2020-12-15 2021-03-19 青岛海尔科技有限公司 Resource allocation parameter configuration method and device, storage medium and electronic device
CN112527384B (en) * 2020-12-15 2023-06-16 青岛海尔科技有限公司 Method and device for configuring resource allocation parameters, storage medium and electronic device
CN112597121A (en) * 2020-12-25 2021-04-02 北京知因智慧科技有限公司 Logic script processing method and device, electronic equipment and storage medium
EP4086764A1 (en) * 2021-05-06 2022-11-09 Ateme Method for dynamic resources allocation and apparatus for implementing the same
CN113691587A (en) * 2021-07-20 2021-11-23 北京达佳互联信息技术有限公司 Virtual resource processing method and device, electronic equipment and storage medium
WO2023115931A1 (en) * 2021-12-21 2023-06-29 浪潮通信信息系统有限公司 Big-data component parameter adjustment method and apparatus, and electronic device and storage medium
CN114168302A (en) * 2021-12-28 2022-03-11 中国建设银行股份有限公司 Task scheduling method, device, equipment and storage medium
CN115794591A (en) * 2023-02-06 2023-03-14 南方电网数字电网研究院有限公司 Scheduling method of power grid IT (information technology) resources

Also Published As

Publication number Publication date
CN108845884B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
CN108845884A (en) Physical source distributing method, apparatus, computer equipment and storage medium
CN109271447A (en) Method of data synchronization, device, computer equipment and storage medium
Gunasekaran et al. Fifer: Tackling resource underutilization in the serverless era
CN111708627B (en) Task scheduling method and device based on distributed scheduling framework
Struhár et al. React: Enabling real-time container orchestration
Tămaş-Selicean et al. Design optimization of mixed-criticality real-time embedded systems
Axer et al. Response-time analysis of parallel fork-join workloads with real-time constraints
CN100489790C (en) Processing management device, computer system, distributed processing method
CN110597858A (en) Task data processing method and device, computer equipment and storage medium
CN108920153B (en) Docker container dynamic scheduling method based on load prediction
CN108897610A (en) Method for scheduling task, device, computer equipment and storage medium
CN106406983A (en) Task scheduling method and device in cluster
Soualhia et al. Predicting scheduling failures in the cloud: A case study with google clusters and hadoop on amazon EMR
CN111625331B (en) Task scheduling method, device, platform, server and storage medium
CN107291546A (en) A kind of resource regulating method and device
CN103677990B (en) Dispatching method, device and the virtual machine of virtual machine real-time task
CN112286671B (en) Containerization batch processing job scheduling method and device and computer equipment
Imai et al. Accurate resource prediction for hybrid IaaS clouds using workload-tailored elastic compute units
CN112486642B (en) Resource scheduling method, device, electronic equipment and computer readable storage medium
Moulik RESET: A real-time scheduler for energy and temperature aware heterogeneous multi-core systems
Caniou et al. Budget-aware scheduling algorithms for scientific workflows with stochastic task weights on heterogeneous iaas cloud platforms
CN110196773B (en) Multi-time-scale security check system and method for unified scheduling computing resources
CN106845746A (en) A kind of cloud Workflow Management System for supporting extensive example intensive applications
CN109656692A (en) A kind of big data task management method, device, equipment and storage medium
Werner et al. HARDLESS: A generalized serverless compute architecture for hardware processing accelerators

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant