CN107168782A

CN107168782A - A kind of concurrent computational system based on Spark and GPU

Info

Publication number: CN107168782A
Application number: CN201710270400.8A
Authority: CN
Inventors: 郑健; 杜姗姗; 冯瑞; 金城; 薛向阳
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2017-04-24
Filing date: 2017-04-24
Publication date: 2017-09-15

Abstract

The invention belongs to parallel computing field, specially a kind of parallel computation frame system based on Spark and GPU.The present invention be based on YARN resource management platforms, by improving its explorer and node manager, can effectively perceive isomeric group GPU resource, so as to support the management to cluster GPU resource and scheduling；Then under YARN deployment modes, job scheduling mechanism and tasks carrying mechanism to Spark are improved, and it is supported scheduling and execution to GPU type tasks.By introducing the mark to GPU resource in the stages such as resource bid, resource allocation, DAG generations, stage divisions and tasks carryings, enforcement engine is set to perceive GPU task, and effectively performed in isomeric group；Simultaneously using the Spark characteristics that efficient internal memory is calculated in itself, the effective programming model under the framework is proposed with reference to the advantage that GPU multi-core parallel concurrents are calculated.The present invention can effectively processing data it is intensive with computation-intensive operation, job processing efficiency is greatly improved.

Description

A kind of concurrent computational system based on Spark and GPU

Technical field

The invention belongs to parallel computing field, and in particular to a kind of parallel computation frame system based on Spark and GPU System.

Background technology

Today's society, every profession and trade needs data scale to be processed presented magnanimity trend, and big data causes social each The extensive concern of industry.Undoubtedly, big data contains abundant useful information, if it is possible to reasonably excavates and uses Big data, will produce huge facilitation to scientific research and social economy.By the information contained in big data can Auxiliary commerce decision-making and scientific research, so having obtained quick development and application in many industries.In big data Dai Zhong, all are data-centered, from the historical data of magnanimity can mining analysis go out many can not obtain otherwise The effective information obtained, so as to improve the accuracy of decision-making.

Developing into for Distributed Calculation fully excavates data value there is provided effective means.Distributed Calculation can be utilized Cheap computer cluster, fast calculation analysis is carried out to mass data, has effectively to save data analysis cost. Under such environment, a collection of distributed computing framework technology is arisen at the historic moment, the spy that wherein Spark is calculated due to it based on internal memory Property, can effectively lift the efficiency of data processing, and be had a wide range of applications in fields such as machine learning and interactive analyses.

At the same time, the characteristic that GPU possesses numerous cores because of it, makes it to obtain in many applications than simple CPU Higher computational efficiency is calculated, and this acceleration effect is often weighed with ten times or hundred times.Compared to simple lifting CPU Performance, carries out parallel computation often more cheap and effective using GPU.This causes GPU to have important in high-performance computing sector Status.

Although Spark can the effectively intensive operation of processing data, less fitted for computation-intensive operation Close.And cluster scale extension is limited, if calculated using CPU merely, then the process performance for high-volume operation is carried Rise and still have much room for improvement.If the support to GPU equipment can be introduced in Spark, it is set to give full play to Spark sheets The characteristic that height effect internal memory is calculated, the advantage that can be calculated again using GPU multi-core parallel concurrents, this is by significant increase to mass data Treatment effeciency.

Do not have to introduce the support to GPU equipment in primary Spark frameworks, it is existing that GPU speed-up computations are called in Spark Solution be that based on calling C/C++ programs to be handled in Java/Scala language, this mode has many drawbacks. Due to GPU calculating tasks can not be perceived in Spark, so it cannot be distinguished by CPU tasks and GPU task, performed in scheduler task When, it would be possible to GPU task can be started in the node without GPU equipment, cause tasks carrying to fail.And in YARN resource managements In device, the scheduling to CPU and memory source is only supported, it is impossible to perceive GPU resource, it can not be provided to the Spark frameworks on upper strata Distribution and scheduling to GPU resource.Due to YARN and Spark frameworks itself, the biography that GPU is calculated is performed in Spark System method can not adapt to isomeric group environment.

The content of the invention

It is an object of the invention to provide a kind for the treatment of effeciency is high, and can adapt to isomeric group environment based on Spark With GPU concurrent computational system.

The concurrent computational system based on Spark and GPU that the present invention is provided, Spark is integrated with GPU, can Enough effective processing data intensities and computation-intensive operation, greatly improve job processing efficiency.

The parallel computation frame system based on Spark and GPU that the present invention is provided, including：

Component one, improved resource management platform, its support the multi dimensional resources such as GPU, CPU and internal memory are scheduled with Management；

Component two, improved Spark distributed computing frameworks, it supports the scheduling and execution to GPU type tasks.

(1) the improved resource management platform, including：

Improve YARN explorer and node manager, can effectively perceive isomeric group GPU resource, from And support the management and scheduling to cluster GPU resource.Wherein, including resource representation model, resource dispatching model, resource seize mould The improvement of the binding mechanism of type, resource isolation mechanism and GPU equipment.

(2) the improved Spark distributed computing frameworks, including：

Spark resource bid and distribution mechanism, job scheduling mechanism and tasks carrying mechanism is improved, makes its support pair The scheduling and execution of GPU type tasks.By in ranks such as resource bid, resource allocation, DAG generations, stage divisions and tasks carryings Section introduces the mark to GPU resource, its enforcement engine is perceived GPU task, and effectively performed in isomeric group.

In the present invention, the improved resource management platform, it would be preferable to support to including the multi dimensional resource including GPU resource It is managed and dispatches.Specifically：

On resource representation model, the GPU number of devices included in node is can customize first, and change resource representation association View, makes it increase the expression to GPU resource.When node starts, node manager initializes the Resources list, and and resource management Device reports the resource information of the node by heartbeat mechanism.

On resource dispatching model, GPU is added to the layer of resource management platform by the present invention together with CPU, memory source In level management queue.The uniformity of resource management can not only be so kept, the GPU resource that is directed to that also can be more flexible carries out authority Setting, be more suitable in large-scale cluster handle multi user operation scene under apply.The present invention is according to DRF algorithms to resource Scheduler module is modified, and it is added scheduling and management to GPU resource.The algorithm is as follows：

(1) initializing variable.Wherein, R=<totalCPU,totalGPU,totalMem>Represent cluster CPU, GPU and interior Deposit the total amount of resource.C=<usedCPU,usedGPU,usedMem>Represent CPU, GPU and the internal memory money consumed in cluster The quantity in source.s_iRepresent that operation i primary resource accounts for the share of corresponding total resources.U_i=<CPU_i,GPU_i,Mem_i>Represent the allocated Stock number to operation i.D_i=<CPU_i,GPU_i,Mem_i>Represent the stock number that operation i each task needs.

When choosing operation progress resource allocation every time, following steps are performed successively：

(2) primary resource share s is chosen_iMinimum Job execution.

(3) if C+D_i≤ R, then allocate resources to operation i, updates C=C+D_i,U_i=U_i+D_i,

s_i=max { U_i/R}.Otherwise, cluster resource can not meet demand, stop distribution.

On resource preemption model, by Resource Scheduler every kind of resource is set to each queue in level queue can With the upper limit and lower limit.Resource Scheduler is by the resource allocation of the queue of light load to the queue of other heavier loads to improve collection Group's resource utilization.But when there is new application program to be submitted to the queue of light load, scheduler can resource preemption mechanism receipts The resource shared by other queues is returned, so as to will originally belong to the resource allocation of the queue to it.When resource preemption mechanism occurs, Need to discharge GPU resource.

This work transfers to node manager to complete, and increasing releaseGPU methods newly here is used to discharge GPU resource. The Resources list information for needing to discharge is sent to the node manager of response, node administration by explorer by heartbeat mechanism When device is detected in resource entity to be released containing GPU resource, releaseGPU methods can be called to discharge GPU resource.Then The resource discharged is further distributed to associated queue by explorer.

On resource isolation model, due to Cgroups possess preferable isolation performance and its support to carry out GPU resource every From the present invention is isolated using Cgroups schemes to GPU resource.

On the binding mechanism of GPU equipment, when including GPU resource in the resource entity for distributing to the task, phase The node manager answered needs to be bound the GPU equipment on node and the resource entity.If there is multiple free time on node GPU resource, then need selection one be allocated.GPU running state information is expressed as by the present invention<GPU device numbers, Resource entity number>List, every data of list identifies the corresponding relation of GPU equipment and related resource entity.Node administration Device according to the GPU facility informations on associated profile and the node can initialize the list when node just starts.

When there is new task requests to use GPU resource, node manager is by searching the list, so as to obtain in sky The GPU facility informations of not busy state, and assign them to inter-related task.If possessing multiple GPU resources on node manager node In idle condition, then the distribution of GPU resource is carried out by round robin.Meanwhile, by the resource entity run and GPU resource Corresponding informance preserved into database., can be from database in the case where node manager needs to restart Directly read the distribution information of GPU equipment, it is to avoid reallocation to node resource.

In the present invention, the improved Spark distributed computing frameworks are to be improved for Spark kernels, prop up it Hold scheduling and execution to GPU type tasks.Specifically：

When submitting operation, if the application controller of Spark application programs detects the application program and needs GPU Resource, then needed in resource bid, and required GPU resource is added in resource request description.

The Container of application includes two kinds：CPU types Container and GPU types Container.Because for GPU types Task, it is also desirable to which CPU completes processing, transmission and the startup of GPU of data, so GPU types Container is except needing Outside the GPU resource of 1 unit, in addition it is also necessary to the core cpu of specified quantity.When applying for resource, it is thus necessary to determine that to be applied two The Container numbers of type.Here, each Container CPU core numbers to be included are represented with executorCores, TotalCores represents the CPU core number of application program, and GPUNum represents the GPU resource quantity of application program, then GPU Type Container quantity is GPUNum, and non-GPU types Container quantity is (totalCores-GPUNum* executorCores)/executorCores.Then the memory source quantity further according to setting is judged, detects always interior Deposit whether quantity disclosure satisfy that the memory amount that all Container need, further to handle.Send after resource request, provide Source scheduler can't immediately return for it and meet desired resource, and need the corresponding application controllers of Spark continuous Communicated by heartbeat mechanism with explorer, with probe requests thereby to resource whether be assigned.Application program controlling Device is added into the Resources list to be allocated inside program, to distribute to what is specifically performed after apllied resource is received Task.

, it is necessary to which the task to GPU is identified in Spark interfaces.The present invention proposes mapPartitionsGPU calculations Son and mapPartitionsGPURDD, for being handled for GPU task.

Spark job scheduler DAGScheduler generation DAG figures after, start divide stage when, it is necessary to increase word Whether section includes GPU operation to identify in current stage.Inside a stage, according to the calculating side run on each RDD Whether method needs GPU resource, and its internal RDD is divided to for two kinds：Need RDDs of the RDD of GPU resource with not needing GPU resource.Such as When really in the stage comprising the RDD for needing GPU resource, then when the subregion of RDD in for this stage distributes resource, it should be It distributes enough GPU resources, even if possible only one of which RDD needs when calculating.Otherwise, task can in calculating process It is able to can cause to perform failure because of no workable GPU resource.GPU resource is needed in order to identify whether to include in stage RDD, it is necessary to increase field flagGPU for stage, when flagGPU is true, show in the stage comprising needing GPU resource RDD.By setting flagGPU fields, in the resource allocation of next step, can be recognized by task manager and be its distribution GPU resource.

In the present invention, the flow of stage types is identified in the job scheduler DAGScheduler inside Spark It is as follows：

(1) after DAG generations, stage is divided.When generating stage, detect the RDD's that stage inside is included Whether flagGPU fields are true, if it is, illustrating that the stage needs GPU resource in the process of implementation, mark the stage's FlagGPU fields are true.The foundation of GPU resource distribution is carried out as later task manager.

(2) enforcement engine submits stage algorithm to be a recursive process, and it can submit last in DAG figures first One stage, then checks whether the father stage of the stage all has been filed on finishing, and starts to perform this time if all submitting Task-set corresponding to stage.If his father stage have without submitting, recurrence submits his father stage, and equally makes Above-mentioned inspection.So final result is schemed according to DAG, stage is performed from front to back.The benefit that so doing is is to be able to ensure that Its input data has been prepared for finishing when current stage is performed, and the partition data in RDD is when losing, can be along DAG Figure finds the partition data generated in nearest RDD from back to front, then re-executes to obtain loss subregion.

(3) submit after stage, task manager starts the stage being divided into task-set, and to cluster manager dual system application Resource needed for performing.Task quantity included in task-set is identical with RDD number of partitions.Task manager is detected first Whether the flagGPU fields of the stage are true, and the container of GPU resource is if it is included for its distribution.It is being somebody's turn to do During container distribution, if multiple container can be selected, then judged according to localization strategy.I.e. successively Select local node, other nodes of this frame and other frame nodes.Then task is started in node in resource, and will Task intermediate result is stored in storage system with final result.During this, if the container numbers comprising GPU resource Amount is less than the quantity of GPU type tasks, then temporarily the task of unallocated GPU resource needs to wait, and treats that other tasks carryings are finished, has GPU resource is then allocated when being in idle condition.

(4) after task is finished, returned resource.The container of recovery is added in list to be allocated, to distribute Used to other tasks.

The present invention is based on improved framework, it is proposed that a kind of effective programming model for GPU type tasks.

In Spark, the data in RDD are made up of several subregions, and it is finally assigned to some in units of subregion Complete to calculate in individual node.In fact, performing granularity according to partition data, the type of GPU calculating can will be carried out using Spark It is broadly divided into two kinds：

(1) GPU is completed in units of subregion to calculate.The data in RDD subregions are all put into GPU parallel to complete Calculate, improve executing efficiency；

(2) GPU is completed in units of single record to calculate.The data in RDD subregions are put into GPU one by one and complete to count Calculate, acceleration processing is carried out in units of single record.

In improved framework, the mapPartitionsGPU operators newly increased can perceive GPU type tasks, with the number of partitions Handled according to as input.The main execution logic of the operator is as follows：

(1) GPU equipment is initialized first in method；

(2) then judge to partition data perform granularity be in units of subregion or in units of wall scroll is recorded.Such as Fruit is with divisional unit, then partition data is transferred in GPU video memorys using CUDA API, this process may relate to data The conversion of form, partition data in RDD is converted into the data format that can be handled by GPU.Then GPU pair is called Data carry out parallel computation, and after the completion of calculating, output result is transmitted into internal memory.If the granularity that partition data is performed is In units of wall scroll is recorded, then need to one by one to each partitioned record sequential processes.Copy a record data extremely every time In GPU video memorys, call GPU to carry out parallel computation to data, after the completion of calculating, output result is copied to internal memory In., it is necessary to which the output result entirely recorded is converted into a partitioned set after all records are disposed；

(3) GPU equipment is discharged, and returns to partitioned set iterator.

Compared with prior art, advantages of the present invention and effect have：

1st, improved resource management platform proposed by the invention can perceive CPU, internal memory and the GPU moneys in isomeric group Source, and effectively it can be managed and dispatched；

2nd, it is improved based on Spark distributed computing frameworks, can effectively differentiates GPU type tasks, in DAG life Divide, the application of resource can be handled targetedly with the stage such as distributing, GPU type operations can be entered into, stage The correct scheduling of row is with performing；

3rd, framework proposed by the present invention, can adapt to part of nodes in cluster and possesses the isomery blocked GPU equipment and single-point more Environment, GPU types task is correctly assigned in cluster in the node comprising GPU resource and performed by it, solves tradition and performs GPU Can not be under isomeric group environment the problem of normal work during task.

Brief description of the drawings

Fig. 1 is that GPU equipment is distributed with discharging.

Fig. 2 is improved framework execution flow chart.

Fig. 3 is mapPartitionsGPU operation principles.

Embodiment

Technical scheme of the present invention is further illustrated below in conjunction with the accompanying drawings.

Fig. 1 is model training and the block diagram of image recognition, is mainly included：

1st, on resource representation, the GPU number of devices included in node is can customize first, and change resource representation association View, makes it increase the expression to GPU resource.When node starts, node manager initializes the Resources list, and and resource management Device reports the resource information of the node by heartbeat mechanism.

2nd, in scheduling of resource, GPU is added to the level of resource management platform by the present invention together with CPU, memory source Manage in queue.

3rd, the Resources list information for needing to discharge is sent to the node administration of response by explorer by heartbeat mechanism Device, when node manager is detected in resource entity to be released containing GPU resource, can call releaseGPU methods to discharge GPU resource.Then the resource discharged is further distributed to associated queue by explorer.

4th, in resource isolation, due to Cgroups possess preferable isolation performance and its support to GPU resource carry out every From the present invention is isolated using Cgroups schemes to GPU resource.

5th, on the dynamic binding of GPU equipment, when including GPU resource in the resource entity for distributing to task, accordingly Node manager needs to be bound the GPU equipment on node and the resource entity.Running state information of the invention by GPU It is expressed as<GPU device numbers, resource entity number>List, node manager can when node just starts according to associated profile with And the GPU facility informations on the node initialize the list.When there is new task requests to use GPU resource, node manager By searching the list, so as to obtain the GPU facility informations in idle condition, and inter-related task is assigned them to.If section Possess multiple GPU resources on point manager node and be in idle condition, then the distribution of GPU resource is carried out by round robin.Meanwhile, The corresponding informance of the resource entity run and GPU resource is preserved into database.

6th, the present invention proposes mapPartitionsGPU operators and mapPartitionsGPURDD, for appointing for GPU Business is handled.After generation DAG figures, start whether to wrap to identify in current stage, it is necessary to increase field when dividing stage Containing GPU operation.

7th, when the stage is divided into task-set by task manager, detect whether the stage possesses GPU marks first, such as Fruit is the container for then including GPU resource for its distribution.

8th, for possessing the task of GPU marks, schedule it in the node comprising GPU equipment and perform.

Bibliography：

[1]Ali Ghodsi,MateiZaharia.Dominant Resource Fairness:Fair Allocation of Multiple Resource Types[J].Berkeley.

[2]M.Zaharia,M.Chowdhury,T.Das,et al.Resilient distributed datasets:A fault tolerant abstraction for in-memory cluster computing[C]//Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation.CA, USA:USENIX Association,2012

[3]Janki Bhimani,Miriam Leeser,Ningfang Mi.Accelerating K-Means clustering with parallel implementations and GPU computing[A].2015IEEE High Performance Extreme Computing Conference(HPEC)[C].2015,1-6.

[4]Huang Chao-Qiang,Yang Shu-Qiang.RDDShare:Reusing Results of Spark RDD[A].2016IEEE First International Conference on Data Science in Cyberspace (DSC)[J].2016,290-295.

[5]Jie Zhu 1,Juanjuan Li.GPU-In-Hadoop:Enabling MapReduce Across Distributed Heterogeneous Platforms.IEEE ICIS 2014[J].2014,1-6。

Claims

1. a kind of concurrent computational system based on Spark and GPU, it is characterised in that including：

Improved resource management platform, its support is scheduled and managed to multi dimensional resources such as GPU, CPU and internal memories；

Improved Spark distributed computing frameworks, it supports the scheduling and execution to GPU type tasks；

(1) the improved resource management platform, including：

Improve YARN explorer and node manager, can effectively perceive isomeric group GPU resource, so as to prop up Hold the management to cluster GPU resource and scheduling；Wherein, including resource representation model, resource dispatching model, resource preemption model, The improvement of the binding mechanism of resource isolation mechanism and GPU equipment；

(2) the improved Spark distributed computing frameworks, including：

Spark resource bid and distribution mechanism, job scheduling mechanism and tasks carrying mechanism is improved, it is supported to GPU types The scheduling and execution of task；By dividing and drawing with the stage such as tasks carrying in resource bid, resource allocation, DAG generations, stage Enter the mark to GPU resource, its enforcement engine is perceived GPU task, and effectively performed in isomeric group.

2. the concurrent computational system according to claim 1 based on Spark and GPU, it is characterised in that described improved Resource management platform, it would be preferable to support to being managed and dispatching comprising the multi dimensional resource including GPU resource：

On resource representation model, the GPU number of devices included first in User- defined Node, and resource representation agreement is changed, make It increases the expression to GPU resource；When node starts, node manager initializes the Resources list, and logical with explorer Cross the resource information that heartbeat mechanism reports the node；

On resource dispatching model, GPU is added to the hierarchy management queue of resource management platform together with CPU, memory source In；Scheduling of resource module is modified according to DRF algorithms, it is added scheduling and management to GPU resource；The algorithm is such as Under：

(1) initializing variable；Wherein, R=<totalCPU,totalGPU,totalMem>Represent cluster CPU, GPU and internal memory money The total amount in source, C=<usedCPU,usedGPU,usedMem>Represent CPU, GPU for having been consumed and memory source in cluster Quantity, s_iRepresent that operation i primary resource accounts for the share of corresponding total resources, U_i=<CPU_i,GPU_i,Mem_i>Expression is already allocated to work Industry i stock number, D_i=<CPU_i,GPU_i,Mem_i>The stock number that operation i each task needs is represented, operation is being chosen every time When carrying out resource allocation, following steps are performed successively：

(2) primary resource share s is chosen_iMinimum Job execution；

s_i=max { U_i/R}；Otherwise, cluster resource can not meet demand, stop distribution；

On resource preemption model, using for every kind of resource is set to each queue in level queue by Resource Scheduler Limit and lower limit；Resource Scheduler provides the resource allocation of the queue of light load to the queue of other heavier loads to improve cluster Source utilization rate；But when there is new application program to be submitted to the queue of light load, scheduler meeting resource preemption mechanism withdraws it Resource shared by his queue, so as to will originally belong to the resource allocation of the queue to it；When resource preemption mechanism occurs, it is necessary to Discharge GPU resource；This work is completed by node manager, and increasing releaseGPU methods newly here is used to discharge GPU resource；Money The Resources list information for needing to discharge is sent to the node manager of response, node manager by source manager by heartbeat mechanism When detecting in resource entity to be released containing GPU resource, releaseGPU methods are called to discharge GPU resource；Then resource The resource discharged is further distributed to associated queue by manager；

On resource isolation model, GPU resource is isolated using Cgroups schemes；

On the binding mechanism of GPU equipment, when including GPU resource in the resource entity for distributing to the task, accordingly Node manager needs to be bound the GPU equipment on node and the resource entity；If there is multiple idle GPU on node Resource, then need selection one to be allocated；GPU running state information is expressed as<GPU device numbers, resource entity number> List, every data of list identifies the corresponding relation of GPU equipment and related resource entity；Node manager has just been opened in node The list is initialized according to the GPU facility informations on associated profile and the node when dynamic；

When there is new task requests to use GPU resource, node manager is by searching the list, so as to obtain in idle shape The GPU facility informations of state, and assign them to inter-related task；If possessing multiple GPU resources on node manager node to be in Idle condition, then carry out the distribution of GPU resource by round robin；Meanwhile, by the resource entity run and pair of GPU resource Information is answered to be preserved into database；, can be from database directly in the case where node manager needs to restart Read the distribution information of GPU equipment, it is to avoid reallocation to node resource.

3. the concurrent computational system according to claim 2 based on Spark and GPU, it is characterised in that described improved Spark distributed computing frameworks, are improved for Spark kernels, it is supported scheduling and execution to GPU type tasks：

When submitting operation, if the application controller of Spark application programs detects the application program and needs GPU to provide Source, then in resource bid, required GPU resource is added in resource request description；

The Container of application includes two kinds：CPU types Container and GPU types Container；Because for GPU type tasks, Processing, transmission and the startup of GPU that CPU completes data are also required to, so GPU types Container is except needing 1 list Outside the GPU resource of position, in addition it is also necessary to the core cpu of specified quantity；When applying for resource, it is determined that to be applied is two kinds of Container numbers；Here, each Container CPU core numbers to be included are represented with executorCores, TotalCores represents the CPU core number of application program, and GPUNum represents the GPU resource quantity of application program, then GPU Type Container quantity is GPUNum, and non-GPU types Container quantity is (totalCores-GPUNum* executorCores)/executorCores；Then the memory source quantity further according to setting is judged, detects always interior Deposit whether quantity disclosure satisfy that the memory amount that all Container need, further to handle；Send after resource request, provide Source scheduler does not return immediately for it and meets desired resource, and needs the corresponding application controllers of Spark not open close Heartbeat mechanism is crossed to communicate with explorer, with probe requests thereby to resource whether be assigned；Application controller After apllied resource is received, it is added into the Resources list to be allocated inside program, to distribute to times specifically performed Business；

In Spark interfaces, the task to GPU is identified, wherein by mapPartitionsGPU operators with MapPartitionsGPURDD, for being handled for GPU task；

Spark job scheduler DAGScheduler is after generation DAG figures, when starting to divide stage, increases field to identify Whether GPU operation is included in current stage；Inside a stage, according to the computational methods run on each RDD whether need GPU resource is wanted, its internal RDD is divided to for two kinds：Need RDDs of the RDD of GPU resource with not needing GPU resource；If should Then it is its distribution foot when the subregion of RDD in for this stage distributes resource when in stage comprising the RDD for needing GPU resource Enough GPU resources, even if possible only one of which RDD needs when calculating；Otherwise, task may be because of in calculating process Cause to perform failure without workable GPU resource；In order to whether identify in stage comprising the RDD of GPU resource is needed, it is Stage increases field flagGPU, when flagGPU is true, and showing to include in the stage needs the RDD of GPU resource；By setting Put flagGPU fields, in the resource allocation of next step, can be recognized by task manager and be its distribute GPU resource.

4. the concurrent computational system according to claim 3 based on Spark and GPU, it is characterised in that inside Spark Job scheduler DAGScheduler in be identified stage types flow it is as follows：

(1) after DAG generations, stage is divided；When generating stage, the RDD included inside the stage flagGPU is detected Whether field is true, if it is, illustrating that the stage needs GPU resource in the process of implementation, marks the flagGPU of the stage Field is true；The foundation of GPU resource distribution is carried out as later task manager；

(2) enforcement engine submits stage algorithm to be a recursive process, and last in DAG figures is submitted first Stage, then checks whether the father stage of the stage all has been filed on finishing, and starts to perform this time if all submitting Task-set corresponding to stage；If his father stage have without submitting, recurrence submits his father stage, and equally makes Above-mentioned inspection；So final result is schemed according to DAG, stage is performed from front to back；

(3) submit after stage, task manager starts the stage being divided into task-set, and is performed to cluster manager dual system application Required resource；Task quantity included in task-set is identical with RDD number of partitions；Task manager detects this first Whether stage flagGPU fields are true, if it is, including the container of GPU resource for its distribution；It is being somebody's turn to do During container distribution, if multiple container can be selected, then judged according to localization strategy, i.e., successively Select local node, other nodes of this frame and other frame nodes；Then task is started in node in resource, and will Task intermediate result is stored in storage system with final result；During this, if the container numbers comprising GPU resource Amount is less than the quantity of GPU type tasks, then temporarily the task of unallocated GPU resource needs to wait, and treats that other tasks carryings are finished, has GPU resource is then allocated when being in idle condition；

(4) after task is finished, returned resource；The container of recovery is added in list to be allocated, to distribute to it He uses task.

5. the concurrent computational system according to claim 4 based on Spark and GPU, it is characterised in that propose that one kind is directed to The effective programming model of GPU type tasks：

In Spark, the data in RDD are made up of several subregions, and it is finally assigned to several sections in units of subregion Complete to calculate in point；Granularity is performed according to partition data, the type that GPU calculating is carried out using Spark is broadly divided into two kinds：

(1) GPU is completed in units of subregion to calculate, i.e., the data in RDD subregions are all put into GPU to complete in terms of parallel Calculate, to improve executing efficiency；

(2) GPU is completed in units of single record to calculate, i.e., the data in RDD subregions are put into GPU one by one and complete to calculate, Acceleration processing is carried out in units of single record.

6. the concurrent computational system according to claim 4 based on Spark and GPU, it is characterised in that newly increase MapPartitionsGPU operators can perceive GPU type tasks, be handled using partition data as input；The operator it is main Execution logic is as follows：

(1) GPU equipment is initialized first in method；

(2) then judge to partition data perform granularity be in units of subregion or in units of wall scroll is recorded；If With divisional unit, then partition data is transferred in GPU video memorys using CUDA API, this process may relate to data format Conversion, partition data in RDD is converted into the data format that can be handled by GPU；Then GPU is called to data Carry out parallel computation；After the completion of calculating, output result is transmitted into internal memory；If the granularity that partition data is performed is with list Bar record is unit, then to one by one to each partitioned record sequential processes；A record data is copied every time to GPU video memorys In, call GPU to carry out parallel computation to data；After the completion of calculating, output result is copied in internal memory；In whole After record is disposed, the output result entirely recorded is converted into a partitioned set；

(3) GPU equipment is discharged, and returns to partitioned set iterator.