CN107168782A - A kind of concurrent computational system based on Spark and GPU - Google Patents

A kind of concurrent computational system based on Spark and GPU Download PDF

Info

Publication number
CN107168782A
CN107168782A CN201710270400.8A CN201710270400A CN107168782A CN 107168782 A CN107168782 A CN 107168782A CN 201710270400 A CN201710270400 A CN 201710270400A CN 107168782 A CN107168782 A CN 107168782A
Authority
CN
China
Prior art keywords
gpu
resource
stage
task
spark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710270400.8A
Other languages
Chinese (zh)
Inventor
郑健
杜姗姗
冯瑞
金城
薛向阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201710270400.8A priority Critical patent/CN107168782A/en
Publication of CN107168782A publication Critical patent/CN107168782A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17318Parallel communications techniques, e.g. gather, scatter, reduce, roadcast, multicast, all to all
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17331Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to parallel computing field, specially a kind of parallel computation frame system based on Spark and GPU.The present invention be based on YARN resource management platforms, by improving its explorer and node manager, can effectively perceive isomeric group GPU resource, so as to support the management to cluster GPU resource and scheduling;Then under YARN deployment modes, job scheduling mechanism and tasks carrying mechanism to Spark are improved, and it is supported scheduling and execution to GPU type tasks.By introducing the mark to GPU resource in the stages such as resource bid, resource allocation, DAG generations, stage divisions and tasks carryings, enforcement engine is set to perceive GPU task, and effectively performed in isomeric group;Simultaneously using the Spark characteristics that efficient internal memory is calculated in itself, the effective programming model under the framework is proposed with reference to the advantage that GPU multi-core parallel concurrents are calculated.The present invention can effectively processing data it is intensive with computation-intensive operation, job processing efficiency is greatly improved.

Description

A kind of concurrent computational system based on Spark and GPU
Technical field
The invention belongs to parallel computing field, and in particular to a kind of parallel computation frame system based on Spark and GPU System.
Background technology
Today's society, every profession and trade needs data scale to be processed presented magnanimity trend, and big data causes social each The extensive concern of industry.Undoubtedly, big data contains abundant useful information, if it is possible to reasonably excavates and uses Big data, will produce huge facilitation to scientific research and social economy.By the information contained in big data can Auxiliary commerce decision-making and scientific research, so having obtained quick development and application in many industries.In big data Dai Zhong, all are data-centered, from the historical data of magnanimity can mining analysis go out many can not obtain otherwise The effective information obtained, so as to improve the accuracy of decision-making.
Developing into for Distributed Calculation fully excavates data value there is provided effective means.Distributed Calculation can be utilized Cheap computer cluster, fast calculation analysis is carried out to mass data, has effectively to save data analysis cost. Under such environment, a collection of distributed computing framework technology is arisen at the historic moment, the spy that wherein Spark is calculated due to it based on internal memory Property, can effectively lift the efficiency of data processing, and be had a wide range of applications in fields such as machine learning and interactive analyses.
At the same time, the characteristic that GPU possesses numerous cores because of it, makes it to obtain in many applications than simple CPU Higher computational efficiency is calculated, and this acceleration effect is often weighed with ten times or hundred times.Compared to simple lifting CPU Performance, carries out parallel computation often more cheap and effective using GPU.This causes GPU to have important in high-performance computing sector Status.
Although Spark can the effectively intensive operation of processing data, less fitted for computation-intensive operation Close.And cluster scale extension is limited, if calculated using CPU merely, then the process performance for high-volume operation is carried Rise and still have much room for improvement.If the support to GPU equipment can be introduced in Spark, it is set to give full play to Spark sheets The characteristic that height effect internal memory is calculated, the advantage that can be calculated again using GPU multi-core parallel concurrents, this is by significant increase to mass data Treatment effeciency.
Do not have to introduce the support to GPU equipment in primary Spark frameworks, it is existing that GPU speed-up computations are called in Spark Solution be that based on calling C/C++ programs to be handled in Java/Scala language, this mode has many drawbacks. Due to GPU calculating tasks can not be perceived in Spark, so it cannot be distinguished by CPU tasks and GPU task, performed in scheduler task When, it would be possible to GPU task can be started in the node without GPU equipment, cause tasks carrying to fail.And in YARN resource managements In device, the scheduling to CPU and memory source is only supported, it is impossible to perceive GPU resource, it can not be provided to the Spark frameworks on upper strata Distribution and scheduling to GPU resource.Due to YARN and Spark frameworks itself, the biography that GPU is calculated is performed in Spark System method can not adapt to isomeric group environment.
The content of the invention
It is an object of the invention to provide a kind for the treatment of effeciency is high, and can adapt to isomeric group environment based on Spark With GPU concurrent computational system.
The concurrent computational system based on Spark and GPU that the present invention is provided, Spark is integrated with GPU, can Enough effective processing data intensities and computation-intensive operation, greatly improve job processing efficiency.
The parallel computation frame system based on Spark and GPU that the present invention is provided, including:
Component one, improved resource management platform, its support the multi dimensional resources such as GPU, CPU and internal memory are scheduled with Management;
Component two, improved Spark distributed computing frameworks, it supports the scheduling and execution to GPU type tasks.
(1) the improved resource management platform, including:
Improve YARN explorer and node manager, can effectively perceive isomeric group GPU resource, from And support the management and scheduling to cluster GPU resource.Wherein, including resource representation model, resource dispatching model, resource seize mould The improvement of the binding mechanism of type, resource isolation mechanism and GPU equipment.
(2) the improved Spark distributed computing frameworks, including:
Spark resource bid and distribution mechanism, job scheduling mechanism and tasks carrying mechanism is improved, makes its support pair The scheduling and execution of GPU type tasks.By in ranks such as resource bid, resource allocation, DAG generations, stage divisions and tasks carryings Section introduces the mark to GPU resource, its enforcement engine is perceived GPU task, and effectively performed in isomeric group.
In the present invention, the improved resource management platform, it would be preferable to support to including the multi dimensional resource including GPU resource It is managed and dispatches.Specifically:
On resource representation model, the GPU number of devices included in node is can customize first, and change resource representation association View, makes it increase the expression to GPU resource.When node starts, node manager initializes the Resources list, and and resource management Device reports the resource information of the node by heartbeat mechanism.
On resource dispatching model, GPU is added to the layer of resource management platform by the present invention together with CPU, memory source In level management queue.The uniformity of resource management can not only be so kept, the GPU resource that is directed to that also can be more flexible carries out authority Setting, be more suitable in large-scale cluster handle multi user operation scene under apply.The present invention is according to DRF algorithms to resource Scheduler module is modified, and it is added scheduling and management to GPU resource.The algorithm is as follows:
(1) initializing variable.Wherein, R=<totalCPU,totalGPU,totalMem>Represent cluster CPU, GPU and interior Deposit the total amount of resource.C=<usedCPU,usedGPU,usedMem>Represent CPU, GPU and the internal memory money consumed in cluster The quantity in source.siRepresent that operation i primary resource accounts for the share of corresponding total resources.Ui=<CPUi,GPUi,Memi>Represent the allocated Stock number to operation i.Di=<CPUi,GPUi,Memi>Represent the stock number that operation i each task needs.
When choosing operation progress resource allocation every time, following steps are performed successively:
(2) primary resource share s is choseniMinimum Job execution.
(3) if C+Di≤ R, then allocate resources to operation i, updates C=C+Di,Ui=Ui+Di,
si=max { Ui/R}.Otherwise, cluster resource can not meet demand, stop distribution.
On resource preemption model, by Resource Scheduler every kind of resource is set to each queue in level queue can With the upper limit and lower limit.Resource Scheduler is by the resource allocation of the queue of light load to the queue of other heavier loads to improve collection Group's resource utilization.But when there is new application program to be submitted to the queue of light load, scheduler can resource preemption mechanism receipts The resource shared by other queues is returned, so as to will originally belong to the resource allocation of the queue to it.When resource preemption mechanism occurs, Need to discharge GPU resource.
This work transfers to node manager to complete, and increasing releaseGPU methods newly here is used to discharge GPU resource. The Resources list information for needing to discharge is sent to the node manager of response, node administration by explorer by heartbeat mechanism When device is detected in resource entity to be released containing GPU resource, releaseGPU methods can be called to discharge GPU resource.Then The resource discharged is further distributed to associated queue by explorer.
On resource isolation model, due to Cgroups possess preferable isolation performance and its support to carry out GPU resource every From the present invention is isolated using Cgroups schemes to GPU resource.
On the binding mechanism of GPU equipment, when including GPU resource in the resource entity for distributing to the task, phase The node manager answered needs to be bound the GPU equipment on node and the resource entity.If there is multiple free time on node GPU resource, then need selection one be allocated.GPU running state information is expressed as by the present invention<GPU device numbers, Resource entity number>List, every data of list identifies the corresponding relation of GPU equipment and related resource entity.Node administration Device according to the GPU facility informations on associated profile and the node can initialize the list when node just starts.
When there is new task requests to use GPU resource, node manager is by searching the list, so as to obtain in sky The GPU facility informations of not busy state, and assign them to inter-related task.If possessing multiple GPU resources on node manager node In idle condition, then the distribution of GPU resource is carried out by round robin.Meanwhile, by the resource entity run and GPU resource Corresponding informance preserved into database., can be from database in the case where node manager needs to restart Directly read the distribution information of GPU equipment, it is to avoid reallocation to node resource.
In the present invention, the improved Spark distributed computing frameworks are to be improved for Spark kernels, prop up it Hold scheduling and execution to GPU type tasks.Specifically:
When submitting operation, if the application controller of Spark application programs detects the application program and needs GPU Resource, then needed in resource bid, and required GPU resource is added in resource request description.
The Container of application includes two kinds:CPU types Container and GPU types Container.Because for GPU types Task, it is also desirable to which CPU completes processing, transmission and the startup of GPU of data, so GPU types Container is except needing Outside the GPU resource of 1 unit, in addition it is also necessary to the core cpu of specified quantity.When applying for resource, it is thus necessary to determine that to be applied two The Container numbers of type.Here, each Container CPU core numbers to be included are represented with executorCores, TotalCores represents the CPU core number of application program, and GPUNum represents the GPU resource quantity of application program, then GPU Type Container quantity is GPUNum, and non-GPU types Container quantity is (totalCores-GPUNum* executorCores)/executorCores.Then the memory source quantity further according to setting is judged, detects always interior Deposit whether quantity disclosure satisfy that the memory amount that all Container need, further to handle.Send after resource request, provide Source scheduler can't immediately return for it and meet desired resource, and need the corresponding application controllers of Spark continuous Communicated by heartbeat mechanism with explorer, with probe requests thereby to resource whether be assigned.Application program controlling Device is added into the Resources list to be allocated inside program, to distribute to what is specifically performed after apllied resource is received Task.
, it is necessary to which the task to GPU is identified in Spark interfaces.The present invention proposes mapPartitionsGPU calculations Son and mapPartitionsGPURDD, for being handled for GPU task.
Spark job scheduler DAGScheduler generation DAG figures after, start divide stage when, it is necessary to increase word Whether section includes GPU operation to identify in current stage.Inside a stage, according to the calculating side run on each RDD Whether method needs GPU resource, and its internal RDD is divided to for two kinds:Need RDDs of the RDD of GPU resource with not needing GPU resource.Such as When really in the stage comprising the RDD for needing GPU resource, then when the subregion of RDD in for this stage distributes resource, it should be It distributes enough GPU resources, even if possible only one of which RDD needs when calculating.Otherwise, task can in calculating process It is able to can cause to perform failure because of no workable GPU resource.GPU resource is needed in order to identify whether to include in stage RDD, it is necessary to increase field flagGPU for stage, when flagGPU is true, show in the stage comprising needing GPU resource RDD.By setting flagGPU fields, in the resource allocation of next step, can be recognized by task manager and be its distribution GPU resource.
In the present invention, the flow of stage types is identified in the job scheduler DAGScheduler inside Spark It is as follows:
(1) after DAG generations, stage is divided.When generating stage, detect the RDD's that stage inside is included Whether flagGPU fields are true, if it is, illustrating that the stage needs GPU resource in the process of implementation, mark the stage's FlagGPU fields are true.The foundation of GPU resource distribution is carried out as later task manager.
(2) enforcement engine submits stage algorithm to be a recursive process, and it can submit last in DAG figures first One stage, then checks whether the father stage of the stage all has been filed on finishing, and starts to perform this time if all submitting Task-set corresponding to stage.If his father stage have without submitting, recurrence submits his father stage, and equally makes Above-mentioned inspection.So final result is schemed according to DAG, stage is performed from front to back.The benefit that so doing is is to be able to ensure that Its input data has been prepared for finishing when current stage is performed, and the partition data in RDD is when losing, can be along DAG Figure finds the partition data generated in nearest RDD from back to front, then re-executes to obtain loss subregion.
(3) submit after stage, task manager starts the stage being divided into task-set, and to cluster manager dual system application Resource needed for performing.Task quantity included in task-set is identical with RDD number of partitions.Task manager is detected first Whether the flagGPU fields of the stage are true, and the container of GPU resource is if it is included for its distribution.It is being somebody's turn to do During container distribution, if multiple container can be selected, then judged according to localization strategy.I.e. successively Select local node, other nodes of this frame and other frame nodes.Then task is started in node in resource, and will Task intermediate result is stored in storage system with final result.During this, if the container numbers comprising GPU resource Amount is less than the quantity of GPU type tasks, then temporarily the task of unallocated GPU resource needs to wait, and treats that other tasks carryings are finished, has GPU resource is then allocated when being in idle condition.
(4) after task is finished, returned resource.The container of recovery is added in list to be allocated, to distribute Used to other tasks.
The present invention is based on improved framework, it is proposed that a kind of effective programming model for GPU type tasks.
In Spark, the data in RDD are made up of several subregions, and it is finally assigned to some in units of subregion Complete to calculate in individual node.In fact, performing granularity according to partition data, the type of GPU calculating can will be carried out using Spark It is broadly divided into two kinds:
(1) GPU is completed in units of subregion to calculate.The data in RDD subregions are all put into GPU parallel to complete Calculate, improve executing efficiency;
(2) GPU is completed in units of single record to calculate.The data in RDD subregions are put into GPU one by one and complete to count Calculate, acceleration processing is carried out in units of single record.
In improved framework, the mapPartitionsGPU operators newly increased can perceive GPU type tasks, with the number of partitions Handled according to as input.The main execution logic of the operator is as follows:
(1) GPU equipment is initialized first in method;
(2) then judge to partition data perform granularity be in units of subregion or in units of wall scroll is recorded.Such as Fruit is with divisional unit, then partition data is transferred in GPU video memorys using CUDA API, this process may relate to data The conversion of form, partition data in RDD is converted into the data format that can be handled by GPU.Then GPU pair is called Data carry out parallel computation, and after the completion of calculating, output result is transmitted into internal memory.If the granularity that partition data is performed is In units of wall scroll is recorded, then need to one by one to each partitioned record sequential processes.Copy a record data extremely every time In GPU video memorys, call GPU to carry out parallel computation to data, after the completion of calculating, output result is copied to internal memory In., it is necessary to which the output result entirely recorded is converted into a partitioned set after all records are disposed;
(3) GPU equipment is discharged, and returns to partitioned set iterator.
Compared with prior art, advantages of the present invention and effect have:
1st, improved resource management platform proposed by the invention can perceive CPU, internal memory and the GPU moneys in isomeric group Source, and effectively it can be managed and dispatched;
2nd, it is improved based on Spark distributed computing frameworks, can effectively differentiates GPU type tasks, in DAG life Divide, the application of resource can be handled targetedly with the stage such as distributing, GPU type operations can be entered into, stage The correct scheduling of row is with performing;
3rd, framework proposed by the present invention, can adapt to part of nodes in cluster and possesses the isomery blocked GPU equipment and single-point more Environment, GPU types task is correctly assigned in cluster in the node comprising GPU resource and performed by it, solves tradition and performs GPU Can not be under isomeric group environment the problem of normal work during task.
Brief description of the drawings
Fig. 1 is that GPU equipment is distributed with discharging.
Fig. 2 is improved framework execution flow chart.
Fig. 3 is mapPartitionsGPU operation principles.
Embodiment
Technical scheme of the present invention is further illustrated below in conjunction with the accompanying drawings.
Fig. 1 is model training and the block diagram of image recognition, is mainly included:
1st, on resource representation, the GPU number of devices included in node is can customize first, and change resource representation association View, makes it increase the expression to GPU resource.When node starts, node manager initializes the Resources list, and and resource management Device reports the resource information of the node by heartbeat mechanism.
2nd, in scheduling of resource, GPU is added to the level of resource management platform by the present invention together with CPU, memory source Manage in queue.
3rd, the Resources list information for needing to discharge is sent to the node administration of response by explorer by heartbeat mechanism Device, when node manager is detected in resource entity to be released containing GPU resource, can call releaseGPU methods to discharge GPU resource.Then the resource discharged is further distributed to associated queue by explorer.
4th, in resource isolation, due to Cgroups possess preferable isolation performance and its support to GPU resource carry out every From the present invention is isolated using Cgroups schemes to GPU resource.
5th, on the dynamic binding of GPU equipment, when including GPU resource in the resource entity for distributing to task, accordingly Node manager needs to be bound the GPU equipment on node and the resource entity.Running state information of the invention by GPU It is expressed as<GPU device numbers, resource entity number>List, node manager can when node just starts according to associated profile with And the GPU facility informations on the node initialize the list.When there is new task requests to use GPU resource, node manager By searching the list, so as to obtain the GPU facility informations in idle condition, and inter-related task is assigned them to.If section Possess multiple GPU resources on point manager node and be in idle condition, then the distribution of GPU resource is carried out by round robin.Meanwhile, The corresponding informance of the resource entity run and GPU resource is preserved into database.
6th, the present invention proposes mapPartitionsGPU operators and mapPartitionsGPURDD, for appointing for GPU Business is handled.After generation DAG figures, start whether to wrap to identify in current stage, it is necessary to increase field when dividing stage Containing GPU operation.
7th, when the stage is divided into task-set by task manager, detect whether the stage possesses GPU marks first, such as Fruit is the container for then including GPU resource for its distribution.
8th, for possessing the task of GPU marks, schedule it in the node comprising GPU equipment and perform.
Bibliography:
[1]Ali Ghodsi,MateiZaharia.Dominant Resource Fairness:Fair Allocation of Multiple Resource Types[J].Berkeley.
[2]M.Zaharia,M.Chowdhury,T.Das,et al.Resilient distributed datasets:A fault tolerant abstraction for in-memory cluster computing[C]//Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation.CA, USA:USENIX Association,2012
[3]Janki Bhimani,Miriam Leeser,Ningfang Mi.Accelerating K-Means clustering with parallel implementations and GPU computing[A].2015IEEE High Performance Extreme Computing Conference(HPEC)[C].2015,1-6.
[4]Huang Chao-Qiang,Yang Shu-Qiang.RDDShare:Reusing Results of Spark RDD[A].2016IEEE First International Conference on Data Science in Cyberspace (DSC)[J].2016,290-295.
[5]Jie Zhu 1,Juanjuan Li.GPU-In-Hadoop:Enabling MapReduce Across Distributed Heterogeneous Platforms.IEEE ICIS 2014[J].2014,1-6。

Claims (6)

1. a kind of concurrent computational system based on Spark and GPU, it is characterised in that including:
Improved resource management platform, its support is scheduled and managed to multi dimensional resources such as GPU, CPU and internal memories;
Improved Spark distributed computing frameworks, it supports the scheduling and execution to GPU type tasks;
(1) the improved resource management platform, including:
Improve YARN explorer and node manager, can effectively perceive isomeric group GPU resource, so as to prop up Hold the management to cluster GPU resource and scheduling;Wherein, including resource representation model, resource dispatching model, resource preemption model, The improvement of the binding mechanism of resource isolation mechanism and GPU equipment;
(2) the improved Spark distributed computing frameworks, including:
Spark resource bid and distribution mechanism, job scheduling mechanism and tasks carrying mechanism is improved, it is supported to GPU types The scheduling and execution of task;By dividing and drawing with the stage such as tasks carrying in resource bid, resource allocation, DAG generations, stage Enter the mark to GPU resource, its enforcement engine is perceived GPU task, and effectively performed in isomeric group.
2. the concurrent computational system according to claim 1 based on Spark and GPU, it is characterised in that described improved Resource management platform, it would be preferable to support to being managed and dispatching comprising the multi dimensional resource including GPU resource:
On resource representation model, the GPU number of devices included first in User- defined Node, and resource representation agreement is changed, make It increases the expression to GPU resource;When node starts, node manager initializes the Resources list, and logical with explorer Cross the resource information that heartbeat mechanism reports the node;
On resource dispatching model, GPU is added to the hierarchy management queue of resource management platform together with CPU, memory source In;Scheduling of resource module is modified according to DRF algorithms, it is added scheduling and management to GPU resource;The algorithm is such as Under:
(1) initializing variable;Wherein, R=<totalCPU,totalGPU,totalMem>Represent cluster CPU, GPU and internal memory money The total amount in source, C=<usedCPU,usedGPU,usedMem>Represent CPU, GPU for having been consumed and memory source in cluster Quantity, siRepresent that operation i primary resource accounts for the share of corresponding total resources, Ui=<CPUi,GPUi,Memi>Expression is already allocated to work Industry i stock number, Di=<CPUi,GPUi,Memi>The stock number that operation i each task needs is represented, operation is being chosen every time When carrying out resource allocation, following steps are performed successively:
(2) primary resource share s is choseniMinimum Job execution;
(3) if C+Di≤ R, then allocate resources to operation i, updates C=C+Di,Ui=Ui+Di,
si=max { Ui/R};Otherwise, cluster resource can not meet demand, stop distribution;
On resource preemption model, using for every kind of resource is set to each queue in level queue by Resource Scheduler Limit and lower limit;Resource Scheduler provides the resource allocation of the queue of light load to the queue of other heavier loads to improve cluster Source utilization rate;But when there is new application program to be submitted to the queue of light load, scheduler meeting resource preemption mechanism withdraws it Resource shared by his queue, so as to will originally belong to the resource allocation of the queue to it;When resource preemption mechanism occurs, it is necessary to Discharge GPU resource;This work is completed by node manager, and increasing releaseGPU methods newly here is used to discharge GPU resource;Money The Resources list information for needing to discharge is sent to the node manager of response, node manager by source manager by heartbeat mechanism When detecting in resource entity to be released containing GPU resource, releaseGPU methods are called to discharge GPU resource;Then resource The resource discharged is further distributed to associated queue by manager;
On resource isolation model, GPU resource is isolated using Cgroups schemes;
On the binding mechanism of GPU equipment, when including GPU resource in the resource entity for distributing to the task, accordingly Node manager needs to be bound the GPU equipment on node and the resource entity;If there is multiple idle GPU on node Resource, then need selection one to be allocated;GPU running state information is expressed as<GPU device numbers, resource entity number> List, every data of list identifies the corresponding relation of GPU equipment and related resource entity;Node manager has just been opened in node The list is initialized according to the GPU facility informations on associated profile and the node when dynamic;
When there is new task requests to use GPU resource, node manager is by searching the list, so as to obtain in idle shape The GPU facility informations of state, and assign them to inter-related task;If possessing multiple GPU resources on node manager node to be in Idle condition, then carry out the distribution of GPU resource by round robin;Meanwhile, by the resource entity run and pair of GPU resource Information is answered to be preserved into database;, can be from database directly in the case where node manager needs to restart Read the distribution information of GPU equipment, it is to avoid reallocation to node resource.
3. the concurrent computational system according to claim 2 based on Spark and GPU, it is characterised in that described improved Spark distributed computing frameworks, are improved for Spark kernels, it is supported scheduling and execution to GPU type tasks:
When submitting operation, if the application controller of Spark application programs detects the application program and needs GPU to provide Source, then in resource bid, required GPU resource is added in resource request description;
The Container of application includes two kinds:CPU types Container and GPU types Container;Because for GPU type tasks, Processing, transmission and the startup of GPU that CPU completes data are also required to, so GPU types Container is except needing 1 list Outside the GPU resource of position, in addition it is also necessary to the core cpu of specified quantity;When applying for resource, it is determined that to be applied is two kinds of Container numbers;Here, each Container CPU core numbers to be included are represented with executorCores, TotalCores represents the CPU core number of application program, and GPUNum represents the GPU resource quantity of application program, then GPU Type Container quantity is GPUNum, and non-GPU types Container quantity is (totalCores-GPUNum* executorCores)/executorCores;Then the memory source quantity further according to setting is judged, detects always interior Deposit whether quantity disclosure satisfy that the memory amount that all Container need, further to handle;Send after resource request, provide Source scheduler does not return immediately for it and meets desired resource, and needs the corresponding application controllers of Spark not open close Heartbeat mechanism is crossed to communicate with explorer, with probe requests thereby to resource whether be assigned;Application controller After apllied resource is received, it is added into the Resources list to be allocated inside program, to distribute to times specifically performed Business;
In Spark interfaces, the task to GPU is identified, wherein by mapPartitionsGPU operators with MapPartitionsGPURDD, for being handled for GPU task;
Spark job scheduler DAGScheduler is after generation DAG figures, when starting to divide stage, increases field to identify Whether GPU operation is included in current stage;Inside a stage, according to the computational methods run on each RDD whether need GPU resource is wanted, its internal RDD is divided to for two kinds:Need RDDs of the RDD of GPU resource with not needing GPU resource;If should Then it is its distribution foot when the subregion of RDD in for this stage distributes resource when in stage comprising the RDD for needing GPU resource Enough GPU resources, even if possible only one of which RDD needs when calculating;Otherwise, task may be because of in calculating process Cause to perform failure without workable GPU resource;In order to whether identify in stage comprising the RDD of GPU resource is needed, it is Stage increases field flagGPU, when flagGPU is true, and showing to include in the stage needs the RDD of GPU resource;By setting Put flagGPU fields, in the resource allocation of next step, can be recognized by task manager and be its distribute GPU resource.
4. the concurrent computational system according to claim 3 based on Spark and GPU, it is characterised in that inside Spark Job scheduler DAGScheduler in be identified stage types flow it is as follows:
(1) after DAG generations, stage is divided;When generating stage, the RDD included inside the stage flagGPU is detected Whether field is true, if it is, illustrating that the stage needs GPU resource in the process of implementation, marks the flagGPU of the stage Field is true;The foundation of GPU resource distribution is carried out as later task manager;
(2) enforcement engine submits stage algorithm to be a recursive process, and last in DAG figures is submitted first Stage, then checks whether the father stage of the stage all has been filed on finishing, and starts to perform this time if all submitting Task-set corresponding to stage;If his father stage have without submitting, recurrence submits his father stage, and equally makes Above-mentioned inspection;So final result is schemed according to DAG, stage is performed from front to back;
(3) submit after stage, task manager starts the stage being divided into task-set, and is performed to cluster manager dual system application Required resource;Task quantity included in task-set is identical with RDD number of partitions;Task manager detects this first Whether stage flagGPU fields are true, if it is, including the container of GPU resource for its distribution;It is being somebody's turn to do During container distribution, if multiple container can be selected, then judged according to localization strategy, i.e., successively Select local node, other nodes of this frame and other frame nodes;Then task is started in node in resource, and will Task intermediate result is stored in storage system with final result;During this, if the container numbers comprising GPU resource Amount is less than the quantity of GPU type tasks, then temporarily the task of unallocated GPU resource needs to wait, and treats that other tasks carryings are finished, has GPU resource is then allocated when being in idle condition;
(4) after task is finished, returned resource;The container of recovery is added in list to be allocated, to distribute to it He uses task.
5. the concurrent computational system according to claim 4 based on Spark and GPU, it is characterised in that propose that one kind is directed to The effective programming model of GPU type tasks:
In Spark, the data in RDD are made up of several subregions, and it is finally assigned to several sections in units of subregion Complete to calculate in point;Granularity is performed according to partition data, the type that GPU calculating is carried out using Spark is broadly divided into two kinds:
(1) GPU is completed in units of subregion to calculate, i.e., the data in RDD subregions are all put into GPU to complete in terms of parallel Calculate, to improve executing efficiency;
(2) GPU is completed in units of single record to calculate, i.e., the data in RDD subregions are put into GPU one by one and complete to calculate, Acceleration processing is carried out in units of single record.
6. the concurrent computational system according to claim 4 based on Spark and GPU, it is characterised in that newly increase MapPartitionsGPU operators can perceive GPU type tasks, be handled using partition data as input;The operator it is main Execution logic is as follows:
(1) GPU equipment is initialized first in method;
(2) then judge to partition data perform granularity be in units of subregion or in units of wall scroll is recorded;If With divisional unit, then partition data is transferred in GPU video memorys using CUDA API, this process may relate to data format Conversion, partition data in RDD is converted into the data format that can be handled by GPU;Then GPU is called to data Carry out parallel computation;After the completion of calculating, output result is transmitted into internal memory;If the granularity that partition data is performed is with list Bar record is unit, then to one by one to each partitioned record sequential processes;A record data is copied every time to GPU video memorys In, call GPU to carry out parallel computation to data;After the completion of calculating, output result is copied in internal memory;In whole After record is disposed, the output result entirely recorded is converted into a partitioned set;
(3) GPU equipment is discharged, and returns to partitioned set iterator.
CN201710270400.8A 2017-04-24 2017-04-24 A kind of concurrent computational system based on Spark and GPU Pending CN107168782A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710270400.8A CN107168782A (en) 2017-04-24 2017-04-24 A kind of concurrent computational system based on Spark and GPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710270400.8A CN107168782A (en) 2017-04-24 2017-04-24 A kind of concurrent computational system based on Spark and GPU

Publications (1)

Publication Number Publication Date
CN107168782A true CN107168782A (en) 2017-09-15

Family

ID=59813923

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710270400.8A Pending CN107168782A (en) 2017-04-24 2017-04-24 A kind of concurrent computational system based on Spark and GPU

Country Status (1)

Country Link
CN (1) CN107168782A (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596824A (en) * 2018-03-21 2018-09-28 华中科技大学 A kind of method and system optimizing rich metadata management based on GPU
CN108652610A (en) * 2018-06-04 2018-10-16 成都皓图智能科技有限责任公司 A kind of non-contact detection method that more popular feelings are jumped
CN108762921A (en) * 2018-05-18 2018-11-06 电子科技大学 A kind of method for scheduling task and device of the on-line optimization subregion of Spark group systems
CN109032809A (en) * 2018-08-13 2018-12-18 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Heterogeneous parallel scheduling system based on remote sensing image storage position
CN109086137A (en) * 2018-08-06 2018-12-25 清华四川能源互联网研究院 GPU concurrent computation resource configuration method and device
CN109254851A (en) * 2018-09-30 2019-01-22 武汉斗鱼网络科技有限公司 A kind of method and relevant apparatus for dispatching GPU
CN109743453A (en) * 2018-12-29 2019-05-10 出门问问信息科技有限公司 A kind of multi-screen display method and device
CN109977306A (en) * 2019-03-14 2019-07-05 北京达佳互联信息技术有限公司 Implementation method, system, server and the medium of advertisement engine
CN109995965A (en) * 2019-04-08 2019-07-09 复旦大学 A kind of ultrahigh resolution video image real-time calibration method based on FPGA
CN110018817A (en) * 2018-01-05 2019-07-16 中兴通讯股份有限公司 The distributed operation method and device of data, storage medium and processor
CN110109747A (en) * 2019-05-21 2019-08-09 北京百度网讯科技有限公司 Method for interchanging data and system, server based on Apache Spark
CN110134521A (en) * 2019-05-28 2019-08-16 北京达佳互联信息技术有限公司 Method, apparatus, resource manager and the storage medium of resource allocation
CN110442446A (en) * 2019-06-29 2019-11-12 西南电子技术研究所(中国电子科技集团公司第十研究所) The method of processing high-speed digital signal data flow in real time
CN110458294A (en) * 2019-08-19 2019-11-15 Oppo广东移动通信有限公司 Model running method, apparatus, terminal and storage medium
CN110704186A (en) * 2019-09-25 2020-01-17 国家计算机网络与信息安全管理中心 Computing resource allocation method and device based on hybrid distribution architecture and storage medium
CN110795219A (en) * 2019-10-24 2020-02-14 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Resource scheduling method and system suitable for multiple computing frameworks
CN110879753A (en) * 2019-11-19 2020-03-13 中国移动通信集团广东有限公司 GPU acceleration performance optimization method and system based on automatic cluster resource management
CN110955526A (en) * 2019-12-16 2020-04-03 湖南大学 Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment
CN111240844A (en) * 2020-01-13 2020-06-05 星环信息科技(上海)有限公司 Resource scheduling method, equipment and storage medium
CN111314401A (en) * 2018-12-12 2020-06-19 百度在线网络技术(北京)有限公司 Resource allocation method, device, system, terminal and computer readable storage medium
CN111400035A (en) * 2020-03-04 2020-07-10 杭州海康威视系统技术有限公司 Video memory allocation method and device, electronic equipment and storage medium
CN111656323A (en) * 2018-01-23 2020-09-11 派泰克集群能力中心有限公司 Dynamic allocation of heterogeneous computing resources determined at application runtime
CN112035261A (en) * 2020-09-11 2020-12-04 杭州海康威视数字技术股份有限公司 Data processing method and system
CN112711448A (en) * 2020-12-30 2021-04-27 安阳师范学院 Agent technology-based parallel component assembling and performance optimizing method
CN112835996A (en) * 2019-11-22 2021-05-25 北京初速度科技有限公司 Map production system and method thereof
CN113515361A (en) * 2021-07-08 2021-10-19 中国电子科技集团公司第五十二研究所 Lightweight heterogeneous computing cluster system facing service
CN113808001A (en) * 2021-11-19 2021-12-17 南京芯驰半导体科技有限公司 Method and system for single system to simultaneously support multiple GPU (graphics processing Unit) work
CN114840125B (en) * 2022-03-30 2024-04-26 曙光信息产业(北京)有限公司 Device resource allocation and management method, device resource allocation and management device, device resource allocation and management medium, and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407921A (en) * 2014-12-25 2015-03-11 浪潮电子信息产业股份有限公司 Time-based method for dynamically scheduling yarn task resources
CN105022670A (en) * 2015-07-17 2015-11-04 中国海洋大学 Heterogeneous distributed task processing system and processing method in cloud computing platform
EP3067797A1 (en) * 2015-03-12 2016-09-14 International Business Machines Corporation Creating new cloud resource instruction set architecture
CN106506266A (en) * 2016-11-01 2017-03-15 中国人民解放军91655部队 Network flow analysis method based on GPU, Hadoop/Spark mixing Computational frame

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407921A (en) * 2014-12-25 2015-03-11 浪潮电子信息产业股份有限公司 Time-based method for dynamically scheduling yarn task resources
EP3067797A1 (en) * 2015-03-12 2016-09-14 International Business Machines Corporation Creating new cloud resource instruction set architecture
CN105022670A (en) * 2015-07-17 2015-11-04 中国海洋大学 Heterogeneous distributed task processing system and processing method in cloud computing platform
CN106506266A (en) * 2016-11-01 2017-03-15 中国人民解放军91655部队 Network flow analysis method based on GPU, Hadoop/Spark mixing Computational frame

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘德波: "基于YARN的GPU集群系统研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
郑伟: "Spark下MPI/GPU并行计算处理机制的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018817A (en) * 2018-01-05 2019-07-16 中兴通讯股份有限公司 The distributed operation method and device of data, storage medium and processor
CN111656323A (en) * 2018-01-23 2020-09-11 派泰克集群能力中心有限公司 Dynamic allocation of heterogeneous computing resources determined at application runtime
CN108596824A (en) * 2018-03-21 2018-09-28 华中科技大学 A kind of method and system optimizing rich metadata management based on GPU
CN108762921A (en) * 2018-05-18 2018-11-06 电子科技大学 A kind of method for scheduling task and device of the on-line optimization subregion of Spark group systems
CN108652610A (en) * 2018-06-04 2018-10-16 成都皓图智能科技有限责任公司 A kind of non-contact detection method that more popular feelings are jumped
CN109086137A (en) * 2018-08-06 2018-12-25 清华四川能源互联网研究院 GPU concurrent computation resource configuration method and device
CN109032809A (en) * 2018-08-13 2018-12-18 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Heterogeneous parallel scheduling system based on remote sensing image storage position
CN109254851A (en) * 2018-09-30 2019-01-22 武汉斗鱼网络科技有限公司 A kind of method and relevant apparatus for dispatching GPU
CN111314401A (en) * 2018-12-12 2020-06-19 百度在线网络技术(北京)有限公司 Resource allocation method, device, system, terminal and computer readable storage medium
CN109743453A (en) * 2018-12-29 2019-05-10 出门问问信息科技有限公司 A kind of multi-screen display method and device
CN109977306A (en) * 2019-03-14 2019-07-05 北京达佳互联信息技术有限公司 Implementation method, system, server and the medium of advertisement engine
CN109977306B (en) * 2019-03-14 2021-08-20 北京达佳互联信息技术有限公司 Method, system, server and medium for implementing advertisement engine
CN109995965B (en) * 2019-04-08 2021-12-03 复旦大学 Ultrahigh-resolution video image real-time calibration method based on FPGA
CN109995965A (en) * 2019-04-08 2019-07-09 复旦大学 A kind of ultrahigh resolution video image real-time calibration method based on FPGA
CN110109747B (en) * 2019-05-21 2021-05-14 北京百度网讯科技有限公司 Apache Spark-based data exchange method, system and server
CN110109747A (en) * 2019-05-21 2019-08-09 北京百度网讯科技有限公司 Method for interchanging data and system, server based on Apache Spark
CN110134521A (en) * 2019-05-28 2019-08-16 北京达佳互联信息技术有限公司 Method, apparatus, resource manager and the storage medium of resource allocation
CN110442446A (en) * 2019-06-29 2019-11-12 西南电子技术研究所(中国电子科技集团公司第十研究所) The method of processing high-speed digital signal data flow in real time
CN110442446B (en) * 2019-06-29 2022-12-13 西南电子技术研究所(中国电子科技集团公司第十研究所) Method for real-time processing high-speed digital signal data stream
CN110458294B (en) * 2019-08-19 2022-02-25 Oppo广东移动通信有限公司 Model operation method, device, terminal and storage medium
CN110458294A (en) * 2019-08-19 2019-11-15 Oppo广东移动通信有限公司 Model running method, apparatus, terminal and storage medium
CN110704186B (en) * 2019-09-25 2022-05-24 国家计算机网络与信息安全管理中心 Computing resource allocation method and device based on hybrid distribution architecture and storage medium
CN110704186A (en) * 2019-09-25 2020-01-17 国家计算机网络与信息安全管理中心 Computing resource allocation method and device based on hybrid distribution architecture and storage medium
CN110795219A (en) * 2019-10-24 2020-02-14 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Resource scheduling method and system suitable for multiple computing frameworks
CN110795219B (en) * 2019-10-24 2022-03-18 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Resource scheduling method and system suitable for multiple computing frameworks
CN110879753B (en) * 2019-11-19 2024-04-05 中国移动通信集团广东有限公司 GPU acceleration performance optimization method and system based on automatic cluster resource management
CN110879753A (en) * 2019-11-19 2020-03-13 中国移动通信集团广东有限公司 GPU acceleration performance optimization method and system based on automatic cluster resource management
CN112835996A (en) * 2019-11-22 2021-05-25 北京初速度科技有限公司 Map production system and method thereof
CN110955526A (en) * 2019-12-16 2020-04-03 湖南大学 Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment
CN111240844A (en) * 2020-01-13 2020-06-05 星环信息科技(上海)有限公司 Resource scheduling method, equipment and storage medium
CN111400035A (en) * 2020-03-04 2020-07-10 杭州海康威视系统技术有限公司 Video memory allocation method and device, electronic equipment and storage medium
CN112035261A (en) * 2020-09-11 2020-12-04 杭州海康威视数字技术股份有限公司 Data processing method and system
CN112711448A (en) * 2020-12-30 2021-04-27 安阳师范学院 Agent technology-based parallel component assembling and performance optimizing method
CN113515361A (en) * 2021-07-08 2021-10-19 中国电子科技集团公司第五十二研究所 Lightweight heterogeneous computing cluster system facing service
CN113808001A (en) * 2021-11-19 2021-12-17 南京芯驰半导体科技有限公司 Method and system for single system to simultaneously support multiple GPU (graphics processing Unit) work
CN114840125B (en) * 2022-03-30 2024-04-26 曙光信息产业(北京)有限公司 Device resource allocation and management method, device resource allocation and management device, device resource allocation and management medium, and program product

Similar Documents

Publication Publication Date Title
CN107168782A (en) A kind of concurrent computational system based on Spark and GPU
WO2021208546A1 (en) Multi-dimensional resource scheduling method in kubernetes cluster architecture system
US7945913B2 (en) Method, system and computer program product for optimizing allocation of resources on partitions of a data processing system
CN111344688B (en) Method and system for providing resources in cloud computing
CN101727357B (en) Method and apparatus for allocating resources in a compute farm
CN108845874B (en) Dynamic resource allocation method and server
US11816509B2 (en) Workload placement for virtual GPU enabled systems
CN108762896A (en) One kind being based on Hadoop cluster tasks dispatching method and computer equipment
CN103729246B (en) Method and device for dispatching tasks
US20070226743A1 (en) Parallel-distributed-processing program and parallel-distributed-processing system
US8527988B1 (en) Proximity mapping of virtual-machine threads to processors
CN108572873A (en) A kind of load-balancing method and device solving the problems, such as Spark data skews
CN102937918A (en) Data block balancing method in operation process of HDFS (Hadoop Distributed File System)
CN103297499A (en) Scheduling method and system based on cloud platform
CN110990154B (en) Big data application optimization method, device and storage medium
CN104881322A (en) Method and device for dispatching cluster resource based on packing model
CN114356543A (en) Kubernetes-based multi-tenant machine learning task resource scheduling method
CN113672391A (en) Parallel computing task scheduling method and system based on Kubernetes
US20210390405A1 (en) Microservice-based training systems in heterogeneous graphic processor unit (gpu) cluster and operating method thereof
CN115994567A (en) Asynchronous scheduling method for parallel computing tasks of deep neural network model
CN102184124B (en) Task scheduling method and system
CN112306642A (en) Workflow scheduling method based on stable matching game theory
CN107423114A (en) A kind of dynamic migration of virtual machine method based on classification of service
CN106648866B (en) Resource scheduling method based on KVM platform and capable of meeting task time limit requirements
CN113419827A (en) High-performance computing resource scheduling fair sharing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170915