CN106547627A - The method and system that a kind of Spark MLlib data processings accelerate - Google Patents

The method and system that a kind of Spark MLlib data processings accelerate Download PDF

Info

Publication number
CN106547627A
CN106547627A CN201611056361.3A CN201611056361A CN106547627A CN 106547627 A CN106547627 A CN 106547627A CN 201611056361 A CN201611056361 A CN 201611056361A CN 106547627 A CN106547627 A CN 106547627A
Authority
CN
China
Prior art keywords
mllib
fpga
algorithms
spark
opencl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611056361.3A
Other languages
Chinese (zh)
Inventor
王丽
陈继承
王洪伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201611056361.3A priority Critical patent/CN106547627A/en
Publication of CN106547627A publication Critical patent/CN106547627A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing

Abstract

The embodiment of the invention discloses a kind of method that Spark MLlib data processings accelerate, including judge whether MLlib algorithms meet FPGA OpenCL parallelization design conditions;If it is satisfied, then MLlib algorithm core calculations are partially distributed to FPGA carries out computing;Otherwise, MLlib algorithms still carry out computing in Spark platforms.Divided on demand to which according to the property of MLlib algorithms, the algorithm for meeting condition is assigned to FPGA carries out concurrent operation, solves the problems, such as that memory cost is excessive to a certain extent;What is task carried out due to FPGA is parallel processing, improves task arithmetic speed so that Spark Mllib overall data processing speed is accelerated, and improves the calculating performance of Spark platforms.Additionally, the embodiment of the present invention additionally provides corresponding system, further such that methods described has more practicality, the system has corresponding advantage.

Description

The method and system that a kind of Spark MLlib data processings accelerate
Technical field
The present invention relates to big data, cloud are processed and isomery accelerates field, more particularly to a kind of Spark MLlib data Process the method and system for accelerating.
Background technology
With the arrival of Internet of Things and 5G communication eras, big data field is also faced with huge change, bigger more higher-dimension The information of degree needs real-time, interactive is carried out between data center and intelligent terminal, and data processing degree has been also required that Height is to meet above demand it is necessary to realize accelerating to large scale database and deep learning.
Spark is the class increased income by UC Berkeley AMP lab (the AMP laboratories of University of California Berkeley) The universal parallel framework of Hadoop MapReduce, is an efficient distributed computing system.Enable internal memory distributed data Collection, in addition to it can provide interactive inquiry, it can be with Optimized Iterative workload.Spark is realized in Scala language , Scala is used as its application framework, including related test and Data Generator by it.It is global big data neck at present Big data universal computing platform most active in domain, most popular, efficient.
ML (Machine Learning, machine learning) is a multi-field cross discipline, specializes in machine how mould Intend or realize the learning behavior of the mankind, to obtain new knowledge or skills, reorganize the existing structure of knowledge and be allowed to constantly change It is apt to the performance of itself.It is the core of artificial intelligence, is the fundamental way for making computer have intelligence, and its application is throughout artificial intelligence The every field of energy, it is mainly using conclusion, synthesis rather than deduction." machine " mentioned here, what is referred to is exactly computer, for example Electronic computer, neutron computer, photonic computer or neuro-computer etc..
Machine learning storehouse Mllib (Machine Learning lib) one is specifically designed for mass data process, general , quick engine.The design original intention of Spark is exactly that, for the task of supporting some iteration, this has conformed exactly to machine learning The characteristics of algorithm, so Mllib is applied on Spark platforms.Mllib is the machine learning storehouse that Spark can extend, and is Spark realizes storehouse to the machine learning algorithm commonly used and application, the algorithm of main machine learning for example classify recurrences, gather Class, correlation rule, recommendation, dimensionality reduction, optimization, feature extraction screening, the mathematical statistics method pre-processed for feature and algorithm Evaluation and test all include in MLlib.
In prior art, in Spark distributed computing systems, Spark is responsible for task scheduling, during Spark calls MLlib Algorithm computing is analyzed to calculating task.Big data epoch, the unlimited increase of data scale, and data processing speed High request, calculates performance based on Mllib algorithms under Spark frameworks somewhat unable to do what one wishes in the face of this challenge.Therefore, how to improve Or optimization is that those skilled in the art are urgently to be resolved hurrily based on Mllib algorithms calculating performance (speed of computing) under Spark frameworks Technical problem.
The content of the invention
The purpose of the embodiment of the present invention is to provide the method and system that a kind of Spark MLlib data processings accelerate, and realizes The faster data processing speeds of Spark Mllib, improve the calculating performance based on Spark platforms.
To solve above-mentioned technical problem, the embodiment of the present invention provides technical scheme below:
On the one hand the embodiment of the present invention provides a kind of method that Spark MLlib data processings accelerate, including:
Judge whether Mllib algorithms meet FPGA OpenCL parallel optimization design conditions;
When judging that the Mllib algorithms meet the FPGA OpenCL parallel optimization design conditions, by the Mllib The core calculations of algorithm are based partially on OpenCL and realize that paralell design is described, and are transplanted on FPGA and realize parallel computation, institute State OpenCL and OpenCL special purpose interfaces are called by Scala, to realize the fusion of FPGA and Spark platforms;Conversely, then described Mllib algorithms carry out computing in the Spark platforms.
Preferably, it is described to judge whether Mllib algorithms meet FPGA parallel optimization design conditions and be:
It is applied to parallelization calculating when the time that Mllib algorithms carry out computing exceedes Preset Time and the Mllib algorithms When, the Mllib algorithms meet FPGA OpenCL parallel optimization design conditions;Conversely, then not meeting the FPGA OpenCL Parallel optimization design condition.
Preferably, OpenCL is based partially in the core calculations by the Mllib algorithms realize that paralell design is retouched State, and be transplanted on FPGA realize parallel computation after also include:
Result of calculation is returned the Spark platforms and is shown by the FPGA.
Preferably, it is described and be transplanted to realize parallel computation on FPGA before also include:
The Mllib algorithms are carried out into Parallel Design using the OpenCL.
The embodiment of the present invention additionally provides the system that a kind of Spark MLlib data processings accelerate, including:
The distributed big data processing meanss of Spark, multiple FPGA and multiple calculate nodes,
Wherein, the distributed big data processing meanss of the Spark are deployed to multiple calculate nodes, each described meter Operator node is connected with FPGA one or more described;
The distributed big data processing meanss of the Spark are used to judging whether Mllib algorithms meet FPGA OpenCL parallel Optimization design condition;When judging that the Mllib algorithms meet the FPGA OpenCL parallel optimization design conditions, will be described The core calculations of Mllib algorithms are based partially on OpenCL and realize that paralell design is described, and are transplanted on the FPGA and realize simultaneously Row is calculated, and the OpenCL calls OpenCL special purpose interfaces by Scala, to realize the fusion of FPGA and Spark platforms;Instead It, then the Mllib algorithms carry out computing in the Spark platforms;
The FPGA meets the Mllib algorithms of the FPGA OpenCL parallel optimization design conditions for processing;
The calculate node is for carrying out resource allocation and task scheduling to the calculating task on the Spark platforms.
Preferably, the distributed big data processing meanss of the Spark are when the time that the Mllib algorithms carry out computing surpasses Cross Preset Time and the Mllib algorithms are suitable for, during parallelization calculating, it is parallel that the Mllib algorithms meet FPGA OpenCL Optimization design condition;Conversely, then not meeting the module of the FPGA OpenCL parallel optimization design conditions.
Preferably, also include:
PCIe high-speed interfaces, return for data transfer between the FPGA and the Spark platforms and result.
Preferably, the distributed big data processing meanss of the Spark also include:
Optimization unit, for the Mllib algorithms are carried out Parallel Design using the OpenCL.
Preferably, the calculate node is connected with the FPGA by the PCIe high-speed interfaces.
A kind of method that Spark MLlib data processings accelerate is embodiments provided, by judging Spark Whether MLlib algorithms meet FPGA OpenCL parallelization design conditions;If it is satisfied, then by MLlib algorithm core calculations part Realize that paralell design is described based on OpenCL, and be assigned to FPGA carrying out computing;Otherwise, MLlib algorithms are still in Spark platforms Carry out computing.
It is divided on demand to which according to the property of MLlib algorithms, the MLlib algorithms for meeting condition are assigned to into FPGA and are transported Calculate, it is to avoid traditional Spark MLlib algorithms carry out processing all tasks, solve that memory cost is excessive asks to a certain extent Topic;As FPGA carries out parallel processing to task, therefore improve task arithmetic speed so that Spark Mllib overall data Processing speed is accelerated, and improves the calculating performance of Spark platforms.Additionally, the embodiment of the present invention is also directed to Spark data processings Method provides corresponding system, further such that methods described has more practicality, the system has corresponding advantage.
Description of the drawings
For the clearer explanation embodiment of the present invention or the technical scheme of prior art, below will be to embodiment or existing Accompanying drawing to be used needed for technology description is briefly described, it should be apparent that, drawings in the following description are only this Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is the block schematic illustration of an exemplary application scene provided in an embodiment of the present invention;
Fig. 2 is a kind of schematic flow sheet of Spark MLlib data processing accelerated methods provided in an embodiment of the present invention;
Fig. 3 is the schematic flow sheet of another kind of Spark MLlib data processing accelerated methods provided in an embodiment of the present invention;
Fig. 4 is a kind of structure chart of Spark MLlib data processing acceleration systems provided in an embodiment of the present invention;
Fig. 5 is the structure chart of another kind of Spark MLlib data processing acceleration systems provided in an embodiment of the present invention.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, with reference to the accompanying drawings and detailed description The present invention is described in further detail.Obviously, described embodiment is only a part of embodiment of the invention, rather than Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, belongs to the scope of protection of the invention.
Term " first ", " second ", " the 3rd " " in the description and claims of this application and above-mentioned accompanying drawing Four " it is etc. for distinguishing different objects, rather than for describing specific order.In addition term " comprising " and " having " and Their any deformations, it is intended that cover non-exclusive including.For example contain the process of series of steps or unit, method, System, product or equipment are not limited to the step of listing or unit, but may include the step of not listing or unit.
Present inventor is had found through research, and in existing Spark distributed computing systems, Spark takes responsibility Business scheduling, is analyzed computing to calculating task by calling the algorithm in MLlib.And the unlimited increase of data scale is faced, And the high request of data processing speed, performance is calculated based on Mllib algorithms under Spark frameworks somewhat unable to do what one wishes.
FPGA (Field-Programmable Gate Array, field programmable gate array), it be PAL, GAL, The product further developed on the basis of the programming devices such as CPLD, is that a kind of new heterogeneous computing platforms accelerate device, by Programmable logical block and internet composition, can perform multiple threads under Different Logic, realize pipeline and parallel design, have There is stronger parallel processing capability.FPGA has many advantages in big data process field, is such as realized simultaneously using pipeline system Row calculating, low-power consumption, dynamic reconfigurable etc..
With the development of technology, the algorithm debugging mode that OpenCL high-level languages are adopted instead of traditional HDL hardware descriptions Language is compiled exploitation to FPGA.The OpenCL FPGA that can write direct carry out program verification, and it is general to be that FPGA is widely used in Calculating field carries out quickly developing the guarantee provided convenience.
Heterogeneous Computing refers to the calculation that system is constituted using the computing unit of different type instruction set and architectural framework, Can be which distributes different calculating tasks according to the design feature of each computing subsystem, common computing unit has central authorities Processor CPU, graphic process unit GPU, Digital Signal Processing DSP, application-specific integrated circuit ASIC, FPGA etc..Such as CPU and GPU Between " cooperated computing, each other accelerate ", so as to break through the bottleneck of CPU development.This pattern can improve the computational of server Energy, Energy Efficiency Ratio and calculating real-time.
In consideration of it, the application is by being combined into heterogeneous platform by FPGA boards and Spark platforms, will FPGA and Spark The calculate node of platform is connected carries out Heterogeneous Computing.By by complicated and time consumption and be adapted to parallelization calculating task on FPGA Computing is carried out, calculating speed is improve, it is achieved thereby that the faster data processing speeds of Spark Mllib, improve and be based on Under Spark frameworks, Mllib algorithms calculate performance.
Based on the technical scheme of the embodiments of the present invention, the technical side with reference to Fig. 1 to the embodiment of the present invention first below Some possible application scenarios that case is related to carry out citing introduction.As shown in figure 1, each CPU of Spark platforms is calculated section Point is connected with N number of FPGA by PCIe interface.The compiler language Scala of Spark platforms is by calling the compiler language of FPGA OpenCL special purpose interfaces are merged to be compiled under Spark frameworks.Will to the training analysis of current task by Spark processors The task of meeting FPGA computings is assigned on FPGA and carries out process computing, and general task is then processed using MLlib algorithms. Parallel processing calculating task is carried out by adding FPGA boards, the speed of overall Spark MLlib process tasks is improved, is lifted Calculating performance.
It should be noted that above-mentioned application scenarios are for only for ease of the thought and principle that understand the application and illustrate, this The embodiment of application is unrestricted in this regard.Conversely, presently filed embodiment can apply to it is applicable any Scene.
After the technical scheme for describing the embodiment of the present invention, the various non-limiting reality of detailed description below the application Apply mode.
Embodiment one:
Referring first to Fig. 2, Fig. 2 is a kind of Spark MLlib data processing accelerated methods provided in an embodiment of the present invention Schematic flow sheet, the embodiment of the present invention may include herein below:
S201:Judge whether Mllib algorithms meet FPGA OpenCL parallel optimization design conditions.
S202:When judging that the Mllib algorithms meet the FPGA OpenCL parallel optimization design conditions, will be described The core calculations of Mllib algorithms are based partially on OpenCL and realize that paralell design is described, and are transplanted to the parallel meter of realization on FPGA Calculate, the OpenCL calls OpenCL special purpose interfaces to realize the fusion of FPGA and Spark platforms by Scala;Conversely, then described Mllib algorithms carry out computing in the Spark platforms.
The compiler language of Spark platforms can be Scala, and the compiler language of FPGA can be OpenCL.By Scala is adjusted Merged with OpenCL special purpose interfaces and be compiled into the combination that FPGA and Spark platforms are realized under Spark frameworks.Certainly, In the case of necessity, can not also select Scala with OpenCL high-level languages as compiler language, other applicable language sheets may be selected Inventive embodiments are not limited to this.However, it is preferred to, the language of selection should be ripe, that compatibility is good, exploitation week Phase it is short, it is stable convenient.
Spark Platform Analysis Mllib algorithms, and task can be carried out by the training to the machine learning algorithm in MLlib With the distribution or scheduling of resource.When being allocated to Mllib algorithms, can be with reference to the machine learning algorithm and FPGA in MLlib Respective advantage is allocated, and concrete assigning process is as described below.
The time of computing can be carried out as one of Rule of judgment for distributing task according to Mllib algorithms, affect to calculate as Mllib Method carries out the amount of calculation that the factor of the time of computing can be mainly the complexity and calculating task of current task, for example, Current calculating task 1 is to be analyzed process to the table for being related to same project in 8000 business support system BBS tables, and Calculating task 2 be analysis 8000 BBS tables in front 4000 tables, due to both computation complexities it is the same, the calculating of calculating task 1 Amount is more than calculating task 2, relative, and calculating task 1 is inevitable more time-consuming;Calculating task 3 is according to project by 8000 BBS tables Difference carries out classification summary, it is seen that calculating task 3 wants many of complexity for 1 and 2, necessarily takes most.
Because FPGA can perform multiple threads under different logics, be capable of achieving streamline parallel processing, therefore with compared with Strong parallel processing capability.That is, during for carrying out process task on FPGA, the task is should be suitable at parallelization The calculating task of reason.Described concurrency is referred to and completes two or more property in synchronization or same time interval , as long as overlapping each other in time, all there is concurrency in work that is identical or differing.For example, instruction internal is parallel, that is, refer to Executory each microoperation is made to realize parallel work-flow as far as possible;Parallel between instruction, i.e. the execution of two or more pieces instruction is Carry out parallel;Task process it is parallel, that is, refer to broken down into program into multiple process tasks of parallel processing, and two can be made Individual or multiple tasks parallel processing;It is parallel that operation is processed, that is, refer to the two or more operations of parallel processing, and such as multiprogramming sets Meter, time-sharing system.In addition, for the parallel processing of data processing, such as word string position simultaneously, while all positions to a binary word Operated;Word bit string, while operating to the same position of multiple words;Full parellel, while entering to all positions of many words Row operation.For example, it is adaptable to which the task that parallelization is processed can be decomposed, or can iteration, or the task of circulation;No The task of can carry out parallelization process is more close for logic to each other, it is necessary to serial process, for example, the computing knot of previous step Whole calculating can just be proceeded in the computing that next step must really be applied to, a result can only be treated out, it is next calculating Individual result, is not available for parallel processing.
In sum, when the time that Mllib algorithms carry out computing exceedes Preset Time (complicated and time consumption) and Mllib algorithms are suitable When calculating for parallelization, Mllib algorithms are eligible, and the core calculations of the Mllib algorithms are based partially on OpenCL realities Existing paralell design description, and be transplanted on FPGA and realize parallel computation;Conversely, then the Mllib algorithms do not meet bar Part, still carries out computing, the reading of such as data, or the less part of amount of calculation in the Spark platforms.For example, if Name in national each school is counted for the student of " king two ", and its school grade these years is recalled point Analysis, simultaneously can be investigated to the education administration system database of school by province, be adapted to be assigned on FPGA and process.When So, above-mentioned example is enumerated in order to those skilled in the art are better understood from application scheme, in practical operation, not to this Limit.
It should be noted that Mllib algorithms may not be to be suitable for carrying out parallel computation, but it is just suitable after may be processed Merging rows are calculated, it is preferred, therefore, that, before the calculating task for being assigned to FPGA is calculated, can be to Mllib algorithms using FPGA's Compiler language (such as OpenCL) carries out Parallel Design.In this manner it is possible to Mllib algorithms are sufficiently spread out on FPGA, it is to avoid because Some algorithm is not suitable for concurrent operation and can not process on FPGA, does not realize the effect accelerated using FPGA.
One calculate node can be connected with one or more FPGA, you can so that Mllib algorithms are assigned to one or many Processed on individual FPGA simultaneously.For the task of same complexity and equal amount of calculation, FPGA numbers increase, and can enter The saving operation time of one step, improve arithmetic speed.For example, for example in 1ms, 1000 BBS tables of analysis need 1 FPGA, So for 8000 BBS require process is analyzed in 1ms, then be accomplished by 8 FPGA while work.Certainly, it is above-mentioned Example is enumerated in order to those skilled in the art are better understood from application scheme, in practical operation, this is not limited.
It should be noted that when FPGA performs processor active task, Scala calls OpenCL special purpose interfaces, and the interface can be looked for To the OpenCL core codes that early stage compiling passes through, and calculation is performed by FPGA boards on the code migrating to FPGA boards, are started The calculating section of method.
From the foregoing, it will be observed that the embodiment of the present invention is by judging whether Spark MLlib algorithms meet FPGA OpenCL parallelizations Design conditions;If it is satisfied, then MLlib algorithm core calculations are partially distributed to FPGA carries out computing;Otherwise, MLlib algorithms Still computing is carried out in Spark platforms.It is divided on demand to which according to the property of MLlib algorithms, the MLlib algorithms point of condition will be met Being fitted on FPGA carries out computing, it is to avoid traditional Spark MLlib algorithms carry out processing all tasks, solve to a certain extent in Deposit the excessive problem of expense;As FPGA carries out parallel processing to task, therefore improve task arithmetic speed so that Spark Mllib overall data processing speed is accelerated, and improves the calculating performance of Spark platforms.
Optionally, in some embodiments of the present embodiment, as shown in figure 3, methods described can also for example include:
Result of calculation is returned Spark platforms and is shown by FPGA, that is, export result of calculation.
Under some scenes, need to export result of calculation, such as, when calculate node is GPU, processing some emulation During data, the images outputting that will be emulated often is needed, if some related computing tasks are that computing is carried out on FPGA, at this moment It is accomplished by for result of calculation returning to Spark platforms, the long and is shown to into user by the display device of Spark platforms. FPGA is provided and is returned operation result, can further extend the scope that can process calculating task, so as to the overall Spark that improves puts down The calculating performance of the MLlib of platform.
The embodiment of the present invention is also directed to Spark MLlib data processings and accelerates there is provided corresponding system, further such that Methods described has more practicality.Below the system that Spark MLlib data processings provided in an embodiment of the present invention accelerate is entered Row is introduced, system and above-described Spark MLlib data processings that Spark MLlib data processings described below accelerate The method of acceleration can be mutually to should refer to.
Embodiment two:
Referring to Fig. 4, Fig. 4 is a kind of structure of Spark MLlib data processing acceleration systems provided in an embodiment of the present invention Figure, the system may include:
The distributed big data processing meanss 401 of Spark, set for judging whether Mllib algorithms meet FPGA parallel optimizations Meter condition;When judging that the Mllib algorithms meet the FPGA parallel optimizations design condition, by the core of the Mllib algorithms Scheming is calculated and is based partially on OpenCL and realizes that paralell design is described, and is transplanted on the FPGA and realizes parallel computation, described OpenCL calls OpenCL special purpose interfaces to realize the fusion of FPGA and Spark platforms by Scala;Conversely, then the Mllib is calculated Method carries out computing in the Spark platforms.
FPGA 402, for processing the Mllib algorithms for meeting FPGA parallel optimization design conditions.
Calculate node 403, for carrying out resource allocation and task scheduling to the calculating task on the Spark platforms.
Wherein, calculate node can have multiple, and the distributed big data processing meanss of Spark are deployed to multiple calculate nodes, Calculate node is connected with one or more FPGA, it is preferred that calculate node is by PCIe high-speed interfaces and one or more FPGA It is connected.
It should be noted that processor may also include:
Optimization unit, for the Mllib algorithms are carried out Parallel Design using the OpenCL.
Optionally, in some embodiments of the present embodiment, as shown in figure 5, the system can also for example include:
PCIe high-speed interfaces 404, return for data transfer between the FPGA and the Spark platforms and result.
It should be noted that the data transfer between FPGA and Spark platforms may also be employed other kinds of connecing if necessary Mouthful, the embodiment of the present invention does not do any restriction to this.
Described in the embodiment of the present invention, the function of each functional module of Spark MLlib data processing acceleration systems can be according to upper The method stated in embodiment of the method is implemented, and which implements the associated description that process is referred to said method embodiment, Here is omitted.
From the foregoing, it will be observed that the embodiment of the present invention is by judging whether Spark MLlib algorithms meet FPGA OpenCL parallelizations Design conditions;If it is satisfied, then MLlib algorithm core calculations are partially distributed to FPGA carries out computing;Otherwise, MLlib algorithms Still computing is carried out in Spark platforms.It is divided on demand to which according to the property of MLlib algorithms, the MLlib algorithms point of condition will be met Being fitted on FPGA carries out computing, it is to avoid traditional Spark MLlib algorithms carry out processing all tasks, solve to a certain extent in Deposit the excessive problem of expense;As FPGA carries out parallel processing to task, therefore improve task arithmetic speed so that Spark Mllib overall data processing speed is accelerated, and improves the calculating performance of Spark platforms.
In this specification, each embodiment is described by the way of progressive, and what each embodiment was stressed is and other The difference of embodiment, between each embodiment same or similar part mutually referring to.For dress disclosed in embodiment For putting, as which corresponds to the method disclosed in Example, so description is fairly simple, related part is referring to method part Illustrate.
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description And algorithm steps, can with electronic hardware, computer software or the two be implemented in combination in, in order to clearly demonstrate hardware and The interchangeability of software, generally describes the composition and step of each example in the above description according to function.These Function actually with hardware or software mode performing, depending on the application-specific and design constraint of technical scheme.Specialty Technical staff can use different methods to realize described function to each specific application, but this realization should not Think beyond the scope of this invention.
The step of method described with reference to the embodiments described herein or algorithm, directly can be held with hardware, processor Capable software module, or the combination of the two is implementing.Software module can be placed in random access memory (RAM), internal memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The method that above a kind of Spark MLlib data processings described in the embodiment of the present invention provided by the present invention are accelerated And system is described in detail.Specific case used herein is set forth to the principle and embodiment of the present invention, The explanation of above example is only intended to help and understands the method for the present invention and its core concept.It should be pointed out that for this technology For the those of ordinary skill in field, under the premise without departing from the principles of the invention, some improvement can also be carried out to the present invention And modification, these improve and modification is also fallen in the protection domain of the claims in the present invention.

Claims (9)

1. a kind of method that Spark MLlib data processings accelerate, it is characterised in that include:
Judge whether Mllib algorithms meet FPGA OpenCL parallel optimization design conditions;
When judging that the Mllib algorithms meet the FPGA OpenCL parallel optimization design conditions, by the Mllib algorithms Core calculations be based partially on OpenCL and realize that paralell design is described, and be transplanted on FPGA and realize parallel computation, it is described OpenCL calls OpenCL special purpose interfaces by Scala, to realize the fusion of FPGA and Spark platforms;Conversely, then described Mllib algorithms carry out computing in the Spark platforms.
2. method according to claim 1, it is characterised in that described to judge whether Mllib algorithms meet FPGA OpenCL Parallel optimization design condition is:
When the time that Mllib algorithms carry out computing exceedes Preset Time and the Mllib algorithms are applied to parallelization calculating, institute State Mllib algorithms and meet FPGA OpenCL parallel optimization design conditions;Conversely, then not meeting the FPGA parallel optimizations design Condition.
3. method according to claim 2, it is characterised in that in the core calculations part by the Mllib algorithms Based on OpenCL realize paralell design describe, and be transplanted on FPGA realize parallel computation after also include:
Result of calculation is returned the Spark platforms and is shown by the FPGA.
4. the method according to claim 1-3 any one, it is characterised in that it is described be transplanted on FPGA realize it is parallel Also include before calculating:
The Mllib algorithms are carried out into Parallel Design using the OpenCL.
5. the system that a kind of Spark MLlib data processings accelerate, it is characterised in that include:
The distributed big data processing meanss of Spark, multiple FPGA and multiple calculate nodes,
Wherein, the distributed big data processing meanss of the Spark are deployed to multiple calculate nodes, and each described calculating is saved Point is connected with FPGA one or more described;
The distributed big data processing meanss of the Spark are used to judge whether Mllib algorithms meet FPGA OpenCL parallel optimizations Design condition;When judging that the Mllib algorithms meet the FPGA parallel optimizations design condition, by the Mllib algorithms Core calculations are based partially on OpenCL and realize that paralell design is described, and are transplanted on the FPGA and realize parallel computation, described OpenCL calls OpenCL special purpose interfaces by Scala, to realize the fusion of FPGA and Spark platforms;Conversely, then described Mllib algorithms carry out computing in the Spark platforms;
The FPGA meets the Mllib algorithms of the FPGA OpenCL parallel optimization design conditions for processing;
The calculate node is for carrying out resource allocation and task scheduling to the calculating task on the Spark platforms.
6. system according to claim 5, it is characterised in that the distributed big data processing meanss of the Spark are to work as institute It is when stating Mllib algorithms and carrying out time of computing and exceed Preset Time and the Mllib algorithms and be applied to parallelization and calculate, described Mllib algorithms meet FPGA OpenCL parallel optimization design conditions;Conversely, then not meeting the FPGA parallel optimizations design bar The module of part.
7. system according to claim 6, it is characterised in that also include:
PCIe high-speed interfaces, return for data transfer between the FPGA and the Spark platforms and result.
8. the system according to claim 5 to 7 any one, it is characterised in that the distributed big data of the Spark is processed Device also includes:
Optimization unit, for the Mllib algorithms are carried out Parallel Design using the OpenCL.
9. system according to claim 7, it is characterised in that the calculate node passes through the PCIe high-speed interfaces and institute State FPGA to be connected.
CN201611056361.3A 2016-11-24 2016-11-24 The method and system that a kind of Spark MLlib data processings accelerate Pending CN106547627A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611056361.3A CN106547627A (en) 2016-11-24 2016-11-24 The method and system that a kind of Spark MLlib data processings accelerate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611056361.3A CN106547627A (en) 2016-11-24 2016-11-24 The method and system that a kind of Spark MLlib data processings accelerate

Publications (1)

Publication Number Publication Date
CN106547627A true CN106547627A (en) 2017-03-29

Family

ID=58395380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611056361.3A Pending CN106547627A (en) 2016-11-24 2016-11-24 The method and system that a kind of Spark MLlib data processings accelerate

Country Status (1)

Country Link
CN (1) CN106547627A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122490A (en) * 2017-05-18 2017-09-01 郑州云海信息技术有限公司 The data processing method and system of aggregate function in a kind of Querying by group
CN107391432A (en) * 2017-08-11 2017-11-24 中国计量大学 A kind of heterogeneous Computing device and computing node interconnection network
CN107480770A (en) * 2017-07-27 2017-12-15 中国科学院自动化研究所 The adjustable neutral net for quantifying bit wide quantifies the method and device with compression
CN107612681A (en) * 2017-09-25 2018-01-19 郑州云海信息技术有限公司 A kind of data processing method based on SM3 algorithms, apparatus and system
CN107612682A (en) * 2017-09-25 2018-01-19 郑州云海信息技术有限公司 A kind of data processing method based on SHA512 algorithms, apparatus and system
CN107862386A (en) * 2017-11-03 2018-03-30 郑州云海信息技术有限公司 A kind of method and device of data processing
CN108958852A (en) * 2018-07-16 2018-12-07 济南浪潮高新科技投资发展有限公司 A kind of system optimization method based on FPGA heterogeneous platform
WO2019041708A1 (en) * 2017-08-29 2019-03-07 武汉斗鱼网络科技有限公司 Classification model training system and realisation method therefor
CN110209631A (en) * 2019-05-10 2019-09-06 普华诚信信息技术有限公司 Big data processing method and its processing system
CN110597615A (en) * 2018-06-12 2019-12-20 杭州海康威视数字技术股份有限公司 Method for processing coding instruction and node equipment
CN111722834A (en) * 2020-07-23 2020-09-29 哈尔滨工业大学 Robot-oriented EKF-SLAM algorithm acceleration method
TWI714078B (en) * 2019-05-07 2020-12-21 國立高雄大學 System and method for scheduling big data analysis platform based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120173476A1 (en) * 2011-01-04 2012-07-05 Nasir Rizvi System and Method for Rule-Based Asymmetric Data Reporting
CN105447285A (en) * 2016-01-20 2016-03-30 杭州菲数科技有限公司 Method for improving OpenCL hardware execution efficiency
CN105956666A (en) * 2016-04-29 2016-09-21 浪潮(北京)电子信息产业有限公司 Machine learning method and system
CN106155635A (en) * 2015-04-03 2016-11-23 北京奇虎科技有限公司 A kind of data processing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120173476A1 (en) * 2011-01-04 2012-07-05 Nasir Rizvi System and Method for Rule-Based Asymmetric Data Reporting
CN106155635A (en) * 2015-04-03 2016-11-23 北京奇虎科技有限公司 A kind of data processing method and device
CN105447285A (en) * 2016-01-20 2016-03-30 杭州菲数科技有限公司 Method for improving OpenCL hardware execution efficiency
CN105956666A (en) * 2016-04-29 2016-09-21 浪潮(北京)电子信息产业有限公司 Machine learning method and system

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122490A (en) * 2017-05-18 2017-09-01 郑州云海信息技术有限公司 The data processing method and system of aggregate function in a kind of Querying by group
CN107480770B (en) * 2017-07-27 2020-07-28 中国科学院自动化研究所 Neural network quantization and compression method and device capable of adjusting quantization bit width
CN107480770A (en) * 2017-07-27 2017-12-15 中国科学院自动化研究所 The adjustable neutral net for quantifying bit wide quantifies the method and device with compression
CN107391432A (en) * 2017-08-11 2017-11-24 中国计量大学 A kind of heterogeneous Computing device and computing node interconnection network
CN107391432B (en) * 2017-08-11 2020-07-28 中国计量大学 Heterogeneous parallel computing device and operation node interconnection network
WO2019041708A1 (en) * 2017-08-29 2019-03-07 武汉斗鱼网络科技有限公司 Classification model training system and realisation method therefor
CN107612681A (en) * 2017-09-25 2018-01-19 郑州云海信息技术有限公司 A kind of data processing method based on SM3 algorithms, apparatus and system
CN107612682A (en) * 2017-09-25 2018-01-19 郑州云海信息技术有限公司 A kind of data processing method based on SHA512 algorithms, apparatus and system
CN107862386A (en) * 2017-11-03 2018-03-30 郑州云海信息技术有限公司 A kind of method and device of data processing
CN110597615A (en) * 2018-06-12 2019-12-20 杭州海康威视数字技术股份有限公司 Method for processing coding instruction and node equipment
CN110597615B (en) * 2018-06-12 2022-07-01 杭州海康威视数字技术股份有限公司 Method for processing coding instruction and node equipment
CN108958852A (en) * 2018-07-16 2018-12-07 济南浪潮高新科技投资发展有限公司 A kind of system optimization method based on FPGA heterogeneous platform
TWI714078B (en) * 2019-05-07 2020-12-21 國立高雄大學 System and method for scheduling big data analysis platform based on deep learning
CN110209631A (en) * 2019-05-10 2019-09-06 普华诚信信息技术有限公司 Big data processing method and its processing system
CN111722834B (en) * 2020-07-23 2021-12-28 哈尔滨工业大学 Robot-oriented EKF-SLAM algorithm acceleration method
CN111722834A (en) * 2020-07-23 2020-09-29 哈尔滨工业大学 Robot-oriented EKF-SLAM algorithm acceleration method

Similar Documents

Publication Publication Date Title
CN106547627A (en) The method and system that a kind of Spark MLlib data processings accelerate
Wang et al. Various frameworks and libraries of machine learning and deep learning: a survey
Khorasani et al. CuSha: vertex-centric graph processing on GPUs
CN105022670B (en) Heterogeneous distributed task processing system and its processing method in a kind of cloud computing platform
Karloff et al. A model of computation for MapReduce
CN106528171B (en) Method of interface, apparatus and system between a kind of heterogeneous computing platforms subsystem
Negrevergne et al. Discovering closed frequent itemsets on multicore: Parallelizing computations and optimizing memory accesses
Banger et al. OpenCL programming by example
CN105426344A (en) Matrix calculation method of distributed large-scale matrix multiplication based on Spark
Gent et al. A preliminary review of literature on parallel constraint solving
CN106295670A (en) Data processing method and data processing equipment
Shan et al. CNN-on-AWS: Efficient allocation of multikernel applications on multi-FPGA platforms
CN100531070C (en) Network resource scheduling simulation system
Xu et al. Optimizing finite volume method solvers on Nvidia GPUs
CN103955443A (en) Ant colony algorithm optimization method based on GPU (Graphic Processing Unit) acceleration
Farrell et al. A concurrent computer architecture and a ring based implementation
Chong et al. A Multi-GPU framework for in-memory text data analytics
Jin et al. TurboDL: Improving the CNN training on GPU with fine-grained multi-streaming scheduling
Yuan et al. Automatic enhanced CDFG generation based on runtime instrumentation
Muniyandi et al. Using graphics processing unit to accelerate simulation of membrane computing
Rodrigues Programming future architectures: dusty decks, memory walls, and the speed of light
Ponce et al. Extension of a Task-based model to Functional programming
Savadi et al. Multi-DaC programming model: A variant of multi-BSP model for divide-and-conquer algorithms
Wu et al. Generalization of large-scale data processing in one MapReduce job for coarse-grained parallelism
Janjic et al. Using erlang skeletons to parallelise realistic medium-scale parallel programs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170329