CN107862386A - A kind of method and device of data processing - Google Patents
A kind of method and device of data processing Download PDFInfo
- Publication number
- CN107862386A CN107862386A CN201711070292.6A CN201711070292A CN107862386A CN 107862386 A CN107862386 A CN 107862386A CN 201711070292 A CN201711070292 A CN 201711070292A CN 107862386 A CN107862386 A CN 107862386A
- Authority
- CN
- China
- Prior art keywords
- fpga accelerator
- calculation
- result
- accelerator cards
- block data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Stored Programmes (AREA)
Abstract
The invention discloses a kind of method of data processing, applied to Mahout machine learning algorithms, this method includes:Sample set is read, is initialized by the deblocking in the sample set, and to parameter;Each block data and the parameter are transmitted to the FPGA accelerator cards of each clustered node;Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation;Whether evaluation algorithm restrains;The result of calculation is exported if convergence.Acceleration processing is carried out to Mahout machine learning algorithms using FPGA accelerator cards, effectively improves the calculating performance of Mahout machine learning algorithms.The invention also discloses a kind of device of data processing, equally with above-mentioned technique effect.
Description
Technical field
The present invention relates to machine learning algorithm field, more particularly to a kind of method and device of data processing.
Background technology
Big data epoch, data processing develop towards the direction of Intelligent data mining, have greatly promoted machine learning calculation
The research and application of method.Mahout is under ASF (Apache Software Foundation, Apache Software Foundation)
One open source projects, there is provided the realization of some expansible machine learning field classic algorithms, it is intended to help developer more
Conveniently and efficiently create intelligent application.The characteristics of Mahout maximums, is realized based on Hadoop, all with Hadoop
Advantage.The design that Hadoop framework is most crucial is Hadoop distributed file systems HDFS and MapReduce.Wherein, HDFS
(Hadoop Distributed File System, Hadoop distributed file system) provides storage for mass data,
MapReduce then provides calculating for the data of magnanimity.Mahout is by means of Hadoop many calculations run in the past on unit
Method, the demand for for MapReduce patterns, meeting data parallelization processing to a certain extent is converted, improving algorithm can
The data volume and process performance of processing.However, because single node disposal ability is limited, Hadoop is needed by extending clustered node
Scale realizes calculating performance boost, and this cluster expansion often causes system cost and energy consumption quickly to increase, be greatly reduced
The calculating performance gain that cluster expansion is brought.
Therefore, how to improve the disposal ability of Hadoop single node is those skilled in the art's urgent problem to be solved.
The content of the invention
It is an object of the invention to provide a kind of method of data processing, this method is by the Mahout machine learning algorithms of complexity
Carried out in FPGA accelerator cards, effectively improve the calculating performance of Mahout machine learning algorithms.Another object of the present invention
It is to provide a kind of device of data processing.
In order to solve the above technical problems, the invention provides a kind of method of data processing, methods described is applied to
Mahout machine learning algorithms, including:
Sample set is read, is initialized by the deblocking in the sample set, and to parameter;
Each block data and the parameter are transmitted to corresponding FPGA accelerator cards;
Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns
Return result of calculation;
Whether evaluation algorithm restrains;
If the algorithmic statement, export the result of calculation.
Preferably, the calling hardware applications, each block data is carried out using each FPGA accelerator cards
Iterating to calculate and returning to result of calculation includes:
OpenCL application programs are called, calculating is iterated simultaneously to each block data using each FPGA accelerator cards
Return to result of calculation.
Preferably, the calling OpenCL application programs include:
The OpenCL application programs are called using JNI.
Present invention also offers a kind of device of the acceleration of Mahout machine learning algorithms, including:
Initialization unit, for reading sample set, by the deblocking in the sample set, and parameter is carried out initial
Change;
Transmission unit, for each block data and the parameter to be transmitted to corresponding FPGA accelerator cards;
Execution unit, for calling OpenCL application programs, using each FPGA accelerator cards to each block data
It is iterated and calculates and return result of calculation;
Judging unit, whether restrained for evaluation algorithm;If the algorithmic statement, export the result of calculation.
Preferably, the execution unit includes:
Subelement is performed, for calling OpenCL application programs, using each FPGA accelerator cards to each block count
Calculate according to being iterated and return to result of calculation;
Preferably, the execution subelement includes:
Subelement is called, for calling the OpenCL application programs using JNI.
The method of data processing provided by the invention, sample set is read, by the deblocking in the sample set, and to ginseng
Number is initialized;Each block data and the parameter are transmitted to corresponding FPGA accelerator cards;OpenCL application programs are called,
Each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation;Whether evaluation algorithm is received
Hold back;If the algorithmic statement, export the result of calculation.
It can be seen that it is of the invention on the basis of fully analysis Mahout machine learning algorithms, by Mahout machine learning algorithms
In computationally intensive and suitable parallel computation part be incorporated into FPGA accelerator cards, by calling hardware applications, in FPGA
Correlation computations are carried out in accelerator card.Acceleration processing is carried out to Mahout machine learning algorithms using FPGA accelerator cards, effectively lifting
The calculating performances of Mahout machine learning algorithms.
The device of data processing provided by the invention equally has above-mentioned technique effect.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is the schematic diagram of the method for data processing provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram of the device of data processing provided in an embodiment of the present invention.
Embodiment
The core of the present invention is to provide a kind of method of data processing, and this method is by the Mahout machine learning algorithms of complexity
Acceleration processing is carried out in FPGA accelerator cards (Field-Programmable Gate Array, field programmable gate array), is had
Effect improves the calculating performance of Mahout machine learning algorithms.Another core of the present invention is to provide a kind of dress of data processing
Put, equally with above-mentioned technique effect.
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
Part of the embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art
The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Fig. 1 is refer to, Fig. 1 is the schematic diagram of the method for data processing provided in an embodiment of the present invention, is understood with reference to figure 1,
This method may comprise steps of:
S100:Sample set is read, is initialized by the deblocking in sample set, and to parameter;
Wherein, sample set is read, parameter is carried out the operation such as initializing can be according to the original of Mahout machine learning algorithms
There is mechanism execution.
S200:Each block data and parameter are transmitted to corresponding FPGA accelerator cards;
Specifically, after deblocking in sample set, parameter and different data blocks are transmitted to corresponding
FPGA accelerator cards.Wherein, each former Hadoop clustered nodes form new Hadoop clustered nodes with corresponding FPGA accelerator cards,
FPGA accelerator cards carry out interconnected communication with former Hadoop clustered nodes by PCIe buses.
S300:Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns
Result of calculation;
Because FPGA accelerator cards support verilog hardware languages, VHDL (Very-High-Speed Integrated
Circuit Hardware Description Language, VHSIC hardware description language) hardware language,
OpenCL (Open Computing Laguage, open computing language) high-level language etc.;So calling can be passed through
Verilog hardware languages or VHDL hardware languages are realized is iterated calculating in FPGA accelerator cards to each block data
Purpose.
Preferably, hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns
Returning result of calculation includes:
OpenCL application programs are called, each block data is iterated using each FPGA accelerator cards and calculates and return calculating
As a result.
Specifically, the kernel program of OpenCL application programs generates the logic circuit of FPGA accelerator cards simultaneously by compilation process
Configuration parameter as FPGA.After receiving parameter and data block, each FPGA accelerator cards according to parameter and data block simultaneously
Identical algorithm is performed, result of calculation is then back to CPU.
Because OpenCL language than verilog hardware languages, VHDL hardware languages for, exploitation is relatively easy, exploitation
Cycle is relatively short.So the preferred scheme, which can be realized relatively easily, calls OpenCL application programs, accelerated using each FPGA
Card is iterated to each block data and calculates and return the purpose of result of calculation.
Wherein, Mahout calls the interface of OpenCL application programs to realize, can be by JNI (Java Native
Interface, Java local interface), the instrument such as g++ realizes.Wherein, JNI provides some API (Application
Programming Interface, application programming interface), realize the logical of Java and other language (mainly C&C++)
Letter, the code that JNI allows Java code and other language to write interact.
Preferably, OpenCL application programs are called to include:
OpenCL application programs are called using JNI.
Specifically, realizing interacting for java language and OpenCL language using JNI, OpenCL is called to reach Mahout
The purpose of application program.
S400:Whether evaluation algorithm restrains;
S500:If algorithmic statement, export result of calculation;
S600:If algorithm dissipates, recalculate.
Specifically, for the not convergent situation of algorithm, calculating can be re-started;Again algorithm can also be designed
Realize, can specifically be determined according to actual conditions, the present invention is not especially limited.
In summary, the method for data processing provided by the invention, sample set is read, by the deblocking in sample set,
And parameter is initialized;Each block data and parameter are transmitted to corresponding FPGA accelerator cards;Call OpenCL application journeys
Sequence, each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation;Whether evaluation algorithm restrains;Such as
Fruit algorithmic statement, then export result of calculation.
It can be seen that it is of the invention on the basis of fully analysis Mahout machine learning algorithms, by Mahout machine learning algorithms
In computationally intensive and suitable parallel computation part be incorporated into FPGA accelerator cards, by calling hardware applications, in FPGA
Correlation computations are carried out in accelerator card.Acceleration processing is carried out to algorithm using FPGA accelerator cards, effectively improves Mahout engineerings
Practise the calculating performance of algorithm.
Present invention also offers a kind of device of data processing.The device of data processing provided by the invention is carried out below
Introduce, the device described below can be with method as described above mutually to should refer to.
As shown in Fig. 2 the device includes initialization unit 1, transmission unit 2, execution unit 3, judging unit 4.
Initialization unit 1, for reading sample set, initialized by the deblocking in sample set, and to parameter;
Transmission unit 2, for each block data and parameter to be transmitted to corresponding FPGA accelerator cards;
Execution unit 3, for calling OpenCL application programs, each block data is iterated using each FPGA accelerator cards
Calculate and return to result of calculation;
Judging unit 4, whether restrained for evaluation algorithm;If algorithmic statement, export result of calculation.
Preferably, execution unit 3 includes:
Subelement is performed, for for calling OpenCL application programs, entering using each FPGA accelerator cards to each block data
Row iteration calculates and returns to result of calculation;
Preferably, execution unit subelement includes:
Subelement is called, for calling OpenCL application programs using JNI.
Each embodiment is described by the way of progressive in specification, and what each embodiment stressed is and other realities
Apply the difference of example, between each embodiment identical similar portion mutually referring to.For device disclosed in embodiment
Speech, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part illustration
.
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description
And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These
Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty
Technical staff can realize described function using distinct methods to each specific application, but this realization should not
Think beyond the scope of this invention.
Directly it can be held with reference to the step of method or algorithm that the embodiments described herein describes with hardware, processor
Capable software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
The method and device of data processing provided by the present invention is described in detail above.Tool used herein
Body example is set forth to the principle and embodiment of the present invention, and the explanation of above example is only intended to help and understands this hair
Bright method and its core concept.It should be pointed out that for those skilled in the art, the present invention is not being departed from
On the premise of principle, some improvement and modification can also be carried out to the present invention, these are improved and modification also falls into right of the present invention
It is required that protection domain in.
Claims (6)
- A kind of 1. method of data processing, it is characterised in that applied to Mahout machine learning algorithms, including:Sample set is read, is initialized by the deblocking in the sample set, and to parameter;Each block data and the parameter are transmitted to corresponding FPGA accelerator cards;Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and return meter Calculate result;Whether evaluation algorithm restrains;If the algorithmic statement, export the result of calculation.
- 2. according to the method for claim 1, it is characterised in that the calling hardware applications, utilize each FPGA Accelerator card is iterated to calculate and return to result of calculation to each block data to be included:OpenCL application programs are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Result of calculation.
- 3. according to the method for claim 2, it is characterised in that the calling OpenCL application programs include:The OpenCL application programs are called using JNI.
- A kind of 4. device of data processing, it is characterised in that applied to Mahout machine learning algorithms, including:Initialization unit, for reading sample set, initialized by the deblocking in the sample set, and to parameter;Transmission unit, for each block data and the parameter to be transmitted to corresponding FPGA accelerator cards;Execution unit, for calling hardware applications, each block data is changed using each FPGA accelerator cards In generation, calculates and returns to result of calculation;Judging unit, whether restrained for evaluation algorithm;If the algorithmic statement, export the result of calculation.
- 5. device according to claim 4, it is characterised in that the execution unit includes:Subelement is performed, for calling OpenCL application programs, each block data is entered using each FPGA accelerator cards Row iteration calculates and returns to result of calculation.
- 6. device according to claim 5, it is characterised in that the execution subelement includes:Subelement is called, for calling the OpenCL application programs using JNI.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711070292.6A CN107862386A (en) | 2017-11-03 | 2017-11-03 | A kind of method and device of data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711070292.6A CN107862386A (en) | 2017-11-03 | 2017-11-03 | A kind of method and device of data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107862386A true CN107862386A (en) | 2018-03-30 |
Family
ID=61700600
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711070292.6A Pending CN107862386A (en) | 2017-11-03 | 2017-11-03 | A kind of method and device of data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107862386A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750309A (en) * | 2012-03-19 | 2012-10-24 | 南京大学 | Parallelization support vector machine (SVM) solving method based on Hadoop |
CN106547627A (en) * | 2016-11-24 | 2017-03-29 | 郑州云海信息技术有限公司 | The method and system that a kind of Spark MLlib data processings accelerate |
CN107122243A (en) * | 2017-04-12 | 2017-09-01 | 杭州远算云计算有限公司 | Heterogeneous Cluster Environment and CFD computational methods for CFD simulation calculations |
CN107292330A (en) * | 2017-05-02 | 2017-10-24 | 南京航空航天大学 | A kind of iterative label Noise Identification algorithm based on supervised learning and semi-supervised learning double-point information |
-
2017
- 2017-11-03 CN CN201711070292.6A patent/CN107862386A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750309A (en) * | 2012-03-19 | 2012-10-24 | 南京大学 | Parallelization support vector machine (SVM) solving method based on Hadoop |
CN106547627A (en) * | 2016-11-24 | 2017-03-29 | 郑州云海信息技术有限公司 | The method and system that a kind of Spark MLlib data processings accelerate |
CN107122243A (en) * | 2017-04-12 | 2017-09-01 | 杭州远算云计算有限公司 | Heterogeneous Cluster Environment and CFD computational methods for CFD simulation calculations |
CN107292330A (en) * | 2017-05-02 | 2017-10-24 | 南京航空航天大学 | A kind of iterative label Noise Identification algorithm based on supervised learning and semi-supervised learning double-point information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114546405B (en) | Method and system for processing graphics using a unified intermediate representation | |
Igual et al. | The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations | |
Chamberlain et al. | Auto-Pipe: Streaming applications on architecturally diverse systems | |
KR20130114688A (en) | Architecture optimizer | |
KR102371844B1 (en) | Computing method applied to artificial intelligence chip, and artificial intelligence chip | |
CN106528171B (en) | Method of interface, apparatus and system between a kind of heterogeneous computing platforms subsystem | |
US11556756B2 (en) | Computation graph mapping in heterogeneous computer system | |
US10002225B2 (en) | Static timing analysis with improved accuracy and efficiency | |
Souza et al. | CAP Bench: a benchmark suite for performance and energy evaluation of low‐power many‐core processors | |
Lanzagorta et al. | Introduction to reconfigurable supercomputing | |
Flasskamp et al. | Performance estimation of streaming applications for hierarchical MPSoCs | |
US8555030B2 (en) | Creating multiple versions for interior pointers and alignment of an array | |
Sanchez-Roman et al. | An euler solver accelerator in FPGA for computational fluid dynamics applications | |
Khan et al. | Accelerating SpMV multiplication in probabilistic model checkers using GPUs | |
Liu et al. | A simulation framework for memristor-based heterogeneous computing architectures | |
CN107862386A (en) | A kind of method and device of data processing | |
Hubert | A survey of HW/SW cosimulation techniques and tools | |
US10133839B1 (en) | Systems and methods for estimating a power consumption of a register-transfer level circuit design | |
Bombieri et al. | HDTLib: an efficient implementation of SystemC data types for fast simulation at different abstraction levels | |
Yuan et al. | Automatic enhanced CDFG generation based on runtime instrumentation | |
Saussard et al. | A novel global methodology to analyze the embeddability of real-time image processing algorithms | |
Bhimani et al. | Design space exploration of GPU Accelerated cluster systems for optimal data transfer using PCIe bus | |
Li et al. | Formal and virtual multi-level design space exploration | |
Agharass et al. | Hardware Software Co-design based CPU-FPGA Architecture: Overview and Evaluation | |
Stamoulias et al. | Hardware accelerators for financial applications in HDL and High Level Synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180330 |
|
RJ01 | Rejection of invention patent application after publication |