CN107862386A

CN107862386A - A kind of method and device of data processing

Info

Publication number: CN107862386A
Application number: CN201711070292.6A
Authority: CN
Inventors: 曹芳
Original assignee: Zhengzhou Yunhai Information Technology Co Ltd
Current assignee: Zhengzhou Yunhai Information Technology Co Ltd
Priority date: 2017-11-03
Filing date: 2017-11-03
Publication date: 2018-03-30

Abstract

The invention discloses a kind of method of data processing, applied to Mahout machine learning algorithms, this method includes：Sample set is read, is initialized by the deblocking in the sample set, and to parameter；Each block data and the parameter are transmitted to the FPGA accelerator cards of each clustered node；Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation；Whether evaluation algorithm restrains；The result of calculation is exported if convergence.Acceleration processing is carried out to Mahout machine learning algorithms using FPGA accelerator cards, effectively improves the calculating performance of Mahout machine learning algorithms.The invention also discloses a kind of device of data processing, equally with above-mentioned technique effect.

Description

A kind of method and device of data processing

Technical field

The present invention relates to machine learning algorithm field, more particularly to a kind of method and device of data processing.

Background technology

Big data epoch, data processing develop towards the direction of Intelligent data mining, have greatly promoted machine learning calculation The research and application of method.Mahout is under ASF (Apache Software Foundation, Apache Software Foundation) One open source projects, there is provided the realization of some expansible machine learning field classic algorithms, it is intended to help developer more Conveniently and efficiently create intelligent application.The characteristics of Mahout maximums, is realized based on Hadoop, all with Hadoop Advantage.The design that Hadoop framework is most crucial is Hadoop distributed file systems HDFS and MapReduce.Wherein, HDFS (Hadoop Distributed File System, Hadoop distributed file system) provides storage for mass data, MapReduce then provides calculating for the data of magnanimity.Mahout is by means of Hadoop many calculations run in the past on unit Method, the demand for for MapReduce patterns, meeting data parallelization processing to a certain extent is converted, improving algorithm can The data volume and process performance of processing.However, because single node disposal ability is limited, Hadoop is needed by extending clustered node Scale realizes calculating performance boost, and this cluster expansion often causes system cost and energy consumption quickly to increase, be greatly reduced The calculating performance gain that cluster expansion is brought.

Therefore, how to improve the disposal ability of Hadoop single node is those skilled in the art's urgent problem to be solved.

The content of the invention

It is an object of the invention to provide a kind of method of data processing, this method is by the Mahout machine learning algorithms of complexity Carried out in FPGA accelerator cards, effectively improve the calculating performance of Mahout machine learning algorithms.Another object of the present invention It is to provide a kind of device of data processing.

In order to solve the above technical problems, the invention provides a kind of method of data processing, methods described is applied to Mahout machine learning algorithms, including：

Sample set is read, is initialized by the deblocking in the sample set, and to parameter；

Each block data and the parameter are transmitted to corresponding FPGA accelerator cards；

Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Return result of calculation；

Whether evaluation algorithm restrains；

If the algorithmic statement, export the result of calculation.

Preferably, the calling hardware applications, each block data is carried out using each FPGA accelerator cards Iterating to calculate and returning to result of calculation includes：

OpenCL application programs are called, calculating is iterated simultaneously to each block data using each FPGA accelerator cards Return to result of calculation.

Preferably, the calling OpenCL application programs include：

The OpenCL application programs are called using JNI.

Present invention also offers a kind of device of the acceleration of Mahout machine learning algorithms, including：

Initialization unit, for reading sample set, by the deblocking in the sample set, and parameter is carried out initial Change；

Transmission unit, for each block data and the parameter to be transmitted to corresponding FPGA accelerator cards；

Execution unit, for calling OpenCL application programs, using each FPGA accelerator cards to each block data It is iterated and calculates and return result of calculation；

Judging unit, whether restrained for evaluation algorithm；If the algorithmic statement, export the result of calculation.

Preferably, the execution unit includes：

Subelement is performed, for calling OpenCL application programs, using each FPGA accelerator cards to each block count Calculate according to being iterated and return to result of calculation；

Preferably, the execution subelement includes：

Subelement is called, for calling the OpenCL application programs using JNI.

The method of data processing provided by the invention, sample set is read, by the deblocking in the sample set, and to ginseng Number is initialized；Each block data and the parameter are transmitted to corresponding FPGA accelerator cards；OpenCL application programs are called, Each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation；Whether evaluation algorithm is received Hold back；If the algorithmic statement, export the result of calculation.

It can be seen that it is of the invention on the basis of fully analysis Mahout machine learning algorithms, by Mahout machine learning algorithms In computationally intensive and suitable parallel computation part be incorporated into FPGA accelerator cards, by calling hardware applications, in FPGA Correlation computations are carried out in accelerator card.Acceleration processing is carried out to Mahout machine learning algorithms using FPGA accelerator cards, effectively lifting The calculating performances of Mahout machine learning algorithms.

The device of data processing provided by the invention equally has above-mentioned technique effect.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.

Fig. 1 is the schematic diagram of the method for data processing provided in an embodiment of the present invention；

Fig. 2 is the schematic diagram of the device of data processing provided in an embodiment of the present invention.

Embodiment

The core of the present invention is to provide a kind of method of data processing, and this method is by the Mahout machine learning algorithms of complexity Acceleration processing is carried out in FPGA accelerator cards (Field-Programmable Gate Array, field programmable gate array), is had Effect improves the calculating performance of Mahout machine learning algorithms.Another core of the present invention is to provide a kind of dress of data processing Put, equally with above-mentioned technique effect.

To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is Part of the embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.

Fig. 1 is refer to, Fig. 1 is the schematic diagram of the method for data processing provided in an embodiment of the present invention, is understood with reference to figure 1, This method may comprise steps of：

S100：Sample set is read, is initialized by the deblocking in sample set, and to parameter；

Wherein, sample set is read, parameter is carried out the operation such as initializing can be according to the original of Mahout machine learning algorithms There is mechanism execution.

S200：Each block data and parameter are transmitted to corresponding FPGA accelerator cards；

Specifically, after deblocking in sample set, parameter and different data blocks are transmitted to corresponding FPGA accelerator cards.Wherein, each former Hadoop clustered nodes form new Hadoop clustered nodes with corresponding FPGA accelerator cards, FPGA accelerator cards carry out interconnected communication with former Hadoop clustered nodes by PCIe buses.

S300：Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Result of calculation；

Because FPGA accelerator cards support verilog hardware languages, VHDL (Very-High-Speed Integrated Circuit Hardware Description Language, VHSIC hardware description language) hardware language, OpenCL (Open Computing Laguage, open computing language) high-level language etc.；So calling can be passed through Verilog hardware languages or VHDL hardware languages are realized is iterated calculating in FPGA accelerator cards to each block data Purpose.

Preferably, hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Returning result of calculation includes：

OpenCL application programs are called, each block data is iterated using each FPGA accelerator cards and calculates and return calculating As a result.

Specifically, the kernel program of OpenCL application programs generates the logic circuit of FPGA accelerator cards simultaneously by compilation process Configuration parameter as FPGA.After receiving parameter and data block, each FPGA accelerator cards according to parameter and data block simultaneously Identical algorithm is performed, result of calculation is then back to CPU.

Because OpenCL language than verilog hardware languages, VHDL hardware languages for, exploitation is relatively easy, exploitation Cycle is relatively short.So the preferred scheme, which can be realized relatively easily, calls OpenCL application programs, accelerated using each FPGA Card is iterated to each block data and calculates and return the purpose of result of calculation.

Wherein, Mahout calls the interface of OpenCL application programs to realize, can be by JNI (Java Native Interface, Java local interface), the instrument such as g++ realizes.Wherein, JNI provides some API (Application Programming Interface, application programming interface), realize the logical of Java and other language (mainly C＆C++) Letter, the code that JNI allows Java code and other language to write interact.

Preferably, OpenCL application programs are called to include：

OpenCL application programs are called using JNI.

Specifically, realizing interacting for java language and OpenCL language using JNI, OpenCL is called to reach Mahout The purpose of application program.

S400：Whether evaluation algorithm restrains；

S500：If algorithmic statement, export result of calculation；

S600：If algorithm dissipates, recalculate.

Specifically, for the not convergent situation of algorithm, calculating can be re-started；Again algorithm can also be designed Realize, can specifically be determined according to actual conditions, the present invention is not especially limited.

In summary, the method for data processing provided by the invention, sample set is read, by the deblocking in sample set, And parameter is initialized；Each block data and parameter are transmitted to corresponding FPGA accelerator cards；Call OpenCL application journeys Sequence, each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation；Whether evaluation algorithm restrains；Such as Fruit algorithmic statement, then export result of calculation.

It can be seen that it is of the invention on the basis of fully analysis Mahout machine learning algorithms, by Mahout machine learning algorithms In computationally intensive and suitable parallel computation part be incorporated into FPGA accelerator cards, by calling hardware applications, in FPGA Correlation computations are carried out in accelerator card.Acceleration processing is carried out to algorithm using FPGA accelerator cards, effectively improves Mahout engineerings Practise the calculating performance of algorithm.

Present invention also offers a kind of device of data processing.The device of data processing provided by the invention is carried out below Introduce, the device described below can be with method as described above mutually to should refer to.

As shown in Fig. 2 the device includes initialization unit 1, transmission unit 2, execution unit 3, judging unit 4.

Initialization unit 1, for reading sample set, initialized by the deblocking in sample set, and to parameter；

Transmission unit 2, for each block data and parameter to be transmitted to corresponding FPGA accelerator cards；

Execution unit 3, for calling OpenCL application programs, each block data is iterated using each FPGA accelerator cards Calculate and return to result of calculation；

Judging unit 4, whether restrained for evaluation algorithm；If algorithmic statement, export result of calculation.

Preferably, execution unit 3 includes：

Subelement is performed, for for calling OpenCL application programs, entering using each FPGA accelerator cards to each block data Row iteration calculates and returns to result of calculation；

Preferably, execution unit subelement includes：

Subelement is called, for calling OpenCL application programs using JNI.

Each embodiment is described by the way of progressive in specification, and what each embodiment stressed is and other realities Apply the difference of example, between each embodiment identical similar portion mutually referring to.For device disclosed in embodiment Speech, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part illustration .

Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty Technical staff can realize described function using distinct methods to each specific application, but this realization should not Think beyond the scope of this invention.

Directly it can be held with reference to the step of method or algorithm that the embodiments described herein describes with hardware, processor Capable software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.

The method and device of data processing provided by the present invention is described in detail above.Tool used herein Body example is set forth to the principle and embodiment of the present invention, and the explanation of above example is only intended to help and understands this hair Bright method and its core concept.It should be pointed out that for those skilled in the art, the present invention is not being departed from On the premise of principle, some improvement and modification can also be carried out to the present invention, these are improved and modification also falls into right of the present invention It is required that protection domain in.

Claims

A kind of 1. method of data processing, it is characterised in that applied to Mahout machine learning algorithms, including：

Sample set is read, is initialized by the deblocking in the sample set, and to parameter；

Each block data and the parameter are transmitted to corresponding FPGA accelerator cards；

Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and return meter Calculate result；

Whether evaluation algorithm restrains；

If the algorithmic statement, export the result of calculation.
2. according to the method for claim 1, it is characterised in that the calling hardware applications, utilize each FPGA Accelerator card is iterated to calculate and return to result of calculation to each block data to be included：

OpenCL application programs are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Result of calculation.
3. according to the method for claim 2, it is characterised in that the calling OpenCL application programs include：

The OpenCL application programs are called using JNI.
A kind of 4. device of data processing, it is characterised in that applied to Mahout machine learning algorithms, including：

Initialization unit, for reading sample set, initialized by the deblocking in the sample set, and to parameter；

Transmission unit, for each block data and the parameter to be transmitted to corresponding FPGA accelerator cards；

Execution unit, for calling hardware applications, each block data is changed using each FPGA accelerator cards In generation, calculates and returns to result of calculation；

Judging unit, whether restrained for evaluation algorithm；If the algorithmic statement, export the result of calculation.
5. device according to claim 4, it is characterised in that the execution unit includes：

Subelement is performed, for calling OpenCL application programs, each block data is entered using each FPGA accelerator cards Row iteration calculates and returns to result of calculation.
6. device according to claim 5, it is characterised in that the execution subelement includes：

Subelement is called, for calling the OpenCL application programs using JNI.