CN107862386A - A kind of method and device of data processing - Google Patents

A kind of method and device of data processing Download PDF

Info

Publication number
CN107862386A
CN107862386A CN201711070292.6A CN201711070292A CN107862386A CN 107862386 A CN107862386 A CN 107862386A CN 201711070292 A CN201711070292 A CN 201711070292A CN 107862386 A CN107862386 A CN 107862386A
Authority
CN
China
Prior art keywords
fpga accelerator
calculation
result
accelerator cards
block data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711070292.6A
Other languages
Chinese (zh)
Inventor
曹芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201711070292.6A priority Critical patent/CN107862386A/en
Publication of CN107862386A publication Critical patent/CN107862386A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a kind of method of data processing, applied to Mahout machine learning algorithms, this method includes:Sample set is read, is initialized by the deblocking in the sample set, and to parameter;Each block data and the parameter are transmitted to the FPGA accelerator cards of each clustered node;Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation;Whether evaluation algorithm restrains;The result of calculation is exported if convergence.Acceleration processing is carried out to Mahout machine learning algorithms using FPGA accelerator cards, effectively improves the calculating performance of Mahout machine learning algorithms.The invention also discloses a kind of device of data processing, equally with above-mentioned technique effect.

Description

A kind of method and device of data processing
Technical field
The present invention relates to machine learning algorithm field, more particularly to a kind of method and device of data processing.
Background technology
Big data epoch, data processing develop towards the direction of Intelligent data mining, have greatly promoted machine learning calculation The research and application of method.Mahout is under ASF (Apache Software Foundation, Apache Software Foundation) One open source projects, there is provided the realization of some expansible machine learning field classic algorithms, it is intended to help developer more Conveniently and efficiently create intelligent application.The characteristics of Mahout maximums, is realized based on Hadoop, all with Hadoop Advantage.The design that Hadoop framework is most crucial is Hadoop distributed file systems HDFS and MapReduce.Wherein, HDFS (Hadoop Distributed File System, Hadoop distributed file system) provides storage for mass data, MapReduce then provides calculating for the data of magnanimity.Mahout is by means of Hadoop many calculations run in the past on unit Method, the demand for for MapReduce patterns, meeting data parallelization processing to a certain extent is converted, improving algorithm can The data volume and process performance of processing.However, because single node disposal ability is limited, Hadoop is needed by extending clustered node Scale realizes calculating performance boost, and this cluster expansion often causes system cost and energy consumption quickly to increase, be greatly reduced The calculating performance gain that cluster expansion is brought.
Therefore, how to improve the disposal ability of Hadoop single node is those skilled in the art's urgent problem to be solved.
The content of the invention
It is an object of the invention to provide a kind of method of data processing, this method is by the Mahout machine learning algorithms of complexity Carried out in FPGA accelerator cards, effectively improve the calculating performance of Mahout machine learning algorithms.Another object of the present invention It is to provide a kind of device of data processing.
In order to solve the above technical problems, the invention provides a kind of method of data processing, methods described is applied to Mahout machine learning algorithms, including:
Sample set is read, is initialized by the deblocking in the sample set, and to parameter;
Each block data and the parameter are transmitted to corresponding FPGA accelerator cards;
Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Return result of calculation;
Whether evaluation algorithm restrains;
If the algorithmic statement, export the result of calculation.
Preferably, the calling hardware applications, each block data is carried out using each FPGA accelerator cards Iterating to calculate and returning to result of calculation includes:
OpenCL application programs are called, calculating is iterated simultaneously to each block data using each FPGA accelerator cards Return to result of calculation.
Preferably, the calling OpenCL application programs include:
The OpenCL application programs are called using JNI.
Present invention also offers a kind of device of the acceleration of Mahout machine learning algorithms, including:
Initialization unit, for reading sample set, by the deblocking in the sample set, and parameter is carried out initial Change;
Transmission unit, for each block data and the parameter to be transmitted to corresponding FPGA accelerator cards;
Execution unit, for calling OpenCL application programs, using each FPGA accelerator cards to each block data It is iterated and calculates and return result of calculation;
Judging unit, whether restrained for evaluation algorithm;If the algorithmic statement, export the result of calculation.
Preferably, the execution unit includes:
Subelement is performed, for calling OpenCL application programs, using each FPGA accelerator cards to each block count Calculate according to being iterated and return to result of calculation;
Preferably, the execution subelement includes:
Subelement is called, for calling the OpenCL application programs using JNI.
The method of data processing provided by the invention, sample set is read, by the deblocking in the sample set, and to ginseng Number is initialized;Each block data and the parameter are transmitted to corresponding FPGA accelerator cards;OpenCL application programs are called, Each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation;Whether evaluation algorithm is received Hold back;If the algorithmic statement, export the result of calculation.
It can be seen that it is of the invention on the basis of fully analysis Mahout machine learning algorithms, by Mahout machine learning algorithms In computationally intensive and suitable parallel computation part be incorporated into FPGA accelerator cards, by calling hardware applications, in FPGA Correlation computations are carried out in accelerator card.Acceleration processing is carried out to Mahout machine learning algorithms using FPGA accelerator cards, effectively lifting The calculating performances of Mahout machine learning algorithms.
The device of data processing provided by the invention equally has above-mentioned technique effect.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is the schematic diagram of the method for data processing provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram of the device of data processing provided in an embodiment of the present invention.
Embodiment
The core of the present invention is to provide a kind of method of data processing, and this method is by the Mahout machine learning algorithms of complexity Acceleration processing is carried out in FPGA accelerator cards (Field-Programmable Gate Array, field programmable gate array), is had Effect improves the calculating performance of Mahout machine learning algorithms.Another core of the present invention is to provide a kind of dress of data processing Put, equally with above-mentioned technique effect.
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is Part of the embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Fig. 1 is refer to, Fig. 1 is the schematic diagram of the method for data processing provided in an embodiment of the present invention, is understood with reference to figure 1, This method may comprise steps of:
S100:Sample set is read, is initialized by the deblocking in sample set, and to parameter;
Wherein, sample set is read, parameter is carried out the operation such as initializing can be according to the original of Mahout machine learning algorithms There is mechanism execution.
S200:Each block data and parameter are transmitted to corresponding FPGA accelerator cards;
Specifically, after deblocking in sample set, parameter and different data blocks are transmitted to corresponding FPGA accelerator cards.Wherein, each former Hadoop clustered nodes form new Hadoop clustered nodes with corresponding FPGA accelerator cards, FPGA accelerator cards carry out interconnected communication with former Hadoop clustered nodes by PCIe buses.
S300:Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Result of calculation;
Because FPGA accelerator cards support verilog hardware languages, VHDL (Very-High-Speed Integrated Circuit Hardware Description Language, VHSIC hardware description language) hardware language, OpenCL (Open Computing Laguage, open computing language) high-level language etc.;So calling can be passed through Verilog hardware languages or VHDL hardware languages are realized is iterated calculating in FPGA accelerator cards to each block data Purpose.
Preferably, hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Returning result of calculation includes:
OpenCL application programs are called, each block data is iterated using each FPGA accelerator cards and calculates and return calculating As a result.
Specifically, the kernel program of OpenCL application programs generates the logic circuit of FPGA accelerator cards simultaneously by compilation process Configuration parameter as FPGA.After receiving parameter and data block, each FPGA accelerator cards according to parameter and data block simultaneously Identical algorithm is performed, result of calculation is then back to CPU.
Because OpenCL language than verilog hardware languages, VHDL hardware languages for, exploitation is relatively easy, exploitation Cycle is relatively short.So the preferred scheme, which can be realized relatively easily, calls OpenCL application programs, accelerated using each FPGA Card is iterated to each block data and calculates and return the purpose of result of calculation.
Wherein, Mahout calls the interface of OpenCL application programs to realize, can be by JNI (Java Native Interface, Java local interface), the instrument such as g++ realizes.Wherein, JNI provides some API (Application Programming Interface, application programming interface), realize the logical of Java and other language (mainly C&C++) Letter, the code that JNI allows Java code and other language to write interact.
Preferably, OpenCL application programs are called to include:
OpenCL application programs are called using JNI.
Specifically, realizing interacting for java language and OpenCL language using JNI, OpenCL is called to reach Mahout The purpose of application program.
S400:Whether evaluation algorithm restrains;
S500:If algorithmic statement, export result of calculation;
S600:If algorithm dissipates, recalculate.
Specifically, for the not convergent situation of algorithm, calculating can be re-started;Again algorithm can also be designed Realize, can specifically be determined according to actual conditions, the present invention is not especially limited.
In summary, the method for data processing provided by the invention, sample set is read, by the deblocking in sample set, And parameter is initialized;Each block data and parameter are transmitted to corresponding FPGA accelerator cards;Call OpenCL application journeys Sequence, each block data is iterated using each FPGA accelerator cards and calculates and return result of calculation;Whether evaluation algorithm restrains;Such as Fruit algorithmic statement, then export result of calculation.
It can be seen that it is of the invention on the basis of fully analysis Mahout machine learning algorithms, by Mahout machine learning algorithms In computationally intensive and suitable parallel computation part be incorporated into FPGA accelerator cards, by calling hardware applications, in FPGA Correlation computations are carried out in accelerator card.Acceleration processing is carried out to algorithm using FPGA accelerator cards, effectively improves Mahout engineerings Practise the calculating performance of algorithm.
Present invention also offers a kind of device of data processing.The device of data processing provided by the invention is carried out below Introduce, the device described below can be with method as described above mutually to should refer to.
As shown in Fig. 2 the device includes initialization unit 1, transmission unit 2, execution unit 3, judging unit 4.
Initialization unit 1, for reading sample set, initialized by the deblocking in sample set, and to parameter;
Transmission unit 2, for each block data and parameter to be transmitted to corresponding FPGA accelerator cards;
Execution unit 3, for calling OpenCL application programs, each block data is iterated using each FPGA accelerator cards Calculate and return to result of calculation;
Judging unit 4, whether restrained for evaluation algorithm;If algorithmic statement, export result of calculation.
Preferably, execution unit 3 includes:
Subelement is performed, for for calling OpenCL application programs, entering using each FPGA accelerator cards to each block data Row iteration calculates and returns to result of calculation;
Preferably, execution unit subelement includes:
Subelement is called, for calling OpenCL application programs using JNI.
Each embodiment is described by the way of progressive in specification, and what each embodiment stressed is and other realities Apply the difference of example, between each embodiment identical similar portion mutually referring to.For device disclosed in embodiment Speech, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part illustration .
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty Technical staff can realize described function using distinct methods to each specific application, but this realization should not Think beyond the scope of this invention.
Directly it can be held with reference to the step of method or algorithm that the embodiments described herein describes with hardware, processor Capable software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The method and device of data processing provided by the present invention is described in detail above.Tool used herein Body example is set forth to the principle and embodiment of the present invention, and the explanation of above example is only intended to help and understands this hair Bright method and its core concept.It should be pointed out that for those skilled in the art, the present invention is not being departed from On the premise of principle, some improvement and modification can also be carried out to the present invention, these are improved and modification also falls into right of the present invention It is required that protection domain in.

Claims (6)

  1. A kind of 1. method of data processing, it is characterised in that applied to Mahout machine learning algorithms, including:
    Sample set is read, is initialized by the deblocking in the sample set, and to parameter;
    Each block data and the parameter are transmitted to corresponding FPGA accelerator cards;
    Hardware applications are called, each block data is iterated using each FPGA accelerator cards and calculates and return meter Calculate result;
    Whether evaluation algorithm restrains;
    If the algorithmic statement, export the result of calculation.
  2. 2. according to the method for claim 1, it is characterised in that the calling hardware applications, utilize each FPGA Accelerator card is iterated to calculate and return to result of calculation to each block data to be included:
    OpenCL application programs are called, each block data is iterated using each FPGA accelerator cards and calculates and returns Result of calculation.
  3. 3. according to the method for claim 2, it is characterised in that the calling OpenCL application programs include:
    The OpenCL application programs are called using JNI.
  4. A kind of 4. device of data processing, it is characterised in that applied to Mahout machine learning algorithms, including:
    Initialization unit, for reading sample set, initialized by the deblocking in the sample set, and to parameter;
    Transmission unit, for each block data and the parameter to be transmitted to corresponding FPGA accelerator cards;
    Execution unit, for calling hardware applications, each block data is changed using each FPGA accelerator cards In generation, calculates and returns to result of calculation;
    Judging unit, whether restrained for evaluation algorithm;If the algorithmic statement, export the result of calculation.
  5. 5. device according to claim 4, it is characterised in that the execution unit includes:
    Subelement is performed, for calling OpenCL application programs, each block data is entered using each FPGA accelerator cards Row iteration calculates and returns to result of calculation.
  6. 6. device according to claim 5, it is characterised in that the execution subelement includes:
    Subelement is called, for calling the OpenCL application programs using JNI.
CN201711070292.6A 2017-11-03 2017-11-03 A kind of method and device of data processing Pending CN107862386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711070292.6A CN107862386A (en) 2017-11-03 2017-11-03 A kind of method and device of data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711070292.6A CN107862386A (en) 2017-11-03 2017-11-03 A kind of method and device of data processing

Publications (1)

Publication Number Publication Date
CN107862386A true CN107862386A (en) 2018-03-30

Family

ID=61700600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711070292.6A Pending CN107862386A (en) 2017-11-03 2017-11-03 A kind of method and device of data processing

Country Status (1)

Country Link
CN (1) CN107862386A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750309A (en) * 2012-03-19 2012-10-24 南京大学 Parallelization support vector machine (SVM) solving method based on Hadoop
CN106547627A (en) * 2016-11-24 2017-03-29 郑州云海信息技术有限公司 The method and system that a kind of Spark MLlib data processings accelerate
CN107122243A (en) * 2017-04-12 2017-09-01 杭州远算云计算有限公司 Heterogeneous Cluster Environment and CFD computational methods for CFD simulation calculations
CN107292330A (en) * 2017-05-02 2017-10-24 南京航空航天大学 A kind of iterative label Noise Identification algorithm based on supervised learning and semi-supervised learning double-point information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750309A (en) * 2012-03-19 2012-10-24 南京大学 Parallelization support vector machine (SVM) solving method based on Hadoop
CN106547627A (en) * 2016-11-24 2017-03-29 郑州云海信息技术有限公司 The method and system that a kind of Spark MLlib data processings accelerate
CN107122243A (en) * 2017-04-12 2017-09-01 杭州远算云计算有限公司 Heterogeneous Cluster Environment and CFD computational methods for CFD simulation calculations
CN107292330A (en) * 2017-05-02 2017-10-24 南京航空航天大学 A kind of iterative label Noise Identification algorithm based on supervised learning and semi-supervised learning double-point information

Similar Documents

Publication Publication Date Title
CN114546405B (en) Method and system for processing graphics using a unified intermediate representation
Igual et al. The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations
Chamberlain et al. Auto-Pipe: Streaming applications on architecturally diverse systems
KR20130114688A (en) Architecture optimizer
KR102371844B1 (en) Computing method applied to artificial intelligence chip, and artificial intelligence chip
CN106528171B (en) Method of interface, apparatus and system between a kind of heterogeneous computing platforms subsystem
US11556756B2 (en) Computation graph mapping in heterogeneous computer system
US10002225B2 (en) Static timing analysis with improved accuracy and efficiency
Souza et al. CAP Bench: a benchmark suite for performance and energy evaluation of low‐power many‐core processors
Lanzagorta et al. Introduction to reconfigurable supercomputing
Flasskamp et al. Performance estimation of streaming applications for hierarchical MPSoCs
US8555030B2 (en) Creating multiple versions for interior pointers and alignment of an array
Sanchez-Roman et al. An euler solver accelerator in FPGA for computational fluid dynamics applications
Khan et al. Accelerating SpMV multiplication in probabilistic model checkers using GPUs
Liu et al. A simulation framework for memristor-based heterogeneous computing architectures
CN107862386A (en) A kind of method and device of data processing
Hubert A survey of HW/SW cosimulation techniques and tools
US10133839B1 (en) Systems and methods for estimating a power consumption of a register-transfer level circuit design
Bombieri et al. HDTLib: an efficient implementation of SystemC data types for fast simulation at different abstraction levels
Yuan et al. Automatic enhanced CDFG generation based on runtime instrumentation
Saussard et al. A novel global methodology to analyze the embeddability of real-time image processing algorithms
Bhimani et al. Design space exploration of GPU Accelerated cluster systems for optimal data transfer using PCIe bus
Li et al. Formal and virtual multi-level design space exploration
Agharass et al. Hardware Software Co-design based CPU-FPGA Architecture: Overview and Evaluation
Stamoulias et al. Hardware accelerators for financial applications in HDL and High Level Synthesis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180330

RJ01 Rejection of invention patent application after publication