CN107273339A

CN107273339A - A kind of task processing method and device

Info

Publication number: CN107273339A
Application number: CN201710483201.5A
Authority: CN
Inventors: 刘姝; 黄雪
Original assignee: Zhengzhou Yunhai Information Technology Co Ltd
Current assignee: Zhengzhou Yunhai Information Technology Co Ltd
Priority date: 2017-06-21
Filing date: 2017-06-21
Publication date: 2017-10-20

Abstract

The present invention provides a kind of task processing method and device, reads corresponding matrix Sub Data Set parallel by multiple back end and the calculating task of the matrix Sub Data Set is divided into the i.e. multiple submatrix blocks in multiple subtasks respectively；The submatrix block is distributed to corresponding calculate node and performs the parallel computation i.e. parallel reading of file and the parallel execution of task by the calculate node by the back end, is realized efficient, inexpensive task processing, be greatly strengthen Consumer's Experience.

Description

A kind of task processing method and device

Technical field

The invention belongs to high-performance computing sector, more particularly to a kind of task processing method and device.

Background technology

With continuing to develop for big data epoch, large-scale data plays crucial work for scientific algorithm and scientific statistics With, and Matrix Multiplication is a kind of algorithm more popularized in large-scale scientific algorithm.And the increase and calculating with data set are complicated The lifting of degree, requirement to algorithm performance also more and more higher during science is realized, the lifting of algorithm performance is to promoting engineering, project Research progress plays vital effect, and is limited to the limitation of computer hardware resource internal memory, calculating speed etc., for completing The time for generally requiring tens of days or even several months is realized on the computing of some extensive matrix, single machine, and for internal memory The less calculating platform of capacity, its scarce capacity frequently can lead to journey to support under the storage of large-scale dataset, serious conditions The collapse of sequence, so as to limit the realization of large-scale dataset calculating.

Traditional matrix algorithm realizes that the product for being generally basede on each matrix element in single calculate node is summed successively, and adopts Data are read with serial calculation, i.e. order, wait back to calculate the calculating for completing just to start next step.If want The matrix size of calculating is ranks up to ten thousand, and the time complexity and disappeared that two matrix multiple methods are brought are realized using serial mode The cost of consumption is well imagined.

Therefore, above-mentioned technical problem is solved in the urgent need to providing a kind of task processing scheme efficiently, inexpensive.

The content of the invention

The present invention provides a kind of task processing method and device, to solve the above problems.

The embodiment of the present invention provides a kind of task processing method.The above method comprises the following steps：Pass through multiple data sections Point reads corresponding matrix Sub Data Set and the calculating task of the matrix Sub Data Set is divided into many height respectively parallel appoints Business is multiple submatrix blocks；

The submatrix block is distributed to corresponding calculate node and performed simultaneously by the calculate node by the back end Row is calculated.

The embodiment of the present invention provides a kind of Task Processing Unit, including read module, division module, distribution computing module； Wherein, the read module is connected by the division module with the distribution computing module；

The read module, for reading corresponding matrix Sub Data Set parallel by multiple back end；

The division module, is divided into multiple subtasks i.e. many for respectively by the calculating task of the matrix Sub Data Set Individual sub- matrix-block；

The distribution computing module, for distributing the submatrix block to corresponding calculate node and being saved by described calculate Point performs parallel computation.

Pass through following scheme：Corresponding matrix Sub Data Set is read parallel by multiple back end and respectively by the square The calculating task of a period of time data set is divided into the i.e. multiple submatrix blocks in multiple subtasks；The back end is by the submatrix block Distribute to corresponding calculate node and by the calculate node perform parallel computation be file parallel reading and task it is parallel Perform, realize efficient, inexpensive task processing, greatly strengthen Consumer's Experience.

Pass through following scheme：The submatrix block includes the line number more than 1 and line number is continuous so that calculate node will be counted Calculate result and be disposably sent to back end, number of communications is reduced, so as to reduce communication overhead.

Brief description of the drawings

Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not constitute inappropriate limitation of the present invention.In the accompanying drawings：

Fig. 1 show the task processing method flow chart of the embodiment of the present invention 1；

Fig. 2 show the designed holder composition of the embodiment of the present invention 2；

Fig. 3 show principal and subordinate's process task distribution implementation process figure of the embodiment of the present invention 3；

Fig. 4 show the Task Processing Unit structure chart of the embodiment of the present invention 4.

Embodiment

Describe the present invention in detail below with reference to accompanying drawing and in conjunction with the embodiments.It should be noted that not conflicting In the case of, the feature in embodiment and embodiment in the application can be mutually combined.

Fig. 1 show the task processing method flow chart of the embodiment of the present invention 1, comprises the following steps：

Step 101：Corresponding matrix Sub Data Set is read parallel by multiple back end and respectively by matrix The calculating task of data set is divided into the i.e. multiple submatrix blocks in multiple subtasks；

Further, the process for reading corresponding matrix Sub Data Set parallel by multiple back end is：

Multiple back end open the file for depositing matrix by corresponding digital independent host process respectively, return to a text Part handle；

The offset in the corresponding matrix file of each digital independent host process is calculated, each digital independent host process is obtained Logical place of the matrix Sub Data Set to be read in matrix file；

Logical place of the digital independent host process according to the matrix Sub Data Set to be read in matrix file, reads Take corresponding matrix Sub Data Set.

Preferably, multiple back end are read after corresponding matrix Sub Data Set parallel, are stored in internal memory.

Further, the matrix Sub Data Set is averagely divided into many by the back end according to calculate node quantity Individual sub- matrix-block is simultaneously sent to corresponding calculate node.

Preferably, the submatrix block includes the line number more than 1 and line number is continuous.

If the line number for the matrix Sub Data Set that the back end is read is M, calculate node number is child_ Process, then distribute each calculate node per_rank=M/child_process rows,

For remaining extra=M%child_process rows, extra rows are evenly distributed to 1 to No. extra calculating Node, 1 to each process of extra calculate nodes increases a line.

Step 102：The submatrix block is distributed to corresponding calculate node and saved by described calculate by the back end Point performs parallel computation.

Further, the calculate node is completed after the calculating of the submatrix block, and result of calculation is sent to corresponding Back end.

Further, a back end is selected from the back end, the back end for statistics is used as；

The back end for being used to count is collected and stored to the result of calculation obtained from other back end.

On the one hand the method that the embodiment of the present invention is proposed draws large-scale dataset by way of parallel file is read It is divided into numerous matrix Sub Data Sets, solves internal memory restricted problem, on the other hand by the way that the calculating task of matrix Sub Data Set is drawn It is divided into multiple subtasks, realizes the parallel execution of multiple calculating subtasks, so as to improves algorithm based under large-scale dataset Computational efficiency.

(1) for the processing of large-scale dataset, by the way of parallel file is read, using parallel file I/O technologies, When matrix is deposited in file, in the case of cluster multinode, the several nodes of selection pass through finger as back end from file Determine display offset, read the data of diverse location in file, big matrix data collection is divided, each back end is from matrix file The middle a part of subset for obtaining whole matrix data collection, this multiple node can be same if multiple back end parallel read datas Shi Zuowei MPI host processes carry out data distribution and management.This mode on the one hand can solving matrix scale it is too big, single node internal memory The problem of being not enough to store whole matrix capacity, while the mode read using parallel file offset address realizes each data section again Point parallel read data, improves the access efficiency of file.

(2) for serial matrix computations inefficiency the problem of, by the way of being divided using multiple subtasks, in cluster more piece In dot system, the calculating of the matrix Sub Data Set that each back end is read parallel in (1) is appointed using MPI multiprocess communications technology Business is divided into multiple subtasks, that is, divides the matrix into multiple submatrix blocks, distributes to different processes and handles, and handles submatrix Node where the process of block is that (node where host process is back end to calculate node, is meter from the node where process Operator node), the calculating task between each calculate node is performed parallel.

The embodiment of the present invention provides a set of effective host process, from process collaborative management method, and host process completes task Division, the distribution of data complete the parallel computation of Sub Data Set from process, if there is multiple back end to read parallel with collecting work Fetch evidence, then it is considered that there are multiple host processes, each host process is responsible for that multiple (i.e. each back end is responsible for multiple from process Calculate node).

(3) the Matrix Multiplication optimization method that the embodiment of the present invention is proposed can run on large-scale cluster platform, utilize MPI skills Art is different by setting offset address by the way of the reading of multiple back end parallel files for large-scale dataset Back end obtains the Sub Data Set of file diverse location simultaneously, so can both improve file reading efficiency, single-unit is solved again The problem of point limited memory is not enough to store whole data set.While the computational efficiency in order to improve algorithm, utilizes MPI multi-process The mode of cooperative cooperating, the calculating task of a certain matrix data collection is divided into the calculating of multiple submatrix blocks, respectively negative from process The wherein a certain submatrix block of duty, respectively from the parallel calculating for completing correspondence submatrix block between process.

(4) distinguishing feature of the embodiment of the present invention is in large-scale cluster system, for large-scale dataset, in single-unit Point machine is not enough in the case of storing whole data set, sets multiple back end to read correspondence Sub Data Set parallel, for Matrix data in different pieces of information node, is divided into multiple submatrix blocks, and submatrix block is allocated parallel to multiple calculate nodes Complete to cooperate between the processing of different submatrix blocks, each node, finally by the unified management for completing communication of back end and knot The arrangement of fruit data.

It is specifically described below：

In the optimization method of extensive Matrix Multiplication based on group system, it is divided into back end and calculate node, when single When node memory is not enough to data storage collection, multiple nodes are by the way of parallel file work/O, by specifying display in file The data that different nodes read diverse location in file are offset, these nodes are referred to as back end.Different from traditional MPI journeys Exist multiple " host processes " in sequence, the MPI programs that the embodiment of the present invention is realized, i.e., it is parallel in back end to read the every of file We are referred to as host process to individual process, in order to improve computational efficiency, and each host process divides the Sub Data Set obtained again (and will Matrix Sub Data Set in back end is divided into multiple submatrix blocks), distribution is to corresponding calculate node (from the section where process Put our referred to as calculate nodes), multiple calculate nodes obtain it is parallel after submatrix blocks performs calculatings, before replacement matrix division Serial computing, by parallel twice (digital independent calculates parallel with data parallel), (large-scale dataset is divided with division twice It is that Sub Data Set and matrix-block are divided into submatrix block), realize the algorithm optimization based on the extensive Matrix Multiplication of cluster platform.This hair Bright embodiment realizes framework as shown in Fig. 2 each back end obtains matrix Sub Data Set i.e. sub_ from database dataset；Each back end is responsible for different calculate nodes, and (calculate node obtains corresponding submatrix block, for example：Sub1, and Result of calculation res is returned into corresponding back end), last result of calculation carries out unified remittance by one of back end Always (res_dataset) and storage.

The embodiment that multiple back end (i.e. multiple digital independent host processes) read file parallel is as follows, calls first MPI_File_open (comm, filename ...), each process opens the file of storage matrix, returns to a file handle, Operation of each process to file is all realized by handle.The offset in each process homography file is calculated, is obtained Logical place of the data in matrix file needed for each process, so each process (back end) can obtain matrix data collection A part (i.e. Sub Data Set), each digital independent process is by calling MPI_File_read_at () to read square in file respectively The submatrix data set of acquisition is deposited in progress next step meter in respective internal memory by the different piece of battle array, so each back end Calculate operation.

It is previously noted that being by secondary stroke of data set in the method that the embodiment of the present invention is realized using MPI multi-process technologies Point, first time division is multiple back end (host process) Sub Data Set that reading large-scale data is concentrated parallel and is stored in interior In depositing, second of division is that matrix Sub Data Set is again divided into multiple submatrix blocks, the meter of each submatrix block by back end Calculation is allocated to corresponding calculate node (from process), and the parallel calculating for performing submatrix block, is realized one between calculate node The serial computing of individual large-scale dataset is converted to the parallel computation that submatrix block is directed to for multiple calculate nodes.In multiple data Nodal parallel is read under conditions of data set, and the multiple calculate nodes of each back end correspondence, back end enters as a master Matrix Sub Data Set is averagely divided into multiple submatrix blocks and sent to corresponding calculate node (calculate node as from process) by journey, Calculate node after the calculating of submatrix block is completed parallel, by the unified transmission of result of calculation to corresponding data node, each data section Point completes the distribution and collection of data, finally selects back end to complete collecting and storing for result data.

The implementation that submatrix block realizes parallel computation to calculate node (from process) is divided for back end (host process) Process is as shown in figure 3, mainly include following key point：

Host process carries out submatrix block and averagely divided, and sends consecutive numbers row data extremely from process；

From task parallelism calculated sub-matrix block, result of calculation is sent to host process；

Host process receives the feedback data from process, as a result collects and stores.

In addition, submatrix block divide with transmission process in, for ensure internal memory continuously read, host process be sent to it is each from The data of process all need to be continuous several rows in matrix, are calculated, then will calculated after respectively receiving host process data from process As a result host process is disposably sent to, number of communications can be so reduced, so as to reduce communication overhead.For the processing of remainder, if Current matrix line number is M, is child_process from process number, then each can obtain per_rank=M/ from process first Child_process rows, for remaining extra=M%child_process rows, to ensure that the load of each process processing is equal Weighing apparatus, extra rows are evenly distributed into 1 to extra processes, 1 to each process of extra processes increases a line, i.e. host process To 1 to No. extra each sends M/num_process+1 row data from process, and to extra to child_process, process is sent out Send M/child_process row data.

Fig. 4 show the Task Processing Unit structure chart of the embodiment of the present invention 4, including read module, division module, distribution Computing module；Wherein, the read module is connected by the division module with the distribution computing module；

Further, the submatrix block includes the line number more than 1 and line number is continuous.

The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should be included in the scope of the protection.

Claims

1. a kind of task processing method, it is characterised in that comprise the following steps：

Corresponding matrix Sub Data Set is read parallel by multiple back end and respectively by the calculating of the matrix Sub Data Set Task is divided into the i.e. multiple submatrix blocks in multiple subtasks；

The submatrix block is distributed to corresponding calculate node and performs parallel meter by the calculate node by the back end Calculate.

2. according to the method described in claim 1, it is characterised in that read corresponding matrix parallel by multiple back end The process of data set is：

Multiple back end open the file for depositing matrix by corresponding digital independent host process respectively, return to a file sentence Handle；

The offset in the corresponding matrix file of each digital independent host process is calculated, each digital independent host process is obtained and wants Logical place of the matrix Sub Data Set of reading in matrix file；

Logical place of the digital independent host process according to the matrix Sub Data Set to be read in matrix file, reading pair The matrix Sub Data Set answered.

3. according to the method described in claim 1, it is characterised in that multiple back end read corresponding matrix subdata parallel After collection, it is stored in internal memory.

4. according to the method described in claim 1, it is characterised in that the back end, will be described according to calculate node quantity Matrix Sub Data Set is averagely divided into multiple submatrix blocks and sent to corresponding calculate node.

5. according to the method described in claim 1, it is characterised in that the submatrix block includes the line number for being more than 1 and line number is Continuously.

6. method according to claim 5, it is characterised in that if the row for the matrix Sub Data Set that the back end is read Number is M, and calculate node number is child_process, then distributes each calculate node per_rank=M/child_process OK,

For remaining extra=M%child_process rows, extra rows are evenly distributed to 1 to No. extra and calculate section Point, 1 to each process of extra calculate nodes increases a line.

7. according to the method described in claim 1, it is characterised in that the calculate node completes the calculating of the submatrix block Afterwards, result of calculation is sent to corresponding back end.

8. method according to claim 7 a, it is characterised in that back end is selected from the back end, makees For the back end for statistics；

9. a kind of Task Processing Unit, it is characterised in that including read module, division module, distribution computing module；Wherein, institute Read module is stated by the division module with the distribution computing module to be connected；

The division module, for the calculating task of the matrix Sub Data Set to be divided into the i.e. many height in multiple subtasks respectively Matrix-block；

The distribution computing module, for distributing the submatrix block to corresponding calculate node and being held by the calculate node Row parallel computation.

10. device according to claim 9, it is characterised in that the submatrix block includes the line number for being more than 1 and line number is Continuously.