The parallel calculating method of a kind of data-oriented intensity and dependence
Technical field
The invention belongs to the technical field of parallel system, relate to for data and dividing and Parallel schedule Quantitative study, propose especially that maintenance data dependency graph is set up data relationship model and quantization method and based on data model parallel calculating method.
Background technology
Parallel analysis of digital terrain is the combination of parallel computation and Geography.Parallel computation is one of research contents important in Computer Subject, initial with solving numerical value or nonnumeric computational problem, mainly towards be the science calculating of big data quantity.Digital Terrain Analysis (Digital Terrain Analysis, DTA) is the digital information processing that carries out landform property calculation and feature extraction on the basis of digital elevation model (Digital Elevation Model, DEM).
Along with the renewal of various sensors and measuring technique, dem data scale increases, and makes the very difficult thing that is treated as to mass data under stand-alone environment.Therefore need to use parallel computing to solve the Calculation bottleneck problem of uniprocessor.
At present, the research of parallel analysis of digital terrain mainly concentrates on data parallel strategy.By existing research work analysis and research are found, although parallel analysis of digital terrain has been obtained certain development, and obtained performance boost in various degree, the theory of data parallel is needed the quantitative appraisal of systematization arrangement and data granularity badly.The focus of research is concentrated on to Data Partition Strategy to numerous scholars and storage policy is managed massive spatial data more easily, fractionation and the management of the theoretical foundation guide data that lacks quantification in Parallel Algorithm design.Digital Terrain Analysis mainly by logarithm according to this and the fractionation of task realize paralell design, there is complicated data dependence relation.Parallel analysis of digital terrain has its distinctive data characteristic, if ignore the internal relation between data, and just can not be from intension and in essence the relation between task parallel computation and data is well determined.Scheduling how to carry out the decomposition of data, the distribution of data and task is all the key factor that affects parallelization efficiency.
Summary of the invention
The present invention is directed to above-mentioned technical matters, proposed for parallel computation modeling and parallelization strategies and the corresponding dispatching method thereof with data dependence type feature.
Parallel computation modeling of the present invention and corresponding parallelization strategies thereof and dispatching method comprise:
A parallel calculating method for data-oriented intensity and dependence, comprises the steps:
(1) for the data with data-intensive feature, determine division methods;
(2) for the data with data-intensive feature, carry out parallelization data modeling and corresponding parallelization strategies thereof;
(3) parallelization data modeling and the parallelization strategies based on step (2) dispatched.
The deterministic process of step (1) division methods is:
Step 101: import data; For the parallel computation modeling with data-intensive feature, first data are effectively divided, then for each piece, calculate respectively, finally carry out result fusion;
Step 102: according to deal with data feature selecting data partition method: comprise bar division methods and piece division methods, its discal patch is divided into that row is divided and row are divided, piece is divided into that square data are divided and rectangle data division; Pass through formula
the relation of counting P with computing node determines that divided block counts n, if
,
, otherwise n=P; Wherein, Msize is palpus deal with data size, and men is for node memory is provided, and k is for providing the processor number of node, and P is computing node number;
Step 103: carry out corresponding parallel method deal with data according to data partition method.
When step (1) adopts bar division methods, the process of step (2) specifically comprises:
Step 201: carry out bar division for concrete data-intensive characteristic, according to row or column, data are divided;
Step 202: parallelization strategies: consider that Processing Algorithm is to institute's dividing data interblock Existence dependency relationship whether, if Processing Algorithm between data block without dependence, perform step 203, if Processing Algorithm is for there being dependence between data block, perform step 204;
Step 203: if the task of each data block is separate, direct executed in parallel;
Step 204: if the task of data block has relation of interdependence, can start successively sequentially to carry out to centre calculation task from data block 1 and data block n; There are two tasks in parallel to carry out, therefore need two computing nodes at every turn;
Step 205: carry out follow-up work.
When step (1) adopts bar division methods, the process of step (2) specifically comprises:
Step 301: area data is divided according to square, if area data is square, perform step 302; If area data is non-square, perform step 305;
Step 302: parallelization strategies: for setting up n * n data block division methods, if Existence dependency not between data block performs step 303; If Existence dependency between data block, performs step 304;
Step 303: if task corresponding to data block is separate, at the next available n of shared drive computation schema
2individual node carries out parallel computation and obtains best efficiency; Under distributed formula internal memory computation schema, counting yield promotes and depends on granularity
with Data dissemination degree
wherein, T
cpfor the computing time of each data block, T
ccfor the call duration time between data block, T
disthe distribution time of each data block; The optimum processor number p=[p+d of the calculating that it needs], wherein [] is rounding operation symbol;
Step 304: if Existence dependency relationship between task corresponding to data block, can divide the data block of division into 2n-1 layer, the data block of every layer progressively increases, and every layer has at most n data block, lower floor calculates and relies on upper strata result, therefore only data block can parallel computation in same layer, and parallel computation needs at most n node;
Step 305: parallelization strategies: for setting up n * m data block division methods, if Existence dependency not between data block, with step 303 disposal route; If Existence dependency between data block, performs step 306;
Step 306: for data block Existence dependency relationship, the deblocking of n * m has m+n-1 layer, and every layer has at most n data block, and parallel method is with step 304; Therefore, it needs at most n node to carry out parallel computation just to reach the highest speed-up ratio.
The dispatching method of step (3) comprising:
Step 401: host node selects active data division methods to divide data, then judges according to Processing Algorithm whether data block has dependence, host node distributing data;
Step 402: judge according to Processing Algorithm whether data block has dependence, if Existence dependency relationship not between data block performs step 403; If Existence dependency relationship between data block, performs step 404;
Step 403: host node is distributed all data blocks according to computational resource, each node initiates to calculate, and result of calculation is issued to host node;
Step 404: adopt bar division methods if data are divided, perform step 405; Otherwise execution step 408;
Step 405: two, host node distribution head and tail be data block to two node of Existence dependency relationship not, i.e. data block 1 and data block n, node is initiated calculating, until calculate complete;
Step 406: host node continues distributing data piece 2 and data block n-1 arrives above-mentioned two nodes, node initiates to calculate according to upper strata result of calculation, until calculate complete;
Step 407: process successively data block n/2 and data block n/2-1, whole node calculates complete, and result of calculation is issued to node, execution step 417;
Step 408: divide if data are divided into block n * n, perform step 409; Otherwise execution step 416;
Step 409: according to data dependence relation figure, lower floor calculates and relies on upper strata result, host node distribution the 1st layer data, data block 1, and node initiates to calculate, and result of calculation is beamed back to host node;
Step 410: host node distribution the 2nd layer data: first host node is distributed to former computing node by the data block 2 that depends on data block 1, and node calculates, and result of calculation is beamed back to host node;
Step 411: host node starts a new node simultaneously, is distributed to new node by the data block 3 and the last layer result of calculation that depend on data block 1; Node calculates, and result of calculation is beamed back to host node;
Step 412: host node distribution layer 3 data: host node continues to start a new node, and is distributed in three nodes depending on the data block of upper layer data and the result of calculation of upper layer data, and node calculates, and result of calculation is beamed back to host node;
Step 413: several layer data after processing successively, every layer starts a new node, until the processing of n layer data finishes, now partial node adds up to n;
Step 414: host node is distributed n+1 layer data: according to data dependence relation figure, after n layer data, every layer data block starts to reduce one by one, host node will not restarted new node, distribution n+1 layer data block and n layer result of calculation are to n-1 node, n-1 node calculates, and result of calculation is beamed back to host node;
Step 415: host node is distributed the result of calculation of follow-up a few layer data block and dependence successively, finishes until all data blocks are calculated, and returns to result of calculation, execution step 417;
Step 416: divide if data are divided into block n * m, data processing is divided with block n * n, but after handling n layer data block, because piecemeal is now that n * m divides, after this | m-n+1| layer, nodes is n, during processing, host node must not start new node, execution step 417;
Step 417: host node is compiled result.
Technical characterstic of the present invention and beneficial effect:
1, the present invention is based on data-intensive data, proposes fractionation and the management of a kind of theoretical foundation guide data in Parallel Algorithm design; Consider complicated data dependence, Task Dependent relation, how to have carried out the key factors such as scheduling of the decomposition of data and task, the distribution of data and task.
2, the present invention is based on data partition method, considered respectively whether Existence dependency relationship of data block, carry out different data processing parallelizations; Corresponding parallelization computation process dispatching algorithm is proposed simultaneously.
3, the present invention can be applicable to the high-performance calculation occasion of the parallel analysis of digital terrain of extensive mass data completely, and for example, parallel computation is filled and led up in regular grid parallel interpolation, the parallel computation of gradient slope aspect, depression, visible range terrain analysis etc., and terrain factor extracts; Can be applied to the high-performance calculation of geographic information processing; Also can be applied to the application scenarios such as spatial decision making based on geography information and data mining, improve treatment effeciency.
Accompanying drawing explanation
Fig. 1 processes case to the parallelization of bar shaped dividing data in embodiments of the invention, (a), for by bar shaped dividing data situation, (b) is corresponding data division parallel modeling situation;
Fig. 2 processes case to 3 * 3 dividing data parallelizations in embodiments of the invention, (a), for by 3 * 3 block dividing data situations, (b) is corresponding data division parallel modeling situation;
Fig. 3 processes case to 4 * 4 dividing data parallelizations in embodiments of the invention, (a), for by 4 * 4 block dividing data situations, (b) is corresponding data division parallel modeling situation;
Fig. 4 processes case to the parallelization of n * n dividing data in embodiments of the invention, (a) for pressing the block dividing data situation of n * n, (b) for corresponding data is divided parallel modeling situation;
Fig. 5 processes case to 2 * 3 dividing data parallelizations in embodiments of the invention; (a) for by 2 * 3 block dividing data situations, (b) be corresponding data division parallel modeling situation;
Fig. 6 processes case to 3 * 4 dividing data parallelizations in embodiments of the invention; (a) for by 3 * 4 block dividing data situations, (b) be corresponding data division parallel modeling situation;
Fig. 7 carries out parallel computation dispatching algorithm process flow diagram to dividing rear data in embodiments of the invention;
Embodiment
Below in conjunction with accompanying drawing, the present invention is illustrated.It may be noted that described embodiment is only considered as the object of explanation, rather than the restriction to invention.
Embodiments of the invention provide the parallel calculating method of a kind of data-oriented intensity and dependence, comprise the following steps:
1. the data of parallel computation modeling are divided the method for determining
Step 101: import data; For the parallel computation modeling with data-intensive feature, normally data are effectively divided, then for each piece, calculate respectively, finally carry out result fusion;
Step 102: according to deal with data feature selecting data partition method: common data partition method has bar to divide and piece is divided, its discal patch is divided into that row is divided and row are divided, and piece is divided into square data division and rectangle data division.Pass through formula
the relation of counting P with computing node determines that divided block counts n, if
,
, otherwise n=P; Wherein, Msize is palpus deal with data size, and men is for node memory is provided, and k is for providing the check figure of node, and P is computing node number;
Step 103: carry out follow-up work.
If Processing Algorithm does not have dependence to data interblock, the task of corresponding each data block is exactly independent, directly executed in parallel.If the situation (see figure 1) of data block Existence dependency, because data block 1 and data block n are from border, only has two data blocks can carry out parallel computation at every turn, therefore only need two nodes just can.
2. for thering is data-intensive characteristic, carry out strip and divide parallel computation modeling method and comprise:
Step 201: data are divided according to row or column;
Step 202: parallelization strategies: consider that Processing Algorithm is to institute's dividing data interblock Existence dependency relationship whether, if Processing Algorithm between data block without dependence, perform step 203, if Processing Algorithm is for there being dependence between data block, perform step 204;
Step 203: if the task of each data block is separate, direct executed in parallel.Its efficiency depends on the nodes m of calculating; When n and m are divided exactly, be nt/m computing time; Otherwise be ([n/m]+1) t; Wherein t is individual data piece computing time.
Step 204: if the task of data block has relation of interdependence, can start successively sequentially to carry out to centre calculation task from data block 1 and data block n; There are two tasks in parallel to carry out, therefore need two computing nodes at every turn; When n is even number, be nt/2 computing time; Otherwise be ([n/2]+1) t; Wherein t is individual data piece computing time.
3. for thering is data-intensive characteristic, carry out bulk and divide parallel computation modeling method and comprise:
Step 301: area data is divided according to square, if area data is square, perform step 302; If area data is non-square, perform step 305;
Step 302: parallelization strategies: for setting up n * n data block division methods, if Existence dependency not between data block performs step 303; If Existence dependency between data block, performs step 304;
Step 303: if task corresponding to data block is separate, available n
2individual node carries out parallel computation and obtains best efficiency, if be subject to the restriction of computational resource, it is relevant that its counting yield and computing node are counted m, therefore be [n its computing time
2/ m] t;
Step 304: if Existence dependency relationship between task corresponding to data block as shown in Figure 4, can divide the data block of division into 2n-1 layer, the data block of every layer progressively increases, and every layer has at most n data block.Therefore, the parallel computation that n * n data block is divided needs at most n node, and be T=(2n-1) t computing time; The data block of every one deck can parallel computation, could carry out the data block of time one deck after last complete;
Step 305: parallelization strategies: for setting up n * m data block division methods, if Existence dependency not between data block, with step 303 disposal route; If Existence dependency between data block, performs step 306;
Step 306: for data block Existence dependency relationship, the deblocking of n * m has m+n-1 layer, and every layer has at most n data block.Therefore, it needs at most n node to carry out parallel computation just can reach the highest speed-up ratio, and the T.T. of parallel computation is T=(m+n-1) t.
Various division methods and paralleling tactic comparison: when between the data block of dividing not during Existence dependency relationship, the strip data block that is unit according to row or column is divided and dealt with fairly simplely, and parallel efficiency calculation is high.When between data block during Existence dependency relationship, data block can only be calculated according to precedence relationship, and obvious block data block is divided with the obvious advantage, and the efficiency of parallel computation is better than strip and divides.Side's data block has more advantage than rectangle data piece.Because when girth one timing of data block, the area of square data block is maximum, data block is large, most effective.
4. the dispatching method based on above-mentioned parallelization data modeling and parallelization strategies comprises:
Step 401: host node selects active data division methods to divide data, then judges according to Processing Algorithm whether data block has dependence, host node distributing data.
Step 402: judge according to Processing Algorithm whether data block has dependence, if Existence dependency relationship not between data block performs step 403; If Existence dependency relationship between data block, performs step 404.
Step 403: host node is distributed all data blocks according to computational resource, each node initiates to calculate, and result of calculation is issued to host node;
Step 404: divide if data are divided into strip, perform step 405; Otherwise execution step 408;
Step 405: two, host node distribution head and tail be data block to two node of Existence dependency relationship not, i.e. data block 1 and data block n, node is initiated calculating, until calculate complete;
Step 406: host node continues distributing data piece 2 and data block n-1 arrives above-mentioned two nodes, node initiates to calculate according to upper strata result of calculation, until calculate complete;
Step 407: process successively data block n/2 and data block n/2-1, whole node calculates complete, and result of calculation is issued to node, execution step 417.
Step 408: divide if data are divided into block n * n, perform step 409; Otherwise execution step 416;
Step 409: according to data dependence relation figure, lower floor calculates and relies on upper strata result, host node distribution the 1st layer data, data block 1, and node initiates to calculate, and result of calculation is beamed back to host node;
Step 410: host node distribution the 2nd layer data: first host node is distributed to former computing node by the data block 2 that depends on data block 1, and node calculates, and result of calculation is beamed back to host node;
Step 411: host node starts a new node simultaneously, is distributed to new node by the data block 3 and the last layer result of calculation that depend on data block 1; Node calculates, and result of calculation is beamed back to host node;
Step 412: host node distribution layer 3 data: host node continues to start a new node, and is distributed in three nodes depending on the data block of upper layer data and the result of calculation of upper layer data, and node calculates, and result of calculation is beamed back to host node;
Step 413: several layer data after processing successively, every layer starts a new node, until the processing of n layer data finishes, now partial node adds up to n;
Step 414: host node is distributed n+1 layer data: according to data dependence relation figure, after n layer data, every layer data block starts to reduce one by one, host node will not restarted new node, distribution n+1 layer data block and n layer result of calculation are to n-1 node, n-1 node calculates, and result of calculation is beamed back to host node.
Step 415: host node is distributed the result of calculation of follow-up a few layer data block and dependence successively, finishes until all data blocks are calculated, and returns to result of calculation, execution step 417.
Step 416: divide if data are divided into block n * m, data processing is divided with block n * n, but after handling n layer data block, because piecemeal is now that n * m divides, after this | m-n+1| layer, nodes is n, during processing, host node must not start new node, execution step 417;
Step 417: host node is compiled result.