CN110727746B - Buffer analysis parallel scheduling method and device based on secondary task division strategy - Google Patents

Buffer analysis parallel scheduling method and device based on secondary task division strategy Download PDF

Info

Publication number
CN110727746B
CN110727746B CN201910782395.8A CN201910782395A CN110727746B CN 110727746 B CN110727746 B CN 110727746B CN 201910782395 A CN201910782395 A CN 201910782395A CN 110727746 B CN110727746 B CN 110727746B
Authority
CN
China
Prior art keywords
calculation
buffer
grid
elements
intensity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910782395.8A
Other languages
Chinese (zh)
Other versions
CN110727746A (en
Inventor
黄颖
郭明强
王均浩
曹威
关庆锋
谢忠
韩成德
耿振坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Space Time Technology Development Co ltd
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN201910782395.8A priority Critical patent/CN110727746B/en
Publication of CN110727746A publication Critical patent/CN110727746A/en
Application granted granted Critical
Publication of CN110727746B publication Critical patent/CN110727746B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a buffer analysis parallel scheduling method and a device based on a secondary task division strategy, wherein the method comprises the following steps: selecting an SDK of a certain GIS platform to judge main influence factors of buffer analysis, generating the relationship between the calculation intensity of the buffer analysis result and the influence factors by the test elements, and generating a corresponding model by using statistical analysis software SPSS; building a computational intensity grid using the generated model; performing first task area division by using a calculation strength grid to achieve load balance on the areas; transferring the range to a parallel computing node, on the basis, performing secondary division to distribute the crossed elements among the regions to achieve load balance of actual calculated amount among tasks, and generating a buffer area by the distributed elements; and merging the buffer analysis results generated by each parallel task.

Description

Buffer analysis parallel scheduling method and device based on secondary task division strategy
Technical Field
The invention relates to the field of geographic information science, in particular to a buffer analysis parallel scheduling method and device based on a secondary task division strategy.
Background
With the growth of unprecedented geographic big data, these serial methods of generating buffers have not met the demand for geographic big data. And the improvement of computer hardware, the quantity of available processing cores is increasing day by day, so that the parallel computing of the geographic big data space analysis is possible. Most scholars have turned their efforts to parallel algorithms in the search for algorithms for spatial analysis of geographically large data. Parallel spatial analysis algorithms can be roughly divided into algorithm parallelism and data parallelism according to the adopted parallelism strategy. The parallel strategy based on algorithm parallelism mostly improves the algorithm to be suitable for a corresponding parallel framework or for a certain specific requirement (visualization, online quick display and the like). Although the method achieves good effect, the improvement of the algorithm is a relatively complex work, and the method does not necessarily have good expansibility and stability in all application fields. Moreover, due to improvement of basic algorithms, general GIS platforms (QGIS, arcGIS, mapGIS, etc.) cannot use the algorithms to rapidly process geographic big data. Most of the data-based parallel strategies are executed on Hadoop, spark parallel frameworks or extensions based on the frameworks, and the acceleration of the spatial analysis of the geographic big data is very effective due to the execution on the frameworks. However, since the data conversion is required to be performed on these frames, data shift occurs, and the influence of the data shift due to the data conversion is greater as the data amount is larger, and the operation speed is also seriously affected. Therefore, the general GIS platforms (QGIS, arcGIS, mapGIS, etc.) use these algorithms to process geographic big data, and a good effect is not achieved. By the method, algorithms and data formats of the general GIS platforms (QGIS, arcGIS, mapGIS and the like) can be unchanged, and the method has high applicability, effectiveness, stability and expansibility. On the basis of data formats and algorithms of the platforms, a method for calculating the strength grid is based on a method for calculating the strength grid, firstly, task area division is carried out for the first time to achieve load balance on the areas, then, elements intersected among the areas are divided for the second time to distribute load balance of actual calculated amount among tasks, and therefore the parallel scheduling method which can be effectively executed on all the platforms is demonstrated.
Disclosure of Invention
The invention is mainly based on the method of calculating the strength grid on the basis of not changing the algorithm and data format of the general GIS platforms (QGIS, arcGIS, mapGIS, etc.), firstly, the first task area division is carried out to achieve the load balance on the areas, and then, the second division is carried out to distribute the crossed elements among the areas to achieve the load balance of the actual calculated amount among the tasks, thereby showing the parallel scheduling method which can be effectively executed on all the platforms.
The invention solves the technical problem, and the adopted buffer analysis parallel scheduling method based on the secondary task division strategy comprises the following steps:
step (1), generating buffer zone by analyzing line and surface element, three main internal steps: acquiring geometric information of the elements, generating a buffer area and writing a buffer area result, thereby determining the number of folding points which take main factors influencing the generation of the buffer area as the elements; respectively testing the calculation intensity of the line and the face elements by using a data set of the line and the face elements with gradually changed folding point numbers, taking calculation time as an index of the calculation intensity, then performing regression analysis on the relation between the folding point numbers and the calculation time of the three internal steps, selecting a model of each step according to the optimal fitting degree, and obtaining a model of the relation between the folding point numbers and the calculation time of the three internal steps; adding the models of the three internal steps to obtain a total model expressing the relationship between the total time spent in the analysis of one element buffer area and the number of broken points of the element;
step (2) of establishing a calculation strength grid according to the total model established in the step (1), which specifically comprises the following substeps:
(21) Firstly, an outsourcing rectangle (X) of a data set of a buffer area to be generated is inquired by using an SDK of a corresponding GIS platform min 、Y min 、X max 、Y max ) Then expanding the outer-wrapped rectangle into a square; x min 、Y min 、X max 、Y max Respectively representing the minimum abscissa, the minimum ordinate, the maximum abscissa and the maximum ordinate of four vertexes of the outer-wrapping rectangle;
(22) The extent of this square is then quadtree-wise hierarchical gridded, i.e. level iThe number of content grids is 2 i *2 i Then the side length of the ith level grid is d i =edge/2 i The storage of the grid is that the KEY value is stored in a Redis database according to the naming rule of level, row number and column number; edge represents the side length of the square;
(23) Respectively calculating the calculation intensity of each grid by the total model established in the step (1), giving the corresponding intensity to the grid by all the elements in the grid, and giving the calculated intensity of the corresponding element to the grid by the part of the elements in the grid according to the ratio of the number of the folding points of the elements in the grid to the total number of the folding points of the elements; for any one grid: calculating the calculation intensity of all the contained or intersected elements of the grid, then accumulating the calculation intensity values to obtain the calculation intensity Value of the grid, and storing the Value into a Redis database;
and (3) starting to divide for the first time according to the calculated strength grid established in the step (2), wherein the specific method comprises the following steps:
the first division is to divide the regions forming the grid, and the calculation intensity of each divided region is equal: first, the KEY value in the Redis database is traversed, and then Redis client<string>(Key) method finds out corresponding VALUE VALUE, and accumulates the calculation intensity of each line of the grid to obtain the summary W of the calculation intensity of each line i Then W is i Summarizing to obtain the total calculated intensity W total W is to be total Dividing the number of the parallel tasks to be divided into to obtain the required computing intensity W of each parallel task task (ii) a Starting from the first row of the grid, traversing up according to the row if the calculation intensity of the current row is more than W task Then press W task Obtaining a corresponding range from the bottom of the current line according to the proportion of the calculation intensity of the current line, and traversing the next line by taking the rest calculation intensity and range of the current line as a next initial value; if the calculation strength of the current line is less than W task Then, go through the next row to accumulate the calculated intensity of the next row and make a decision until the value is larger than W task Then will add more W task Is being traversed toThe proportion of the forward calculation intensity obtains a corresponding range from the bottom of the current line, then the range of the complete line contained in the accumulation process is combined with the range to be used as the calculation range of the task, the rest calculation intensity and the range are used as the next initial value to traverse the next line, and according to the rule, the last line is traversed to generate the calculation ranges of all the parallel tasks;
step (4), the calculation range of the parallel task established in the step (3) is transmitted to each parallel calculation node, elements completely contained in the calculation range are subjected to buffer analysis by inquiring corresponding data from the spatial data, and part of elements in the calculation range are divided by the following rules and then subjected to buffer analysis: performing second division on the basis of the calculation range of the parallel tasks established in the step (3), decomposing elements intersected with the boundary of the calculation ranges to realize load balance among the parallel tasks, firstly, performing an intersection query by using straight lines of adjacent calculation ranges to obtain elements Feature 1, feature 2 and Feature 3 which are simultaneously in the two calculation ranges, wherein Feature k represents the number of the elements simultaneously in the two calculation ranges, and then respectively calculating the calculation intensity W of the Feature 1, the Feature 2, the Feature 3 and the Feature k according to the total model established in the step (1) 1 、W 2 、W 3 、...、W k Then, the proportion of the number of the folding points of Feature k in the subdomain n to the total number of the folding points is used for calculating the partial calculation strength W of the Feature k in the subdomain n 12 、W 22 、W 32 、W k2 The total partial calculated intensity W within the sub-field n will be summed Sn (ii) a Then to W 1 、W 2 、W 3 、...、W k Go through the traversal, starting with W from the first Sn Making a comparison if less than W Sn Then, the next one is accumulated until the accumulated value is greater than or equal to W Sn Assigning all elements of the accumulation to this range, and the remaining elements to another range; if it is greater than or equal to W Sn Assign this element to this range and the remaining elements to anotherA range;
and (5) writing the rest buffer analysis results into the first completed buffer analysis result according to all parallel buffer results obtained in the step (4), and ending the buffer analysis parallel scheduling method.
Further, in the buffer analysis parallel scheduling method based on the secondary task partitioning policy of the present invention, in step (1), the performing regression analysis on the relationship between the number of inflection points and the computation time in the three internal steps specifically includes: and (4) carrying out regression analysis on the relation between the number of the folding points and the calculation time of the three internal steps by using statistical analysis software SPSS.
Further, in the buffer analysis parallel scheduling method based on the secondary task partitioning strategy of the present invention, in step (1), the model of the relationship between the break point number and the computation time in the three internal steps is specifically as follows:
CFL(x)=a 1 x+b 1 x 2 +c 1 x 3 +d 1 (1)
Figure BDA0002177000740000051
CTL(x)=a 3 x+b 3 x 2 +c 3 x 3 +d 3 (3)
CFP(x)=a 4 x+b 4 x 2 +c 4 x 3 +d 4 (4)
Figure BDA0002177000740000052
CTP(x)=a 6 x+b 6 x 2 +c 6 x 3 +d 6 (6)
CFL, CSL, CTL are the calculation intensity of the first step, the second step and the third step of the line element generation buffer respectively, CFP, CSP, CTP are the calculation intensity of the first step, the second step and the third step of the plane element generation buffer respectively, x is the number of break points of the line or plane element, a 1 ,b 1 ,c 1 ,d 1 Is the coefficient of the cubic function model of the step of obtaining the geometric information of the line elements, a 2 ,b 2 Is the coefficient of the power function model of the step of line element generation buffer area, a 3 ,b 3 ,c 3 ,d 3 Is the coefficient of the cubic function model of the step of writing results in the line element buffer, a 4 ,b 4 ,c 4 ,d 4 Is the coefficient of the cubic function model of the step of obtaining the geometric information of the surface element, a 5 ,b 5 Is the coefficient of the power function model of the step of generating the buffer area by the surface element, a 6 ,b 6 ,c 6 ,d 6 The coefficient of the cubic function model of the step of writing the result in the surface element buffer area;
adding the models of the three internal steps to obtain a total model expressing the relationship between the total time spent in analyzing one element buffer area and the number of the broken points of the element, wherein the total model is as follows:
Figure BDA0002177000740000062
Figure BDA0002177000740000061
in the formula, CL (x) represents the total time of the buffer analysis of the line element with the fold point number x, and CP (x) represents the total time of the buffer analysis of the plane element with the fold point number x.
Further, in the buffer analysis parallel scheduling method based on the secondary task partitioning policy of the present invention, in step (21), the method of expanding the outsourced rectangle into a square expansion is: with (X) min、 Y min ) As an origin, with ((X) max -X min )、(Y max -Y min ) The maximum value in) is the side length edge of the square.
Further, in the secondary task partition policy-based buffer analysis parallel scheduling method of the present invention, in step (22), storing the KEY value in the Redis database specifically means: set < string > (Key, value) method is used to store KEY Value into Redis database.
Further, in the method for parallel scheduling of buffer analysis based on the secondary task partitioning policy of the present invention, in step (23), storing Value in a Redis database specifically means: set < string > (Key, value) method is used to store Value into Redis database.
The invention also provides a buffer analysis parallel scheduling device based on the secondary task division strategy, which is used for solving the technical problem and comprises a computer storage medium, wherein the computer storage medium stores computer executable instructions and is used for realizing any one of the above buffer analysis parallel scheduling methods based on the secondary task division strategy.
The buffer analysis parallel scheduling method and device based on the secondary task division strategy have the following beneficial effects: the invention has been strictly tested and verified, the buffer analysis and parallel scheduling carried out by the method can not change the algorithms and data formats of the general GIS platforms (QGIS, arcGIS, magGIS and the like), and has very good applicability, effectiveness, stability and expansibility, all the functions are realized, and the invention has the characteristics of applicability, effectiveness, stability and expansibility.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flow chart of an embodiment of a secondary task partitioning policy based buffer analysis parallel scheduling method of the present invention;
FIG. 2 is a schematic of a computational intensity calculation for a grid;
FIG. 3 is a schematic diagram of a first division;
fig. 4 is a schematic diagram of the second division.
Detailed Description
For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
To more clearly illustrate the idea of the present invention, the following further describes a parallel scheduling method of a secondary task partitioning policy for generating a buffer result for a surface element by using the SDK of mapgis10.3 as an embodiment, and specifically refers to fig. 1.
Step (1), generating buffer area by analyzing surface element, three main internal steps: obtaining the geometric information of the elements (i.e. the first step), generating the buffer (i.e. the second step) and writing the result of the buffer (i.e. the third step), and determining the number of the break points which have the main factor of the generated buffer as the elements. The calculation intensity of the three steps is tested by using the SDK of the MapGIS10.3 platform and the line with gradually changed break points and the data set of the surface element, and the calculation time is used as an index of the calculation intensity. The first step to the third step are as follows: the method comprises the first step of testing the relation between the time for acquiring the Recordset geoentry attribute and the number of the discount points, the second step of testing the relation between the time for completing the buffer method of GeoPolygon or GeoPolygons and the number of the discount points, and the third step of testing the relation between the time for completing the addition of GeoPolygon or GeoPolygons to the application method of SFeatureCls of simple elements and the number of the discount points. Then, statistical analysis software SPSS is used for carrying out regression analysis on the relation, and a model of each step is selected according to the optimal fitting degree, so that a model of the relation between the number of the folding points and the calculation time of the three steps is obtained. The models of the three steps are added to obtain a total model expressing the relation between the total time spent by analyzing the surface element buffer area and the fold point number of the element.
CP(x)=0.05962x-6.096×10 -9 x 2 +8.3458×10 -14 x 3 +e 0.006 x 1.045 +1.209 (1)
And (2) establishing a calculated strength grid according to the model (formula 1) of the calculated strength and the element folding point number established in the step (1). First get the bounding rectangle (X) of the simple element of the buffer to be generated using the Range attribute of SFeatureCls min 、Y min 、X max 、Y max ) To buffers to be generatedOutsourcing rectangle (X) of data set min 、Y min 、X max 、Y max ) Then the outsourcing rectangle is expanded into a square to (X) min 、Y min ) As an origin, with ((X) max -X min )、(Y max -Y min ) Maximum value of edge of the square (the outsourcing rectangle is repeated with one side of the square). The square range is subjected to a quad-tree hierarchical gridding method, namely the number of the ith-level content grids is 2 i *2 i The side length of the ith grid is d t =edge/2 i . The grid is stored according to the rank-row-column naming rule as a KEY value, using Redis client<string>The (Key, value) method stores the KEY Value in the Redis database. Then, the calculation strength of each grid is calculated according to the established total model of the calculation strength and the element fold points, the element strength is given to the grid by all the elements in the grid, and the element of part of the grid is given to the grid according to the calculation strength of the corresponding element multiplied by the ratio of the fold points of the elements in the grid to the total fold points of the elements. The implementation of the SDK to obtain all grid elements and part grid elements on the mapgs10.3 platform is as follows: the SetRec method, using QueryDef class, sets the rectangular range to be queried and the query mode, which are set to SpaQueryMode.Contain and SpaQueryMode.Intersect, respectively, and then uses SFeatureCls.Select (QueryDef) method to two RecordSet element sets, respectively. The element in the secset element set obtained by using spaquerymode. Container is set as the data set which is intersected and contained in the grid. So to obtain only intersecting datasets, the SubSet method using RecordSet may be used to intersect and contain datasets within the grid less the datasets contained within the grid. As shown in FIG. 2, there are four elements in this cell at row 0 and column 0, the number of fold points of Feature C completely in the cell is 4, and the ratios of the number of fold points in the cell to the number of fold points of the element are 1/2,1/4 and 1/2, respectively, for Feature A, feature B and Feature D partially in the cell. The calculated strength W of this lattice 00 Is composed of
Figure BDA0002177000740000091
According to the rule, all the elements contained or intersected in the grid are calculated, then the calculation strength Value of the grid is accumulated, and the calculation strength Value is stored in a Redis database by using a Key (Value) method of Redis client.
And (3) according to the calculation strength grid established in the step (2), starting to perform first division. The first division is to divide the regions forming the mesh, and the calculation intensity of each divided region is equal. First we traverse the KEY values in the Redis database, using redisclient<string>(Key) method finds out corresponding VALUE VALUE, accumulates the calculated intensity of each row of the grid, and obtains summary W of the calculated intensity of each row i . Then W is i Summarizing to obtain the total calculated intensity W total W is to be total Dividing the number of the parallel tasks which need to be divided by us to obtain the computing intensity W required by each parallel task task . Starting from the first row of the grid, traversing up the grid according to the rows if the calculation intensity of the current row is more than W task Then press W task Obtaining a corresponding range from the bottom of the current line according to the proportion of the calculation intensity of the current line, and traversing the next line by taking the rest calculation intensity and range of the current line as a next initial value; if the calculation strength of the current line is less than W task Then, go through the next row to accumulate the calculated intensity of the next row and make a decision until the value is larger than W task Then will add more W task The ratio of the calculated intensity of the traversed current line takes the corresponding range from the bottom of the current line, then the range of the complete line included in the accumulated sum is merged with the range to be used as the calculated range of the task, and the rest calculated intensity and range are used as the next initial value to traverse the next line. According to the rule, traversing to the last line to generate a model of all parallel tasks needing to be calculatedAnd (5) enclosing. The number of tasks is detailed as 4, as shown in FIG. 3, where W ij Indicating the task strength of the lattice in row i and column j.
And (4) transmitting the calculation range of the parallel task established in the step (3) to each parallel calculation node, wherein elements completely contained in the range are subjected to buffer analysis by inquiring corresponding data from the spatial data, and parts of elements in the range are divided according to the following rules and then subjected to buffer analysis. The complete containing query and the partial containing query can be queried by the method in step (3), only by changing the rectangular range to be queried into the current calculation range by the SetRef method of the QueryDef class. As shown in fig. 4, the parallel tasks are divided for the second time based on the calculation ranges of the parallel tasks established in step (3), and load balancing between the parallel tasks is realized by decomposing elements intersecting with the boundaries of the calculation ranges, taking 4 elements both in the two ranges at the same time as an example, and the specific method is as follows.
Firstly, an intersection query is made by straight lines of adjacent calculation ranges to obtain elements (Feature A, feature B, feature C and Feature D) which are in the two ranges at the same time, the intersection query of the straight lines can also be realized by using a rectangular query in the step (3), the straight lines are expanded into rectangles similar to lines, and the set query method of the QueryDef type sets the rectangular range to be queried into the rectangle for query. Then respectively calculating the calculation intensity W of Feature A, feature B, feature C and Feature D according to the model of the calculation intensity and the element discount number established in the step (1) A 、W B 、W C 、W D
Then, the ratio of the number of break points of Feature A, feature B, feature C and Feature D in the subfield 2 to the total number of break points is used to calculate the partial calculation strength W of the Feature in the subfield 2 A2 、W B2 、W C2 、W D2 Will be summed to a total partial calculated intensity W within sub-field 2 S2
Then to W A 、W B 、W C 、W D Go through the traversal, starting with W from the first S2 Make a comparison ifIs less than W S2 Then, the next one is accumulated until the accumulated value is greater than or equal to W S2 Assigning all elements of the accumulation to this range, and the remaining elements to another range; if it is greater than or equal to W S2 This element is assigned to this range and the remaining elements are assigned to another range.
And (5) writing the rest buffer results into the first finished buffer result by using an application method of SFeatureCls according to all parallel buffer results obtained in the step (4), and ending the parallel scheduling method.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (7)

1. A buffer analysis parallel scheduling method based on a secondary task division strategy is characterized by comprising the following steps:
step (1), generating buffer zone by analyzing line and surface element, three main internal steps: acquiring geometric information of the elements, generating a buffer area and writing a buffer area result, thereby determining the number of folding points which take main factors influencing the generation of the buffer area as the elements; respectively testing the calculation intensity of the line and the face elements by using a data set of the line and the face elements with gradually changed folding point numbers, taking calculation time as an index of the calculation intensity, then performing regression analysis on the relation between the folding point numbers and the calculation time of the three internal steps, selecting a model of each step according to the optimal fitting degree, and obtaining a model of the relation between the folding point numbers and the calculation time of the three internal steps; adding the models of the three internal steps respectively to obtain a total model expressing the relationship between the total time spent in analyzing one element buffer area and the number of broken points of the element;
step (2), establishing a calculation strength grid according to the total model established in the step (1), and specifically comprising the following substeps:
(21) Firstly, querying an outsourcing rectangle (X) of a data set of a buffer area to be generated by using an SDK of a corresponding GIS platform min 、Y min、 X max、 Y max ) Then expanding the outer-wrapped rectangle into a square; x min、 Y min、 X max、 Y max Respectively representing the minimum abscissa, the minimum ordinate, the maximum abscissa and the maximum ordinate of four vertexes of the outsourcing rectangle;
(22) Then the range of the square is subjected to quadtree hierarchical gridding, namely the number of the ith level content grids is 2 i *2 i The side length of the ith grid is d i =edge/2 i The storage of the grid is that the KEY value is stored in a Redis database according to the naming rule of level, row number and column number; edge represents the side length of the square;
(23) Respectively calculating the calculation intensity of each grid by the total model established in the step (1), giving the corresponding intensity to the grid by all the elements in the grid, and giving the calculated intensity of the corresponding element to the grid by the part of the elements in the grid according to the ratio of the number of the folding points of the elements in the grid to the total number of the folding points of the elements; for any one grid: calculating the calculation strength of all contained or intersected elements of the grid, then accumulating the calculation strength values to obtain the calculation strength Value of the grid, and storing the Value into a Redis database;
and (3) starting to divide for the first time according to the calculated strength grid established in the step (2), wherein the specific method comprises the following steps:
the first division is to divide the regions forming the grid, and the calculation intensity of each divided region is equal: traversing KEY VALUEs in a Redis database, finding corresponding VALUE VALUEs by using a Redis client.Get < string > (Key) method, accumulating the calculation intensity of each row of the grid, and obtaining a summary W of the calculation intensity of each row i Then W is i Summarizing to obtain the total calculated intensity W total W is to be total Divided by and to be dividedThe number of the line tasks obtains the computation intensity W required by each parallel task task (ii) a Starting from the first row of the grid, traversing up according to the row if the calculation intensity of the current row is more than W task Then press W task Obtaining a corresponding range from the bottom of the current line according to the proportion of the calculation intensity of the current line, and traversing the next line by taking the rest calculation intensity and range of the current line as a next initial value; if the calculation strength of the current line is less than W task Then, go through the next row to accumulate the calculated intensity of the next row and make a decision until the value is larger than W task Then will add more W task The corresponding range is obtained from the bottom of the current line in the ratio of the traversed computation intensity of the current line, then the range of the complete line contained in the accumulation process is combined with the range to be used as the computation range of the task, the rest computation intensity and the range are used as the next initial value to traverse the next line, and according to the rule, the last line is traversed to generate the computation ranges of all parallel tasks;
step (4), the calculation range of the parallel task established in the step (3) is transmitted to each parallel calculation node, elements completely contained in the calculation range are subjected to buffer analysis by inquiring corresponding data from the spatial data, and parts of the elements in the calculation range are divided by the following rules and then subjected to buffer analysis: performing second division on the basis of the calculation range of the parallel tasks established in the step (3), decomposing elements intersected with the boundary of the calculation ranges to realize load balance among the parallel tasks, firstly, performing an intersection query by using straight lines of adjacent calculation ranges to obtain elements Feature 1, feature 2 and Feature 3 which are simultaneously in the two calculation ranges, wherein Feature k represents the number of the elements simultaneously in the two calculation ranges, and then respectively calculating the calculation intensity W of the Feature 1, the Feature 2, the Feature 3 and the Feature k according to the total model established in the step (1) 1 、W 2 、W 3 、...、W k Then using the ratio of the number of folding points of Feature k in the subdomain n to the total number of folding points to calculateIts partial calculated intensity W within the subfield n 12 、W 22 、W 32 、W k2 N is a positive integer, and the total partial calculation strength W in the subdomain n is obtained through collection Sn (ii) a Then to W 1 、W 2 、W 3 、...、W k Go through the traversal, starting with W from the first Sn Making comparison, if less than W Sn Then, the next one is accumulated until the accumulated value is greater than or equal to W Sn Assigning all elements of the accumulation to this range, and the remaining elements to another range; if it is greater than or equal to W Sn Assigning the element to the range, the remaining elements to another range;
and (5) writing the rest buffer area analysis results into the first completed buffer area analysis result according to all parallel buffer area results obtained in the step (4), and ending the buffer area analysis parallel scheduling method.
2. The buffer analysis parallel scheduling method based on the secondary task partitioning strategy as claimed in claim 1, wherein in step (1), the regression analysis of the relation between the number of discount points and the calculation time in the three internal steps is specifically: and (4) carrying out regression analysis on the relation between the number of the folding points and the calculation time of the three internal steps by using statistical analysis software SPSS.
3. The buffer analysis parallel scheduling method based on the secondary task partitioning strategy as claimed in claim 1, wherein in step (1), the model of the relationship between the break point number and the computation time in the three internal steps is specifically as follows:
CFL(x)=a 1 x+b 1 x 2 +c 1 x 3 +d 1 (1)
Figure FDA0002177000730000041
CTL(x)=a 3 x+b 3 x 2 +c 3 x 3 +d 3 (3)
CFP(x)=a 4 x+b 4 x 2 +c 4 x 3 +d 4 (4)
Figure FDA0002177000730000042
CTP(x)=a 6 x+b 6 x 2 +c 6 x 3 +d 6 (6)
CFL, CSL, CTL are the calculated intensity of the first step, the second step and the third step of the line element generation buffer respectively, CFP, CSP, CTP are the calculated intensity of the first step, the second step and the third step of the plane element generation buffer respectively, x is the number of break points of the line or plane element, a 1 ,b 1 ,c 1 ,d 1 Is the coefficient of the cubic function model of the step of obtaining the geometric information of the line elements, a 2 ,b 2 Is the coefficient of the power function model of the step of line element generation buffer area, a 3 ,b 3 ,c 3 ,d 3 Is the coefficient of the cubic function model of the step of writing the results in the line element buffer, a 4 ,b 4 ,c 4 ,d 4 Is the coefficient of the cubic function model of the step of obtaining the geometric information of the surface element, a 5 ,b 5 Is the coefficient of the power function model of the step of generating the buffer area by the surface element, a 6 ,b 6 ,c 6 ,d 6 The coefficient of the cubic function model of the step of writing the result in the surface element buffer area;
adding the models of the three internal steps respectively to obtain a total model expressing the relationship between the total time spent in analyzing one element buffer area and the number of the element folding points, wherein the total model is as follows:
Figure FDA0002177000730000043
Figure FDA0002177000730000044
in the equation, CL (x) represents the total time of the buffer analysis of the line element with the fold number x, and CP (x) represents the total time of the buffer analysis of the plane element with the fold number x.
4. The secondary task partitioning strategy-based buffer analysis parallel scheduling method according to claim 1, wherein in step (21), the method for expanding the outsourced rectangle into a square expansion is as follows: with (X) min 、Y min ) As an origin, with ((X) max -X min )、(Y max -Y min ) The maximum value in) is the side length edge of the square.
5. The secondary task partition policy-based buffer analysis parallel scheduling method according to claim 1, wherein in step (22), the storing the KEY value in the Redis database specifically means: the KEY Value is stored in a Redis database by using a Key (Value) method of Redis client.
6. The secondary task partition policy-based buffer analysis parallel scheduling method according to claim 1, wherein in step (23), the storing Value in a Redis database specifically means: the Value is stored in a Redis database by using a Key (Value) method of Redis client.
7. A buffer analysis parallel scheduling device based on a secondary task partitioning strategy is provided with a computer storage medium, and is characterized in that the computer storage medium stores computer executable instructions for implementing the buffer analysis parallel scheduling method based on the secondary task partitioning strategy according to any one of claims 1 to 6.
CN201910782395.8A 2019-08-23 2019-08-23 Buffer analysis parallel scheduling method and device based on secondary task division strategy Active CN110727746B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910782395.8A CN110727746B (en) 2019-08-23 2019-08-23 Buffer analysis parallel scheduling method and device based on secondary task division strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910782395.8A CN110727746B (en) 2019-08-23 2019-08-23 Buffer analysis parallel scheduling method and device based on secondary task division strategy

Publications (2)

Publication Number Publication Date
CN110727746A CN110727746A (en) 2020-01-24
CN110727746B true CN110727746B (en) 2023-03-21

Family

ID=69217733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910782395.8A Active CN110727746B (en) 2019-08-23 2019-08-23 Buffer analysis parallel scheduling method and device based on secondary task division strategy

Country Status (1)

Country Link
CN (1) CN110727746B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111459934B (en) * 2020-03-19 2023-09-22 北京图创时代科技有限公司武汉分公司 Multi-source map data slicing system and method
CN111913965B (en) * 2020-08-03 2024-02-27 北京吉威空间信息股份有限公司 Space big data buffer area analysis-oriented method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951325A (en) * 2017-03-10 2017-07-14 中国地质大学(武汉) Space computational fields calculate intensity cube construction method
CN107894992A (en) * 2017-10-12 2018-04-10 武汉中地数码科技有限公司 A kind of GIS dot buffer zones analysis method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951325A (en) * 2017-03-10 2017-07-14 中国地质大学(武汉) Space computational fields calculate intensity cube construction method
CN107894992A (en) * 2017-10-12 2018-04-10 武汉中地数码科技有限公司 A kind of GIS dot buffer zones analysis method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Zhong Xie, etc..The framework and realization of grid node computing pool of China geological spatial information grid.2009,全文. *
朱月琴等.顾及计算强度的地质矢量大数据计算域均衡分解方法研究.2018,第27卷(第9期),第88-92页. *

Also Published As

Publication number Publication date
CN110727746A (en) 2020-01-24

Similar Documents

Publication Publication Date Title
Pınar et al. Fast optimal load balancing algorithms for 1D partitioning
CN110990612B (en) Method and terminal for rapidly displaying vector big data
CN110727746B (en) Buffer analysis parallel scheduling method and device based on secondary task division strategy
CN109656798B (en) Vertex reordering-based big data processing capability test method for supercomputer
CN102360515A (en) Progressive mesh data organization method for three-dimensional model
CN115660078A (en) Distributed computing method, system, storage medium and electronic equipment
CN103778191A (en) Vector contour line data partitioning method with space proximity relation considered
Guo et al. A universal parallel scheduling approach to polyline and polygon vector data buffer analysis on conventional GIS platforms
KR101136200B1 (en) System, method, and computer-readable recording medium for importance sampling of partitioned domains
CN116467540A (en) HBase-based massive space data rapid visualization method
CN117009411A (en) Method, device and computer readable storage medium for meshing space storage and indexing based on point cloud data
Hong et al. A multi-gpu fast iterative method for eikonal equations using on-the-fly adaptive domain decomposition
CN111599015A (en) Space polygon gridding filling method and device under constraint condition
Compton et al. New partitioning techniques and faster algorithms for approximate interval scheduling
CN114119882B (en) Efficient nested grid host unit searching method in aircraft dynamic flow field analysis
CN113821550B (en) Road network topological graph dividing method, device, equipment and computer program product
Klinkovský et al. Configurable Open-source Data Structure for Distributed Conforming Unstructured Homogeneous Meshes with GPU Support
CN114741029A (en) Data distribution method applied to deduplication storage system and related equipment
CN111400969A (en) Method for accelerating generation of unstructured right-angle grid
Tarmur et al. Parallel classification of spatial points into geographical regions
CN117556095B (en) Graph data segmentation method, device, computer equipment and storage medium
CN117237503B (en) Geographic element data accelerated rendering and device
Gui et al. DEVELOPING APACHE SPARK BASED RIPLEY’SK FUNCTIONS FOR ACCELERATING SPATIOTEMPORAL POINT PATTERN ANALYSIS
Hasbestan et al. A parallel adaptive mesh refinement software for complex geometry flow simulations
CN111709593B (en) Space resource optimal allocation method based on weak space constraint

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230829

Address after: Room 415, 4th Floor, Building 35, Erlizhuang, Haidian District, Beijing, 100080

Patentee after: BEIJING SPACE-TIME TECHNOLOGY DEVELOPMENT CO.,LTD.

Address before: 430000 Lu Mill Road, Hongshan District, Wuhan, Hubei Province, No. 388

Patentee before: CHINA University OF GEOSCIENCES (WUHAN CITY)

TR01 Transfer of patent right